WO2013061019A1 - A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated - Google Patents

A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated Download PDF

Info

Publication number
WO2013061019A1
WO2013061019A1 PCT/GB2012/000813 GB2012000813W WO2013061019A1 WO 2013061019 A1 WO2013061019 A1 WO 2013061019A1 GB 2012000813 W GB2012000813 W GB 2012000813W WO 2013061019 A1 WO2013061019 A1 WO 2013061019A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource information
graph
biomedical
ontology
images
Prior art date
Application number
PCT/GB2012/000813
Other languages
French (fr)
Inventor
Bernard DE BONO
Original Assignee
De Bono Bernard
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by De Bono Bernard filed Critical De Bono Bernard
Publication of WO2013061019A1 publication Critical patent/WO2013061019A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Definitions

  • Derivative subgraph - a graph consisting of a subset of nodes from a reference graph, constructed in such a way that the intervening edges of the derivative graph maintain the same connective topology between the nodes as in the original reference graph
  • Biological structure the physical structure of the whole or part of an organism, including viruses, such as the structure of a human body, an internal organ of an animal, a single cell of a plant, a molecular complex or a single molecule.
  • Biomedical resource - A resource that is typified by an electronic dataset, whether text or image based or by a computational model, and bearing some biomedical significance
  • Biomedical resource component - A part of a biomedical resource, for example a data column in a clinical trial spreadsheet or database table, a biological variable in the code of a physiology model, a specific spatial region in a radiology image, or a disease label in a patient's health record
  • Derivative subgraph - a graph consisting of a subset of nodes from a reference graph, constructed in such a way that the intervening edges of the derivative graph maintain the same connective topology between the nodes as in the original reference graph
  • Metadata - Machine readable documentation material that is linked to a corresponding resource or component of a resource indicating how the actual content of that component resource should be interpreted
  • MySQL - A relational database management system that runs as a server providing multi-user access to a number of databases simultaneously
  • Ontology A set of information about the nature and existence of a subject, typically consisting of terms and their relations. In the pharmaceutical domain, for instance, one ontology could characterize a taxonomic classification of drugs - in this case, every term would represent a drug or a family of drugs. In this way, the parent term “Analgesic drug family” would have a child term “Paracetamol” and the two terms may be linked by way of a specialization relation
  • Ontology of a biological structure A set of facts about a biological structure, typically consisting of terms describing parts of that structure and their relations.
  • an anatomy ontology for humans may describe the way in which organs and tissue regions make up the body.
  • the term "hand”' would be a child node to the parent term "upper limb”
  • the term “thumb” would be a child to the parent term “hand” each linked to the other by way of a partonomy relation
  • Ontology graph A graphic representation of an ontology in which graph nodes represent ontology terms and graph edges represent relations between terms Parent-child tile nesting -
  • nesting consists of placing a tile representing a child node entirely within the bounds of the tile representing its parent node
  • Parent-child relation The analogy of parent-child relations is applicable to the invention when a tree graph is assigned a root node.
  • the parent node In the case where two nodes in such a graph are linked by the same edge, one of the two nodes, the parent node, is necessarily closer to the root than the other node, the child node. More formally, a smaller number of edges hasve to be traversed to reach the root from the parent node, compared to the path taken from the child node, and the path from the child to the root must necessarily traverse the parent node
  • RDF Resource description framework
  • Semantic metadata - Metadata as defined here, that links the component of a resource to an ontology term
  • Tile - A spatial graphical artefact, typically in two or three dimensions, and when depicted within a treemap manifestation of an ontology graph, represents an ontology term Tile map - A type of tree map
  • Tree graph A graphic representation of discrete facts and values of a subject that utilizes connecting and branching lines analogous to the structure of a tree to demonstrate ordinal relationships between those facts and values.
  • Treemap - A tree graph organized in a manner similar to a flowchart to show a progression of operations. Treemaps display hierarchical, tree-structured graphs as a set of nested rectangles. Each node of the tree is represented by a rectangle, which rectangle is then tiled internally with smaller rectangles representing child nodes
  • Triplestore - is a purpose-built database for the storage and retrieval of RDF metadata
  • the present invention relates to the general field of knowledge engineering and the specific field of ontology engineeringvisualisation.
  • the visual depiction of biological structures requires that the complex arrangement of their parts is laid out within the constraints of scale imposed by the medium in which they are drawn. Automating the consistent visual depiction of biological structures across multiple scales is technically difficult to achieve, such that biologists have had to rely on manual methods to create complex visual representations. For instance, the generation of small-scale depictions of complex structures, such as skeletal parts and internal organs of living organisms, is constrained by the limitation imposed by the size of the media used for those depictions. This means that descriptions of larger scale features of structures and the relationships between structures and parts of structures must be omitted or may only be achieved through cumbersome manual systems of annotation and linking between separate illustrations. Prior to the computerization of formal reference knowledge about biological structures, the art encompassed only the use of manual linking between smaller and larger scale depictions.
  • reference ontologies formal knowledge about biological structures, referred to in the art as reference ontologies, be related to depictions of those structures so that the resulting visual composites of reference illustrations and related resource information may serve as visual aids to education, research and clinical practice by the biomedical community.
  • reference ontology is used as a blueprint in the automated generation of a biological structure depiction in one medium
  • that same depiction may be used to automatically display, in the same medium, information about resource material related to that ontology.
  • biomedical resources a significant proportion of such resources carry information that cross-references biological structures.
  • examples of resources include a surgical report detailing a surgical procedure involving the stomach, a microscopy image of a liver cell, and a biochemical model of a glucose metabolic pathway.
  • biomedical resources such as the actions of indexing, classification, comparison and searching of data and models carried out by computers operating through appropriate programming
  • effective management of resources requires effective visualisation tools that display information about these resources in a meaningful way.
  • An effective approach to achieving visualisation is to display resource information in the context of diagrams of biological structures related to these resources. For instance, the schematic outline of the human body, where the body is the biological structure in issue, is routinely used by surgeons to report on a surgical procedure on some part of the body. Such a report is an example of a biomedical resource.
  • a second example pertains to the diagrammatic depiction of cell compartments as a backdrop to, or more general, higher-order context of biochemical pathways that involve these cell compartments. In this case, the pathway information is the biomedical resource, and the cell is the biological structure.
  • the present invention is concerned with the automated generation of schematic depictions of biological structures to provide a diagrammatic context to biomedical resources that are related to such structures.
  • the use of the invention provides biomedical resource information in a diagrammatic context by overlaying graphical representations of such information onto a schematic diagram of the biological structure with which it is associated.
  • the automated generation of schematic depictions relies on reference ontologies of biological structures as a source of machine- processable structural knowledge. This knowledge is used to direct the automated composition and layout of schematic diagrams of biological structures.
  • reference ontologies of biological structure may be related to biomedical resources by explicit cross-referencing.
  • Such cross-referencing generates linkages that, in the art, are referred to as semantic metadata.
  • the overlay of biomedical resource information onto the related biological structure schematic is, therefore, dependent on semantic metadata.
  • This type of metadata is utilized by the invention to associate components from a resource to terms in the depicted reference ontology.
  • biomedical ontologies This effort in the field of biomedical ontologies has resulted in standardized, machine- readable semantic metadata that link components of biomedical resources with corresponding terms in associated ontologies so as to convey explicit resources to those termsmeaning to those resources.
  • components within biomedical resources include data columns used in clinical models, specific spatial regions in radiological images, and disease labels in patient health records.
  • Semantic metadata makes possible the automated graphical depiction of relationships between ontological terms and biomedical resources. This graphical depiction allows users of such systems to readily find resource material relating to a particular biological structure, as well as to determine the proximate relationship between ontological terms deemed pertinent to a line of inquiry and other terms.
  • ontology terms and the relationships between them are depicted by ontology treemaps with each term represented by a single "tile.” Multiple ontology graphs may be arranged in treemaps to correctly display information about multiple biological structures so as to provide a beneficial and clearly visualized composite image.
  • One such example of multiple ontology graphs arranged as treemaps involves an ontology of cardiac structure, and a second ontology of white blood cell structure, such that the two ontologies may be combined to depict a nested treemap illustrating the substructures of a white cell in the lumen of the left ventricle of the heart.
  • the visualisation of formal knowledge, as terms in the context of their relations within an ontology together with associated semantic metadata is an area of active research within the biomedical community.
  • the treemap approach allows users to navigate through complex ontology graphs and to visualise metadata annotated to its their terms.
  • a treemap corresponding to an ontology graph one tile represents a single ontology term.
  • the positioning of a tile within another tile known in the art as "nesting,” is typically applied to represent parent-child relationships between nodes. For instance, if the anatomical term “hand” and the term “upper limb” are related in an ontology by the fact that the former is a part of the latter, the nesting of the "hand” tile within the "upper limb” tile is adopted to reflect this relation.
  • semantic metadata associated with an ontology term may be graphically depicted within the bounds of the treemap tile representing that term. For instance, if biomedical data describing palmar injuries is associated with the term "hand” by semantic metadata, then a symbolic link to this data resource may be graphically located within the "hand” tile of the above treemap.
  • the nesting of tiles in a treemap will consistently reflect the topology of the reference ontology graph from which the treemap is drawn.
  • the calculation of the relative position of the tiles in the treemap may vary in response to key conditions such as the aspect ratio of the image to be generated and the area requirements associated with each tile node, taking into account that the size of a particular tile may need to be enlarged with respect to other tiles depending on the amount of metadata symbols it is required to contain.
  • a typical user of biological resource ontologies will require structural knowledge from some arbitrary portion of the reference ontology of biological structure for the purpose of visualization.
  • a physician specializing in endocrine medicine may only be interested in depicting the spatial relationship between hormone-producing organs in the body, known in the art as endocrine organs, but may not require details about blood vessel connections between the gastrointestinal tract and the liver.
  • a vascular surgeon whose interest according to the specializations within the art is mainly focused on the repair of damaged blood vessels, would be primarily interested in schematics that draw upon that portion of the formal knowledge within a larger anatomy reference ontology that describes the cardiovascular system in treemap form.
  • a third scenario may involve a research scientist in the domain of diabetes, who requires the overlay of genetic metadata onto a visual representation of the vascular route between the endocrine glands in the gastrointestinal tract, where for instance, pancreatic insulin is produced, and the sites of hormone action in the liver.
  • a vascular surgeon and a diabetes researcher each draws upon a different subset of nodes and relations from the same overall reference ontology of human anatomy. They each draw upon a different derivative subgraph from the same ontology of anatomical structure.
  • the present invention ensures that the relative layout of such derivative subgraphs is consistent across all three scenarios such that, for example and not by way of limitation, both the endocrine physician and the vascular surgeon may intuitively and seamlessly interpret the graphical depiction generated by the diabetes researcher.
  • the relative spatial position of tiles in a treemap relating to any biological structure may be organized in response to prevailing conditions of the medium.
  • the use of different parameters for treemap construction reflecting prevailing conditions, and requirements for the relative size and orientation of tiles results in the generation of different spatial layouts for the same ontology graph, as well as for sets of derivative subgraphs from the same reference ontology graph.
  • Variations in the layout of graphic images based on the same underlying combinations of information impair the benefit of the resulting images as visualization aids to biomedical learning, research and clinical practice. These inconsistencies typically result in user disorientation and frustration, particular so at times when parameters are changed or maps updated.
  • a goal of the invention is to foster effective biomedical resource management through the consistent diagrammatic depiction of biological structures in order to visualise semantic resource metadata related to these structures.
  • Another goal of the invention is to provide a method by which images of diagrammatic depictions of biological structures are generated and in such a way as to reflect the consistency of those depictions.
  • a further goal of the invention is to provide a method for applying templates to biological structure reference ontologies and associated metadata so as to generate treemaps in which the relative position of tiles representing particular ontology terms are constrained by this application so as to remain consistent when layout conditions change.
  • a still further goal of the invention is to generate images of consistent diagrammatic depictions of biological structures.
  • the invention is a method for organizing biomedical resource information and generating images and text by depicting, describing and relating biological structures through the association of digital graph templates and reference ontologies, the computer programmes and databases for performing the method and the images and text so generated.
  • the method consists of applying constraints to ensure the consistent layout of treemaps for different derivative subgraphs extracted from the same reference ontology graph.
  • the constraints are applied through the use of graph templates associated with parent nodes in a graph-based relationship with subordinate child nodes so as to control the relative location, size and orientation of child tiles in subgraphs of the ontoiogical reference graphs and to generate consistent displays of visual depictions of biological structures at all scales of size and complexity.
  • a result of the performance of the invention is the production of more meaningful and usable depictions of treemaps of complex biological structures and their associated biomedical resource semantic metadata.
  • Figure 1 is a chart depicting how the invention may be performed.
  • Figure 2 demonstrates a treegraph representing an ontology.
  • Figure 3 demonstrates the association between resources and reference ontologies performed through semantic metadata.
  • Figure 4 demonstrates a treemap representing an ontology where the treemap is of the ontological graph demonstrated in Figure 23.
  • Figure 5 demonstrates how the graph of an ontology may be expressed as a treemap and how that treemap may vary in its proportion according to different parameters.
  • Figure 6 explains the creation of different derivative subgraphs from the same reference ontology treegraph.
  • Figure 6 7 demonstrates how the invention uses templates to construct a treemap of an entire ontology and of a node set within that ontology.
  • Figure 7 8 demonstrates how the invention may be used to apply a template to order the depictions of treemaps of large-scale structures.
  • Figure 8 demonstrates how the invention would organize a treemap depiction of an ontology containing information relating to the human right thumb.
  • Figure 9 demonstrates how the invention may be used so that the relative position of tiles constrained by a template is maintained when any subset of nodes in that template is selected for display.
  • the invention utilizes a computer programme together with a database containing ontologies, layout templates and stylesheets.
  • the computer programme component of the invention is designed to import reference ontologies of biological structures as graphs and to carry out analysis of the graphs so as to depict the graph as a treemap of the ontological terms and the relationships between them.
  • the computer programme applies templates to constrain the layout of tiles that represent child nodes of the same parent nodes.
  • the database component of the invention contains ontologies of biological structures that have been modified and extended into one or more treegraphs and reorganized to improve the distribution of terms over a treemap representing biological structures across multiple scales.
  • the database also contains templates, in this case associated with parent nodes, for the layout of tiles representing child nodes of those parent nodes in a manner consistent with the method of the invention and stylesheets for the symbolic representation of ontology terms within their corresponding treemap tile.
  • the computer programme component of the invention utilizes modifications to the imported ontologyies files to make the resulting graph or graphs more readily translated to treemap form.
  • the programme uses specialized templates to organize biomedical information to constrain the layout of tiles and so to ensure consistent spatial relationships when any derivative subgraph is to be generated by the invention.
  • the programme draws upon a repository of stylesheets stored within the database to render the graphical representations of the treemapped ontological material and associated iconography more medically relevant and more easily understood and used as a visual or, educational tool.
  • a template may be associated with any of the term nodes or relation edges in the reference ontology graph. For example, given a particular parent term node, a template linked to that node will explicitly describe the relative position of any subset of its child node tiles when it isthey are automatically laid out in a treemap. While techniques to constrain treemaps for the layout of the same graph under different conditions are well known in the art, no known application applies constraints to ensure the consistent layout of treemaps for different derivative subgraphs from the same reference ontology graph of biological structure.
  • the invention has been developed to import reference ontologies of biological structure as a graph, and to carry out graph analysis in support of the treemap depiction of ontology terms and the relations between them.
  • a key feature of this tool is that it applies templates associated with a parent node to constrain the tiled layout of its children nodes within derivative subgraphs.
  • the domain of application of the invention is in the visual organization and recapitulation of semantic metadata in the context of the biological structures related to this metadata.
  • physicians may use the invention to generate a body anatomy treemap to navigate through, create and query metadata linked to electronic health records of a patient, as well as related medical training material relevant to the particular medical issues the physician is interested in.
  • gene expression metadata may be organized over a whole- body treemap to reflect the location of the tissues of origin of such data, as a first step to depict, for instance, endocrine pathway information.
  • the invention also has applicability to the organization and recapitulation of semantic metadata associated with other knowledge domains.
  • the computer programme component of the invention is installed on a computer or on a computer connected to a network of computers or operating remotely with access to the computer onto which the programme component of the invention and associated data are installed.
  • the architecture of the preferred programme component of the invention consists of three interacting components, namely:
  • a relational database bearing tables describing reference ontologies of biological structure in a manner that reflects their graph topology, templates associated with parent terms of biological structure ontologies and that establish the topological rules for the layout of their child terms, and stylesheet rules that instruct the graphical engine as to, among other things, how to display certain terms depending on the type of the relation with their parent terms.
  • a database typically in the form of a relational database or an RDF triplestore containing metadata that link to ontology terms.
  • This metadata store may be available to the user of the invention as a community webservice and is not necessarily part of the local installation of the programme framework.
  • An application programming interface that combines instructions from the user with relevant data imported from the relational database to layout treemaps as a set of two dimensional co-ordinates for each tile and then to make use of the metadata store to overlay relevant metadata symbolic links to the graphical rendering of the treemap.
  • relational database is available through a MySQL server and the application programming interface is encoded in the Java programming language.
  • Step 1 The programme user selects from the database those terms from a reference ontology of biological structure for display as a two-dimensional treemap.
  • the set of terms chosen by the programme user for display is designated the Target Set.
  • Step 2 The programme user specifies the breadth and width parameters of the two-dimensional treemap in terms of the required co-ordinate system.
  • Step 3 The programme user selects, by way of parameters, the set of templates to be applied during the layout operation.
  • Step 4 The programme user selects, by way of parameters, the set of stylesheets to be applied during the rendering operation.
  • Step 5 The programme user chooses, by way of parameters, the manner of reference ontology conversion to a simple tree graph.
  • ontologies are typically in the form of a directed acyclic graph (DAG), and not in the form of a simple tree-like form, ontology graphs are first converted into a simple tree-like form in preparation for the depiction of their derivative subgraphs as a treemap.
  • DAG directed acyclic graph
  • the programme user is allowed to input parameters that regulate the manner in which the DAG-to-tree conversion occurs.
  • the parameterisation of the programme provides a method to prioritise the use of one type of relation over another during the conversion step.
  • the simple tree generated in this step is designated the Reference Tree Graph.
  • a derivative subgraph is extracted from the Reference Tree Graph generated in Step 5.
  • the node of this derivative subgraph includes those terms in the Target Set designated in Step 1.
  • the edge topology of the derivative subgraph is based on an edge-reduction process carried out on Reference Tree Graph with the purpose of ensuring that the root node of the Reference Tree Graph is connected to air Target Set nodes via the smallest set of derived edges such that nodes in the Reference Tree Graph that are not part of the Target Set are not included in the derivative graph.
  • the programme component while calculating the paths between the root node and the Target Set nodes, includes any template-associated parent nodes that are found on such paths in the relevant derivative subgraph.
  • Step 7 The treemap layout algorithm is then applied to the layout of the derivative subgraph derived in Step 6.
  • the layout process proceeds in the root-to-leaf direction, such that an iteration of the treemap calculation on a specific set of child nodes acts upon the parameterised parent tile area coordinates passed on from the previous iteration.
  • a template- associated parent node is met, the mesh layout calculation of the whole template is carried out such that the relative size of each node in the mesh is proportional to the total number of children each node has. Therefore, a mesh node that is not part of the Target Set and that has no children is allocated a size of 0.
  • the template-associated parent node is not part of the Target Set the resulting child tiles occupy the whole area of the parent tile and the parent tile itself is not part of the final treemap.
  • Step 8 The final set of Target Set tile co-ordinates generated by the iterations of Step 7 above is then passed to another part of the programme tool and used by that part to render the image of the treemap according to the stylesheets selected in Step 4 and to overlay any symbolic representations of metadata derived from the Metadata Store and associated with Target Set terms within the appropriate region of the corresponding tile.
  • the method of the invention allows programme users to add custom-made templates to the database by way of customizing the child tile layout for specific parent terms.
  • the API of the invention is embedded in a patient record management programme suite at a general-practitioner physician's clinic.
  • Anatomy-associated surgical and pathology reports are mapped by way of metadata to a reference ontology of anatomy such as that provided by the Foundational Model of Anatomy.
  • the general practitioner may use the invention to build a treemap indicating body sites that are associated with surgical and pathology reports.
  • the invention may be used to generate a Target Set of Foundational Model of AnatomyMA terms from the patient-specific report metatdata.
  • the clinic's programme suite would then also pass on the relevant parameters applicable to Steps 2 to 4 of the operation of the computer programme component of the invention, optionally including the use of a standardised set of default parameters previously chosen by the clinic or the physician, in order to generate the treemap resulting from Step 8 of the operation of the programme.
  • the general practitioner physician can click on a metadata symbol to retrieve and study the selected reports in more detail.
  • the invention as described can be used by a biomedical researcher to overview the anatomical and cellular location of expressed human genes that are imputed to be involved in some disease mechanism.
  • the gene set under such circumstances is typically the result of a genome-wide association study (GWAS).
  • GWAS genome-wide association study
  • the treemap showing the location of GWAS-imputed gene expression may then be used to show additional metadata links to biomedical literature abstracts such as those that may be derived from publically accessible community internet servers like PubMed through the website at http://www.ncbi.nlm.nih.gov/pubmed/.
  • These literature abstracts discuss the role of a particular gene expressed in some body part and the effect of that gene on the function of that body " part. For instance, given that the gene termed the Junctional adhesion molecule- 1 is known to be expressed in the brainstem body part, the brainstem tile generated by application of the invention could bear the PubMed link to the article by H.
  • Figure 1 depicts a chart describing how the method of the invention may be performed.
  • the computer programme component 61 of the invention processes that information and modifies it to conform to the parameters 62 the invention requires.
  • the modified information is stored in the database 63 component of the invention.
  • the programme component 61 applies templates 64 to the information so as to accurately reflect the spatial relationships between items of information within the database 63 defined as terms 65.
  • the programme component 61 generates treemap depictions 66 of the information at the levels required by a user. Metadata linkages 67 are established between terms 65 as through user installed annotations 68 that are added to the stored information within the database 63 and displayed within the treemap depictions 66 relevant to the such annotations and are accessible at the time of future use of the invention.
  • Figure 2 demonstrates a treegraph representing an ontology 101 (shown in Figure 3).
  • the invention makes use of ontologies comprised of information about a discrete subject. This information is organized according to terms 102 defining various aspects of the subject of the ontology 101.
  • a relationship between two terms in a rooted treegraph, where one term is a parent term 104 and the other is a subordinate child term 105 is referred to here as an edge 103.
  • Metadata annotations 106 associated with a particular ontology term may be linked to the ontology term at nodes of reference 104-105.
  • Figure 3 demonstrates the association between resources and reference ontologies performed through semantic metadata ( Figure reproduced from de Bono, B., et al "The RJCORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions," (201 1) 4(313) BioMed Central Research Notes, Figure 1 C).
  • An ontology 101 provides explicit machine readable information for the annotation of semantic metadata 106.
  • a biological structure ontology 107 101 may be used to illustrate, among other things, the anatomical hierarchy of parts of the heart.
  • a term 105 used as part of this ontology might refer to such a part, such as the interventrical septum 107 or the mitral valve 108.
  • the invention makes use of explicit representations of formal knowledge as well- defined terms of reference concept terms 102 and edges 103 between such concepts terms 102 to compare resource metadata 106 associated with terms from the same ontology 101 precisely and automatically.
  • Figure 4 demonstrates a treemap representing an ontology where the treemap is of the ontological graph demonstrated in Figure 2. This treemap demonstrates how the invention is used to organize or nest tiles of terms 102 in parent 104 and child
  • Figure 5 demonstrates how the graph of an ontology may be expressed as a treemap and how that treemap may vary in its proportion and layout according to different prevailing conditions that are parameterised during the automated execution of treemap construction.
  • the ontology graph 109 may be expressed as treemap 1 10 or treemap 1 11 or treemap 1 12,
  • the internal layout of treemap 1 10 is different from that of treemap 1 1 1 because different aspect ratios were applied to the same ontological graph resulting in different proportions and arrangements of the term tiles 102 within that ontology.
  • the internal layout of treemap 112 differs from that of treemap 1 10 because the area parameter of the term tile 1 13 is greater in treemap 112 than it is in treemap 110.
  • Figure 6 explains the process by which different derivative subgraphs may be extracted from the same reference ontology graph.
  • the intervening edges of the derivative graphs maintain the same connective topology between the nodes as in the original reference graph.
  • the layout of the two derivative graphs is not constrained, the layout of the three nodes shared by the two derivative subgraphs, nodes 1 , 26 and 31, are not consistent across the two treemaps.
  • Figure 7 demonstrates how the invention uses templates 140 and 141 to construct a treemap 142 of an entire ontology 143 and of a node 25 within that ontology.
  • Portions of Figure 7 numbered 1-56 represent tiles nested according to their relationship to each other which in turn represent the terms comprising the ontology 143.
  • the template 140 refers to the arrangement of tiles within the overall treemap of the ontology 143.
  • the child nodes to parent node 1 are organized by their corresponding number designations, those being 2, 35, 29, 31, 6, 32, 34, 30, 35, in a matrix of three rows of thee entries each, depicting the spatial relationships between the nine parents nodes.
  • the number for child tile 2 is shown to the left of child tile 35, which is to the left of child tile 29, these three tiles forming the top row of the template 140.
  • a second row wherein are arranged the numbers designating the placement of child tiles 31, 6 and 32 in order from left to right.
  • tile 2 is itself a parent tile with its own set of internal, subordinate children tiles, and is shown to the left of parent tile 35 and above parent tile 31 in the resulting treemap of the ontology arranged according to the template 140.
  • the placement of the tiles within treemap graph 142 is constrained according to the template 140 so as to represent the ontology 143.
  • the template 141 refers to the arrangement of the child tiles within the parent tile numbered 325. Template 141 orders the tiles of terms numbered 40, 37, 36, 38, 41, 39 in two row of three terms each such that tile 40 is to the left of tile 37 and directly above tile 38. Tile 37 is to the left of tile 36 and directly above tile 41. Tile 36 is directly above tile 39.
  • the application of template 141 to the child nodes of parent tile 325 of the ontology 143 according to the invention results in the spatial arrangement of the six child nodes as shown in the treemapgraph 144. Treemap 142 is topologically identical to treemap 144.
  • Figure 8 demonstrates how the invention may be used to apply a template 145 to order the depiction of treemaps of large-scale structures, such as the human body.
  • the ontological maptree graph 146 of such a structure would be complex.
  • the second highest order of parent tiles may be grouped into five first order parent tiles, designated by a different graphical symbol, with varying numbers of second order tiles within each first order grouping.
  • the invention organizes such complexity according to designated parameters to produce a template 145 comprised of four rows of six tiles and showing all twenty-four second order parent tiles.
  • the first order grouping relating to the vascular system 147 contains four tiles, designated by a triangular symbol.
  • This restructuring and apportioning of an anatomy ontology is done so as to generate images of tiles that can be laid out so as to be readily visualized by a user.
  • Within each of the second order tiles shown here would be numerous lower level tiles containing information about biological structures making up the higher level structures of which they are a part.
  • the invention anticipates that this grouping of lower level substructures within higher level structures can be extended to the molecular level. In this way, each portion of a biological structure, such as the human body, may be reduced through its component parts to the level of the molecules making up each part.
  • Figure 9 demonstrates robustness of the template-constrained layout achieved by this invention. Specifically, Figure 9 demonstrates how the invention may be used so that the relative position of tiles constrained by a template is maintained for any subset of nodes in that template which are selected for display.
  • Figures 9a and 9b illustrate two treemaps that have been generated from two derivate subgraphs. The subgraph in Figure 9b has three tiles less than those represented in the subgraph in Figure 9a. The three missing tiles in Figure 9b are each marked with an X in Figure 9a.
  • the application of the invention results in tiles within a template being arranged in the same relative positions so long as two or more tiles within the template are selected for display.
  • Figure 9 shows how the invention may be used to apply two templates to constrain the relative layout of two sets of child tiles.
  • the first set of child tiles constrained by a template is comprised of child nodes to the parent node representing a human body (the twenty four nodes constrained by this template are outlined by a coarse dashed line).
  • the second set of child tiles constrained by a template are child nodes to the parent node representing the human stomach (the six nodes constrained by this template are outlined by a fine dashed line).
  • the three nodes removed from the derivative subgraph shown in Figure 9a to create the derivative subgraph in Figure 9b are labelled "Vascular Caudal” and "Vascular Cepahic" and are constrained by one template, while the "Body of Stomach" is constrained by another template.
  • a comparison of the two treemaps in Figure 9 demonstrates how the invention utilizes the information associated with two templates to maintain the same overall spatial layout of the tiles involved in the two templates. Arrows are used in the diagrams to indicate how, in this particular example, tiles neighbouring those that were removed from Figure 9b have been expanded to take up the unused space left when tiles are removed. Tiles not constrained by templates are subject to inconsistent layout, exemplified by the two tiles labelled 1 and 2 in both Figures 9a and 9b.
  • the tiles are labelled with the anatomical names given to the parts of human body. These labels are used in the figure for demonstration purposes and the words expressed as labels do not form part of the description of the invention.
  • Tiles representing information about lower-order anatomical divisions of the body are nested within the tiles relating to the higher order divisions of the body.
  • the relationships between parts is maintained both in terms of their anatomical relationships to other parts and in terms of their spatial, visual depiction so as to aid the visualization of these relationships by a user of the invention and the images it generates.
  • metadata linkages a user of the invention can navigate through lower and higher level representations of information about biological structures.
  • templates by the invention to constrain how these depictions are displayed achieves an advantageous visual experience for the user through limiting the inconsistency within the information displayed at any one time about a biological structure at a particular level of detail and in this way ensures consistency in how the relationships between the parts of a structure at the same level and at different levels are represented.

Abstract

The invention is a method for organizing biomedical resource information and generating images and text by depicting, describing and relating biological structures through the association of digital graph templates and reference ontologies, the computer programmes and databases for performing the method and the images and text so generated. The method consists of applying constraints to ensure the consistent layout of treemaps for different derivative subgraphs extracted from the same reference ontology graph. The constraints are applied through the use of graph templates associated with nodes in a graph-based relationship with other nodes so as to control the relative location, size and orientation of tiles in subgraphs of the ontological reference graphs and to generate consistent displays of visual depictions of biological structures at all scales of size and complexity. A result of the performance of the invention is the production of more meaningful and usable depictions of treemaps of complex biological structures and their associated biomedical resource semantic metadata.

Description

TITLE:
A Method for organizing Data and generating Images of Reference Biological Structures and Related Materials and the Images and Materials so generated
INVENTOR: B. de Bono
APPLICANT: B. de Bono
DEFINITION OF TERMS
Aspect ratio - the ratio of the longer dimension to the shorter dimension of a two dimensional image
Derivative subgraph - a graph consisting of a subset of nodes from a reference graph, constructed in such a way that the intervening edges of the derivative graph maintain the same connective topology between the nodes as in the original reference graph Biological structure - the physical structure of the whole or part of an organism, including viruses, such as the structure of a human body, an internal organ of an animal, a single cell of a plant, a molecular complex or a single molecule.
Biomedical community - Researchers in genetics, biology, biochemistry, biophysics, bio-engineering, ecology and medicine, as well as practitioners in the pharmaceutical, biotech and clinical domain
Biomedical resource - A resource that is typified by an electronic dataset, whether text or image based or by a computational model, and bearing some biomedical significance
Biomedical resource component - A part of a biomedical resource, for example a data column in a clinical trial spreadsheet or database table, a biological variable in the code of a physiology model, a specific spatial region in a radiology image, or a disease label in a patient's health record
Biomedical community - Researchers in genetics, biology, biochemistry, biophysics, bio-engineering, ecology and medicine, as well as practitioners in the pharmaceutical, biotech and clinical domain
Derivative subgraph - a graph consisting of a subset of nodes from a reference graph, constructed in such a way that the intervening edges of the derivative graph maintain the same connective topology between the nodes as in the original reference graph Metadata - Machine readable documentation material that is linked to a corresponding resource or component of a resource indicating how the actual content of that component resource should be interpreted
MySQL - A relational database management system that runs as a server providing multi-user access to a number of databases simultaneously
Ontology - A set of information about the nature and existence of a subject, typically consisting of terms and their relations. In the pharmaceutical domain, for instance, one ontology could characterize a taxonomic classification of drugs - in this case, every term would represent a drug or a family of drugs. In this way, the parent term "Analgesic drug family" would have a child term "Paracetamol" and the two terms may be linked by way of a specialization relation
Ontology of a biological structure - A set of facts about a biological structure, typically consisting of terms describing parts of that structure and their relations. For example, an anatomy ontology for humans may describe the way in which organs and tissue regions make up the body. In this example, the term "hand"' would be a child node to the parent term "upper limb", and the term "thumb" would be a child to the parent term "hand" each linked to the other by way of a partonomy relation
Ontology graph - A graphic representation of an ontology in which graph nodes represent ontology terms and graph edges represent relations between terms Parent-child tile nesting - In a treemap representation of a graph, a tree-like graphic structure, nesting consists of placing a tile representing a child node entirely within the bounds of the tile representing its parent node
Parent-child relation - The analogy of parent-child relations is applicable to the invention when a tree graph is assigned a root node. In the case where two nodes in such a graph are linked by the same edge, one of the two nodes, the parent node, is necessarily closer to the root than the other node, the child node. More formally, a smaller number of edges hasve to be traversed to reach the root from the parent node, compared to the path taken from the child node, and the path from the child to the root must necessarily traverse the parent node
Relational database - a database that matches data by using common characteristics found within the data set
Resource description framework (RDF) - a family of World Wide Web Consortium specifications designed as a metadata model, currently used as a general guideline for the conceptual description and modeling of information that is implemented in web resources using a variety of syntax formats.
Semantic metadata - Metadata, as defined here, that links the component of a resource to an ontology term
Tile - A spatial graphical artefact, typically in two or three dimensions, and when depicted within a treemap manifestation of an ontology graph, represents an ontology term Tile map - A type of tree map
Tree - In graph theory, a tree is an undirected graph in which any two vertices or nodes are connected by one simple path of edges and being without cycles
Tree graph - A graphic representation of discrete facts and values of a subject that utilizes connecting and branching lines analogous to the structure of a tree to demonstrate ordinal relationships between those facts and values. A graphic representation of a tree-structured graph as nodes connected by linear edges
Treemap - A tree graph organized in a manner similar to a flowchart to show a progression of operations. Treemaps display hierarchical, tree-structured graphs as a set of nested rectangles. Each node of the tree is represented by a rectangle, which rectangle is then tiled internally with smaller rectangles representing child nodes
Triplestore - is a purpose-built database for the storage and retrieval of RDF metadata
INTRODUCTION AND BACKGROUND OF THE INVENTION
The present invention relates to the general field of knowledge engineering and the specific field of ontology engineeringvisualisation.
The visual depiction of biological structures requires that the complex arrangement of their parts is laid out within the constraints of scale imposed by the medium in which they are drawn. Automating the consistent visual depiction of biological structures across multiple scales is technically difficult to achieve, such that biologists have had to rely on manual methods to create complex visual representations. For instance, the generation of small-scale depictions of complex structures, such as skeletal parts and internal organs of living organisms, is constrained by the limitation imposed by the size of the media used for those depictions. This means that descriptions of larger scale features of structures and the relationships between structures and parts of structures must be omitted or may only be achieved through cumbersome manual systems of annotation and linking between separate illustrations. Prior to the computerization of formal reference knowledge about biological structures, the art encompassed only the use of manual linking between smaller and larger scale depictions.
It is desirable that formal knowledge about biological structures, referred to in the art as reference ontologies, be related to depictions of those structures so that the resulting visual composites of reference illustrations and related resource information may serve as visual aids to education, research and clinical practice by the biomedical community. Where a reference ontology is used as a blueprint in the automated generation of a biological structure depiction in one medium, then that same depiction may be used to automatically display, in the same medium, information about resource material related to that ontology. The considerable growth in the amount of formal knowledge about biological structures, as contained in corresponding reference ontologies, requires apposite methodologies to automatically manage the layout and ensure the consistency in the depictions of such biological structures.
Practice, research and training in biomedicine generate considerable amounts of electronic data and computational models, which are known in the art as biomedical resources. A significant proportion of such resources carry information that cross-references biological structures. Examples of resources include a surgical report detailing a surgical procedure involving the stomach, a microscopy image of a liver cell, and a biochemical model of a glucose metabolic pathway.
The automated management of biomedical resources, such as the actions of indexing, classification, comparison and searching of data and models carried out by computers operating through appropriate programming, is the subject of extensive efforts by members of the biomedical community. \ In particular, effective management of resources requires effective visualisation tools that display information about these resources in a meaningful way. An effective approach to achieving visualisation is to display resource information in the context of diagrams of biological structures related to these resources. For instance, the schematic outline of the human body, where the body is the biological structure in issue, is routinely used by surgeons to report on a surgical procedure on some part of the body. Such a report is an example of a biomedical resource. A second example pertains to the diagrammatic depiction of cell compartments as a backdrop to, or more general, higher-order context of biochemical pathways that involve these cell compartments. In this case, the pathway information is the biomedical resource, and the cell is the biological structure.
The present invention is concerned with the automated generation of schematic depictions of biological structures to provide a diagrammatic context to biomedical resources that are related to such structures. Specifically, the use of the invention provides biomedical resource information in a diagrammatic context by overlaying graphical representations of such information onto a schematic diagram of the biological structure with which it is associated.
According to the invention, the automated generation of schematic depictions relies on reference ontologies of biological structures as a source of machine- processable structural knowledge. This knowledge is used to direct the automated composition and layout of schematic diagrams of biological structures.
In the art, reference ontologies of biological structure may be related to biomedical resources by explicit cross-referencing. Such cross-referencing generates linkages that, in the art, are referred to as semantic metadata. The overlay of biomedical resource information onto the related biological structure schematic is, therefore, dependent on semantic metadata. This type of metadata is utilized by the invention to associate components from a resource to terms in the depicted reference ontology.
To manage the increase in biomedical resources, and to display information about such resources effectively, an effort is ongoing within the public domain to adopt common standards for the classification of such information using biomedical ontologies. A key feature of this effort involves the adoption of common standards for the terminology used to express concepts within individual ontologies. An example of such a community effort is that carried out by the Open Biological and Biomedical Ontologies Foundry (OBO) www.obofoundry.org. An aim of this collective standardization has been to facilitate the automation of the utilization of formal knowledge about biological stmctures in terms of classification, indexing, searching, retrieving and manipulation of that information in machine-readable, digital storage files. See Smith, B., et al. "The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration," (2007) 25(1 1) Nature Biotechnology, at 1251-1255, and de Bono, B., et al "The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions," (2011) 4(313) BioMed Central Research Notes. Furthermore, in the field of biomedicine, reference ontologies in general and biological structure ontologies specifically may be used as educational resources in addition to their use as tools of research and clinical practice.
This effort in the field of biomedical ontologies has resulted in standardized, machine- readable semantic metadata that link components of biomedical resources with corresponding terms in associated ontologies so as to convey explicit resources to those termsmeaning to those resources. Examples of components within biomedical resources include data columns used in clinical models, specific spatial regions in radiological images, and disease labels in patient health records.
While computerization has generally facilitated the utilization of information on many subjects, such as through the ease of reproducing digital files and the speed with which those files may be searched and disseminated, it has not provided a feasible solution to the spatial problem of ensuring consistent depictions of biological structures as well as associated resources.
Semantic metadata makes possible the automated graphical depiction of relationships between ontological terms and biomedical resources. This graphical depiction allows users of such systems to readily find resource material relating to a particular biological structure, as well as to determine the proximate relationship between ontological terms deemed pertinent to a line of inquiry and other terms. In this application of the art, ontology terms and the relationships between them are depicted by ontology treemaps with each term represented by a single "tile." Multiple ontology graphs may be arranged in treemaps to correctly display information about multiple biological structures so as to provide a beneficial and clearly visualized composite image. One such example of multiple ontology graphs arranged as treemaps involves an ontology of cardiac structure, and a second ontology of white blood cell structure, such that the two ontologies may be combined to depict a nested treemap illustrating the substructures of a white cell in the lumen of the left ventricle of the heart. The visualisation of formal knowledge, as terms in the context of their relations within an ontology together with associated semantic metadata is an area of active research within the biomedical community. The use of treemaps to portray ontology terms and their relations to each other as tiled maps is now a well established approach to laying out complex ontologies in a visually-meaningful manner, provided thatespecially if such ontologies have, or are converted to, a tree-like graph structure.
The treemap approach allows users to navigate through complex ontology graphs and to visualise metadata annotated to its their terms. In a treemap corresponding to an ontology graph, one tile represents a single ontology term. The positioning of a tile within another tile, known in the art as "nesting," is typically applied to represent parent-child relationships between nodes. For instance, if the anatomical term "hand" and the term "upper limb" are related in an ontology by the fact that the former is a part of the latter, the nesting of the "hand" tile within the "upper limb" tile is adopted to reflect this relation. In addition, semantic metadata associated with an ontology term may be graphically depicted within the bounds of the treemap tile representing that term. For instance, if biomedical data describing palmar injuries is associated with the term "hand" by semantic metadata, then a symbolic link to this data resource may be graphically located within the "hand" tile of the above treemap.
In the art the nesting of tiles in a treemap will consistently reflect the topology of the reference ontology graph from which the treemap is drawn. However, for a given graph the calculation of the relative position of the tiles in the treemap may vary in response to key conditions such as the aspect ratio of the image to be generated and the area requirements associated with each tile node, taking into account that the size of a particular tile may need to be enlarged with respect to other tiles depending on the amount of metadata symbols it is required to contain.
The automated generation of different layout solutions for the same graph due to such altered conditions leads to inconsistency in the relative position and orientation of tiles and severely impairs the visualisation benefit of the diagrams so generated. This inconsistency results in user disorientation and frustration due to the lack of spatial consistency particularly during real-time interaction with the treemap when conditions are being changed and the map automatically updated. If treemaps are to be usefully applied to the realtime visualization of biology structure reference ontologies and associated semantic metadata, the relative position of tiles representing particular ontology terms needs to be constrained so as to remain consistent when layout conditions change.
Achieving consistent automated layouts of biological structure ontologies is rendered more challenging when the size of the reference ontologies is taken into account. In the biomedical art, ontological resources relating to particular biological structures describe large numbers of terms and relationships between structures and parts of structures. A typical biological structure ontology contains between 500 and 50,000 of such terms. A comprehensive visual depiction of a complete biological structure ontology would be difficult to comprehend even if the medium used for the display could encompass the entire depiction together with the corresponding resource material.
In practice, a typical user of biological resource ontologies will require structural knowledge from some arbitrary portion of the reference ontology of biological structure for the purpose of visualization. For example, a physician specializing in endocrine medicine may only be interested in depicting the spatial relationship between hormone-producing organs in the body, known in the art as endocrine organs, but may not require details about blood vessel connections between the gastrointestinal tract and the liver. Conversely, a vascular surgeon, whose interest according to the specializations within the art is mainly focused on the repair of damaged blood vessels, would be primarily interested in schematics that draw upon that portion of the formal knowledge within a larger anatomy reference ontology that describes the cardiovascular system in treemap form. A third scenario may involve a research scientist in the domain of diabetes, who requires the overlay of genetic metadata onto a visual representation of the vascular route between the endocrine glands in the gastrointestinal tract, where for instance, pancreatic insulin is produced, and the sites of hormone action in the liver. In these three scenarios involving an endocrine physician, a vascular surgeon and a diabetes researcher, each draws upon a different subset of nodes and relations from the same overall reference ontology of human anatomy. They each draw upon a different derivative subgraph from the same ontology of anatomical structure. The present invention ensures that the relative layout of such derivative subgraphs is consistent across all three scenarios such that, for example and not by way of limitation, both the endocrine physician and the vascular surgeon may intuitively and seamlessly interpret the graphical depiction generated by the diabetes researcher.
According to the art, the relative spatial position of tiles in a treemap relating to any biological structure may be organized in response to prevailing conditions of the medium. The use of different parameters for treemap construction reflecting prevailing conditions, and requirements for the relative size and orientation of tiles results in the generation of different spatial layouts for the same ontology graph, as well as for sets of derivative subgraphs from the same reference ontology graph. Variations in the layout of graphic images based on the same underlying combinations of information impair the benefit of the resulting images as visualization aids to biomedical learning, research and clinical practice. These inconsistencies typically result in user disorientation and frustration, particular so at times when parameters are changed or maps updated. In order to optimize the use of treemaps for the visualization of biological structure ontologies and associated semantic metadata, it would be beneficial to constrain the relative positions of tiles representing particular ontology terms so that they remain consistent when layout conditions change. As the graphs of biological structure ontologies are large and complex, the graphical representations of these ontologies are routinely subdivided to derivative subgraphs extracted from the same reference source for the creation of treemap layouts. If the relative position of tiles in treemaps from different derivative subgraphs is to remain spatially consistent, it is necessary that this relative position be constrained so as to most effectively reflect the nature of the biological structures they represent. For instance, in an anatomy treemap, it may be desirable for the layout of the tiles representing the right and left kidney terms to consistently reflect the same laterality so that the tile for the left kidney is always displayed to the left of the right kidney tile. Achieving this goal would make it possible to produce more meaningful and more usable treemaps of biological structure. Without such a solution, the navigation of complex biological treemaps and associated biomedical resource semantic metadata is considerably less manageable.
A goal of the invention is to foster effective biomedical resource management through the consistent diagrammatic depiction of biological structures in order to visualise semantic resource metadata related to these structures. Another goal of the invention is to provide a method by which images of diagrammatic depictions of biological structures are generated and in such a way as to reflect the consistency of those depictions. A further goal of the invention is to provide a method for applying templates to biological structure reference ontologies and associated metadata so as to generate treemaps in which the relative position of tiles representing particular ontology terms are constrained by this application so as to remain consistent when layout conditions change. A still further goal of the invention is to generate images of consistent diagrammatic depictions of biological structures.
SUMMARY OF THE INVENTON The invention is a method for organizing biomedical resource information and generating images and text by depicting, describing and relating biological structures through the association of digital graph templates and reference ontologies, the computer programmes and databases for performing the method and the images and text so generated. The method consists of applying constraints to ensure the consistent layout of treemaps for different derivative subgraphs extracted from the same reference ontology graph. The constraints are applied through the use of graph templates associated with parent nodes in a graph-based relationship with subordinate child nodes so as to control the relative location, size and orientation of child tiles in subgraphs of the ontoiogical reference graphs and to generate consistent displays of visual depictions of biological structures at all scales of size and complexity. A result of the performance of the invention is the production of more meaningful and usable depictions of treemaps of complex biological structures and their associated biomedical resource semantic metadata.
LIST OF FIGURES
Figure 1 is a chart depicting how the invention may be performed.
Figure 2 demonstrates a treegraph representing an ontology.
Figure 3 demonstrates the association between resources and reference ontologies performed through semantic metadata. Figure 4 demonstrates a treemap representing an ontology where the treemap is of the ontological graph demonstrated in Figure 23.
Figure 5 demonstrates how the graph of an ontology may be expressed as a treemap and how that treemap may vary in its proportion according to different parameters.
Figure 6 explains the creation of different derivative subgraphs from the same reference ontology treegraph.
Figure 6 7 demonstrates how the invention uses templates to construct a treemap of an entire ontology and of a node set within that ontology.
Figure 7 8 demonstrates how the invention may be used to apply a template to order the depictions of treemaps of large-scale structures.
Figure 8 demonstrates how the invention would organize a treemap depiction of an ontology containing information relating to the human right thumb.
Figure 9 demonstrates how the invention may be used so that the relative position of tiles constrained by a template is maintained when any subset of nodes in that template is selected for display.
DETAILED DESCRIPTION OF THE INVENTION The invention utilizes a computer programme together with a database containing ontologies, layout templates and stylesheets. The computer programme component of the invention is designed to import reference ontologies of biological structures as graphs and to carry out analysis of the graphs so as to depict the graph as a treemap of the ontological terms and the relationships between them. The computer programme applies templates to constrain the layout of tiles that represent child nodes of the same parent nodes.
The database component of the invention contains ontologies of biological structures that have been modified and extended into one or more treegraphs and reorganized to improve the distribution of terms over a treemap representing biological structures across multiple scales. The database also contains templates, in this case associated with parent nodes, for the layout of tiles representing child nodes of those parent nodes in a manner consistent with the method of the invention and stylesheets for the symbolic representation of ontology terms within their corresponding treemap tile.
The computer programme component of the invention utilizes modifications to the imported ontologyies files to make the resulting graph or graphs more readily translated to treemap form. The programme uses specialized templates to organize biomedical information to constrain the layout of tiles and so to ensure consistent spatial relationships when any derivative subgraph is to be generated by the invention. The programme draws upon a repository of stylesheets stored within the database to render the graphical representations of the treemapped ontological material and associated iconography more medically relevant and more easily understood and used as a visual or, educational tool.
The original contribution of the invention is a computational method, implemented in a programme tool that makes use of graph-based templates to preserve and constrain the layout of derivative subgraphs originating from the same reference ontology graph. A template may be associated with any of the term nodes or relation edges in the reference ontology graph. For example, given a particular parent term node, a template linked to that node will explicitly describe the relative position of any subset of its child node tiles when it isthey are automatically laid out in a treemap. While techniques to constrain treemaps for the layout of the same graph under different conditions are well known in the art, no known application applies constraints to ensure the consistent layout of treemaps for different derivative subgraphs from the same reference ontology graph of biological structure.
The invention has been developed to import reference ontologies of biological structure as a graph, and to carry out graph analysis in support of the treemap depiction of ontology terms and the relations between them. A key feature of this tool is that it applies templates associated with a parent node to constrain the tiled layout of its children nodes within derivative subgraphs.
The domain of application of the invention is in the visual organization and recapitulation of semantic metadata in the context of the biological structures related to this metadata. In an example of an application of the invention, and not by way of limitation, physicians may use the invention to generate a body anatomy treemap to navigate through, create and query metadata linked to electronic health records of a patient, as well as related medical training material relevant to the particular medical issues the physician is interested in. In another example of an application of the invention in a pharmaceutical scenario, and not by way of limitation, gene expression metadata may be organized over a whole- body treemap to reflect the location of the tissues of origin of such data, as a first step to depict, for instance, endocrine pathway information. The invention also has applicability to the organization and recapitulation of semantic metadata associated with other knowledge domains.
The invention will now be described by way of preferred embodiments given as examples of configurations of the invention and as means of its implementation and not by way of limitation. The computer programme component of the invention is installed on a computer or on a computer connected to a network of computers or operating remotely with access to the computer onto which the programme component of the invention and associated data are installed. The architecture of the preferred programme component of the invention consists of three interacting components, namely:
A relational database bearing tables describing reference ontologies of biological structure in a manner that reflects their graph topology, templates associated with parent terms of biological structure ontologies and that establish the topological rules for the layout of their child terms, and stylesheet rules that instruct the graphical engine as to, among other things, how to display certain terms depending on the type of the relation with their parent terms.
A database typically in the form of a relational database or an RDF triplestore containing metadata that link to ontology terms. This metadata store may be available to the user of the invention as a community webservice and is not necessarily part of the local installation of the programme framework.
An application programming interface that combines instructions from the user with relevant data imported from the relational database to layout treemaps as a set of two dimensional co-ordinates for each tile and then to make use of the metadata store to overlay relevant metadata symbolic links to the graphical rendering of the treemap.
In its currently implemented form the relational database is available through a MySQL server and the application programming interface is encoded in the Java programming language.
The method of the invention will now be described through another preferred embodiment relating to the operation of the computer programme component of the invention and not by way of limitation. In the first step of the method of the invention, ontologies about biological structure in electronic form and available in the public domain and accessible through open biomedical community websites such as http://www.bioontology.org and http://www.obofoundry.org are downloaded typically in Web Ontology Language (OWL) or Open Biological Ontologies (OBO) format and processed by the computer programme component of the invention to conform to the database schema in the relational database. The following sequence of steps describes the typical operation of the computer programme component:
Step 1. The programme user selects from the database those terms from a reference ontology of biological structure for display as a two-dimensional treemap. The set of terms chosen by the programme user for display is designated the Target Set.
Step 2. The programme user specifies the breadth and width parameters of the two-dimensional treemap in terms of the required co-ordinate system.
Step 3. The programme user selects, by way of parameters, the set of templates to be applied during the layout operation.
Step 4. The programme user selects, by way of parameters, the set of stylesheets to be applied during the rendering operation.
Step 5. The programme user chooses, by way of parameters, the manner of reference ontology conversion to a simple tree graph. As ontologies are typically in the form of a directed acyclic graph (DAG), and not in the form of a simple tree-like form, ontology graphs are first converted into a simple tree-like form in preparation for the depiction of their derivative subgraphs as a treemap. At this step in the operation of the programme, the programme user is allowed to input parameters that regulate the manner in which the DAG-to-tree conversion occurs. For instance, if a DAG consists of three different types of edge relations, such as regional_part, constitutional__part, and subclass edge relations, then the parameterisation of the programme provides a method to prioritise the use of one type of relation over another during the conversion step. The simple tree generated in this step is designated the Reference Tree Graph.
Step 6. A derivative subgraph is extracted from the Reference Tree Graph generated in Step 5. The node of this derivative subgraph includes those terms in the Target Set designated in Step 1. The edge topology of the derivative subgraph is based on an edge-reduction process carried out on Reference Tree Graph with the purpose of ensuring that the root node of the Reference Tree Graph is connected to air Target Set nodes via the smallest set of derived edges such that nodes in the Reference Tree Graph that are not part of the Target Set are not included in the derivative graph. The programme component, while calculating the paths between the root node and the Target Set nodes, includes any template-associated parent nodes that are found on such paths in the relevant derivative subgraph.
Step 7. The treemap layout algorithm is then applied to the layout of the derivative subgraph derived in Step 6. The layout process proceeds in the root-to-leaf direction, such that an iteration of the treemap calculation on a specific set of child nodes acts upon the parameterised parent tile area coordinates passed on from the previous iteration. When a template- associated parent node is met, the mesh layout calculation of the whole template is carried out such that the relative size of each node in the mesh is proportional to the total number of children each node has. Therefore, a mesh node that is not part of the Target Set and that has no children is allocated a size of 0. Similarly, if the template-associated parent node is not part of the Target Set the resulting child tiles occupy the whole area of the parent tile and the parent tile itself is not part of the final treemap.
Step 8. The final set of Target Set tile co-ordinates generated by the iterations of Step 7 above is then passed to another part of the programme tool and used by that part to render the image of the treemap according to the stylesheets selected in Step 4 and to overlay any symbolic representations of metadata derived from the Metadata Store and associated with Target Set terms within the appropriate region of the corresponding tile.
In addition to these steps and in operations supplemental thereto, the method of the invention allows programme users to add custom-made templates to the database by way of customizing the child tile layout for specific parent terms.
In another preferred embodiment of the invention and not by way of limitation, the API of the invention is embedded in a patient record management programme suite at a general-practitioner physician's clinic. Anatomy-associated surgical and pathology reports are mapped by way of metadata to a reference ontology of anatomy such as that provided by the Foundational Model of Anatomy. In order to obtain an intuitive overview of a patient's past medical history the general practitioner may use the invention to build a treemap indicating body sites that are associated with surgical and pathology reports. By using a patient's unique medical record identifier to call for information concerning a particular patient from the database component of the invention or from a database to which the computer programme component of the invention has access, the invention may be used to generate a Target Set of Foundational Model of AnatomyMA terms from the patient-specific report metatdata. The clinic's programme suite would then also pass on the relevant parameters applicable to Steps 2 to 4 of the operation of the computer programme component of the invention, optionally including the use of a standardised set of default parameters previously chosen by the clinic or the physician, in order to generate the treemap resulting from Step 8 of the operation of the programme. Given that the metadata symbols overlaid onto the treemap bear metadata-encoded links to the original reports, then the general practitioner physician can click on a metadata symbol to retrieve and study the selected reports in more detail.
In another preferred embodiment of the invention and not by way of limitation, the invention as described can be used by a biomedical researcher to overview the anatomical and cellular location of expressed human genes that are imputed to be involved in some disease mechanism. The gene set under such circumstances is typically the result of a genome-wide association study (GWAS). Given an arbitrary set of human genes, a list of those organs, tissues and cells that are known to express such genes may be obtained using biomedical community databases and services in such a way that the list is mapped to reference ontology terms and in so doing a Target Set of biological structures is generated. The treemap showing the location of GWAS-imputed gene expression may then be used to show additional metadata links to biomedical literature abstracts such as those that may be derived from publically accessible community internet servers like PubMed through the website at http://www.ncbi.nlm.nih.gov/pubmed/. These literature abstracts discuss the role of a particular gene expressed in some body part and the effect of that gene on the function of that body "part. For instance, given that the gene termed the Junctional adhesion molecule- 1 is known to be expressed in the brainstem body part, the brainstem tile generated by application of the invention could bear the PubMed link to the article by H. Waki, et al, "Junctional adhesion molecule- 1 is upregulated in spontaneously hypertensive rats: evidence for a prohypertensive role within the brain stem," [2007] Hypertension, June, 49(6): 1321-7. PMID: 17420334.
The invention will now be described in relation to the drawings. Figure 1 depicts a chart describing how the method of the invention may be performed. In the first step of the method, ontologies available in the public domain 60 about biological structure in electronic form are made available to the invention. The computer programme component 61 of the invention processes that information and modifies it to conform to the parameters 62 the invention requires. The modified information is stored in the database 63 component of the invention. The programme component 61 applies templates 64 to the information so as to accurately reflect the spatial relationships between items of information within the database 63 defined as terms 65. The programme component 61 generates treemap depictions 66 of the information at the levels required by a user. Metadata linkages 67 are established between terms 65 as through user installed annotations 68 that are added to the stored information within the database 63 and displayed within the treemap depictions 66 relevant to the such annotations and are accessible at the time of future use of the invention.
Figure 2 demonstrates a treegraph representing an ontology 101 (shown in Figure 3). The invention makes use of ontologies comprised of information about a discrete subject. This information is organized according to terms 102 defining various aspects of the subject of the ontology 101. A relationship between two terms in a rooted treegraph, where one term is a parent term 104 and the other is a subordinate child term 105 is referred to here as an edge 103. Metadata annotations 106 associated with a particular ontology term may be linked to the ontology term at nodes of reference 104-105.
Figure 3 demonstrates the association between resources and reference ontologies performed through semantic metadata (Figure reproduced from de Bono, B., et al "The RJCORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions," (201 1) 4(313) BioMed Central Research Notes, Figure 1 C). An ontology 101 provides explicit machine readable information for the annotation of semantic metadata 106. A biological structure ontology 107 101 may be used to illustrate, among other things, the anatomical hierarchy of parts of the heart. A term 105 used as part of this ontology might refer to such a part, such as the interventrical septum 107 or the mitral valve 108. The invention makes use of explicit representations of formal knowledge as well- defined terms of reference concept terms 102 and edges 103 between such concepts terms 102 to compare resource metadata 106 associated with terms from the same ontology 101 precisely and automatically.
Figure 4 demonstrates a treemap representing an ontology where the treemap is of the ontological graph demonstrated in Figure 2. This treemap demonstrates how the invention is used to organize or nest tiles of terms 102 in parent 104 and child
105 relationships within the ontology 101 , such that the nesting of tiles is analogous to the edges 103 between parent terms 104 and child terms 105 depicted in Figure 2. Symbolic representations of semantic metadata annotations 06 associated with a particular ontology 101 term are placed within the area of the corresponding tile.
Figure 5 demonstrates how the graph of an ontology may be expressed as a treemap and how that treemap may vary in its proportion and layout according to different prevailing conditions that are parameterised during the automated execution of treemap construction. The ontology graph 109 may be expressed as treemap 1 10 or treemap 1 11 or treemap 1 12, The internal layout of treemap 1 10 is different from that of treemap 1 1 1 because different aspect ratios were applied to the same ontological graph resulting in different proportions and arrangements of the term tiles 102 within that ontology. The internal layout of treemap 112 differs from that of treemap 1 10 because the area parameter of the term tile 1 13 is greater in treemap 112 than it is in treemap 110.
Figure 6 explains the process by which different derivative subgraphs may be extracted from the same reference ontology graph. In the lower half of the diagram, the intervening edges of the derivative graphs maintain the same connective topology between the nodes as in the original reference graph. As the layout of the two derivative graphs is not constrained, the layout of the three nodes shared by the two derivative subgraphs, nodes 1 , 26 and 31, are not consistent across the two treemaps.
Figure 7 demonstrates how the invention uses templates 140 and 141 to construct a treemap 142 of an entire ontology 143 and of a node 25 within that ontology. Portions of Figure 7 numbered 1-56 represent tiles nested according to their relationship to each other which in turn represent the terms comprising the ontology 143. The template 140 refers to the arrangement of tiles within the overall treemap of the ontology 143. In the template 140, the child nodes to parent node 1 are organized by their corresponding number designations, those being 2, 35, 29, 31, 6, 32, 34, 30, 35, in a matrix of three rows of thee entries each, depicting the spatial relationships between the nine parents nodes. In template 140, the number for child tile 2 is shown to the left of child tile 35, which is to the left of child tile 29, these three tiles forming the top row of the template 140. Below that top row is a second row wherein are arranged the numbers designating the placement of child tiles 31, 6 and 32 in order from left to right. Below this second row is a third row designating the placement of child tiles 34, 30 and 33. According to this arrangement, tile 2 is itself a parent tile with its own set of internal, subordinate children tiles, and is shown to the left of parent tile 35 and above parent tile 31 in the resulting treemap of the ontology arranged according to the template 140. The placement of the tiles within treemap graph 142 is constrained according to the template 140 so as to represent the ontology 143.
The template 141 refers to the arrangement of the child tiles within the parent tile numbered 325. Template 141 orders the tiles of terms numbered 40, 37, 36, 38, 41, 39 in two row of three terms each such that tile 40 is to the left of tile 37 and directly above tile 38. Tile 37 is to the left of tile 36 and directly above tile 41. Tile 36 is directly above tile 39. The application of template 141 to the child nodes of parent tile 325 of the ontology 143 according to the invention results in the spatial arrangement of the six child nodes as shown in the treemapgraph 144. Treemap 142 is topologically identical to treemap 144.
Figure 8 demonstrates how the invention may be used to apply a template 145 to order the depiction of treemaps of large-scale structures, such as the human body. The ontological maptree graph 146 of such a structure would be complex. In this mapgraph, the second highest order of parent tiles may be grouped into five first order parent tiles, designated by a different graphical symbol, with varying numbers of second order tiles within each first order grouping. The invention organizes such complexity according to designated parameters to produce a template 145 comprised of four rows of six tiles and showing all twenty-four second order parent tiles. In this example, the first order grouping relating to the vascular system 147 contains four tiles, designated by a triangular symbol. This restructuring and apportioning of an anatomy ontology is done so as to generate images of tiles that can be laid out so as to be readily visualized by a user. Within each of the second order tiles shown here would be numerous lower level tiles containing information about biological structures making up the higher level structures of which they are a part. The invention anticipates that this grouping of lower level substructures within higher level structures can be extended to the molecular level. In this way, each portion of a biological structure, such as the human body, may be reduced through its component parts to the level of the molecules making up each part.
Figure 9 demonstrates robustness of the template-constrained layout achieved by this invention. Specifically, Figure 9 demonstrates how the invention may be used so that the relative position of tiles constrained by a template is maintained for any subset of nodes in that template which are selected for display. Figures 9a and 9b illustrate two treemaps that have been generated from two derivate subgraphs. The subgraph in Figure 9b has three tiles less than those represented in the subgraph in Figure 9a. The three missing tiles in Figure 9b are each marked with an X in Figure 9a. The application of the invention results in tiles within a template being arranged in the same relative positions so long as two or more tiles within the template are selected for display. Figure 9 shows how the invention may be used to apply two templates to constrain the relative layout of two sets of child tiles. The first set of child tiles constrained by a template is comprised of child nodes to the parent node representing a human body (the twenty four nodes constrained by this template are outlined by a coarse dashed line). The second set of child tiles constrained by a template are child nodes to the parent node representing the human stomach (the six nodes constrained by this template are outlined by a fine dashed line). The three nodes removed from the derivative subgraph shown in Figure 9a to create the derivative subgraph in Figure 9b are labelled "Vascular Caudal" and "Vascular Cepahic" and are constrained by one template, while the "Body of Stomach" is constrained by another template. A comparison of the two treemaps in Figure 9 demonstrates how the invention utilizes the information associated with two templates to maintain the same overall spatial layout of the tiles involved in the two templates. Arrows are used in the diagrams to indicate how, in this particular example, tiles neighbouring those that were removed from Figure 9b have been expanded to take up the unused space left when tiles are removed. Tiles not constrained by templates are subject to inconsistent layout, exemplified by the two tiles labelled 1 and 2 in both Figures 9a and 9b.
In the treemaps shown in this figure, the tiles are labelled with the anatomical names given to the parts of human body. These labels are used in the figure for demonstration purposes and the words expressed as labels do not form part of the description of the invention. Tiles representing information about lower-order anatomical divisions of the body are nested within the tiles relating to the higher order divisions of the body. The relationships between parts is maintained both in terms of their anatomical relationships to other parts and in terms of their spatial, visual depiction so as to aid the visualization of these relationships by a user of the invention and the images it generates. Through metadata linkages, a user of the invention can navigate through lower and higher level representations of information about biological structures. The application of templates by the invention to constrain how these depictions are displayed achieves an advantageous visual experience for the user through limiting the inconsistency within the information displayed at any one time about a biological structure at a particular level of detail and in this way ensures consistency in how the relationships between the parts of a structure at the same level and at different levels are represented.

Claims

CLAIMS: And for my invention I claim:
1. A method for organizing biomedical resource information and generating images and other media artifacts thereof and text related thereto by depicting and describing and relating biological structures and the biological processes in which they participate through the association of digital graph templates and reference ontologies and computer programmes and databases containing biomedical resource information achieved by applying constraints thereon through the application of graph templates associated with nodes in a graph-based relationship with other nodes so as to control the relative location and size and orientation of related tiles in derivative sub-graphs of the onto logical reference graphs and thereby to generate consistent visual depictions of biological structures and the biological processes in which they participate at all scales of size and complexity and to ensure the consistent layout of treemaps for different derivative sub-graphs extracted from the same reference ontology graph.
2. Biomedical resource information organized according to the method of claim 1.
3. Images and other media artifacts of biomedical resource information produced according to the method of claim 1.
4. Text relating to biomedical resource information generated according to the method of claim 1.
5. Biomedical resource information and images and other media artifacts of biomedical resource information organized or produced according to the method of claim 1.
6. Images and media artifacts of biomedical resource information and text relating to biomedical resource information produced or generated according to the method of claim 1.
7. Biomedical resource information and text relating to biomedical resource information organized or generated according to the method of claim 1.
8. Biomedical resource information and images and media artifacts of biomedical resource information and text relating to biomedical resource information organized or produced or generated according to the method of claim 1.
PCT/GB2012/000813 2011-10-26 2012-10-25 A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated WO2013061019A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB1118530.3A GB201118530D0 (en) 2011-10-26 2011-10-26 A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated
GB1118530.3 2011-10-26

Publications (1)

Publication Number Publication Date
WO2013061019A1 true WO2013061019A1 (en) 2013-05-02

Family

ID=45373478

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2012/000813 WO2013061019A1 (en) 2011-10-26 2012-10-25 A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated

Country Status (2)

Country Link
GB (1) GB201118530D0 (en)
WO (1) WO2013061019A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070185904A1 (en) * 2003-09-10 2007-08-09 International Business Machines Corporation Graphics image generation and data analysis

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070185904A1 (en) * 2003-09-10 2007-08-09 International Business Machines Corporation Graphics image generation and data analysis

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
BAEHRECKE ERIC H ET AL: "Visualization and analysis of microarray and gene ontology data with treemaps", BMC BIOINFORMATICS, BIOMED CENTRAL, LONDON, GB, vol. 5, no. 1, 28 June 2004 (2004-06-28), pages 84, XP021000670, ISSN: 1471-2105, DOI: 10.1186/1471-2105-5-84 *
BONO, B. ET AL.: "The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions", BIOMED CENTRAL RESEARCH NOTES, vol. 4, no. 313, 2011
DE BONO BERNARD ET AL: "ApiNATOMY: A Novel Toolkit for Visualizing Multiscale Anatomy Schematics with Phenotype-Related Information", HUMAN MUTATION, JOHN WILEY & SONS, INC, US, vol. 33, no. 5, 1 May 2012 (2012-05-01), pages 837 - 848, XP008155421, ISSN: 1059-7794, DOI: 10.1002/HUMU.22065 *
DE BONO, B. ET AL.: "The RICORDO approach to semantic interoperability for biomedical data and models: strategy, standards and solutions", BIOMED CENTRAL RESEARCH NOTES, vol. 4, no. 313, 2011
H. WAKI ET AL.: "Junctional adhesion molecule-1 is upregulated in spontaneously hypertensive rats: evidence for a prohypertensive role within the brain stem", HYPERTENSION, vol. 49, no. 6, June 2007 (2007-06-01), pages 1321 - 7
IKEHATA Y ET AL: "Hierarchical data visualization using a fast rectangle-packing algorithm", IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, IEEE SERVICE CENTER, LOS ALAMITOS, CA, US, vol. 10, no. 3, 1 May 2004 (2004-05-01), pages 302 - 313, XP011108751, ISSN: 1077-2626, DOI: 10.1109/TVCG.2004.1272729 *
SMITH, B. ET AL.: "The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration", NATURE BIOTECHNOLOGY, vol. 25, no. 11, 2007, pages 1251 - 1255

Also Published As

Publication number Publication date
GB201118530D0 (en) 2011-12-07

Similar Documents

Publication Publication Date Title
Balhoff et al. A semantic model for species description applied to the ensign wasps (Hymenoptera: Evaniidae) of New Caledonia
Hu et al. Ontology-based clinical pathways with semantic rules
Catalano et al. Semantics and 3D media: Current issues and perspectives
Smith et al. Biomedical imaging ontologies: A survey and proposal for future work
Blondé et al. Reasoning with bio-ontologies: using relational closure rules to enable practical querying
Gambino et al. A framework for data-driven adaptive GUI generation based on DICOM
Kerren et al. Network visualization for integrative bioinformatics
Bernasconi et al. From a conceptual model to a knowledge graph for genomic datasets
Rohn et al. Creating views on integrated multidomain data
Costa et al. A scientific software product line for the bioinformatics domain
Zdravković et al. A case of using the Semantic Interoperability Framework for custom orthopedic implants manufacturing
Geniesse et al. NeuMapper: A scalable computational framework for multiscale exploration of the brain’s dynamical organization
Dinkla et al. Comparison of multiple weighted hierarchies: visual analytics for microbe community profiling
Gotz et al. Multifaceted visual analytics for healthcare applications
Kokash et al. Knowledge representation for multi-scale physiology route modeling
Allegri et al. CompositeView: A network-based visualization tool
de Bono et al. ApiNATOMY: Towards multiscale views of human anatomy
Streit et al. Navigation and exploration of interconnected pathways
Erson et al. Design of a framework for modeling, integration and simulation of physiological models
WO2013061019A1 (en) A method for organizing data and generating images of reference biological structures and related materials and the images and materials so generated
Cvjetković et al. The ontology supported intelligent system for experiment search in the scientific research center
Bucur et al. Clinical decision support framework for validation of multiscale models and personalization of treatment in oncology
Travillian et al. An ontology-based comparative anatomy information system
Brasch et al. VANLO-interactive visual exploration of aligned biological networks
Barillot et al. Federating distributed and heterogeneous information sources in neuroimaging: the NeuroBase project

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12808857

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12808857

Country of ref document: EP

Kind code of ref document: A1