WO2003088088A1 - System and method for semantics driven data processing - Google Patents
System and method for semantics driven data processing Download PDFInfo
- Publication number
- WO2003088088A1 WO2003088088A1 PCT/US2003/011025 US0311025W WO03088088A1 WO 2003088088 A1 WO2003088088 A1 WO 2003088088A1 US 0311025 W US0311025 W US 0311025W WO 03088088 A1 WO03088088 A1 WO 03088088A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- metadata
- metalife
- recited
- repository
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/20—Heterogeneous data integration
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
Definitions
- the present invention relates in general to the field of computer technology, and more particularly, to collecting, categorizing, integrating and analyzing any amount of heterogeneous metadata, both from internally generated sources and externally acquired sources, especially as it relates to life science data.
- the benefit of the present invention is its ability to enable humans and machines involved understand and exchange the metadata using the same 'Lingua Franca' - universal language - and cross-fertilize with all business platforms and technologies, regardless of type of data as long as the data source is computational or stored as bytes of information.
- One form of the present invention is a metadata conduit driven software for integrating and analyzing life sciences data from one or more data sources comprising a modeler, a metadata repository, a virtual data access/integration engine, a portal and adapters for disparate data sources, wherein an integration server consumes the metadata stored in the repository to direct queries to data sources, aggregates data and provides functional views of this data to information consumers.
- Another form of the present invention is the ability to embed components of the metadata into the instrumentation (hardware) involved in research/drug development (e.g., High Throughput Screening ("HTS”), Mass Spectrometry and other diagnostics instruments for drug discovery) and enable exchange of the output data using XML.
- This capability can be further enhanced by developing alert mechanisms to inform persons involved in drug development of results of interest in near real-time or real-time, potentially speeding up the discovery process.
- the present invention may also be used for providing subscription based web services to one or more businesses and/or companies that require data integration.
- An example would be a Patent Filing Web Service that automates the process of preparing and filing patents.
- businesses/companies may work independently, accessing only specific data sources as needed, or may be combined to allow access to several independent data sources, including each others data sources.
- FIGURE 1 is a block diagram of a system in accordance with one embodiment of the present invention.
- FIGURE 2 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 3 is a flow chart of a method in accordance with one embodiment of the present invention.
- FIGURE 4 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 5 is a flow chart of a method in accordance with another embodiment of the present invention.
- FIGURE 6 is a screen shot of a MetaLife Modeler in accordance with one embodiment of the present invention.
- FIGURE 7 is a block diagram of a MetaLife Integration Server in accordance with one embodiment of the present invention.
- FIGURE 8 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 9 is a diagram illustrating the uses of the MetaLife Modeler in accordance with one embodiment of the present invention.
- FIGURE 10 is a MetaModel for a BioAssay in accordance with one embodiment of the present invention
- FIGURE 11 is a MetaModel for an ArrayDesign in accordance with another embodiment of the present invention
- FIGURE 12 is a block diagram of a data flow in accordance with one embodiment of the present invention
- FIGURE 13 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 14 is a block diagram of a MetaLife Integration Server in accordance with another embodiment of the present invention.
- FIGURE 15 is a block diagram of a data flow in accordance with another embodiment of the present invention.
- FIGURE 16 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 17 is a block diagram of a system in accordance with another embodiment of the present invention.
- FIGURE 18 is a block diagram of a system in accordance with another embodiment of the present invention.
- the system of the present invention represents a revolutionary advance for the most critical portion of a business — the data that drives it.
- businesses in the life sciences industry in order to investigate a single drug candidate - a researcher and other persons involved might be required to examine several different databases many times over, each database housing different types of data such as genetic, proteomic, bibliographic, and patent information, often using separate software applications to address each database.
- This approach is not only time-consuming (searching for the same answer many times over) but prevents near real-time or real-time access to constantly expanding biological, proteomic and chemistry databases, since researchers must collect, reformat, and assimilate the continuous worldwide production of new life sciences data, and republish their databases at frequent intervals.
- the present invention will enable access to all current and historic data sources relevant to scientific investigations focused on drug development from a single, browser-based interface.
- the present invention mediates near real-time or real-time access between one or more persons and the multiple data sources they need to access.
- Metadata is data about the content, quality, condition, and other characteristics of data.
- the present invention informs users that new life science databases have entered the application service.
- the present invention provides a significantly improved method for those persons attempting to analyze isolated, incompatible data sources. And by freeing a person from the tedious and time- consuming task of data integration and updates, the present invention saves businesses and/or whole industries time and money as well as freeing up the employees from time- consuming data analysis allowing them to focus on their real work.
- the present invention solves some of the current problems by providing a person or business a way to quickly and effectively integrate their data (from one or more sources) into 'functional views' they need. These functional views can be supplied to specialized applications that will help them identify possible candidates for new drugs and rapidly test those hypotheses.
- the present invention also offers solutions that process this data without always requiring the presence of one or more persons.
- the present invention is able to leverage components that a person and/or business is already utilizing because it is a hybrid model that insures that not only the person or business is satisfied with the software but that it is part of an integrated solution that interfaces with person's/business' already existing system(s).
- the present invention also referred to as 'MetaNomeTM', is a novel industry standards-based, scalable, platform independent repertoire of authentic semantics and business rules for the life sciences industry that aims to streamline the costly drug development process and enhance competitive edge.
- MetaNome is also a novel, industry standards-based, scalable, platform independent, horizontal metadata conduit for the life sciences industry that is understood by humans and machines to facilitate the understanding and integration of enterprise assets.
- FIGURE 1 is a block diagram of a system 100 in accordance with one embodiment of the present invention.
- the system 100 includes a MetaLife Integration Server 102, a MetaLife Classifier 104, a MetaLife Modeler 106, a MetaLife Repository 108, a MetaLife Pre-Processor 110 and a MetaLife Portal 112.
- the MetaLife Repository 108 is communicably coupled to the MetaLife Integration Server 102, the MetaLife Classifier 104 (optional), the MetaLife Modeler 106 and the MetaLife Portal 112.
- the MetaLife Classifier 104 is also communicably coupled to the MetaLife Pre-Processor 110 (optional).
- the dashed lines between the MetaLife Classifier 104 and the MetaLife Repository 108 and the MetaLife Pre-Processor 110 indicate that the MetaLife Classifier 104 and the MetaLife Pre-Processor 110 are optional.
- the MetaLife Integration Server 102 provides run-time execution of Metadata for data integration and web services.
- the MetaLife Classifier 104 provides an additional capability to classify the metadata into functional views. The functional views can be output from the MetaLife Classifier 104, built manually in the MetaLife Modeler 106 and accessed from the MetaLife Repository 108.
- the MetaLife Modeler 106 is used to design MetaModels, P s, PSMs, XML Schemas and Web Services.
- the MetaLife Repository 108 stores MetaModels, PIMs/PSMs, Web Services' definitions and XML Schemas, SOAP, WSDL and UDDI, etc.
- the MetaModels may include CWM, MOF and UML.
- the PEVIs/PSMs may include gene expression, genomeMaps, Chemlnformatics, BioMolecular Sequence Analysis, Clinical Image Access Service, etc.
- the Web Service can be internal or external and may include Search GenBank, SearchMed, SearchProt and Patent Filing, etc.
- the MetaLife Pre-Processor 110 gathers, maps and integrates Metadata from various metadata sources.
- the MetaLife Portal 112 provides browser-based 'views and reports' of MetaLife repository components and metadata updates.
- the Metadata Repository Models/Metamodels serves as the central hub into which a Virtual Data Access Engine, XML DTDs/Schemas, UDDI Repository and Adapters flow.
- Clinical Trials Data Repositories Genomic Databases, Chemical Databases, Proteomics Databanks, Lab Instruments, Flat Files, XML/HTML Documents are examples of data sources that may all or independently flow into the Adapters.
- Flow is in either direction between the Metadata Repository Models, Metamodels and one or all of the following components: ETL Engine, Transform, UDDI Repository, XML, DTDs/Schemas, Virtual Data Access Engine. From the ETL Engine and the Virtual Data Access Engine flow may go to an Integrated Data Layer and Portal or web services.
- the destinations may include one or more Web browsers, PC applications, Visalization Applications, and Wireless Devices.
- Users of the System include Administrators, Lab Technicians, researchers, Chemists, Clinical Research Organizations, Proteomics Specialists, businesses and any other person requiring access to the system.
- Metadata is the primary means by which interoperability is achieved in a heterogeneous environment. Although interoperability is essentially facilitated by standard API's, it ultimately depends upon shared metadata as the definitions of systems' semantics and capabilities. Therefore, the capability to gather, store and publish application and system-level metadata is a 'must have.' Applications, tools, databases, and other components expose and discover metadata to enable cross-talk.
- the system of the present invention includes data management software that will vastly simply the task of categorizing, integrating and analyzing the vast amounts of heterogeneous data, both from internally generated sources as well external life sciences research data.
- the present invention will remove the data integration and analysis burden from researchers and allow them to focus their efforts on research and development.
- the present invention solves the following design challenges with the development of the present invention: Standardization of diverse interpretations of data (often same or regional flavors or based on business rules) resolved by creating a metadata repository that will manage metadata as well as directory of services (UDDI) that differentiates the present invention from others; and establishing the common Lingua Franca (common language) and ATM (Adapter-translation Mechanism) that allows standard format for data exchange and transformation resolved by the use of XML and ATM hubs.
- the present invention may include of one or more of the following software components: MetaLife Pre-processor, MetaLife Classifier, MetaLife Modeler, MetaLife Repository; Virtual Data Access Engine; Portal, ETL Engine (Extract, Transformation & Load) and Adapters for various data sources.
- MetaLife Pre-processor MetaLife Classifier
- MetaLife Modeler MetaLife Repository
- Virtual Data Access Engine Virtual Data Access Engine
- Portal ETL Engine (Extract, Transformation & Load) and Adapters for various data sources.
- ETL Engine Extract, Transformation & Load
- the ETL Engine may include one of several commercially available software products such as Informatica (www.informatica.com); Sagent (www.sagenttech.com); and/or
- the purpose of the ETL Engine is to extract, transform and load data from disparate sources into a new integrated physical data store.
- Atomic data from disparate sources may be aggregated and manipulated for faster performance (queries).
- integrated data may also be exchanged among disparate applications.
- the ETL Tool is an optional component of the present invention.
- the metadata repository is the container for managing enterprise metadata.
- the metadata repository should conform to industry standards and provide the 'glue' that drives interoperability among applications.
- XMI XML Metadata Interchange
- Metadata will be stored and exchanged via industry standards, such as XML Metadata Interchange ("XMI"). Metadata will essentially be the key to the driven web services of the present invention.
- UDDI Universal Description, Discovery and Integration
- Metadata repository will manage XML DTD's and/or Schemas.
- the Virtual Data Access Engine is used to create 'virtual' views of data from disparate sources.
- This layer may be viewed as a 'virtual mapping' or a 'roadmap' to the underlying data sources that may be integrated at run-time and provide 'context rich' views of disparate data.
- Xaware's www.xaware.com
- Metamatrix's Integration Server
- Adapters software modules that facilitate connectivity to data. These include ODBC, JDBC and native drivers to relational databases like Oracle, Sybase, DB2 and others. Custom adapters (if necessary) shall be developed although an extensive range of commercially available Adapters is already available and being used in most IT organizations.
- a Connector Development Kit will be provided to develop any specialized connector.
- the system of the present invention will generate a web service query that will search the respective Chemical Libraries, Bioassay, Human Genome Sequence, Proteomics databanks and Clinical/Pre-clinical trials databases and retrieve a results set. Additional data transformation and aggregation may then be performed by the researcher before sharing these results or performing another web service query.
- the present invention can also be used to provide a "patent filing web service.” This service will automate the process of patent filing including searching and providing additional information requested (Toxicology/Adverse impact analysis data for example).
- the present invention may also include specialized web services such as patent preparation/submission, hooks (via web services) into industry (e.g., hospitals, business or government data stores), and for the healthcare industry such things as disease outcomes and diagnostic codes data.
- the architecture provided by the present invention is integrated (ability to generate disparate sources and types of metadata), scalable (ability to sustain growth (content and usability of metadata)), robust (provide extensive functionality and performance), customizable (ability to tailor the metadata solution to satisfy the content complexity and business needs), open (accessibility of metadata to systems, applications and user interfaces), conformant with industry standards (ability to implement established industry metadata standards: MOF, CWM and XMI for example), bi-directional (permit metadata exchange (update) between the metadata sources and metadata repository) and closed-loop (allow metadata repository to feed metadata back to operational systems).
- the components described above in system 100 may be variants of commercial available metadata repository products:
- the commercially available components listed above cannot be taken "off the shelf and combined together to create system 100 for life sciences without special modifications.
- the present invention provides an integrated system that is not currently available.
- the MetaLife Repository supports numerous industry standards.
- the supported standards from the Object Management Group include Meta Object Facility (“MOF”), XML Metadata Interchange (“XMI”), Unified Modeling Language (“UML”), Common Warehouse MetaModel (“CWM”), Software Process Engineering MetaModel (“SPEM”), Component Collaboration Architecture (“EDOC CCA”), and Software Portfolio Management Facility (“SPMF”).
- Supported life sciences domain standards includes gene expression, genome maps, clinical image access service, lab instrument control interface, and biomolecular sequence analysis. Life sciences markup languages and ontologies are also supported.
- the Reusable Asset Specification (“RAS”) and Java Metadata Interface (“JMI”) are supported.
- FIGURE 2 is a block diagram of a system 200 in accordance with another embodiment of the present invention.
- the system 200 includes a MetaLife Classifier 104, a MetaLife Modeler 106, a MetaLife Repository 108, a MetaLife Pre-Processor 110 and a MetaLife Portal 112.
- the components are the same as described in FIGURE 1, except that they are connected differently.
- FIGURE 3 is a flow chart of a method 300 in accordance with one embodiment of the present invention.
- the method 300 obtains metadata from a metadata source in block 302. Thereafter, the metadata is mapped to a MetaModel in block 304 and the mapped metadata is integrated and classified into functional views in block 306. The integrated and classified metadata is then stored in a repository in block 308. The stored metadata is retrieved in block 310 and used in an application web service in block 312.
- FIGURE 4 is a block diagram of a system 400 in accordance with another embodiment of the present invention.
- the system 400 includes a testing or data analysis/instrument device 402 having an embedded interface 404.
- the testing or data analysis/instrument device 402 produces a standard raw data output 406.
- the metadata from the testing or data analysis/instrument device 402 is processed or consumed by the embedded interface 404 using a MetaLife Model 410, which can be downloaded from a MetaLife Repository.
- the output data is then provided to a MetaLife Repository or other selected output 408, such as an XML file or another device.
- FIGURE 5 is a flow chart of a method 500 in accordance with another embodiment of the present invention.
- the method 500 corresponds to the system 400 (FIGURE 4).
- the Embedded Interface 404 receives the data from the Testing or Data Analysis/Instrument Device 402 in block 502 and processes or consumes that data using the MetaLife Model 410 in block 504. Thereafter, the processed data is provided to a MetaLife Repository or other output device/application 408 in block 506.
- FIGURE 6 is a screen shot 600 of a MetaLife Modeler 106 (FIGURES 1 and 2) in accordance with one embodiment of the present invention.
- the MetaLife Modeler is a graphical user interface that enables metadata modeling conformant to OMG's Model Driven Architecture (“MDA") using UML.
- MDA Model Driven Architecture
- the MetaLife Modeler allows abstraction of metadata at design time and run time using semantics and business rules.
- the MetaLife Modeler permits complete integration and exchange of metadata with existing modeling tools, such as ETL and DW, via XML.
- the MetaLife Modeler also allows complete modeling of web services/application as well as more than 90% of the code generation.
- the screen 600 is split into a project window 602, documentation window 604, model window 606 and output window 608.
- the project window 602 lists the various models 610, such as biosequence, bioassay, gene expression, bioevent, genome, proteomic, clinical trial and toxicology models, that are available in a standard file-tree structure. Once selected, the various models 610 can be displayed in the model window 606 and manipulated.
- the MetaLife Modeler promotes understanding of business needs, satisfies questions, provides focus on important issues, removes ambiguity, tests ideas, compares alternatives, provides rigor, reduces cost of changes and corrections, and supports new iterations.
- FIGURE 7 is a block diagram of a MetaLife Integration Server 700 in accordance with one embodiment of the present invention.
- the MetaLife Integration Server 700 provides bi-directional integration of disparate enterprise systems.
- the MetaLife Integration Server 700 also can decompose XML data to enterprise system, manage transactions across systems, apply business rules, workflow logic and transformations to data, aggregate data from disparate systems to create virtual business objects, and reuse semantic accuracy of enterprise metadata.
- the MetaLife Integration Server 700 includes a MetaLife Integration Server 702 communicably coupled to one or more MetaLife Adapters 704, one or more MetaLife Connectors 706 and a manager 708.
- the MetaLife Integration Server 702 is a XML based bi-directional server (Java and C++) that can be deployed on J2EE servers and .Net servers, Windows and Unix platforms.
- the MetaLife Adapters 704 connect the MetaLife Integration Server 702 to enterprise systems, such as RDBMS, XML, DBMS, HTTP, EJB's, JMS, Java, API, SOAP, mainframe, ERP, CRM, SNMP and SOCKET.
- the MetaLife Connectors 706 connect other applications to the MetaLife Integration Server 702, such as XQUERY, EJB, JMS, SERVLET, SOAP, CGI, ISAPI, CORBA, HTTP and API.
- the Manager 708 manages the MetaLife Integration Server 702.
- FIGURE 8 is a block diagram of a system 800 in accordance with another embodiment of the present invention.
- the system 800 includes three tiers: a MetaLife access tier 820, a data storage and processing tier 822 and a data source tier 824.
- Various users 802 use the access tier 820, which includes the MetaLife Portal, to access and use and manipulate metadata that is stored or accessible via the data storage and processing tier 822.
- the various users 802 may include researchers 804, informatics specialists 806, chemists 808, toxicologists 810, pharmacologists 812, clinical trials specialists 814, FDA liaisons 816, proteomics specialists 818 and others.
- the data storage and processing tier 822 includes the MetaLife Repository (software services/applications directory), the MetaLife Integration Server, and the messaging/information request/response infrastructure.
- the data source tier 822 includes internal and external data sources, internal and partner applications, and internal and external services.
- FIGURE 9 is a diagram illustrating the uses of the MetaLife Modeler 106 (FIGURES 1 and 2) in accordance with one embodiment of the present invention.
- the MetaLife Modeler 600 allows the user to create and manipulate MetaModels using disparate XML DTDs/Schemas 900, Semantics 902, MetaModels 904 and 906, and MetaModel output 908.
- the Semantics 902 may include a treatment, which is the experimental manipulation of a sample such as a cell culture, tissue, or organism prior to extraction of a preparation, or a virtual array, which is the resulting BioAssayData of a BioAssayCreation and series of BioAssayTreatments may abstract away the actual lower level design elements so that the user sees the results only on the composite sequence or the reporter level.
- the virtual array allows description and annotation of these design elements for reference in the BiaAssayData.
- MetaModel 904 is a model for BioAssayData and is shown in more detail in FIGURE 10.
- MetaModel 906 is a model for ArrayDesign and is shown in more detail in FIGURE 11.
- FIGURE 12 is a block diagram of a data flow 1200 in accordance with one embodiment of the present invention.
- Life sciences standards 1202 such as gene expression and genome maps, are modeled as PEVI's in a MetaLife Modeler 106 (FIGURES 1 and 2).
- the MetaModels can then be used in MetaPrograms (J2EE or .Net) 1204 to provide .Net web services 1206 and J2EE web services 1208.
- the MetaModels can also be exported via XMI to the MetaLife Repository 1210.
- the Metadata and MetaModels in the MetaLife Repository 1210 may then be used by various tools 1212, such as XML Schema Tools, Data Modeling Tools and ETL Tools, via XMI.
- XML Schema and MetaLife Object(s) may also be exported from the MetaLife Repository 1210 to the MetaLife Integrator 1214, which, in turn, provides integrated data to applications 1216.
- FIGURE 13 is a block diagram of a system 1300 in accordance with another embodiment of the present invention.
- System 1300 is used to generate applications 1310 and web services 1312.
- the PIM Model 1302 uses UDDI, WSDL, SOAP and XML Schemas in the MetaLife Repository 1304 to provide a MetaModel to the MetaLife Machine 1308.
- the MetaLife Repository 1304 is also used to generate MetaPrograms 1306, which are applied to the MetaLife Machine 1308.
- the MetaLife Machine 1308 then generates code to produce applications 1310 (J2EE or .Net) and web services 1312.
- FIGURE 14 is a block diagram of a MetaLife Integration Server 1400 in accordance with another embodiment of the present invention.
- the first tier 1402 contains databases, legacy applications, web services, application servers and other data sources.
- the second tier 1404 contains adapters 1404 that are used to process metadata from the first tier to the third tier 1406, which contains a virtual XML information server 1406, business rules processing and work flow manager 1408, and XML doc processor and transformation processor 1410.
- the third tier 1406 works with the fourth tier 1412, which contains cross applications views, to provide metadata integration.
- the fifth tier 1414 contains connectors that are used to supply integrated metadata to the sixth tier, which includes reporting applications, web applications, EJB's, Pads, HTS and other lab instruments.
- FIGURE 15 is a block diagram of a data flow 1500 in accordance with another embodiment of the present invention.
- Data flow 1500 illustrates the prediction of highly effective chemical compounds, gene and protein structures for drug discovery, diagnostics and improvement of the HTS process.
- Chem-informatics data 1502, bio-assays data 1504 and protein databases 1506 are fed to the MetaLife Pre-Processor 1508.
- the MetaLife Pre- Processor 1508 provides pre-processed metadata to the MetaLife Classifier 1510, which may include SVM or Neural Network algorithms. Chemical structures are then classified with protein regions interaction 1512 to produce faster discovery of lead compounds 1514.
- FIGURE 16 is a block diagram of a system 1600 in accordance with another embodiment of the present invention.
- the present invention provides device driven interoperability by creating output data that can be bi-directionally exchanged between devices.
- a first testing or data analysis/instrument device 1602 such as Bio-chips, Bio- assays, sequencers or HTS, has a first embedded interface 1604.
- the first testing or data analysis/instrument device 1602 uses the first embedded interface 1604 to produces first output data 1616, which may be in XML.
- the first embedded interface 1604 processes or consumes the metadata generated by the first testing or data analysis/instrument device 1602 using a MetaLife Model 1606, which may be downloaded from MetaLife Repository 1614.
- a second testing or data analysis/instrument device 1608 such as gel electrophoresis or mass-spectrometry, has a second embedded interface 1610.
- the second testing or data analysis/instrument device 1608 produces second output data 1618, which may be in XML.
- the second embedded interface 1610 processes or consumes the metadata generated by the second testing or data analysis/instrument device 1608 using a MetaLife Model 1612, which may be downloaded from MetaLife Repository 1614.
- FIGURE 17 is a block diagram of a system 1700 in accordance with another embodiment of the present invention.
- the system 1700 includes Metadata sources 1702, which are used to gather and integrate metadata, a Metadata Repository 1704, which is used to store and update metadata, and Metadata Users 1706, which deliver, exchange and publish metadata.
- the Metadata sources 1702 include such sources 1708 as reference data repositories, enrichment systems, data modeling tools, ETL Tools, data quality tools, reporting tools, data dictionary, intranet/internet and external metadata.
- the Metadata Repository 1704 includes regional MetaLife Repositories 1710, repository administration web or client server 1712, enterprise MetaLife Repository 1714, repository design and development tools 1716, Metadata warehouses 1718 and MetaPortal 1720.
- Metadata sources 1708 are communicably coupled to regional Metadata Repositories 1710.
- the Metadata Users 1706 includes metadata, web services exploration, reporting, WinX/Browser 1722 and research data, proteomics, clinical trials, cheminformatics, toxicology, etc. 1724.
- the regional MetaLife Repositories 1710 are communicably coupled to repository administration web or client server 1712 and enterprise MetaLife Repository 1714.
- Enterprise MetaLife repository 1714 which contains business and technical metadata, is communicably coupled to repository design and development tools 1716, Metadata warehouses 1718, MetaPortal 1720 and reference data, research data, clinical trials, cheminformatics and toxicology 1724.
- the MetaPortal 1722 is also communicably coupled to the Metadata warehouse 1718 and the Metadata, web services exploration, reporting, WinX/Browser 1722.
- FIGURE 18 is a block diagram of a system 1800 in accordance with another embodiment of the present invention.
- System 1800 includes design tools Metadata 1802, core Metadata producers 1804 and other Metadata sources 1806.
- the design tools Metadata 1802 includes Power Designer 1808, Rational Rose 1810, Erwin Client 1812, Open Source (MetaNology, etc.) 1814 and Designer 2K Client 1816 all communicably coupled to the Erwin, ModelMart, Designer 2K and Rose repositories 1818, which are communicably coupled to the Meta ETL Process 1820.
- the core Metadata producers 1804 include reference data repositories 1822, and data dictionary, business and/or transformation rules docs 1824, each communicably coupled to the Meta ETL process 1820.
- the other Metadata sources 1806 include OLAP tools, catalogs and repositories 1826, ETL/DQ tools repository 1828, UDDI registry 1830 and vendor applications 1832, each communicably coupled to the Meta ETL process 1820.
- the Meta ETL process (MetaLife Pre-Processor) 1820 maps, extracts, transforms using Metadata exchange APIs to provide XML input/output.
- the Meta ETL process 1820 is communicably coupled to the integration bridges and/or Metadata repository integration utility 1834.
- the integration bridges 1834 are communicably coupled to the MetaLife repository 1836 to load and update the repository information.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03746705A EP1500005A4 (en) | 2002-04-12 | 2003-04-11 | System and method for semantics driven data processing |
CA002501114A CA2501114A1 (en) | 2002-04-12 | 2003-04-11 | System and method for semantics driven data processing |
AU2003226053A AU2003226053A1 (en) | 2002-04-12 | 2003-04-11 | System and method for semantics driven data processing |
IL16449504A IL164495A0 (en) | 2002-04-12 | 2004-10-11 | System and method for semantics driven data processing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US37227402P | 2002-04-12 | 2002-04-12 | |
US60/372,274 | 2002-04-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2003088088A1 true WO2003088088A1 (en) | 2003-10-23 |
Family
ID=29250829
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2003/011025 WO2003088088A1 (en) | 2002-04-12 | 2003-04-11 | System and method for semantics driven data processing |
Country Status (6)
Country | Link |
---|---|
US (1) | US20030233365A1 (en) |
EP (1) | EP1500005A4 (en) |
AU (1) | AU2003226053A1 (en) |
CA (1) | CA2501114A1 (en) |
IL (1) | IL164495A0 (en) |
WO (1) | WO2003088088A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10083215B2 (en) | 2015-04-06 | 2018-09-25 | International Business Machines Corporation | Model-based design for transforming data |
Families Citing this family (54)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7213018B2 (en) * | 2002-01-16 | 2007-05-01 | Aol Llc | Directory server views |
US7219104B2 (en) * | 2002-04-29 | 2007-05-15 | Sap Aktiengesellschaft | Data cleansing |
US7401064B1 (en) * | 2002-11-07 | 2008-07-15 | Data Advantage Group, Inc. | Method and apparatus for obtaining metadata from multiple information sources within an organization in real time |
US7373350B1 (en) * | 2002-11-07 | 2008-05-13 | Data Advantage Group | Virtual metadata analytics and management platform |
US20050203920A1 (en) * | 2004-03-10 | 2005-09-15 | Yu Deng | Metadata-related mappings in a system |
US7426523B2 (en) * | 2004-03-12 | 2008-09-16 | Sap Ag | Meta Object Facility compliant interface enabling |
US7428552B2 (en) * | 2004-07-09 | 2008-09-23 | Sap Aktiengesellschaft | Flexible access to metamodels, metadata, and other program resources |
US7505989B2 (en) | 2004-09-03 | 2009-03-17 | Biowisdom Limited | System and method for creating customized ontologies |
US7493333B2 (en) | 2004-09-03 | 2009-02-17 | Biowisdom Limited | System and method for parsing and/or exporting data from one or more multi-relational ontologies |
GB0419607D0 (en) * | 2004-09-03 | 2004-10-06 | Accenture Global Services Gmbh | Documenting processes of an organisation |
US7496593B2 (en) | 2004-09-03 | 2009-02-24 | Biowisdom Limited | Creating a multi-relational ontology having a predetermined structure |
US7882170B1 (en) * | 2004-10-06 | 2011-02-01 | Microsoft Corporation | Interfacing a first type of software application to information configured for use by a second type of software application |
US7925540B1 (en) | 2004-10-15 | 2011-04-12 | Rearden Commerce, Inc. | Method and system for an automated trip planner |
US20060101385A1 (en) * | 2004-10-22 | 2006-05-11 | Gerken Christopher H | Method and System for Enabling Roundtrip Code Protection in an Application Generator |
US8024703B2 (en) * | 2004-10-22 | 2011-09-20 | International Business Machines Corporation | Building an open model driven architecture pattern based on exemplars |
WO2006043012A1 (en) * | 2004-10-22 | 2006-04-27 | New Technology/Enterprise Limited | Data processing system and method |
US20060101387A1 (en) * | 2004-10-22 | 2006-05-11 | Gerken Christopher H | An Open Model Driven Architecture Application Implementation Service |
US7376933B2 (en) * | 2004-10-22 | 2008-05-20 | International Business Machines Corporation | System and method for creating application content using an open model driven architecture |
US7831633B1 (en) * | 2004-12-22 | 2010-11-09 | Actuate Corporation | Methods and apparatus for implementing a custom driver for accessing a data source |
US7970666B1 (en) | 2004-12-30 | 2011-06-28 | Rearden Commerce, Inc. | Aggregate collection of travel data |
US20080147450A1 (en) * | 2006-10-16 | 2008-06-19 | William Charles Mortimore | System and method for contextualized, interactive maps for finding and booking services |
US20060224613A1 (en) * | 2005-03-31 | 2006-10-05 | Bermender Pamela A | Method and system for an administrative apparatus for creating a business rule set for dynamic transform and load |
US20070022106A1 (en) * | 2005-07-21 | 2007-01-25 | Caterpillar Inc. | System design using a RAS-based database |
US9117223B1 (en) | 2005-12-28 | 2015-08-25 | Deem, Inc. | Method and system for resource planning for service provider |
US20070150349A1 (en) * | 2005-12-28 | 2007-06-28 | Rearden Commerce, Inc. | Method and system for culling star performers, trendsetters and connectors from a pool of users |
US8141038B2 (en) * | 2005-12-29 | 2012-03-20 | International Business Machines Corporation | Virtual RAS repository |
US8086994B2 (en) | 2005-12-29 | 2011-12-27 | International Business Machines Corporation | Use of RAS profile to integrate an application into a templatable solution |
US20070263010A1 (en) * | 2006-05-15 | 2007-11-15 | Microsoft Corporation | Large-scale visualization techniques |
US7962470B2 (en) * | 2006-06-01 | 2011-06-14 | Sap Ag | System and method for searching web services |
US7941374B2 (en) | 2006-06-30 | 2011-05-10 | Rearden Commerce, Inc. | System and method for changing a personal profile or context during a transaction |
US7774463B2 (en) * | 2006-07-25 | 2010-08-10 | Sap Ag | Unified meta-model for a service oriented architecture |
US20080065750A1 (en) * | 2006-09-08 | 2008-03-13 | O'connell Margaret M | Location and management of components across an enterprise using reusable asset specification |
US20080155557A1 (en) * | 2006-12-21 | 2008-06-26 | Vladislav Bezrukov | Unified metamodel for web services description |
US8601495B2 (en) * | 2006-12-21 | 2013-12-03 | Sap Ag | SAP interface definition language (SIDL) serialization framework |
US20080183725A1 (en) * | 2007-01-31 | 2008-07-31 | Microsoft Corporation | Metadata service employing common data model |
US20090063438A1 (en) * | 2007-08-28 | 2009-03-05 | Iamg, Llc | Regulatory compliance data scraping and processing platform |
US20090182750A1 (en) * | 2007-11-13 | 2009-07-16 | Oracle International Corporation | System and method for flash folder access to service metadata in a metadata repository |
US8156144B2 (en) * | 2008-01-23 | 2012-04-10 | Microsoft Corporation | Metadata search interface |
US7949654B2 (en) * | 2008-03-31 | 2011-05-24 | International Business Machines Corporation | Supporting unified querying over autonomous unstructured and structured databases |
US20100211419A1 (en) * | 2009-02-13 | 2010-08-19 | Rearden Commerce, Inc. | Systems and Methods to Present Travel Options |
CN101963965B (en) * | 2009-07-23 | 2013-03-20 | 阿里巴巴集团控股有限公司 | Document indexing method, data query method and server based on search engine |
CA2679494C (en) | 2009-09-17 | 2014-06-10 | Ibm Canada Limited - Ibm Canada Limitee | Consolidating related task data in process management solutions |
DE102010011664A1 (en) * | 2009-09-29 | 2011-03-31 | Siemens Aktiengesellschaft | View server and method for providing specific data of objects and / or object types |
CA2707251A1 (en) | 2010-06-29 | 2010-09-15 | Ibm Canada Limited - Ibm Canada Limitee | Target application creation |
US8954375B2 (en) * | 2010-10-15 | 2015-02-10 | Qliktech International Ab | Method and system for developing data integration applications with reusable semantic types to represent and process application data |
US20140088880A1 (en) * | 2012-09-21 | 2014-03-27 | Life Technologies Corporation | Systems and Methods for Versioning Hosted Software |
US8954456B1 (en) | 2013-03-29 | 2015-02-10 | Measured Progress, Inc. | Translation and transcription content conversion |
US20140351678A1 (en) * | 2013-05-22 | 2014-11-27 | European Molecular Biology Organisation | Method and System for Associating Data with Figures |
CN103309954A (en) * | 2013-05-27 | 2013-09-18 | 复旦大学 | Html webpage based data extracting system |
US9626388B2 (en) | 2013-09-06 | 2017-04-18 | TransMed Systems, Inc. | Metadata automated system |
US10394828B1 (en) | 2014-04-25 | 2019-08-27 | Emory University | Methods, systems and computer readable storage media for generating quantifiable genomic information and results |
US9684699B2 (en) * | 2014-12-03 | 2017-06-20 | Sas Institute Inc. | System to convert semantic layer metadata to support database conversion |
US10387476B2 (en) * | 2015-11-24 | 2019-08-20 | International Business Machines Corporation | Semantic mapping of topic map meta-models identifying assets and events to include modeled reactive actions |
AU2022236779A1 (en) * | 2021-03-19 | 2023-11-02 | Portfolio4 Pty Ltd | Data management |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020156792A1 (en) * | 2000-12-06 | 2002-10-24 | Biosentients, Inc. | Intelligent object handling device and method for intelligent object data in heterogeneous data environments with high data density and dynamic application needs |
US20030110058A1 (en) * | 2001-12-11 | 2003-06-12 | Fagan Andrew Thomas | Integrated biomedical information portal system and method |
US20030115243A1 (en) * | 2001-12-18 | 2003-06-19 | Intel Corporation | Distributed process execution system and method |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5257363A (en) * | 1990-04-09 | 1993-10-26 | Meta Software Corporation | Computer-aided generation of programs modelling complex systems using colored petri nets |
US5848273A (en) * | 1995-10-27 | 1998-12-08 | Unisys Corp. | Method for generating OLE automation and IDL interfaces from metadata information |
US5978804A (en) * | 1996-04-11 | 1999-11-02 | Dietzman; Gregg R. | Natural products information system |
JP3288264B2 (en) * | 1997-06-26 | 2002-06-04 | 富士通株式会社 | Design information management system, design information access device, and program storage medium |
US5937409A (en) * | 1997-07-25 | 1999-08-10 | Oracle Corporation | Integrating relational databases in an object oriented environment |
US5966707A (en) * | 1997-12-02 | 1999-10-12 | International Business Machines Corporation | Method for managing a plurality of data processes residing in heterogeneous data repositories |
US6535868B1 (en) * | 1998-08-27 | 2003-03-18 | Debra A. Galeazzi | Method and apparatus for managing metadata in a database management system |
US6574635B2 (en) * | 1999-03-03 | 2003-06-03 | Siebel Systems, Inc. | Application instantiation based upon attributes and values stored in a meta data repository, including tiering of application layers objects and components |
US6381743B1 (en) * | 1999-03-31 | 2002-04-30 | Unisys Corp. | Method and system for generating a hierarchial document type definition for data interchange among software tools |
US6523035B1 (en) * | 1999-05-20 | 2003-02-18 | Bmc Software, Inc. | System and method for integrating a plurality of disparate database utilities into a single graphical user interface |
US6477580B1 (en) * | 1999-08-31 | 2002-11-05 | Accenture Llp | Self-described stream in a communication services patterns environment |
WO2001052118A2 (en) * | 2000-01-14 | 2001-07-19 | Saba Software, Inc. | Information server |
AU2001226401A1 (en) * | 2000-01-14 | 2001-07-24 | Saba Software, Inc. | Method and apparatus for a business applications server |
US6985905B2 (en) * | 2000-03-03 | 2006-01-10 | Radiant Logic Inc. | System and method for providing access to databases via directories and other hierarchical structures and interfaces |
US6311194B1 (en) * | 2000-03-15 | 2001-10-30 | Taalee, Inc. | System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising |
US7177798B2 (en) * | 2000-04-07 | 2007-02-13 | Rensselaer Polytechnic Institute | Natural language interface using constrained intermediate dictionary of results |
AU2001257450A1 (en) * | 2000-05-04 | 2001-11-12 | Kickfire, Inc. | An information repository system and method for an itnernet portal system |
US6772160B2 (en) * | 2000-06-08 | 2004-08-03 | Ingenuity Systems, Inc. | Techniques for facilitating information acquisition and storage |
WO2002013065A1 (en) * | 2000-08-03 | 2002-02-14 | Epstein Bruce A | Information collaboration and reliability assessment |
US20020059566A1 (en) * | 2000-08-29 | 2002-05-16 | Delcambre Lois M. | Uni-level description of computer information and transformation of computer information between representation schemes |
US20030028415A1 (en) * | 2001-01-19 | 2003-02-06 | Pavilion Technologies, Inc. | E-commerce system using modeling of inducements to customers |
US20020099563A1 (en) * | 2001-01-19 | 2002-07-25 | Michael Adendorff | Data warehouse system |
US6725232B2 (en) * | 2001-01-19 | 2004-04-20 | Drexel University | Database system for laboratory management and knowledge exchange |
US20020103811A1 (en) * | 2001-01-26 | 2002-08-01 | Fankhauser Karl Erich | Method and apparatus for locating and exchanging clinical information |
US7363372B2 (en) * | 2001-02-06 | 2008-04-22 | Mtvn Online Partners I Llc | System and method for managing content delivered to a user over a network |
US7299202B2 (en) * | 2001-02-07 | 2007-11-20 | Exalt Solutions, Inc. | Intelligent multimedia e-catalog |
US20020161778A1 (en) * | 2001-02-24 | 2002-10-31 | Core Integration Partners, Inc. | Method and system of data warehousing and building business intelligence using a data storage model |
US20020169560A1 (en) * | 2001-05-12 | 2002-11-14 | X-Mine | Analysis mechanism for genetic data |
US20020178150A1 (en) * | 2001-05-12 | 2002-11-28 | X-Mine | Analysis mechanism for genetic data |
US20020194201A1 (en) * | 2001-06-05 | 2002-12-19 | Wilbanks John Thompson | Systems, methods and computer program products for integrating biological/chemical databases to create an ontology network |
US7054847B2 (en) * | 2001-09-05 | 2006-05-30 | Pavilion Technologies, Inc. | System and method for on-line training of a support vector machine |
US6649909B2 (en) * | 2002-02-20 | 2003-11-18 | Agilent Technologies, Inc. | Internal introduction of lock masses in mass spectrometer systems |
-
2003
- 2003-04-11 WO PCT/US2003/011025 patent/WO2003088088A1/en not_active Application Discontinuation
- 2003-04-11 CA CA002501114A patent/CA2501114A1/en not_active Abandoned
- 2003-04-11 AU AU2003226053A patent/AU2003226053A1/en not_active Abandoned
- 2003-04-11 EP EP03746705A patent/EP1500005A4/en not_active Withdrawn
- 2003-04-11 US US10/412,663 patent/US20030233365A1/en not_active Abandoned
-
2004
- 2004-10-11 IL IL16449504A patent/IL164495A0/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020156792A1 (en) * | 2000-12-06 | 2002-10-24 | Biosentients, Inc. | Intelligent object handling device and method for intelligent object data in heterogeneous data environments with high data density and dynamic application needs |
US20030110058A1 (en) * | 2001-12-11 | 2003-06-12 | Fagan Andrew Thomas | Integrated biomedical information portal system and method |
US20030115243A1 (en) * | 2001-12-18 | 2003-06-19 | Intel Corporation | Distributed process execution system and method |
Non-Patent Citations (1)
Title |
---|
See also references of EP1500005A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10083215B2 (en) | 2015-04-06 | 2018-09-25 | International Business Machines Corporation | Model-based design for transforming data |
Also Published As
Publication number | Publication date |
---|---|
IL164495A0 (en) | 2005-12-18 |
EP1500005A4 (en) | 2006-12-13 |
US20030233365A1 (en) | 2003-12-18 |
AU2003226053A1 (en) | 2003-10-27 |
CA2501114A1 (en) | 2003-10-23 |
EP1500005A1 (en) | 2005-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20030233365A1 (en) | System and method for semantics driven data processing | |
US7702639B2 (en) | System, method, software architecture, and business model for an intelligent object based information technology platform | |
Hartley et al. | The BioImage archive–building a home for life-sciences microscopy data | |
Gardner et al. | Common data model for neuroscience data and data model exchange | |
Smith et al. | Biomedical imaging ontologies: A survey and proposal for future work | |
Taylor et al. | Bringing chemical data onto the semantic web | |
Ara et al. | Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses | |
Bugacov et al. | Experiences with DERIVA: An asset management platform for accelerating eScience | |
Hastings et al. | A grid-based image archival and analysis system | |
Spasić et al. | MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics | |
Schuler et al. | Chisel: a user-oriented framework for simplifing database evolution | |
Willighagen et al. | Beautifying data in the real world | |
Venkatesh et al. | Integromics: challenges in data integration | |
Sernadela et al. | A nanopublishing architecture for biomedical data | |
Hartley et al. | The BioImage Archive-home of life-sciences microscopy data | |
Crichton et al. | A Distributed Information Services Architecture to Support Biomarker Discovery in Early Detection of Cancer. | |
Dunlay et al. | Overview of informatics for high content screening | |
Prodanov | Data ontology and an information system realization for web-based management of image measurements | |
Swedlow | The Open Microscopy Environment: A collaborative data modeling and software development project for biological image informatics | |
Mihaylov et al. | An approach for semantic data integration in cancer studies | |
Lyon et al. | eBank UK: linking research data, scholarly communication and learning | |
Curcin et al. | It service infrastructure for integrative systems biology | |
Kawano | Glycobiology meets the semantic web | |
Nuzzo et al. | Phenotypic and genotypic data integration and exploration through a web-service architecture | |
Cuellar et al. | Efficient data management infrastructure for the integration of imaging and omics data in life science research |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SD SE SG SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003746705 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2003226053 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 3588/DELNP/2004 Country of ref document: IN |
|
WWP | Wipo information: published in national office |
Ref document number: 2003746705 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2501114 Country of ref document: CA |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: JP |