WO2006002505A1 - Very large dataset representation system and method - Google Patents

Very large dataset representation system and method Download PDF

Info

Publication number
WO2006002505A1
WO2006002505A1 PCT/CA2004/000973 CA2004000973W WO2006002505A1 WO 2006002505 A1 WO2006002505 A1 WO 2006002505A1 CA 2004000973 W CA2004000973 W CA 2004000973W WO 2006002505 A1 WO2006002505 A1 WO 2006002505A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
subplans
subplan
delegation
dataset
Prior art date
Application number
PCT/CA2004/000973
Other languages
French (fr)
Inventor
Marc Desbiens
Original Assignee
Cognos Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cognos Incorporated filed Critical Cognos Incorporated
Priority to PCT/CA2004/000973 priority Critical patent/WO2006002505A1/en
Priority to EP04737912A priority patent/EP1782271A1/en
Publication of WO2006002505A1 publication Critical patent/WO2006002505A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management

Definitions

  • the present invention relates generally to electronic databases, and more particularly to the dimensional modelling of a very large dataset.
  • a data warehouse contains collections of related data known as datasets.
  • datasets When these datasets are relatively small, such as when a data warehouse has been recently implemented, users can easily access and work with complete datasets directly on their personal computer systems.
  • the present invention is directed to a very large dataset representation system and method.
  • the system includes a delegation modelling object and a subplan manager for filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object.
  • the delegation modelling object includes a master dataset definition, one or more than one data dimension-to-user mapping, a target organization definition defining relationships between the master dataset definition and the data dimension-to-user mappings, and a subplan definition derived from each data dimension-to-user mapping.
  • the system further includes an organizational hierarchy description of the data dimension-to-user mappings provided by an organization modelling object having one or more than one data dimension reference, one or more than one user identifier defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier.
  • the system further includes a consolidator for, upon completion of user interaction, extracting data from each delegated subplan not found in its superior subplans and returning that extracted data to its original dataset.
  • the method includes the steps of constructing a delegation modelling object, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object, and executing the delegation modelling object to extract and generate subplans.
  • the delegation modelling object is constructed by defining a master dataset, mapping each data dimension to one or more than one user identifier, defining relationships between the master dataset and the data dimension-to-user mappings, and deriving a subplan definition from each data dimension-to-user mapping.
  • the method further includes the step of describing an organization hierarchy of the data dimension-to-user mappings by constructing an organization modelling object through referencing one or more than one data dimension, defining intended recipients with one or more than one user identifier, and mapping each data dimension reference to one or more than one user identifier.
  • the method further includes the step of consolidating data from the delegated subplans upon completion of user interaction by extracting data from each delegated subplan not found in its superior subplans, and returning the extracted data to its original dataset.
  • the system provides the ability to delegate from data sources directly, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage.
  • the system further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
  • FIG. 1 is an overview of a very large dataset representation system in accordance with an embodiment of the present invention
  • FIG. 2 shows the system including an organization modelling object
  • FIG. 3 shows the system including an organization modelling object, consolidator, and background server process
  • FIG. 4 is an overview of a very large dataset representation method in accordance with an embodiment of the present invention
  • FIG. 5 shows the method including the step of constructing an organization modelling object
  • FIG. 6 shows the method including the step of constructing an organization modelling object, providing a background server process, and consolidating data from delegated subplans;
  • FIG. 7 illustrates region dimensions;
  • FIG. 8 illustrates an organization modelling object with defined associations
  • FIG. 9 illustrates subplan definitions for an organization modelling object
  • FIG. 10 illustrates a budget plan
  • FIG. 11 illustrates the filtering of subplans in a subplan manager.
  • Embodiments of the present invention are directed to a very large dataset representation system 10 and method 100.
  • the system 10 includes a delegation modelling object 12 and a subplan manager 14 for filtering data from subplan definitions 22 in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 12.
  • the delegation modelling object 12 includes a master dataset definition 16, one or more than one data dimension-to-user mapping 18, a target organization definition 20 defining relationships between the master dataset definition 16 and the data dimension-to-user mappings 18, and a subplan definition 22 derived from each data dimension-to-user mapping 18.
  • the data dimension-to-user mappings 18 are provided by an organization modelling object 24 having one or more than one data dimension reference 26, one or more than one user identifier 28 defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier 18, as illustrated in FIG. 2.
  • the system 10 further includes a consolidator 30 for, upon completion of user interaction, extracting data from each delegated subplan 22a not found in its superior subplans 22a and returning that extracted data to its original dataset, as illustrated in FIG. 3.
  • the method 100 includes the steps of constructing a delegation modelling object 102, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 104, and executing the delegation modelling object to extract and generate subplans 106.
  • the delegation modelling object is constructed by defining a master dataset 108, mapping each data dimension to one or more than one user identifier 110, defining relationships between the master dataset and the data dimension-to-user mappings 112, and deriving a subplan definition from each data dimension-to-user mapping 114.
  • the step of mapping each data dimension to one or more user identifiers 110 is provided by the step of constructing an organization modelling object 116 by referencing one or more than one data dimension 118, defining intended recipients with one or more than one user identifier 120, and mapping each data dimension reference to one or more than one user identifier 122, as illustrated in FiG. 5.
  • the method 100 further includes the step of consolidating data from the delegated subplans upon completion of user interaction 124 by extracting data from each delegated subplan not found in its superior subplans 126, and returning the extracted data to its original dataset 128, as illustrated in FIG. 6.
  • a "cube” as defined herein is a data-modelling object created either manually or automatically from data sources by a planning modeller.
  • the term cube is often used in the art to describe, in a tangible manner, a conceptual understanding of multi-dimensional data structures, whereby data values can be perceived as being stored in the cells of a multi-dimensional array.
  • a "plan” as defined herein is a guide to providing a "snapshot" of a cube and is created by a database modeller and delivered to the manager of a plan. Unlike cubes, plan dimensions are not modifiable by users. By intention, only plan owners or managers can modify plans.
  • a "subplan" 22a as defined herein is a read-only portion of a plan distributed to user classes based upon a specified organization. Subplans 22a are generated by a delegation process that will be defined below.
  • a "proposal" 36 as defined herein is a modifiable version of a subplan definition 22 created by a subplan owner to aid in a planning process.
  • An "organization" 24 as defined herein is a first-class business- modelling object that defines the relationship between dimensional data and user/role identifiers as defined by a business application's security model.
  • An organization modelling object 24 defines the contents of each in a series of subplans and their hierarchical relationship to one another, defines the contents of each subset of data to be extracted, and associates each data subset with a user who will receive and manage that data subset.
  • a "delegation” 12 as defined herein is a first-class business-modelling object that associates a dataset with an organization modelling object 24, and manages the workflow and scheduling around the delivery of subsets of data.
  • Delegation modelling objects 12 provide a formal definition of this process by defining a master dataset and associating the organization hierarchy by which specific datasets or subplans 22a will be generated from the master dataset.
  • a delegation modelling object 12 automates the creation and delivery of subplans 22a and keeps track of changes to subplans 22a over time.
  • a delegation modelling object 12 also provides control to shutdown, as well as clean up an entire delegation process.
  • Delegation modelling objects 12 are described in detail in Applicant's co-pending United States application for patent, titled “Delegation Modelling Object as a First-Class Business Modelling Object, and Method and System for Providing Same” filed February 19, 2003, the teachings of which are hereby incorporated by reference in their entirety.
  • a “dataset” as defined herein is a set of related source data to be used by a delegation modelling object 12 in data extraction and consolidation processes.
  • a dataset should therefore contain elements of the dimensionality referenced in an organization modelling object 24.
  • a single dataset can be the source of more than one distinct delegation 12.
  • Subplan filtering as defined herein describes a process by which each subplan definition 22 is filtered for distribution down to a maximum data size, while respecting the hierarchy as defined by its organization modelling object 24 in order that each user can work with that data on their individual computer systems.
  • Very large dataset delegations 12 are reusable definitions that provide data extraction methods based on business organizational rules, workflow management, and subplan filtering while respecting the organizational integrity defined by the organization 24, as well as consolidation back into an original dataset.
  • a delegation modelling object 12 contains a reference to an organization modelling object 24 in order to define how a master dataset is to be broken out and delivered.
  • a delegation 12 provides a relationship between dimensional data and management roles provided by a data dimension-to-user mapping 18 in order to establish areas of responsibility.
  • each generated subset of data represents the subplan 22a of a larger data subplan 22a generated at a higher level of a management hierarchy.
  • the hierarchy of those subplans 22a is defined in an organization modelling object 24, and since the top-level plan in a so-called "regular" delegation contains the entire dataset, no consolidation back to the original dataset is required.
  • each subplan 22a delegated to a user will roll back up the chain of delegated subplans 22a to a "top-level” plan.
  • each higher-level or "superior" subplan 22a will contain all the data from each of its subordinate subplans 22a, higher-level subplans 22a will become increasingly large, with high-level subplans 22a in larger organizations ultimately becoming unmanageable.
  • the very large dataset representation system 10 enables a plan manager to define, based upon an organization modelling object 24, a delegation modelling object 12 for a very large dataset. This creates a very large dataset delegation 12 of multiple subplans 22a that can then be individually filtered for specific size restrictions.
  • the system 10 enables a plan manager to filter the definition 22 of each subplan prior to the execution of the delegation modelling object 12 precluding any need for higher-level subplans 22a to contain all the data contained in their subordinate subplans 22a.
  • a subsequent consolidator 30 process will then extract data not found at higher levels from each delegated subplan 22a, and return that data to its original dataset.
  • the very large dataset representation system 10 associates or "maps" an organization's 24 hierarchal structure to an external source of data such as a data warehouse in order to define a set of related subplans 22a.
  • an external source of data such as a data warehouse
  • data is extracted directly from an external source and delivered to the computer systems of individual users, with each subplan 22a generated on an individual basis having been filtered in accordance with data size limits.
  • ABC Co. has a budget-related dataset in its data warehouse that it wishes to distribute to each of ABC Co.'s regional managers.
  • This budget dataset contains a master dimension 26 that includes the category dimensions 26 "Account Measures”, “Territories”, “Vendor Segments” and "Years”.
  • the budget dataset in addition to the category dimensions 26 the budget dataset further contains the region dimensions 26, "United States”, “Brazil” and “Canada”, all subordinated to an "Americas" region dimension 26.
  • the budget dataset also contains a measures dimension 26 as illustrated in TABLE 1.
  • a budget manager for ABC Co. would advantageously create a new organization modelling object 24 that would better define these associations.
  • This newly created organization modelling object 24 defines the four subplan definitions 22 illustrated in FIG. 9. If delegated for a large organization, it can be seen by one of skill in the art that the provided organization modelling object 24 would likely define a hierarchy of subplans 22a that would all easily exceed the maximum subplan 22a data size for each user, based on current computing capacity common in most organizations at the user level.
  • the system 10 can be leveraged to distribute and subsequently consolidate the ABC Co. budget.
  • the "Budget Plan" illustrated in FIG. 10 has been pre-fiitered to contain only a summary of each region. If executed, the subplan definitions 22 shown with "not available" icons 32 would all have exceeded the maximum subplan 22a data size.
  • a plan manager is able to define a "deliverable" subplan 22a for each user.
  • the use of a very large dataset in combination with the delegation process has allowed the plan manager to create an organization modelling object 24 based on the region dimension 26, and subsequently assign different region members 26 to each district user class 28, or area of responsibility.
  • the plan manager is then able to create a delegation modelling object 12 for that budget plan, and using the subplan manager 14 edit each subplan definition 22 in that delegation modelling object 12 by selecting only those measures they feel necessary in order to meet the size restrictions of a particular application, as illustrated in FIG. 11.
  • the delegation modelling object 12 is then executed in order to extract and generate deliverable subplans 22a to each designated user.
  • a subsequent consolidation process 30 then reintegrates all data subsets 22a directly back into their original source dataset.
  • Each data subset 22a generated by a very large dataset delegation 12 is a part of that delegation's workflow.
  • Modified subplans 22a are returned "up" an organization's management chain, where managers can then accept or reject subordinate subplans 22a returned to them by subordinate users 28.
  • the management workflow process then culminates in the reconstitution of all accepted subplans 22a back into their original dataset.
  • Data from each subplan 22a not found in its respective higher level or "superior" subplan 22a is extracted and consolidated directly back into the original source dataset. In this manner, the manager of a plan can have firm control over which subplans 22a will, and which subplans 22a will not be used for a specific process.
  • Subplan definitions 22 can also be updated whenever a data warehouse reporting system is likewise updated. If so desired, the system 10 can be independent of a delegation process 12, enabling a plan manager to initiate an update of a data warehouse reporting system at any point along the process.
  • the system 10 can further include a background server process 34 for improved performance when generating a large number of datasource-based subplans 22a from a plan delegation process 12, as illustrated in FIG. 3.
  • the method 100 can further include the step of providing a background server process 130 for improved performance when generating a large number of datasource-based subplans from a plan delegation process, as illustrated in FIG. 6.
  • the system 10 initiates a process by which very large sets of data, typically those datasets greater than about five million cells, external to a data warehouse solution can be imported into that process as a much more manageable set of related planning data subsets 22a.
  • system 10 provides the ability to delegate directly from data sources, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage.
  • the system 10 further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
  • the system 10 enables the smooth extraction, management, and consolidation of very large datasets.
  • Any hardware, software or a combination of hardware and software having the above-described functions may implement the very large dataset representation system 10 and method 100 according to the present invention, and methods described above.
  • the software code either in its entirety or a part thereof, may be in the form of a computer program product such as a computer-readable memory having the model and/or method stored therein.
  • a computer data signal representation of that software code may be embedded in a carrier wave for transmission via communications network infrastructure.
  • Such a computer program product and a computer data signal are also within the scope of the present invention, as well as the hardware, software and combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A system and method for representing a very large dataset that enables a plan manager to define, based upon an organization modelling object, a delegation modelling object for a very large dataset. A very large dataset delegation of multiple subplans is created whose subplans can then be individually filtered for specific size restrictions. This enables a plan manager to filter the definition of each subplan prior to the execution of the delegation modelling object, precluding any need for higher-level subplans to contain all the data contained in their subordinate subplans. This allows subplans to contain increased levels of detail not included in their superior subplans, detail that will instead only be summarized in higher-level subplans. A subsequent consolidation process will then extract data not found at higher levels from each delegated subplan, and return that data to its original dataset.

Description

VERY LARGE DATASET REPRESENTATION SYSTEM AND METHOD
Field of the Invention
[0001] The present invention relates generally to electronic databases, and more particularly to the dimensional modelling of a very large dataset.
Background of the Invention
[0002] With advances in contemporary business information systems, all levels of an organization can now enjoy access to repositories of business data known as data warehouses. Data warehousing techniques enable businesses to eliminate extensive amounts of unnecessary workload generated by multiple redundant reporting tasks, and can further facilitate the standardization of data throughout an organization. Business planning applications such as budgeting and forecasting systems are increasingly being integrated into advanced data warehousing solutions in order to maximize returns on what has often been considerable investments in both computing facilities and the gatherings of data they contain.
[0003] A data warehouse contains collections of related data known as datasets. When these datasets are relatively small, such as when a data warehouse has been recently implemented, users can easily access and work with complete datasets directly on their personal computer systems. However, difficulties arise when datasets get larger. Datasets can eventually grow within a data warehouse facility to contain billions upon billions of individual data values, many times larger than can be handled by the computational capacity of any single user's computer system.
[0004] In order to provide a workable solution for handling these very large datasets, prior art methods have been employed to extract and deliver subsets of these larger datasets to designated users. This has required close management of the size of each data subset to ensure that users receiving these data subsets can consistently access them given the computational limitations of their individual computer systems, limitations such as calculation size limits, fixed memory limitations, and other hard limits. Upon completion of user interaction in these prior art methods, all data subsets must be returned to their "superior" datasets within the data warehouse through a process known as consolidation.
[0005] The problem with these prior art methods has been that they employ manual techniques or scripts that must be manually run and maintained in order to extract the data subsets. The consolidation process has also been a mostly manual process of running database-specific scripts. In addition, the administrator responsible for creating and executing the extraction scripts must also keep track of what data has been delivered to which user.
[0006] The result has been that prior art data warehouse extraction and consolidation methods are highly time-consuming to define, execute and maintain for very large datasets. Furthermore, the delivery of data subsets to designated users lacks integrated tracking, and is often independent of, and therefore outside the control of the organizational security structure employed by the querying application. Therefore, what is needed is a more manageable data model for supporting very large datasets.
[0007] For the foregoing reasons, there is a need for an improved modelling system and method for handling data queries that generate very large datasets.
Summary of the Invention
[0008] The present invention is directed to a very large dataset representation system and method. The system includes a delegation modelling object and a subplan manager for filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object. The delegation modelling object includes a master dataset definition, one or more than one data dimension-to-user mapping, a target organization definition defining relationships between the master dataset definition and the data dimension-to-user mappings, and a subplan definition derived from each data dimension-to-user mapping.
[0009] In an aspect of the present invention, the system further includes an organizational hierarchy description of the data dimension-to-user mappings provided by an organization modelling object having one or more than one data dimension reference, one or more than one user identifier defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier.
[0010] In an aspect of the present invention, the system further includes a consolidator for, upon completion of user interaction, extracting data from each delegated subplan not found in its superior subplans and returning that extracted data to its original dataset.
[0011] The method includes the steps of constructing a delegation modelling object, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object, and executing the delegation modelling object to extract and generate subplans.
[0012] The delegation modelling object is constructed by defining a master dataset, mapping each data dimension to one or more than one user identifier, defining relationships between the master dataset and the data dimension-to-user mappings, and deriving a subplan definition from each data dimension-to-user mapping.
[0013] In an aspect of the present invention, the method further includes the step of describing an organization hierarchy of the data dimension-to-user mappings by constructing an organization modelling object through referencing one or more than one data dimension, defining intended recipients with one or more than one user identifier, and mapping each data dimension reference to one or more than one user identifier.
[0014] In an aspect of the present invention, the method further includes the step of consolidating data from the delegated subplans upon completion of user interaction by extracting data from each delegated subplan not found in its superior subplans, and returning the extracted data to its original dataset.
[0015] The system provides the ability to delegate from data sources directly, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage. The system further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
[0016] By providing a highly scalable system of subplans, each within the computational limits of existing computer systems but that are in combination capable of representing a planning problem of virtually any size, the system enables the smooth extraction, management, and consolidation of very large datasets.
[0017] Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
Brief Description of the Drawings
These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
FIG. 1 is an overview of a very large dataset representation system in accordance with an embodiment of the present invention;
FIG. 2 shows the system including an organization modelling object; FIG. 3 shows the system including an organization modelling object, consolidator, and background server process;
FIG. 4 is an overview of a very large dataset representation method in accordance with an embodiment of the present invention; FIG. 5 shows the method including the step of constructing an organization modelling object;
FIG. 6 shows the method including the step of constructing an organization modelling object, providing a background server process, and consolidating data from delegated subplans; FIG. 7 illustrates region dimensions;
FIG. 8 illustrates an organization modelling object with defined associations;
FIG. 9 illustrates subplan definitions for an organization modelling object; FIG. 10 illustrates a budget plan; and
FIG. 11 illustrates the filtering of subplans in a subplan manager.
Detailed Description of the Presently Preferred Embodiment
[0018] Embodiments of the present invention are directed to a very large dataset representation system 10 and method 100. As illustrated in FIG. 1 , the system 10 includes a delegation modelling object 12 and a subplan manager 14 for filtering data from subplan definitions 22 in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 12. The delegation modelling object 12 includes a master dataset definition 16, one or more than one data dimension-to-user mapping 18, a target organization definition 20 defining relationships between the master dataset definition 16 and the data dimension-to-user mappings 18, and a subplan definition 22 derived from each data dimension-to-user mapping 18.
[0019] In an embodiment of the present invention, the data dimension-to-user mappings 18 are provided by an organization modelling object 24 having one or more than one data dimension reference 26, one or more than one user identifier 28 defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier 18, as illustrated in FIG. 2.
[0020] In an embodiment of the present invention, the system 10 further includes a consolidator 30 for, upon completion of user interaction, extracting data from each delegated subplan 22a not found in its superior subplans 22a and returning that extracted data to its original dataset, as illustrated in FIG. 3.
[0021] As illustrated in FIG. 4, the method 100 includes the steps of constructing a delegation modelling object 102, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 104, and executing the delegation modelling object to extract and generate subplans 106.
[0022] The delegation modelling object is constructed by defining a master dataset 108, mapping each data dimension to one or more than one user identifier 110, defining relationships between the master dataset and the data dimension-to-user mappings 112, and deriving a subplan definition from each data dimension-to-user mapping 114.
[0023] In an embodiment of the present invention, the step of mapping each data dimension to one or more user identifiers 110 is provided by the step of constructing an organization modelling object 116 by referencing one or more than one data dimension 118, defining intended recipients with one or more than one user identifier 120, and mapping each data dimension reference to one or more than one user identifier 122, as illustrated in FiG. 5.
[0024] In an embodiment of the present invention, the method 100 further includes the step of consolidating data from the delegated subplans upon completion of user interaction 124 by extracting data from each delegated subplan not found in its superior subplans 126, and returning the extracted data to its original dataset 128, as illustrated in FIG. 6. [0025] However, before further description of detailed embodiments of the present invention is provided and explained, the following glossary of terms is provided in order to aid in understanding the various elements associated with the present invention.
Glossary
[0026] A "cube" as defined herein is a data-modelling object created either manually or automatically from data sources by a planning modeller. The term cube is often used in the art to describe, in a tangible manner, a conceptual understanding of multi-dimensional data structures, whereby data values can be perceived as being stored in the cells of a multi-dimensional array.
[0027] A "plan" as defined herein is a guide to providing a "snapshot" of a cube and is created by a database modeller and delivered to the manager of a plan. Unlike cubes, plan dimensions are not modifiable by users. By intention, only plan owners or managers can modify plans.
[0028] A "subplan" 22a as defined herein is a read-only portion of a plan distributed to user classes based upon a specified organization. Subplans 22a are generated by a delegation process that will be defined below.
[0029] A "proposal" 36 as defined herein is a modifiable version of a subplan definition 22 created by a subplan owner to aid in a planning process.
[0030] An "organization" 24 as defined herein is a first-class business- modelling object that defines the relationship between dimensional data and user/role identifiers as defined by a business application's security model. An organization modelling object 24 defines the contents of each in a series of subplans and their hierarchical relationship to one another, defines the contents of each subset of data to be extracted, and associates each data subset with a user who will receive and manage that data subset. [0031] Organizations and their embodiment in organization modelling objects 24 are the subject of the Applicant's co-pending United States application for patent titled Organization Modelling Object as a First-Class Business Modelling Object, and Method and System for Providing Same" filed February 19, 2003, the teachings of which are hereby incorporated by reference in their entirety.
[0032] A "delegation" 12 as defined herein is a first-class business-modelling object that associates a dataset with an organization modelling object 24, and manages the workflow and scheduling around the delivery of subsets of data. Delegation modelling objects 12 provide a formal definition of this process by defining a master dataset and associating the organization hierarchy by which specific datasets or subplans 22a will be generated from the master dataset. A delegation modelling object 12 automates the creation and delivery of subplans 22a and keeps track of changes to subplans 22a over time. A delegation modelling object 12 also provides control to shutdown, as well as clean up an entire delegation process.
[0033] Delegation modelling objects 12 are described in detail in Applicant's co-pending United States application for patent, titled "Delegation Modelling Object as a First-Class Business Modelling Object, and Method and System for Providing Same" filed February 19, 2003, the teachings of which are hereby incorporated by reference in their entirety.
[0034] A "dataset" as defined herein is a set of related source data to be used by a delegation modelling object 12 in data extraction and consolidation processes. A dataset should therefore contain elements of the dimensionality referenced in an organization modelling object 24. Furthermore, a single dataset can be the source of more than one distinct delegation 12.
[0035] "Subplan filtering" as defined herein describes a process by which each subplan definition 22 is filtered for distribution down to a maximum data size, while respecting the hierarchy as defined by its organization modelling object 24 in order that each user can work with that data on their individual computer systems.
[0036] "Consolidation" as defined herein describes a process for the reintegration of all data subsets 22a back into their original dataset.
[0037] Returning now to a detailed description of the present invention, the delegation of a very large dataset is a process by which the extractions of data from a data warehouse are described by an organization 24, and managed by a delegation 12. Very large dataset delegations 12 are reusable definitions that provide data extraction methods based on business organizational rules, workflow management, and subplan filtering while respecting the organizational integrity defined by the organization 24, as well as consolidation back into an original dataset. A delegation modelling object 12 contains a reference to an organization modelling object 24 in order to define how a master dataset is to be broken out and delivered. A delegation 12 provides a relationship between dimensional data and management roles provided by a data dimension-to-user mapping 18 in order to establish areas of responsibility.
[0038] As opposed to a very large dataset delegation 12 in accordance with the present invention, in a "regular" delegation as embodied and described in Applicant's aforementioned co-pending United States application for patent titled "Delegation Modelling Object as a First-Class Business Modelling Object, and Method and System for Providing Same", each generated subset of data represents the subplan 22a of a larger data subplan 22a generated at a higher level of a management hierarchy. The hierarchy of those subplans 22a is defined in an organization modelling object 24, and since the top-level plan in a so-called "regular" delegation contains the entire dataset, no consolidation back to the original dataset is required.
[0039] Therefore, in a planning area consolidation process for example, each subplan 22a delegated to a user will roll back up the chain of delegated subplans 22a to a "top-level" plan. However, since each higher-level or "superior" subplan 22a will contain all the data from each of its subordinate subplans 22a, higher-level subplans 22a will become increasingly large, with high-level subplans 22a in larger organizations ultimately becoming unmanageable.
[0040] The very large dataset representation system 10 enables a plan manager to define, based upon an organization modelling object 24, a delegation modelling object 12 for a very large dataset. This creates a very large dataset delegation 12 of multiple subplans 22a that can then be individually filtered for specific size restrictions. The system 10 enables a plan manager to filter the definition 22 of each subplan prior to the execution of the delegation modelling object 12 precluding any need for higher-level subplans 22a to contain all the data contained in their subordinate subplans 22a. This allows subplans 22a to contain increased levels of detail not included in their superior subplans 22a, detail that will instead only be summarized in higher- level subplans 22a. A subsequent consolidator 30 process will then extract data not found at higher levels from each delegated subplan 22a, and return that data to its original dataset.
[0041] Much like the previously defined regular delegation modelling object that binds a dataset to an organization modelling object 24 in order to define a set of related data subsets, the very large dataset representation system 10 associates or "maps" an organization's 24 hierarchal structure to an external source of data such as a data warehouse in order to define a set of related subplans 22a. When a very large dataset delegation 12 is run, data is extracted directly from an external source and delivered to the computer systems of individual users, with each subplan 22a generated on an individual basis having been filtered in accordance with data size limits.
[0042] An exemplary example of the use of an embodiment of the very large dataset representation system 10 is illustrated in the following discussion and accompanying figures. In this example,. ABC Co. has a budget-related dataset in its data warehouse that it wishes to distribute to each of ABC Co.'s regional managers. This budget dataset contains a master dimension 26 that includes the category dimensions 26 "Account Measures", "Territories", "Vendor Segments" and "Years".
[0043] As illustrated in FIG. 7, in addition to the category dimensions 26 the budget dataset further contains the region dimensions 26, "United States", "Brazil" and "Canada", all subordinated to an "Americas" region dimension 26. The budget dataset also contains a measures dimension 26 as illustrated in TABLE 1.
Table 1 : Measures Dimension Revenue Net Income Gross Margin Gross Profit
Break Even Net Margin Return on Assets Current Ratio Debt/Asset
Cost of Goods Sold
Operating Cost
Total Operating Expenses
[0044] As well, ABC Co. has defined the management roles 28 illustrated in TABLE 2.
Table 2: Management Roles
District 1 Manager Responsible for all of the "Americas" regions
District 2 Manager Responsible for the United States" region, and reporting to "District 1" manager District 3 Manager Responsible for the "Brazil" region, and reporting to "District 1" manager District 4 Manager Responsible for the "Canada" region, and reporting to "District 1 " manager ~~
[0045] Therefore as illustrated in FIG. 8, a budget manager for ABC Co. would advantageously create a new organization modelling object 24 that would better define these associations. This newly created organization modelling object 24 defines the four subplan definitions 22 illustrated in FIG. 9. If delegated for a large organization, it can be seen by one of skill in the art that the provided organization modelling object 24 would likely define a hierarchy of subplans 22a that would all easily exceed the maximum subplan 22a data size for each user, based on current computing capacity common in most organizations at the user level.
[0046] However, the system 10 can be leveraged to distribute and subsequently consolidate the ABC Co. budget. In accordance with an embodiment of the very large dataset representation system 10, the "Budget Plan" illustrated in FIG. 10 has been pre-fiitered to contain only a summary of each region. If executed, the subplan definitions 22 shown with "not available" icons 32 would all have exceeded the maximum subplan 22a data size. However, using the subplan manager 14 to filter each subplan definition 22 prior to executing the delegation 12 in accordance with the system 10, a plan manager is able to define a "deliverable" subplan 22a for each user.
[0047] Thus, the use of a very large dataset in combination with the delegation process has allowed the plan manager to create an organization modelling object 24 based on the region dimension 26, and subsequently assign different region members 26 to each district user class 28, or area of responsibility. The plan manager is then able to create a delegation modelling object 12 for that budget plan, and using the subplan manager 14 edit each subplan definition 22 in that delegation modelling object 12 by selecting only those measures they feel necessary in order to meet the size restrictions of a particular application, as illustrated in FIG. 11. The delegation modelling object 12 is then executed in order to extract and generate deliverable subplans 22a to each designated user.
[0048] While the budget plan continues to contain a summary of the regions and all of the measures and each subplan 22a continues to contain all of its specific region members 26, only a subset of the measures data is actually provided to each user. Therefore, all generated subplans 22a in the executed delegation 12 is now less than the pre-determined maximum subplan 22a data size.
[0049] A subsequent consolidation process 30 then reintegrates all data subsets 22a directly back into their original source dataset. Each data subset 22a generated by a very large dataset delegation 12 is a part of that delegation's workflow. Modified subplans 22a are returned "up" an organization's management chain, where managers can then accept or reject subordinate subplans 22a returned to them by subordinate users 28. The management workflow process then culminates in the reconstitution of all accepted subplans 22a back into their original dataset. Data from each subplan 22a not found in its respective higher level or "superior" subplan 22a is extracted and consolidated directly back into the original source dataset. In this manner, the manager of a plan can have firm control over which subplans 22a will, and which subplans 22a will not be used for a specific process.
[0050] When a very large dataset delegation 12 is run, data is extracted directly from the external data source, and each subplan 22a is subsequently generated directly, and independently of the delegation 12. Subplan definitions 22 can also be updated whenever a data warehouse reporting system is likewise updated. If so desired, the system 10 can be independent of a delegation process 12, enabling a plan manager to initiate an update of a data warehouse reporting system at any point along the process.
[0051] In an embodiment of the present invention, the system 10 can further include a background server process 34 for improved performance when generating a large number of datasource-based subplans 22a from a plan delegation process 12, as illustrated in FIG. 3. In an embodiment of the present invention, the method 100 can further include the step of providing a background server process 130 for improved performance when generating a large number of datasource-based subplans from a plan delegation process, as illustrated in FIG. 6.
[0052] The system 10 initiates a process by which very large sets of data, typically those datasets greater than about five million cells, external to a data warehouse solution can be imported into that process as a much more manageable set of related planning data subsets 22a.
[0053] In addition, the system 10 provides the ability to delegate directly from data sources, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage. The system 10 further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
[0054] By providing a highly scalable system of subplans 22a, each within the computational limits of existing computer systems, but whose combined structure is capable of representing a planning problem of virtually any size, the system 10 enables the smooth extraction, management, and consolidation of very large datasets.
[0055] Any hardware, software or a combination of hardware and software having the above-described functions may implement the very large dataset representation system 10 and method 100 according to the present invention, and methods described above. The software code, either in its entirety or a part thereof, may be in the form of a computer program product such as a computer-readable memory having the model and/or method stored therein.
[0056] Furthermore, a computer data signal representation of that software code may be embedded in a carrier wave for transmission via communications network infrastructure. Such a computer program product and a computer data signal are also within the scope of the present invention, as well as the hardware, software and combination thereof.
[0057] Therefore, although the present invention has been described in considerable detail with reference to certain preferred embodiments thereof, other versions are possible. Therefore, the spirit and scope of the appended claims should not be limited to the description of the preferred embodiments contained herein.

Claims

What is claimed is:
1. A very large dataset representation system comprising: a delegation modelling object including: a master dataset definition; one or more than one data dimension-to-user mapping; a target organization definition defining relationships between said master dataset definition and said data dimension-to-user mappings; and a subplan definition derived from each data dimension-to-user mapping; and a subplan manager for filtering data from said subplan definitions in accordance with a predetermined data size limitation in advance of executing said delegation modelling object.
2. The system according to claim 1 , wherein said data dimension-to-user mappings are described by an organization hierarchy description.
3. The system according to claim 2, wherein said organizational hierarchy description is provided by an organization modelling object having: one or more than one data dimension reference; one or more than one user identifier defining intended recipients; and a mapping between each data dimension reference and one or more than one user identifier.
4. The system according to claim 1 , further including a consolidator for, upon completion of user interaction, extracting data from each delegated subplan not found in its superior subplans and returning that extracted data to its original dataset.
5. The system according to claim 1 , further including a background server process to improve performance when generating a large number of datasource-based subplans.
6. The system according to claim 1 , wherein one or more of said subplan definitions is a proposal to aid in a planning process.
7. A method of representing very large datasets, comprising the steps of: (i) constructing a delegation modelling object by: a) defining a master dataset; b) mapping each data dimension to one or more than one user identifier; c) defining relationships between said master dataset and said data dimension-to-user mappings; and d) deriving a subplan definition from each data dimension-to- user mapping;
(ii) filtering data from said subplan definitions in accordance with a predetermined data size limitation in advance of executing said delegation modelling object; and
(Hi) executing said delegation modelling object to extract and generate subplans.
8. The method according to claim 7, wherein said mapping step is described by the step of providing an organization hierarchy description.
9. The method according to claim 8, wherein said organization hierarchy description is provided by the step of constructing an organization modelling object by: referencing one or more than. one data dimension; defining intended recipients with one or more than one user identifier; and mapping each data dimension reference to one or more than one user identifier.
10. The method according to claim 7, further including the step of consolidating data from said delegated subplans upon completion of user interaction.
11. The method according to claim 10, wherein said consolidation step includes the steps of: extracting data from each delegated subplan not found in its superior subplans; and returning said extracted data to its original dataset.
12. The method according to claim 7, further including the step of including a background server process to improve performance when generating a large number of datasource-based subplans.
13. A computer program product for a very large dataset representation method, the computer program product comprising: a computer readable medium for storing machine-executable instructions for use in the execution in a computer of the very large dataset representation method, the method including the steps of: constructing a delegation modelling object by: defining a master dataset; mapping each data dimension to one or more than one user identifier; defining relationships between said master dataset and said data dimension-to-user mappings; and deriving a subplan definition from each data dimension- to-user mapping; filtering data from said subplan definitions in accordance with a predetermined data size limitation in advance of executing said delegation modelling object; and executing said delegation modelling object to extract and generate subplans.
PCT/CA2004/000973 2004-07-02 2004-07-02 Very large dataset representation system and method WO2006002505A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CA2004/000973 WO2006002505A1 (en) 2004-07-02 2004-07-02 Very large dataset representation system and method
EP04737912A EP1782271A1 (en) 2004-07-02 2004-07-02 Very large dataset representation system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CA2004/000973 WO2006002505A1 (en) 2004-07-02 2004-07-02 Very large dataset representation system and method

Publications (1)

Publication Number Publication Date
WO2006002505A1 true WO2006002505A1 (en) 2006-01-12

Family

ID=34958041

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2004/000973 WO2006002505A1 (en) 2004-07-02 2004-07-02 Very large dataset representation system and method

Country Status (2)

Country Link
EP (1) EP1782271A1 (en)
WO (1) WO2006002505A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5758355A (en) * 1996-08-07 1998-05-26 Aurum Software, Inc. Synchronization of server database with client database using distribution tables
US5870746A (en) * 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
WO2000042553A2 (en) * 1999-01-15 2000-07-20 Harmony Software, Inc. Method and apparatus for processing business information from multiple enterprises
CA2361176A1 (en) * 2001-11-02 2003-05-02 Cognos Incorporated Improvements to computer-based business planning processes
US6581068B1 (en) * 1999-12-01 2003-06-17 Cartesis, S.A. System and method for instant consolidation, enrichment, delegation and reporting in a multidimensional database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5870746A (en) * 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
US5758355A (en) * 1996-08-07 1998-05-26 Aurum Software, Inc. Synchronization of server database with client database using distribution tables
WO2000042553A2 (en) * 1999-01-15 2000-07-20 Harmony Software, Inc. Method and apparatus for processing business information from multiple enterprises
US6581068B1 (en) * 1999-12-01 2003-06-17 Cartesis, S.A. System and method for instant consolidation, enrichment, delegation and reporting in a multidimensional database
CA2361176A1 (en) * 2001-11-02 2003-05-02 Cognos Incorporated Improvements to computer-based business planning processes

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
2001, PONNIAH P, DATA WAREHOUSING FUNDAMENTALS: A COMPRENHENSIVE GUIDE FOR IT PROFESSIONALS, XP002317683 *

Also Published As

Publication number Publication date
EP1782271A1 (en) 2007-05-09

Similar Documents

Publication Publication Date Title
US8819783B2 (en) Efficient data structures for multi-dimensional security
US11734293B2 (en) System and method for client-side calculation in a multidimensional database environment
Kuhlmann et al. Role mining-revealing business roles for security administration using data mining technology
Bäumer et al. Framework development for large systems
CN107533569B (en) System and method for sandbox support in a multidimensional database environment
CN104731791A (en) Marketing analysis data market system
US7401090B2 (en) Computer-based business planning processes
CN100543745C (en) Data handling system and method based on data attribute
US20110071871A1 (en) Common semantic model of management of a supply chain
US7333995B2 (en) Very large dataset representation system and method
US20130117230A1 (en) Enablement of Quasi Time Dependency in Organizational Hierarchies
US10528522B1 (en) Metadata-based data valuation
Bhansali Strategic data warehousing: achieving alignment with business
Burgard et al. Data warehouse and business intelligence systems in the context of e-HRM
WO2006002505A1 (en) Very large dataset representation system and method
CA2472926A1 (en) Very large dataset representation system and method
Steinhoff The social reconfiguration of artificial intelligence: Utility and feasibility
Rosenthal et al. First-class views: a key to user-centered computing
Simonin et al. A data warehouse logical design method based on the alignment with business processes
Prakash et al. The Development Process
Sarkar et al. Implementation of graph semantic based multidimensional data model: An object relational approach
Prakash et al. Requirements Engineering for Data Warehousing
Carbone et al. Intelligent Mediation in Active Knowledge Mining: Goals and General Description
AbdelSalam An integrated engineering-computation framework for collaborative engineering: An application in project management
Woon et al. Developing Applications in Corporate Finance: An Object-Oriented Database Management Approach

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWE Wipo information: entry into national phase

Ref document number: 2004737912

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2004737912

Country of ref document: EP