EP1782271A1 - Procede et systeme de representation de tres grands ensembles de donnees - Google Patents

Procede et systeme de representation de tres grands ensembles de donnees

Info

Publication number
EP1782271A1
EP1782271A1 EP04737912A EP04737912A EP1782271A1 EP 1782271 A1 EP1782271 A1 EP 1782271A1 EP 04737912 A EP04737912 A EP 04737912A EP 04737912 A EP04737912 A EP 04737912A EP 1782271 A1 EP1782271 A1 EP 1782271A1
Authority
EP
European Patent Office
Prior art keywords
data
subplans
subplan
delegation
dataset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP04737912A
Other languages
German (de)
English (en)
Inventor
Marc Desbiens
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
Cognos Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cognos Inc filed Critical Cognos Inc
Publication of EP1782271A1 publication Critical patent/EP1782271A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management

Definitions

  • the present invention relates generally to electronic databases, and more particularly to the dimensional modelling of a very large dataset.
  • a data warehouse contains collections of related data known as datasets.
  • datasets When these datasets are relatively small, such as when a data warehouse has been recently implemented, users can easily access and work with complete datasets directly on their personal computer systems.
  • the present invention is directed to a very large dataset representation system and method.
  • the system includes a delegation modelling object and a subplan manager for filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object.
  • the delegation modelling object includes a master dataset definition, one or more than one data dimension-to-user mapping, a target organization definition defining relationships between the master dataset definition and the data dimension-to-user mappings, and a subplan definition derived from each data dimension-to-user mapping.
  • the system further includes an organizational hierarchy description of the data dimension-to-user mappings provided by an organization modelling object having one or more than one data dimension reference, one or more than one user identifier defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier.
  • the system further includes a consolidator for, upon completion of user interaction, extracting data from each delegated subplan not found in its superior subplans and returning that extracted data to its original dataset.
  • the method includes the steps of constructing a delegation modelling object, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object, and executing the delegation modelling object to extract and generate subplans.
  • the delegation modelling object is constructed by defining a master dataset, mapping each data dimension to one or more than one user identifier, defining relationships between the master dataset and the data dimension-to-user mappings, and deriving a subplan definition from each data dimension-to-user mapping.
  • the method further includes the step of describing an organization hierarchy of the data dimension-to-user mappings by constructing an organization modelling object through referencing one or more than one data dimension, defining intended recipients with one or more than one user identifier, and mapping each data dimension reference to one or more than one user identifier.
  • the method further includes the step of consolidating data from the delegated subplans upon completion of user interaction by extracting data from each delegated subplan not found in its superior subplans, and returning the extracted data to its original dataset.
  • the system provides the ability to delegate from data sources directly, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage.
  • the system further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
  • FIG. 1 is an overview of a very large dataset representation system in accordance with an embodiment of the present invention
  • FIG. 2 shows the system including an organization modelling object
  • FIG. 3 shows the system including an organization modelling object, consolidator, and background server process
  • FIG. 4 is an overview of a very large dataset representation method in accordance with an embodiment of the present invention
  • FIG. 5 shows the method including the step of constructing an organization modelling object
  • FIG. 6 shows the method including the step of constructing an organization modelling object, providing a background server process, and consolidating data from delegated subplans;
  • FIG. 7 illustrates region dimensions;
  • FIG. 8 illustrates an organization modelling object with defined associations
  • FIG. 9 illustrates subplan definitions for an organization modelling object
  • FIG. 10 illustrates a budget plan
  • FIG. 11 illustrates the filtering of subplans in a subplan manager.
  • Embodiments of the present invention are directed to a very large dataset representation system 10 and method 100.
  • the system 10 includes a delegation modelling object 12 and a subplan manager 14 for filtering data from subplan definitions 22 in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 12.
  • the delegation modelling object 12 includes a master dataset definition 16, one or more than one data dimension-to-user mapping 18, a target organization definition 20 defining relationships between the master dataset definition 16 and the data dimension-to-user mappings 18, and a subplan definition 22 derived from each data dimension-to-user mapping 18.
  • the data dimension-to-user mappings 18 are provided by an organization modelling object 24 having one or more than one data dimension reference 26, one or more than one user identifier 28 defining intended recipients, and a mapping between each data dimension reference and one or more than one user identifier 18, as illustrated in FIG. 2.
  • the system 10 further includes a consolidator 30 for, upon completion of user interaction, extracting data from each delegated subplan 22a not found in its superior subplans 22a and returning that extracted data to its original dataset, as illustrated in FIG. 3.
  • the method 100 includes the steps of constructing a delegation modelling object 102, filtering data from subplan definitions in accordance with a predetermined data size limitation in advance of executing the delegation modelling object 104, and executing the delegation modelling object to extract and generate subplans 106.
  • the delegation modelling object is constructed by defining a master dataset 108, mapping each data dimension to one or more than one user identifier 110, defining relationships between the master dataset and the data dimension-to-user mappings 112, and deriving a subplan definition from each data dimension-to-user mapping 114.
  • the step of mapping each data dimension to one or more user identifiers 110 is provided by the step of constructing an organization modelling object 116 by referencing one or more than one data dimension 118, defining intended recipients with one or more than one user identifier 120, and mapping each data dimension reference to one or more than one user identifier 122, as illustrated in FiG. 5.
  • the method 100 further includes the step of consolidating data from the delegated subplans upon completion of user interaction 124 by extracting data from each delegated subplan not found in its superior subplans 126, and returning the extracted data to its original dataset 128, as illustrated in FIG. 6.
  • a "cube” as defined herein is a data-modelling object created either manually or automatically from data sources by a planning modeller.
  • the term cube is often used in the art to describe, in a tangible manner, a conceptual understanding of multi-dimensional data structures, whereby data values can be perceived as being stored in the cells of a multi-dimensional array.
  • a "plan” as defined herein is a guide to providing a "snapshot" of a cube and is created by a database modeller and delivered to the manager of a plan. Unlike cubes, plan dimensions are not modifiable by users. By intention, only plan owners or managers can modify plans.
  • a "subplan" 22a as defined herein is a read-only portion of a plan distributed to user classes based upon a specified organization. Subplans 22a are generated by a delegation process that will be defined below.
  • a "proposal" 36 as defined herein is a modifiable version of a subplan definition 22 created by a subplan owner to aid in a planning process.
  • An "organization" 24 as defined herein is a first-class business- modelling object that defines the relationship between dimensional data and user/role identifiers as defined by a business application's security model.
  • An organization modelling object 24 defines the contents of each in a series of subplans and their hierarchical relationship to one another, defines the contents of each subset of data to be extracted, and associates each data subset with a user who will receive and manage that data subset.
  • a "delegation” 12 as defined herein is a first-class business-modelling object that associates a dataset with an organization modelling object 24, and manages the workflow and scheduling around the delivery of subsets of data.
  • Delegation modelling objects 12 provide a formal definition of this process by defining a master dataset and associating the organization hierarchy by which specific datasets or subplans 22a will be generated from the master dataset.
  • a delegation modelling object 12 automates the creation and delivery of subplans 22a and keeps track of changes to subplans 22a over time.
  • a delegation modelling object 12 also provides control to shutdown, as well as clean up an entire delegation process.
  • Delegation modelling objects 12 are described in detail in Applicant's co-pending United States application for patent, titled “Delegation Modelling Object as a First-Class Business Modelling Object, and Method and System for Providing Same” filed February 19, 2003, the teachings of which are hereby incorporated by reference in their entirety.
  • a “dataset” as defined herein is a set of related source data to be used by a delegation modelling object 12 in data extraction and consolidation processes.
  • a dataset should therefore contain elements of the dimensionality referenced in an organization modelling object 24.
  • a single dataset can be the source of more than one distinct delegation 12.
  • Subplan filtering as defined herein describes a process by which each subplan definition 22 is filtered for distribution down to a maximum data size, while respecting the hierarchy as defined by its organization modelling object 24 in order that each user can work with that data on their individual computer systems.
  • Very large dataset delegations 12 are reusable definitions that provide data extraction methods based on business organizational rules, workflow management, and subplan filtering while respecting the organizational integrity defined by the organization 24, as well as consolidation back into an original dataset.
  • a delegation modelling object 12 contains a reference to an organization modelling object 24 in order to define how a master dataset is to be broken out and delivered.
  • a delegation 12 provides a relationship between dimensional data and management roles provided by a data dimension-to-user mapping 18 in order to establish areas of responsibility.
  • each generated subset of data represents the subplan 22a of a larger data subplan 22a generated at a higher level of a management hierarchy.
  • the hierarchy of those subplans 22a is defined in an organization modelling object 24, and since the top-level plan in a so-called "regular" delegation contains the entire dataset, no consolidation back to the original dataset is required.
  • each subplan 22a delegated to a user will roll back up the chain of delegated subplans 22a to a "top-level” plan.
  • each higher-level or "superior" subplan 22a will contain all the data from each of its subordinate subplans 22a, higher-level subplans 22a will become increasingly large, with high-level subplans 22a in larger organizations ultimately becoming unmanageable.
  • the very large dataset representation system 10 enables a plan manager to define, based upon an organization modelling object 24, a delegation modelling object 12 for a very large dataset. This creates a very large dataset delegation 12 of multiple subplans 22a that can then be individually filtered for specific size restrictions.
  • the system 10 enables a plan manager to filter the definition 22 of each subplan prior to the execution of the delegation modelling object 12 precluding any need for higher-level subplans 22a to contain all the data contained in their subordinate subplans 22a.
  • a subsequent consolidator 30 process will then extract data not found at higher levels from each delegated subplan 22a, and return that data to its original dataset.
  • the very large dataset representation system 10 associates or "maps" an organization's 24 hierarchal structure to an external source of data such as a data warehouse in order to define a set of related subplans 22a.
  • an external source of data such as a data warehouse
  • data is extracted directly from an external source and delivered to the computer systems of individual users, with each subplan 22a generated on an individual basis having been filtered in accordance with data size limits.
  • ABC Co. has a budget-related dataset in its data warehouse that it wishes to distribute to each of ABC Co.'s regional managers.
  • This budget dataset contains a master dimension 26 that includes the category dimensions 26 "Account Measures”, “Territories”, “Vendor Segments” and "Years”.
  • the budget dataset in addition to the category dimensions 26 the budget dataset further contains the region dimensions 26, "United States”, “Brazil” and “Canada”, all subordinated to an "Americas" region dimension 26.
  • the budget dataset also contains a measures dimension 26 as illustrated in TABLE 1.
  • a budget manager for ABC Co. would advantageously create a new organization modelling object 24 that would better define these associations.
  • This newly created organization modelling object 24 defines the four subplan definitions 22 illustrated in FIG. 9. If delegated for a large organization, it can be seen by one of skill in the art that the provided organization modelling object 24 would likely define a hierarchy of subplans 22a that would all easily exceed the maximum subplan 22a data size for each user, based on current computing capacity common in most organizations at the user level.
  • the system 10 can be leveraged to distribute and subsequently consolidate the ABC Co. budget.
  • the "Budget Plan" illustrated in FIG. 10 has been pre-fiitered to contain only a summary of each region. If executed, the subplan definitions 22 shown with "not available" icons 32 would all have exceeded the maximum subplan 22a data size.
  • a plan manager is able to define a "deliverable" subplan 22a for each user.
  • the use of a very large dataset in combination with the delegation process has allowed the plan manager to create an organization modelling object 24 based on the region dimension 26, and subsequently assign different region members 26 to each district user class 28, or area of responsibility.
  • the plan manager is then able to create a delegation modelling object 12 for that budget plan, and using the subplan manager 14 edit each subplan definition 22 in that delegation modelling object 12 by selecting only those measures they feel necessary in order to meet the size restrictions of a particular application, as illustrated in FIG. 11.
  • the delegation modelling object 12 is then executed in order to extract and generate deliverable subplans 22a to each designated user.
  • a subsequent consolidation process 30 then reintegrates all data subsets 22a directly back into their original source dataset.
  • Each data subset 22a generated by a very large dataset delegation 12 is a part of that delegation's workflow.
  • Modified subplans 22a are returned "up" an organization's management chain, where managers can then accept or reject subordinate subplans 22a returned to them by subordinate users 28.
  • the management workflow process then culminates in the reconstitution of all accepted subplans 22a back into their original dataset.
  • Data from each subplan 22a not found in its respective higher level or "superior" subplan 22a is extracted and consolidated directly back into the original source dataset. In this manner, the manager of a plan can have firm control over which subplans 22a will, and which subplans 22a will not be used for a specific process.
  • Subplan definitions 22 can also be updated whenever a data warehouse reporting system is likewise updated. If so desired, the system 10 can be independent of a delegation process 12, enabling a plan manager to initiate an update of a data warehouse reporting system at any point along the process.
  • the system 10 can further include a background server process 34 for improved performance when generating a large number of datasource-based subplans 22a from a plan delegation process 12, as illustrated in FIG. 3.
  • the method 100 can further include the step of providing a background server process 130 for improved performance when generating a large number of datasource-based subplans from a plan delegation process, as illustrated in FIG. 6.
  • the system 10 initiates a process by which very large sets of data, typically those datasets greater than about five million cells, external to a data warehouse solution can be imported into that process as a much more manageable set of related planning data subsets 22a.
  • system 10 provides the ability to delegate directly from data sources, and to directly create data source plans, thereby providing a manageable solution for queries that generate very large datasets, datasets that have heretofore proved difficult to manage.
  • the system 10 further enables a plan manager to update and maintain a data warehouse application in a consistent manner.
  • the system 10 enables the smooth extraction, management, and consolidation of very large datasets.
  • Any hardware, software or a combination of hardware and software having the above-described functions may implement the very large dataset representation system 10 and method 100 according to the present invention, and methods described above.
  • the software code either in its entirety or a part thereof, may be in the form of a computer program product such as a computer-readable memory having the model and/or method stored therein.
  • a computer data signal representation of that software code may be embedded in a carrier wave for transmission via communications network infrastructure.
  • Such a computer program product and a computer data signal are also within the scope of the present invention, as well as the hardware, software and combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un système et un procédé de représentation d'un très grand ensemble de données permettant à un gestionnaire de plans de définir, sur la base d'un objet de modélisation d'organisation, un objet de modélisation de délégation pour un très grand ensemble de données. Une délégation d'ensemble de données très grand constituée de plusieurs sous-plans est créée; les sous-plans qu'elle contient peuvent ensuite être filtrés individuellement en vue de restrictions de taille spécifiques. Ce mode de réalisation permet à un gestionnaire de plans de filtrer la définition de chaque sous-plan avant l'exécution de l'objet de modélisation de délégation, évitant ainsi toute nécessité, pour les sous-plans de niveaux supérieurs, de contenir toutes les données contenues dans leurs sous-plans subordonnés. Ce mode de réalisation permet aux sous-plans de contenir des niveaux de détails plus élevés qui ne sont pas contenus dans leurs sous-plans supérieurs; des détails qui seront uniquement résumés dans des sous-plans de niveaux supérieurs. Un processus de consolidation ultérieur permettra d'extraire de chaque sous-plan délégué les données qui n'auront pas été trouvées dans les niveaux supérieurs et de les renvoyer vers leur ensemble de données d'origine.
EP04737912A 2004-07-02 2004-07-02 Procede et systeme de representation de tres grands ensembles de donnees Withdrawn EP1782271A1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CA2004/000973 WO2006002505A1 (fr) 2004-07-02 2004-07-02 Procede et systeme de representation de tres grands ensembles de donnees

Publications (1)

Publication Number Publication Date
EP1782271A1 true EP1782271A1 (fr) 2007-05-09

Family

ID=34958041

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04737912A Withdrawn EP1782271A1 (fr) 2004-07-02 2004-07-02 Procede et systeme de representation de tres grands ensembles de donnees

Country Status (2)

Country Link
EP (1) EP1782271A1 (fr)
WO (1) WO2006002505A1 (fr)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5870746A (en) * 1995-10-12 1999-02-09 Ncr Corporation System and method for segmenting a database based upon data attributes
US5758355A (en) * 1996-08-07 1998-05-26 Aurum Software, Inc. Synchronization of server database with client database using distribution tables
DE60006377D1 (de) * 1999-01-15 2003-12-11 Harmony Software Inc Verfahren und apparat zum verarbeiten von geschäftsinformationen aus mehreren unternehmen
FR2806183B1 (fr) * 1999-12-01 2006-09-01 Cartesis S A Dispositif et procede pour la consolidation instantanee, l'enrichissement et le "reporting" ou remontee d'information dans une base de donnees multidimensionnelle
CA2361176A1 (fr) * 2001-11-02 2003-05-02 Cognos Incorporated Ameliorations pour operations de planification commerciale informatisees

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006002505A1 *

Also Published As

Publication number Publication date
WO2006002505A1 (fr) 2006-01-12

Similar Documents

Publication Publication Date Title
US8819783B2 (en) Efficient data structures for multi-dimensional security
US11734293B2 (en) System and method for client-side calculation in a multidimensional database environment
Kuhlmann et al. Role mining-revealing business roles for security administration using data mining technology
Bäumer et al. Framework development for large systems
US20220138226A1 (en) System and method for sandboxing support in a multidimensional database environment
CN104731791A (zh) 一种市场销售分析数据集市系统
US7401090B2 (en) Computer-based business planning processes
CN100543745C (zh) 基于数据属性的数据处理系统和方法
US20110071871A1 (en) Common semantic model of management of a supply chain
US7333995B2 (en) Very large dataset representation system and method
US8577841B2 (en) Enablement of quasi time dependency in organizational hierarchies
US10528522B1 (en) Metadata-based data valuation
Bhansali Strategic data warehousing: achieving alignment with business
WO2006002505A1 (fr) Procede et systeme de representation de tres grands ensembles de donnees
CA2472926A1 (fr) Systeme et methode de representation d'un ensemble de donnees tres etendu
Steinhoff The social reconfiguration of artificial intelligence: Utility and feasibility
Rosenthal et al. First-class views: a key to user-centered computing
Firestone et al. Knowledge base management systems and the knowledge warehouse: a" Strawman
Simonin et al. A data warehouse logical design method based on the alignment with business processes
Prakash et al. The Development Process
Sarkar et al. Implementation of graph semantic based multidimensional data model: An object relational approach
Chang et al. Dynamic Data Mart for Business Intelligence
Carbone et al. Intelligent Mediation in Active Knowledge Mining: Goals and General Description
AbdelSalam An integrated engineering-computation framework for collaborative engineering: An application in project management
Woon et al. Developing Applications in Corporate Finance: An Object-Oriented Database Management Approach

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070201

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20090226

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20090909