WO2015027831A1 - Multidimensional data processing method and device - Google Patents

Multidimensional data processing method and device Download PDF

Info

Publication number
WO2015027831A1
WO2015027831A1 PCT/CN2014/084506 CN2014084506W WO2015027831A1 WO 2015027831 A1 WO2015027831 A1 WO 2015027831A1 CN 2014084506 W CN2014084506 W CN 2014084506W WO 2015027831 A1 WO2015027831 A1 WO 2015027831A1
Authority
WO
WIPO (PCT)
Prior art keywords
attributes
attribute
recursive
layer
information
Prior art date
Application number
PCT/CN2014/084506
Other languages
French (fr)
Inventor
Hao Li
Lei Wu
Weiji ZENG
Fuhan CAI
Original Assignee
Tencent Technology (Shenzhen) Company Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology (Shenzhen) Company Limited filed Critical Tencent Technology (Shenzhen) Company Limited
Publication of WO2015027831A1 publication Critical patent/WO2015027831A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • the present invention relates to a technical field of computers, and more particularly to multidimensional data processing method and device.
  • Multidimensional data analysis is widely used in a variety of data analytics platforms.
  • Multidimensional data analysis is a part of the OLAP (On-Line Analytical Processing) technique and as a matter of fact, is the core technique.
  • Multidimensional data analysis functions for observing and parsing variations of measures in order to signify some of the measures according to selected important dimensions.
  • age is set as a condition for sifting the account data.
  • age is a dimensional attribute while a number of purchasers is measure data. From FIG. 1, it is shown that only one dimensional attribute concerning age is revealed in the data analysis result of the shopping platform. If any further dimensional data, e.g.
  • a general sifting process is performed by first obtaining data associated with a specified dimensional attribute from the entire account database, then screening the obtained data according to the other dimensional attributes in the dimensional attribute set one by one. Finally, measure data complying with all the conditions of dimensional attributes in the dimensional attribute set.
  • An embodiment of the present invention provides method and device for processing multidimensional data to solve the complicated computing problem in the case that a lot of dimensions and dimensional attributes are involved for repetitively screening out a variety of measure data from the entire account database.
  • a multidimensional data processing method comprising:
  • the recursive topologic structure including a set of attributes and recursive paths among the set of attributes;
  • a multidimensional data processing device comprising:
  • an acquisition unit realizing information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a database
  • the acquisition unit further realizes a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes;
  • a generation unit generating at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions
  • the generation unit further generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes;
  • a recursion unit receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • the embodiments according to the present invention provide multidimensional data processing method and device. Since a recursive topological structure is generated according to the top-layer sets of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes, when receiving a search request, measure data corresponding to an attribute set associated with the search request can be realized by way of recursion according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated.
  • measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
  • FIG. 1 is a displayed frame showing a data analysis interface according to prior art
  • FIG. 2 is a flowchart illustrating a multidimensional data processing method according to an embodiment of the present invention
  • FIG. 3 is a flowchart illustrating a multidimensional data processing method according to another embodiment of the present invention.
  • FIG. 4 is a scheme illustrating mapping relationship between fields in the account data and measures and dimensions according to an embodiment of the present invention
  • FIG. 5 is a scheme illustrating the data change in a multidimensional data processing method according to an embodiment of the present invention
  • FIG. 6 is a recursive topologic scheme according to an embodiment of the present invention.
  • FIG. 7 is another recursive topologic scheme according to an embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of a multidimensional data processing device according to an embodiment of the present invention.
  • FIG. 9 is a schematic block diagram of a multidimensional data processing device according to another embodiment of the present invention.
  • a multidimensional data processing method comprises the following steps.
  • Step 101 information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes are realized from a database.
  • the step further obtains account data of a data business, which is day-to-day account data recorded in a database when the data business, e.g. a website, an application program or an online game, is used.
  • Each dimension in the information of dimensions indicates a selected view for data analysis, and for example, can be regions or ages of users of an application program at issue.
  • the information of attributes in each of the dimensions includes attributes of different sizes in the same dimension, e.g. day, week, month or year in a time dimension.
  • the information of layer correlation of the attributes includes various kinds of layer correlation of the attributes, e.g.
  • Step 102 a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes is realized.
  • the finest attribute is the attribute of the smallest size among the information of attributes in the same dimension.
  • the attributes included in the information of attributes include year, month, day and hour, and then the attribute "hour" is the finest one of the attributes I the time dimension.
  • Step 103 at least one top-layer set of attributes is generated according to the finest one of the attributes in each of the dimensions.
  • the one or more finest attributes constitute the at least one top-layer set of attributes.
  • the top-layer set of attributes is directly acquired from the account data.
  • the acquired finest attributes are city, age and primary source, and thus the top-layer set of attributes is constituted by city, age and primary source.
  • the primary source belongs to a source dimension, and the source dimension may include a primary source and a secondary source, wherein the primary source may be real data source, e.g. a real website. Then the secondary source may be a set of the websites.
  • the websites belong to a social network.
  • Step 104 a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes.
  • roll-up of a set of attributes is generally conducted to generate a next-layer set of attributes.
  • the roll-up operation may be performed in two ways. The first one is to remove one of the attributes in the set of attributes so as to obtain the next-layer set of attributes. For example, in a set of attributes containing city, age and primary source, the primary source may be removed in the roll-up operation to form the next-layer set of attributes containing city and age. The other way is to enlarge the size of the attributes in the set of attributes according to the layer correlation of the attributes.
  • the attribute "city" in the set of attributes may be replaced by "province” for roll-up, thereby enlarging the size of attributes and generate another set of attributes containing province, age and primary source.
  • a general attribute is formed without limit to the set of attributes.
  • Step 105 a search request is received, and measure data corresponding to an attribute set associated with the search request is recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • measure data corresponding to a part of the set of attributes in the recursive topological structure is acquired first.
  • the search request is accompanied by an attribute set to be analyzed.
  • measure data corresponding to the attribute set associated with the search request can be recursed according to the recursive paths and the measure data corresponding to the previously acquired attribute set. For example, data of a registration number corresponding to an attribute set containing city, age and primary source is previously acquired.
  • the next-layer set of attributes is an attribute set containing city and age, so data of a registration number corresponding to the next-layer set of attributes containing city and age is acquired.
  • the host to execute the multidimensional data processing method provided according embodiments of the present invention can be a multidimensional data processing device.
  • the multidimensional data processing device can be implemented in, but not limited to, electronic equipment such as a computer or an internet server.
  • a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated.
  • measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
  • the embodiment of a multidimensional data processing method provided according to the present invention includes the following steps.
  • Step 201 account data of a data business, information of dimensions in the data business, information of attributes in each of the dimensions and information of layer correlation of the attributes are acquired.
  • the account data of a data business is day-to-day account data recorded in a database when the data business, e.g. a website, an application program or an online game, is used.
  • the account data may contain a user identity, a registration website, and an operation being performed.
  • Each dimension in the information of dimensions indicates a selected view for data analysis, and for example, can be regions or ages of users of an application program at issue.
  • the information of attributes in each of the dimensions includes attributes of different sizes in the same dimension, e.g. day, week, month or year in a time dimension.
  • the information of layer correlation of the attributes includes various kinds of layer correlation of the attributes, e.g. 7 days being equal to 1 week, 12 months being equal to 1 year in a time dimension, or several city regions being included in one province region which is further included in a country in a region dimension.
  • a specified field included in the user identity may correspond to the dimension of region or the dimension of age.
  • a specified field included in the website may correspond to the dimension of source. Examples are not limited to the above.
  • FIG. 4 schematically illustrates an example of mapping relationship between fields in the account data and measures and dimensions.
  • the user identity corresponds to the dimension of region and the dimension of age.
  • the dimension of region includes attributes of city, city level, province, and country.
  • the dimension of age includes age and age division.
  • the website corresponds to the dimension of source.
  • the dimension of source includes primary source and secondary source, wherein the primary source may be real data source, e.g. a real website, and the secondary source may be a set of the websites.
  • the websites belong to a social network.
  • Step 202 a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes is realized.
  • the finest attribute is the attribute of the smallest size among the information of attributes in the same dimension.
  • the attributes included in the information of attributes include year, month, day and hour, and then the attribute "hour" is the finest one of the attributes I the time dimension.
  • Step 203 at least one top-layer set of attributes is generated according to the finest one of the attributes in each of the dimensions.
  • the one or more finest attributes constitute the at least one top-layer set of attributes.
  • the top-layer set of attributes is directly acquired from the account data.
  • the acquired finest attributes are city, age and primary source, and thus the top-layer set of attributes is constituted by city, age and primary source.
  • Step 204 each set of attributes is traversed, and whether the set of attributes to be rolled up are general attributes or not is determined.
  • Step 205 is executed if the set of attributes are determined to be general attributes.
  • Step 206 is executed if the set of attributes are determined to be non-general attributes.
  • the set of attributes being traversed is not limited to the top-layer set of attributes. After Step 201 through Step 204 is performed, a variety of sets of attributes can be sequentially generated. Therefore, it is necessary to traverse the sets of attributes so as to gradually form a recursive topologic structure.
  • a general attribute is formed as a number of the attributes contained in the set of attributes is getting less and less and the attribute size becomes larger and larger with the stepwise roll up of the set of attributes, and finally no limitation for a set of attributes exists. For example, in a case that a registration number of a certain website is to be realized, the data of registration numbers in each city and at each age are realized if the set of attributes contains attributes of city and age. As a result of roll up, there will be no dimension of region and dimension of age in the final general attribute. Accordingly, the registration number will be the total registration number of the website.
  • Step 205 go back to Step 204 if it is determined that no further roll-up of the set of attributes is available.
  • the set of attributes are general attributes, none of the attributes in the set of attributes can be removed, and the attribute size in the set of attributes cannot be enlarged. Then, no roll-up can be done.
  • Step 206 each attribute in the set of attributes to be rolled up is traversed.
  • Step 207 whether a measure corresponding to the set of attributes needs to undergo universal duplication removal is determined.
  • Step 208 If it is determined that the measure corresponding to the set of attributes needs to undergo universal duplication removal, execute Step 208.
  • the measure corresponding to the set of attributes is a measure analyzed according to the requirement of the set of attributes, and in general, can be acquired by analyzing the account data. For example, as illustrated in FIG. 5, identities of users and the websites visited by each of the users are recorded in the account data. Therefore, by analyzing the account data, the measure to be realized may be the registration count or registration user number. It is to be noted that some measure cannot be obtained by simply summing the items included in the account data. For example, for the measure of registration user number, it is probable that more than one time of registration is conducted by the same user ID. Since it is allowable to accumulate only one to the registration user number, it is necessary to undergo universal duplication removal.
  • Step 208 whether an attribute in the set of attributes complies with a recursive condition is determined.
  • the attribute complies with the recursive condition means the element of account data, on which the measure to undergo universal duplication removal relies, has one and only one attribute value regarding the attribute.
  • the element of account data, on which the above-mentioned measure of registration user number relies is the user ID.
  • the registration user number can be determined.
  • the attribute is city, a user ID can only correspond to one city.
  • the user ID 250708 specifically corresponds to the city "Shenzhen”
  • the user ID 347516 specifically corresponds to the city "Guangzhou”
  • the attribute of city complies with the recursive condition.
  • the element of account data, on which the measure to undergo universal duplication removal relies does not have the only attribute value regarding the attribute, the recursive condition is not complied with.
  • Step 209 whether a father attribute among the attributes has a child attribute is determined according to the layer correlation of the attributes.
  • Step 211 If it is determined that the father attribute has a child attribute, execute Step 211.
  • Step 212 If it is determined that the father attribute has no child attribute, execute Step 212.
  • a country includes a plurality of provinces, and a province includes a plurality of cities. Therefore, "city” is a father attribute, “province” is a son attribute of the attribute “city”, and “country” is a son attribute of the "province” attribute. Since the father attribute is city, and city is the finest one of the attributes in the layer correlation of attributes, the attribute "city” has no son attribute.
  • Step 210 it is determined that no roll up of the set of attributes is performed according to the attributes
  • Step 211 a first strategy is determined as the roll-up strategy.
  • Step 212 a second strategy is determined as the roll-up strategy.
  • the second strategy is adopted to delete the father attribute, and the next-layer set of attributes is generated with the other attributes in the set of attributes. Subsequently, execute Step 214.
  • Step 213 the father attribute is replaced by the son attribute, and the son attribute and the other attributes in the set of attributes form the next-layer set of attributes. After Step 213, go back to Step 204.
  • Step 214 the father attribute is deleted, and the other attributes in the set of attributes form the next-layer set of attributes. After Step 214, go back to Step 204.
  • the top-layer set of attributes consists of attributes of city, age and primary source.
  • the registration numbers of the three attributes are acquired. After the first roll up, the attribute of primary source is deleted, and the attributes of city and age form a new set of attributes. The registration numbers of the attributes of city and age are acquired. After the second roll up, the attribute "city” is replaced by the attribute "province”, and the attributes of province and age form a new set of attributes. The registration numbers of the attributes of province and age are acquired.
  • a recursive topologic structure is finally acquired.
  • the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes.
  • a recursive topologic structure can be generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
  • measure data corresponding to an attribute set associated with the search request is recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • FIG. 6 schematically exemplifies a recursive topologic scheme, which can be generated by directly accumulating measures.
  • the accumulative measures are registration numbers.
  • the generated top-layer set of attributes in the recursive topologic structure is determined based on identities and registration websites of users.
  • the dimension to be analyzed can be a dimension of region, including city (hereinafter, Al) and province (hereinafter, A2), and/or a dimension of age, including age number (hereinafter, B l), and/or a dimension of source, including primary source (hereinafter, CI).
  • the top-layer set of attributes for example, is AIB ICI.
  • sets of attributes e.g. A2B 1C1, A1B 1, AlCl and B 1C1 can be acquired.
  • the set A2B 1C1 is further rolled up to acquire sets B 1C1, A2B 1 and A2C1.
  • sets A2B 1, Al and B l can be acquired.
  • sets A2C1, Al and CI can be acquired.
  • sets B l and CI can be acquired. Afterwards, the further roll up of the set A2B lor A2C1 results in A2, and the roll up of Al results in A2 or general attribute ALL.
  • the roll up of B l results in ALL, and the roll up of CI results in ALL.
  • the subsequent roll up of A2 also results in ALL. Accordingly, if measure data of some sets of attributes are acquired previously, for example the measure data corresponding to A1B 1 is previously acquired, the measure data corresponding to Al and B l can be directly recursed from the measure data of A1B 1.
  • FIG. 7 schematically exemplifies a recursive topologic scheme, which is generated by accumulating measures in need of universal duplication removal.
  • the generated top-layer set of attributes in the recursive topologic structure is determined based on identities and registration websites of users.
  • the dimension to be analyzed can be a dimension of region, including city (hereinafter, Al) and province (hereinafter, A2), and/or a dimension of age, including age number (hereinafter, B l), and/or a dimension of source, including primary source (hereinafter, CI).
  • the registration numbers acquired associated with the primary source cannot be simply accumulated because the same user ID may be used for registration of websites belonging to different attributes of primary source.
  • the attribute CI cannot be used to generate the recursive topologic structure.
  • the top-layer set of attributes is A1B 1C1.
  • sets of attributes e.g. A2B 1C1, AlCl and B lCl, can be acquired.
  • the set A2B 1C1 is further rolled up to acquire sets B lCl and A2C1.
  • the attribute AlB l needs to be acquired from the account data.
  • sets A2C1 and CI can be acquired.
  • set B lCl sets CI can be acquired.
  • the further roll up of the set A2B 1 results in A2
  • the roll up of Al results in A2 or general attribute ALL.
  • the roll up of Bl results in ALL
  • the roll up of CI results in ALL.
  • the subsequent roll up of A2 also results in ALL.
  • A1B 1C1 and AlB l are two top-layer sets of attributes, and recursive topologic structures of A1B 1C1 and AlB l are generated at two nodes, respectively. Two recursive topologic structures are separately maintained at the two nodes. If the attributes contained in the top-layer set of attributes, the same recursive topologic sub-tree may also split at a plurality of nodes for maintenance.
  • a main body executing the multidimensional data processing method provided according to the another embodiment of the present invention can be a multidimensional data processing device.
  • the multidimensional data processing device can be operated in, but not limited to, an electronic equipment such as a computer or a network server.
  • the recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated.
  • measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
  • An acquisition unit 31 acquires information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a data business.
  • the acquisition unit 31 further realizes a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes.
  • a generation unit 32 generates at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions.
  • the generation unit 32 further generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes.
  • the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes.
  • a recursion unit 33 receives a search request, and recurses measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
  • the generation unit 32 includes:
  • a discrimination module 321 determining whether the set of attributes are general attributes or not.
  • a generation module 322 conducting roll-up of the set of attributes to generate a next-layer set of attributes if the set of attributes are not general attributes.
  • the generation module 322 includes:
  • a traverse sub-module 3221 traversing each attribute in each set of attributes; an acquisition sub-module 3222 realizing each condition to be complied with for the each attribute;
  • a determination sub-module 3223 determining a roll-up strategy of the set of attributes according to the each condition to be complied with for the each attribute
  • a generation sub-module 3224 conducting roll-up of the set of attributes to generate the next-layer set of attributes according to the roll-up strategy.
  • the acquisition sub-module 3222 further determines whether the measure corresponding to the set of attributes needs to undergo universal duplication removal.
  • the acquisition sub-module 3222 further determines whether an attribute in the set of attributes complies with a recursive condition or not if it is determined that the measure corresponding to the set of attributes needs to undergo universal duplication removal.
  • the determination sub-module 3223further determines not to conduct roll-up of the set of attributes according to the attributes if it is determined no attribute in the set of attributes complies with the recursive condition.
  • the acquisition sub-module 3222 further determines whether a father attribute among the attributes has a child attribute according to the layer correlation of the attributes if it is determined no measure needs to undergo universal duplication removal or it is determined an attribute in the set of attributes complies with the recursive condition.
  • the determination sub-module further determines a first strategy as the roll-up strategy if it is determined a father attribute has a child attribute, wherein according to the first strategy, the child attribute replaces the father attribute to constitute, along with the other attributes in the set of attributes, the next-layer set of attributes, and determines a second strategy as the roll-up strategy if it is determined the father attribute has no child attribute, wherein according to the second strategy, the father attribute is deleted, and generating the next-layer set of attributes with the other attributes in the set of attributes.
  • the generation unit 32 generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
  • the multidimensional data processing device provided according to an embodiment of the present invention, since a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. In contrast, according to prior art, when measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated. In the present invention, measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
  • the invention can be implemented with software and essential hardware. Hardware may also be used to implement the invention even though the combination of software and hardware would be preferred. Accordingly, the technical solution according to the present invention as well as the contribution relative to the prior art is basically the software part.
  • the software product of computer is stored in an accessible storage medium, e.g. floppy disc, hard disc or optical disc of a computer.
  • a plurality of commands are used to have a computer device, e.g. personal computer, server or network equipment, execute the methods according to the embodiments of the present invention.

Abstract

Multidimensional data processing method and device are used in computers to simplify computation without repetitive acquisition of attribute measure data in a variety of dimensions from the entire account data. The method for multidimensional data analysis includes realizing information of dimensions, attributes in the dimensions and layer correlation of the attributes; realizing a finest attribute in each dimension according to the above-mentioned information; generating a top-layer set of attributes according to the finest attribute in each of the dimensions; generating a recursive topologic structure according to the top-layer set of attributes and layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes; and receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.

Description

MULTIDIMENSIONAL DATA PROCESSING METHOD AND
DEVICE
FIELD OF THE INVENTION
[0001] The present invention relates to a technical field of computers, and more particularly to multidimensional data processing method and device.
BACKGROUND OF THE INVENTION
[0002] With current technical development of internet and computers, multidimensional data analysis is widely used in a variety of data analytics platforms. Multidimensional data analysis is a part of the OLAP (On-Line Analytical Processing) technique and as a matter of fact, is the core technique. Multidimensional data analysis functions for observing and parsing variations of measures in order to signify some of the measures according to selected important dimensions.
[0003] Nowadays, there are a lot of websites, e.g. shopping or self-analysis platforms, adopting multidimensional data analysis. For example, as shown in FIG. 1, for analyzing numbers of purchasers in different age divisions buying specified goods on a shopping platform, age is set as a condition for sifting the account data. In this example, age is a dimensional attribute while a number of purchasers is measure data. From FIG. 1, it is shown that only one dimensional attribute concerning age is revealed in the data analysis result of the shopping platform. If any further dimensional data, e.g. numbers of purchasers in each age in each city region, numbers of purchasers in each age in each province region and/or numbers of purchasers in each age division in each province region, are to be revealed for analysis involving multidimensional attributes, it needs to sift the entire account database to realize each and every measure data in a dimensional attribute set. A general sifting process is performed by first obtaining data associated with a specified dimensional attribute from the entire account database, then screening the obtained data according to the other dimensional attributes in the dimensional attribute set one by one. Finally, measure data complying with all the conditions of dimensional attributes in the dimensional attribute set.
[0004] Therefore, if there are many dimensions and dimensional attributes involved for repetitively screening out a variety of measure data from the entire account database, the computation would be very complicated.
SUMMARY OF THE INVENTION
[0005] An embodiment of the present invention provides method and device for processing multidimensional data to solve the complicated computing problem in the case that a lot of dimensions and dimensional attributes are involved for repetitively screening out a variety of measure data from the entire account database.
[0006] The above object can be achieved by adopting the following technical solutions according to the present invention.
[0007] A multidimensional data processing method, comprising:
realizing information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a database;
realizing a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes;
generating at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions;
generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes; and
receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. [0008] A multidimensional data processing device, comprising:
an acquisition unit realizing information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a database;
wherein the acquisition unit further realizes a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes;
a generation unit generating at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions;
wherein the generation unit further generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes; and
a recursion unit receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
[0009] The embodiments according to the present invention provide multidimensional data processing method and device. Since a recursive topological structure is generated according to the top-layer sets of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes, when receiving a search request, measure data corresponding to an attribute set associated with the search request can be realized by way of recursion according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. According to prior art, when measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated. On the other hand, according to the present invention, measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] In order to have the technical solutions according to the present invention or the prior art understood in a better way, drawings required for subsequently describing the embodiments of the present invention or prior art are briefly described herein. It is known to those ordinary skilled in the art that the drawings described as follows are only for illustrating embodiments or examples of the present invention, and associated drawings can be realized accordingly without creative efforts, in which:
[0011] FIG. 1 is a displayed frame showing a data analysis interface according to prior art;
[0012] FIG. 2 is a flowchart illustrating a multidimensional data processing method according to an embodiment of the present invention;
[0013] FIG. 3 is a flowchart illustrating a multidimensional data processing method according to another embodiment of the present invention;
[0014] FIG. 4 is a scheme illustrating mapping relationship between fields in the account data and measures and dimensions according to an embodiment of the present invention;
[0015] FIG. 5 is a scheme illustrating the data change in a multidimensional data processing method according to an embodiment of the present invention;
[0016] FIG. 6 is a recursive topologic scheme according to an embodiment of the present invention;
[0017] FIG. 7 is another recursive topologic scheme according to an embodiment of the present invention;
[0018] FIG. 8 is a schematic block diagram of a multidimensional data processing device according to an embodiment of the present invention; and [0019] FIG. 9 is a schematic block diagram of a multidimensional data processing device according to another embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0020] Hereinafter, with reference to the drawings, the technical solutions according to embodiments of the present invention will be described in a clear and complete way. However, it is to be noted that the embodiments are only described as examples instead of covering all the possible embodiments. Based on the following embodiments, those ordinary in the art may realize alternative embodiments and examples without creative efforts, which are within the scope of the present invention.
[0021] In order to make the advantages of the technical solutions according to the present invention understood in a better way, descriptions of embodiments according to the present invention are given as follows with reference to the drawings.
[0022] As shown in FIG. 2, a multidimensional data processing method according to an embodiment of the present invention comprises the following steps.
[0023] In Step 101, information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes are realized from a database.
[0024] The step further obtains account data of a data business, which is day-to-day account data recorded in a database when the data business, e.g. a website, an application program or an online game, is used. Each dimension in the information of dimensions indicates a selected view for data analysis, and for example, can be regions or ages of users of an application program at issue. The information of attributes in each of the dimensions includes attributes of different sizes in the same dimension, e.g. day, week, month or year in a time dimension. The information of layer correlation of the attributes includes various kinds of layer correlation of the attributes, e.g. 7 days being equal to 1 week, 12 months being equal to 1 year in a time dimension, or several city regions being included in one province region which is further included in a country in a region dimension. [0025] In Step 102, a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes is realized.
[0026] The finest attribute is the attribute of the smallest size among the information of attributes in the same dimension. For example, in a time dimension, the attributes included in the information of attributes include year, month, day and hour, and then the attribute "hour" is the finest one of the attributes I the time dimension.
[0027] In Step 103, at least one top-layer set of attributes is generated according to the finest one of the attributes in each of the dimensions.
[0028] After one or more finest attributes are acquired in Step 102, the one or more finest attributes constitute the at least one top-layer set of attributes. The top-layer set of attributes is directly acquired from the account data. For example, the acquired finest attributes are city, age and primary source, and thus the top-layer set of attributes is constituted by city, age and primary source. The primary source belongs to a source dimension, and the source dimension may include a primary source and a secondary source, wherein the primary source may be real data source, e.g. a real website. Then the secondary source may be a set of the websites. For example, the websites belong to a social network.
[0029] In Step 104, a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes.
[0030] For multidimensional data processing, roll-up of a set of attributes is generally conducted to generate a next-layer set of attributes. The roll-up operation may be performed in two ways. The first one is to remove one of the attributes in the set of attributes so as to obtain the next-layer set of attributes. For example, in a set of attributes containing city, age and primary source, the primary source may be removed in the roll-up operation to form the next-layer set of attributes containing city and age. The other way is to enlarge the size of the attributes in the set of attributes according to the layer correlation of the attributes. For example, in a set of attributes containing city, age and primary source, since city belongs to a region dimension which further includes attributes of country and province, the attribute "city" in the set of attributes may be replaced by "province" for roll-up, thereby enlarging the size of attributes and generate another set of attributes containing province, age and primary source. After the roll-up operation in either of the above-mentioned ways, a general attribute is formed without limit to the set of attributes.
[0031] In Step 105, a search request is received, and measure data corresponding to an attribute set associated with the search request is recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
[0032] After the recursive topological structure is built, measure data corresponding to a part of the set of attributes in the recursive topological structure is acquired first. The search request is accompanied by an attribute set to be analyzed. After the search request is received, measure data corresponding to the attribute set associated with the search request can be recursed according to the recursive paths and the measure data corresponding to the previously acquired attribute set. For example, data of a registration number corresponding to an attribute set containing city, age and primary source is previously acquired. Then according to the recursive paths, it is realized that the next-layer set of attributes is an attribute set containing city and age, so data of a registration number corresponding to the next-layer set of attributes containing city and age is acquired.
[0033] It is to be noted that the host to execute the multidimensional data processing method provided according embodiments of the present invention can be a multidimensional data processing device. The multidimensional data processing device can be implemented in, but not limited to, electronic equipment such as a computer or an internet server.
[0034] In the multidimensional data processing method provided according embodiments of the present invention, since a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. In contrast, according to prior art, when measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated. In the present invention, measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
[0035] Hereinafter, another embodiment will be described in more detail with reference to FIG. 3. The embodiment of a multidimensional data processing method provided according to the present invention includes the following steps.
[0036] In Step 201, account data of a data business, information of dimensions in the data business, information of attributes in each of the dimensions and information of layer correlation of the attributes are acquired.
[0037] The account data of a data business is day-to-day account data recorded in a database when the data business, e.g. a website, an application program or an online game, is used. The account data may contain a user identity, a registration website, and an operation being performed. Each dimension in the information of dimensions indicates a selected view for data analysis, and for example, can be regions or ages of users of an application program at issue. The information of attributes in each of the dimensions includes attributes of different sizes in the same dimension, e.g. day, week, month or year in a time dimension. The information of layer correlation of the attributes includes various kinds of layer correlation of the attributes, e.g. 7 days being equal to 1 week, 12 months being equal to 1 year in a time dimension, or several city regions being included in one province region which is further included in a country in a region dimension.
[0038] After the account data is acquired, it is necessary to establish mapping relationship between fields in the account data and measures and dimensions. For example, in the account data, a specified field included in the user identity may correspond to the dimension of region or the dimension of age. In another example, a specified field included in the website may correspond to the dimension of source. Examples are not limited to the above. FIG. 4 schematically illustrates an example of mapping relationship between fields in the account data and measures and dimensions. As shown, the user identity corresponds to the dimension of region and the dimension of age. The dimension of region includes attributes of city, city level, province, and country. The dimension of age includes age and age division. Furthermore, the website corresponds to the dimension of source. The dimension of source includes primary source and secondary source, wherein the primary source may be real data source, e.g. a real website, and the secondary source may be a set of the websites. For example, the websites belong to a social network.
[0039] In Step 202, a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes is realized.
[0040] The finest attribute is the attribute of the smallest size among the information of attributes in the same dimension. For example, in a time dimension, the attributes included in the information of attributes include year, month, day and hour, and then the attribute "hour" is the finest one of the attributes I the time dimension.
[0041] In Step 203, at least one top-layer set of attributes is generated according to the finest one of the attributes in each of the dimensions.
[0042] After one or more finest attributes are acquired in Step 202, the one or more finest attributes constitute the at least one top-layer set of attributes. The top-layer set of attributes is directly acquired from the account data. For example, the acquired finest attributes are city, age and primary source, and thus the top-layer set of attributes is constituted by city, age and primary source.
[0043] In Step 204, each set of attributes is traversed, and whether the set of attributes to be rolled up are general attributes or not is determined.
[0044] Step 205 is executed if the set of attributes are determined to be general attributes.
[0045] Step 206 is executed if the set of attributes are determined to be non-general attributes.
[0046] The set of attributes being traversed is not limited to the top-layer set of attributes. After Step 201 through Step 204 is performed, a variety of sets of attributes can be sequentially generated. Therefore, it is necessary to traverse the sets of attributes so as to gradually form a recursive topologic structure. A general attribute is formed as a number of the attributes contained in the set of attributes is getting less and less and the attribute size becomes larger and larger with the stepwise roll up of the set of attributes, and finally no limitation for a set of attributes exists. For example, in a case that a registration number of a certain website is to be realized, the data of registration numbers in each city and at each age are realized if the set of attributes contains attributes of city and age. As a result of roll up, there will be no dimension of region and dimension of age in the final general attribute. Accordingly, the registration number will be the total registration number of the website.
[0047] In Step 205, go back to Step 204 if it is determined that no further roll-up of the set of attributes is available.
[0048] If it is determined that the set of attributes are general attributes, none of the attributes in the set of attributes can be removed, and the attribute size in the set of attributes cannot be enlarged. Then, no roll-up can be done.
[0049] In Step 206, each attribute in the set of attributes to be rolled up is traversed.
[0050] In Step 207, whether a measure corresponding to the set of attributes needs to undergo universal duplication removal is determined.
[0051] If it is determined that the measure corresponding to the set of attributes needs to undergo universal duplication removal, execute Step 208.
[0052] If it is determined that the measure corresponding to the set of attributes does not need to undergo universal duplication removal, execute Step 209.
[0053] The measure corresponding to the set of attributes is a measure analyzed according to the requirement of the set of attributes, and in general, can be acquired by analyzing the account data. For example, as illustrated in FIG. 5, identities of users and the websites visited by each of the users are recorded in the account data. Therefore, by analyzing the account data, the measure to be realized may be the registration count or registration user number. It is to be noted that some measure cannot be obtained by simply summing the items included in the account data. For example, for the measure of registration user number, it is probable that more than one time of registration is conducted by the same user ID. Since it is allowable to accumulate only one to the registration user number, it is necessary to undergo universal duplication removal.
[0054] In Step 208, whether an attribute in the set of attributes complies with a recursive condition is determined.
[0055] If it is determined that the attribute complies with the recursive condition, execute Step 209.
[0056] If it is determined that the attribute does not comply with the recursive condition, execute Step 210.
[0057] In practice, that the attribute complies with the recursive condition means the element of account data, on which the measure to undergo universal duplication removal relies, has one and only one attribute value regarding the attribute. For example, the element of account data, on which the above-mentioned measure of registration user number relies, is the user ID. In other words, by an alyzing identities of users appearing in the account data, the registration user number can be determined. If the attribute is city, a user ID can only correspond to one city. For example, as shown in FIG. 5, the user ID 250708 specifically corresponds to the city "Shenzhen", and the user ID 347516 specifically corresponds to the city "Guangzhou", so the attribute of city complies with the recursive condition. Otherwise, if the element of account data, on which the measure to undergo universal duplication removal relies, does not have the only attribute value regarding the attribute, the recursive condition is not complied with.
[0058] In Step 209, whether a father attribute among the attributes has a child attribute is determined according to the layer correlation of the attributes.
[0059] If it is determined that the father attribute has a child attribute, execute Step 211.
[0060] If it is determined that the father attribute has no child attribute, execute Step 212.
[0061] For example, in the region dimension, a country includes a plurality of provinces, and a province includes a plurality of cities. Therefore, "city" is a father attribute, "province" is a son attribute of the attribute "city", and "country" is a son attribute of the "province" attribute. Since the father attribute is city, and city is the finest one of the attributes in the layer correlation of attributes, the attribute "city" has no son attribute.
[0062] In Step 210, it is determined that no roll up of the set of attributes is performed according to the attributes,
[0063] Go back to Step 206 after Step 210.
[0064] Herein, if an attribute in the set of attributes is determined not to comply with the recursive condition, no roll up will be performed according the attribute. It is because if the element of account data, on which the measure to undergo universal duplication removal relies, does not have the only attribute value regarding the attribute, the removal of the attribute to form the next-layer set of attributes may cause the measure data corresponding to the next-layer set of attributes unable to undergo the duplication removal and accumulation.
[0065] In Step 211, a first strategy is determined as the roll-up strategy.
[0066] The first strategy is adopted to replace the father attribute with the son attribute, and the son attribute constitutes, along with the other attributes in the set of attributes, the next-layer set of attributes. Subsequently, execute Step 213. [0067] In Step 212, a second strategy is determined as the roll-up strategy.
[0068] The second strategy is adopted to delete the father attribute, and the next-layer set of attributes is generated with the other attributes in the set of attributes. Subsequently, execute Step 214.
[0069] In Step 213, the father attribute is replaced by the son attribute, and the son attribute and the other attributes in the set of attributes form the next-layer set of attributes. After Step 213, go back to Step 204.
[0070] In Step 214, the father attribute is deleted, and the other attributes in the set of attributes form the next-layer set of attributes. After Step 214, go back to Step 204.
[0071] For example, as shown in FIG. 5, the top-layer set of attributes consists of attributes of city, age and primary source. The registration numbers of the three attributes are acquired. After the first roll up, the attribute of primary source is deleted, and the attributes of city and age form a new set of attributes. The registration numbers of the attributes of city and age are acquired. After the second roll up, the attribute "city" is replaced by the attribute "province", and the attributes of province and age form a new set of attributes. The registration numbers of the attributes of province and age are acquired.
[0072] By way of Steps 201-214, a recursive topologic structure is finally acquired. The recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. In practice, a recursive topologic structure can be generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
[0073] After a subsequent search request is received, measure data corresponding to an attribute set associated with the search request is recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
[0074] FIG. 6 schematically exemplifies a recursive topologic scheme, which can be generated by directly accumulating measures. The accumulative measures are registration numbers. The generated top-layer set of attributes in the recursive topologic structure is determined based on identities and registration websites of users. The dimension to be analyzed, for example, can be a dimension of region, including city (hereinafter, Al) and province (hereinafter, A2), and/or a dimension of age, including age number (hereinafter, B l), and/or a dimension of source, including primary source (hereinafter, CI). During the generation of the recursive topologic structure, if the same next-layer set of attributes is acquired from different sets of attributes, the next-layer set of attributes needs to undergo duplication removal. In practice, the top-layer set of attributes, for example, is AIB ICI. By way of a roll up operation, sets of attributes, e.g. A2B 1C1, A1B 1, AlCl and B 1C1, can be acquired. Then the set A2B 1C1 is further rolled up to acquire sets B 1C1, A2B 1 and A2C1. Likewise, by further rolling up the set A1B 1, sets A2B 1, Al and B l can be acquired. By rolling up the set AlCl, sets A2C1, Al and CI can be acquired. By rolling up the set B 1C1, sets B l and CI can be acquired. Afterwards, the further roll up of the set A2B lor A2C1 results in A2, and the roll up of Al results in A2 or general attribute ALL. The roll up of B l results in ALL, and the roll up of CI results in ALL. The subsequent roll up of A2 also results in ALL. Accordingly, if measure data of some sets of attributes are acquired previously, for example the measure data corresponding to A1B 1 is previously acquired, the measure data corresponding to Al and B l can be directly recursed from the measure data of A1B 1.
[0075] FIG. 7 schematically exemplifies a recursive topologic scheme, which is generated by accumulating measures in need of universal duplication removal. The generated top-layer set of attributes in the recursive topologic structure is determined based on identities and registration websites of users. The dimension to be analyzed, for example, can be a dimension of region, including city (hereinafter, Al) and province (hereinafter, A2), and/or a dimension of age, including age number (hereinafter, B l), and/or a dimension of source, including primary source (hereinafter, CI). The registration numbers acquired associated with the primary source cannot be simply accumulated because the same user ID may be used for registration of websites belonging to different attributes of primary source. In other words, a user ID under a specified primary source attribute may correspond to different primary source attributes. Therefore, the attribute CI cannot be used to generate the recursive topologic structure. During the generation of the recursive topologic structure, if the same next-layer set of attributes is acquired from different sets of attributes, the next-layer set of attributes needs to undergo duplication removal. In practice, the top-layer set of attributes, for example, is A1B 1C1. By way of a roll up operation, sets of attributes, e.g. A2B 1C1, AlCl and B lCl, can be acquired. Then the set A2B 1C1 is further rolled up to acquire sets B lCl and A2C1. Since the attribute CI cannot be used to generate the recursive topologic structure, the attribute AlB l needs to be acquired from the account data. By rolling up the set AlCl, sets A2C1 and CI can be acquired. By rolling up the set B lCl, set CI can be acquired. Afterwards, the further roll up of the set A2B 1 results in A2, and the roll up of Al results in A2 or general attribute ALL. The roll up of Bl results in ALL, and the roll up of CI results in ALL. The subsequent roll up of A2 also results in ALL. Accordingly, if measure data of some sets of attributes are acquired previously, for example the measure data corresponding to AlB l is previously acquired, the measure data corresponding to Al and B l can be directly recursed from the measure data of AlB l. As shown in FIG. 7, A1B 1C1 and AlB l are two top-layer sets of attributes, and recursive topologic structures of A1B 1C1 and AlB l are generated at two nodes, respectively. Two recursive topologic structures are separately maintained at the two nodes. If the attributes contained in the top-layer set of attributes, the same recursive topologic sub-tree may also split at a plurality of nodes for maintenance.
[0076] It is to be noted that a main body executing the multidimensional data processing method provided according to the another embodiment of the present invention can be a multidimensional data processing device. The multidimensional data processing device can be operated in, but not limited to, an electronic equipment such as a computer or a network server.
[0077] In the multidimensional data processing method provided according to the another embodiment of the present invention, since a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. In contrast, according to prior art, when measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated. In the present invention, measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
[0078] A multidimensional data processing device provided according to an embodiment of the present invention as shown in FIG. 8, which corresponds to the embodiment of multidimensional data processing method illustrated in FIG. 2 and FIG. 3, includes the following units.
[0079] An acquisition unit 31 acquires information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a data business.
[0080] The acquisition unit 31 further realizes a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes.
[0081] A generation unit 32 generates at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions.
[0082] The generation unit 32 further generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes. The recursive topologic structure includes a set of attributes and recursive paths among the set of attributes.
[0083] A recursion unit 33 receives a search request, and recurses measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
[0084] In practice, referring to FIG. 9, the generation unit 32 includes:
a discrimination module 321 determining whether the set of attributes are general attributes or not; and
a generation module 322 conducting roll-up of the set of attributes to generate a next-layer set of attributes if the set of attributes are not general attributes.
[0085] In practice, referring to FIG. 9, the generation module 322 includes:
a traverse sub-module 3221 traversing each attribute in each set of attributes; an acquisition sub-module 3222 realizing each condition to be complied with for the each attribute;
a determination sub-module 3223 determining a roll-up strategy of the set of attributes according to the each condition to be complied with for the each attribute; and
a generation sub-module 3224 conducting roll-up of the set of attributes to generate the next-layer set of attributes according to the roll-up strategy.
[0086] In practice, the acquisition sub-module 3222 further determines whether the measure corresponding to the set of attributes needs to undergo universal duplication removal.
[0087] Furthermore, as shown in FIG. 9, the acquisition sub-module 3222 further determines whether an attribute in the set of attributes complies with a recursive condition or not if it is determined that the measure corresponding to the set of attributes needs to undergo universal duplication removal.
[0088] In practice, referring to FIG. 9, the determination sub-module 3223further determines not to conduct roll-up of the set of attributes according to the attributes if it is determined no attribute in the set of attributes complies with the recursive condition.
[0089] Furthermore, referring to FIG. 9, the acquisition sub-module 3222 further determines whether a father attribute among the attributes has a child attribute according to the layer correlation of the attributes if it is determined no measure needs to undergo universal duplication removal or it is determined an attribute in the set of attributes complies with the recursive condition.
[0090] Furthermore, referring to FIG. 9,the determination sub-module further determines a first strategy as the roll-up strategy if it is determined a father attribute has a child attribute, wherein according to the first strategy, the child attribute replaces the father attribute to constitute, along with the other attributes in the set of attributes, the next-layer set of attributes, and determines a second strategy as the roll-up strategy if it is determined the father attribute has no child attribute, wherein according to the second strategy, the father attribute is deleted, and generating the next-layer set of attributes with the other attributes in the set of attributes.
[0091] In practice, referring to FIG. 9, the generation unit 32generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
[0092] In the multidimensional data processing device provided according to an embodiment of the present invention, since a recursive topologic structure is generated according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, wherein the recursive topologic structure includes a set of attributes and recursive paths among the set of attributes. Therefore, when a search request is received, measure data corresponding to an attribute set associated with the search request can be recursed according to the recursive paths and a specified measure data corresponding to a previously specified attribute set. In contrast, according to prior art, when measure data corresponding to an attribute set is required, it is necessary to acquire measure data associated with a variety of dimensional attribute sets from the account database, so the computation is relative complicated. In the present invention, measure data corresponding to unknown attribute sets can be realized according to the recursive topological structure and a specified measure data corresponding to a previously specified attribute set, thereby simplifying the computation.
[0093] According to the above descriptions of embodiments, those skilled in the art are able to clearly understand that the invention can be implemented with software and essential hardware. Hardware may also be used to implement the invention even though the combination of software and hardware would be preferred. Accordingly, the technical solution according to the present invention as well as the contribution relative to the prior art is basically the software part. The software product of computer is stored in an accessible storage medium, e.g. floppy disc, hard disc or optical disc of a computer. A plurality of commands are used to have a computer device, e.g. personal computer, server or network equipment, execute the methods according to the embodiments of the present invention.
[0094] The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the disclosure, and all such modifications are intended to be included within the scope of the disclosure.

Claims

WHAT IS CLAIMED IS:
1. A multidimensional data processing method, comprising:
realizing information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a database; realizing a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes;
generating at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions;
generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes; and
receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
2. The multidimensional data processing method according to claim 1, wherein generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes includes:
determining whether the set of attributes are general attributes or not; and conducting roll-up of the set of attributes to generate a next-layer set of attributes if the set of attributes are not general attributes.
3. The multidimensional data processing method according to claim 2, wherein conducting roll-up of the set of attributes to generate a next-layer set of attributes includes:
traversing each attribute in each set of attributes;
realizing each condition to be complied with for the each attribute; determining a roll-up strategy of the set of attributes according to the each condition to be complied with for the each attribute; and
conducting roll-up of the set of attributes to generate the next-layer set of attributes according to the roll-up strategy.
4. The multidimensional data processing method according to claim 3, wherein realizing each condition to be complied with for the each attribute includes:
determining whether a measure corresponding to the set of attributes needs to undergo universal duplication removal.
5. The multidimensional data processing method according to claim 4, wherein realizing a condition to be complied with for the each attribute includes:
determining whether an attribute in the set of attributes complies with a recursive condition if it is determined the measure corresponding to the set of attributes needs to undergo universal duplication removal.
6. The multidimensional data processing method according to claim 5, wherein determining a roll-up strategy of the set of attributes according to the condition to be complied with for the each attribute includes:
determining not to conduct roll-up of the set of attributes if it is determined no attribute in the set of attributes complies with the recursive condition.
7. The multidimensional data processing method according to claim 5, wherein realizing each condition to be complied with for the each attribute includes:
determining whether a father attribute among the attributes has a child attribute according to the layer correlation of the attributes if it is determined no measure needs to undergo universal duplication removal or it is determined an attribute in the set of attributes complies with the recursive condition.
8. The multidimensional data processing method according to claim 6, wherein determining a roll-up strategy of the set of attributes according to the condition to be complied with for the each attribute further includes:
determining a first strategy as the roll-up strategy if it is determined a father attribute has a child attribute, wherein according to the first strategy, the child attribute replaces the father attribute to constitute, along with the other attributes in the set of attributes, the next-layer set of attributes; and
determining a second strategy as the roll-up strategy if it is determined the father attribute has no child attribute, wherein according to the second strategy, the father attribute is deleted, and generating the next-layer set of attributes with the other attributes in the set of attributes.
9. The multidimensional data processing method according to any of claims 1-8, wherein generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes includes:
generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
10. A multidimensional data processing device, comprising:
an acquisition unit realizing information of dimensions, information of attributes in each of the dimensions and information of layer correlation of the attributes from a database;
wherein the acquisition unit further realizes a finest one of the attributes in each of the dimensions according to the information of dimensions, information of attributes in each of the dimensions and information of layers of the attributes;
a generation unit generating at least one top-layer set of attributes according to the finest one of the attributes in each of the dimensions;
wherein the generation unit further generates a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes, the recursive topologic structure including a set of attributes and recursive paths among the set of attributes; and
a recursion unit receiving a search request, and recursing measure data corresponding to an attribute set associated with the search request according to the recursive paths and a specified measure data corresponding to a previously specified attribute set.
11. The multidimensional data processing device according to claim 10, wherein the generation unit includes:
a discrimination module determining whether the set of attributes are general attributes or not; and
a generation module conducting roll-up of the set of attributes to generate a next-layer set of attributes if the set of attributes are not general attributes.
12. The multidimensional data processing device according to claim 11, wherein the generating module includes:
a traverse sub-module traversing each attribute in each set of attributes;
an acquisition sub-module realizing each condition to be complied with for the each attribute;
a determination sub-module determining a roll-up strategy of the set of attributes according to the each condition to be complied with for the each attribute; and
a generation sub-module conducting roll-up of the set of attributes to generate the next-layer set of attributes according to the roll-up strategy.
13. The multidimensional data processing device according to claim 12, wherein the generation sub-module determines whether a measure corresponding to the set of attributes needs to undergo universal duplication removal.
14. The multidimensional data processing device according to claim 13, wherein the acquisition sub-module further determines whether an attribute in the set of attributes complies with a recursive condition if it is determined the measure corresponding to the set of attributes needs to undergo universal duplication removal.
15. The multidimensional data processing device according to claim 14, wherein the determination sub-module determines not to conduct roll-up of the set of attributes if it is determined no attribute in the set of attributes complies with the recursive condition.
16. The multidimensional data processing device according to claim 14, wherein the acquisition sub-module further determines whether a father attribute among the attributes has a child attribute according to the layer correlation of the attributes if it is determined no measure needs to undergo universal duplication removal or it is determined an attribute in the set of attributes complies with the recursive condition.
17. The multidimensional data processing device according to claim 15, wherein the determination sub-module further determines a first strategy as the roll-up strategy if it is determined a father attribute has a child attribute, wherein according to the first strategy, the child attribute replaces the father attribute to constitute, along with the other attributes in the set of attributes, the next-layer set of attributes, and determines a second strategy as the roll-up strategy if it is determined the father attribute has no child attribute, wherein according to the second strategy, the father attribute is deleted, and generating the next-layer set of attributes with the other attributes in the set of attributes.
18. The multidimensional data processing device according to any of claims 10-17, wherein the generation unit generating a recursive topologic structure according to the at least one top-layer set of attributes and the information of layer correlation of the attributes at a plurality of nodes respectively corresponding to the at least one top-layer set of attributes.
PCT/CN2014/084506 2013-08-26 2014-08-15 Multidimensional data processing method and device WO2015027831A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310376349.0 2013-08-26
CN201310376349.0A CN104424231B (en) 2013-08-26 2013-08-26 The processing method and processing device of multidimensional data

Publications (1)

Publication Number Publication Date
WO2015027831A1 true WO2015027831A1 (en) 2015-03-05

Family

ID=52585550

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/084506 WO2015027831A1 (en) 2013-08-26 2014-08-15 Multidimensional data processing method and device

Country Status (2)

Country Link
CN (1) CN104424231B (en)
WO (1) WO2015027831A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451142A (en) * 2016-05-31 2017-12-08 北京京东尚科信息技术有限公司 The method and apparatus and its management system of data are write and inquired about in database
CN113761036A (en) * 2021-09-07 2021-12-07 国网福建省电力有限公司经济技术研究院 Power grid statistics professional index multidimensional association table query display method

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106445934A (en) * 2015-08-04 2017-02-22 北京奇虎科技有限公司 Data processing method and apparatus
CN106557498A (en) * 2015-09-25 2017-04-05 北京国双科技有限公司 Date storage method and device and data query method and apparatus
CN106157088A (en) * 2016-04-28 2016-11-23 美信网络技术有限公司 A kind of method recording Information Communication
CN107025542B (en) * 2016-10-27 2020-12-29 创新先进技术有限公司 Method and apparatus for providing integration capability of channel combination
CN110019425A (en) * 2017-08-22 2019-07-16 北京京东尚科信息技术有限公司 A kind of method and apparatus that data are shown
CN107527070B (en) * 2017-08-25 2020-03-24 南京小睿软件有限公司 Identification method of dimension data and index data, storage medium and server
CN107562893A (en) * 2017-09-06 2018-01-09 叶进蓉 A kind of multi-dimensional data duplicate removal method and system being used in network log file
CN110601866B (en) * 2018-06-13 2023-01-24 阿里巴巴集团控股有限公司 Flow analysis system, data acquisition device, data processing device and method
CN109710610B (en) * 2018-12-17 2020-12-01 北京三快在线科技有限公司 Data processing method and device and computing equipment
CN109739940A (en) * 2018-12-29 2019-05-10 东软集团股份有限公司 On-line analytical processing method, apparatus, storage medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5940818A (en) * 1997-06-30 1999-08-17 International Business Machines Corporation Attribute-based access for multi-dimensional databases
US20040015507A1 (en) * 2002-07-19 2004-01-22 Amir Netz System and method for analytically modeling data organized according to related attributes
US20070271227A1 (en) * 2006-05-16 2007-11-22 Business Objects, S.A. Apparatus and method for recursively rationalizing data source queries
US20120005228A1 (en) * 2010-06-30 2012-01-05 Himanshu Singh Method and system for navigating and displaying multi dimensional data
WO2013032911A1 (en) * 2011-08-26 2013-03-07 Hewlett-Packard Development Company, L.P. Multidimension clusters for data partitioning
CN102982103A (en) * 2012-11-06 2013-03-20 东南大学 On-line analytical processing (OLAP) massive multidimensional data dimension storage method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6408300B1 (en) * 1999-07-23 2002-06-18 International Business Machines Corporation Multidimensional indexing structure for use with linear optimization queries
CN101853283B (en) * 2010-05-21 2012-01-04 南京邮电大学 Construction method for multidimensional data-oriented semantic indexing peer-to-peer network
CN102467559A (en) * 2010-11-19 2012-05-23 金蝶软件(中国)有限公司 Multilevel and multidimensional method and device for analyzing data attributes
CN102663117B (en) * 2012-04-18 2013-11-20 中国人民大学 OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5940818A (en) * 1997-06-30 1999-08-17 International Business Machines Corporation Attribute-based access for multi-dimensional databases
US20040015507A1 (en) * 2002-07-19 2004-01-22 Amir Netz System and method for analytically modeling data organized according to related attributes
US20070271227A1 (en) * 2006-05-16 2007-11-22 Business Objects, S.A. Apparatus and method for recursively rationalizing data source queries
US20120005228A1 (en) * 2010-06-30 2012-01-05 Himanshu Singh Method and system for navigating and displaying multi dimensional data
WO2013032911A1 (en) * 2011-08-26 2013-03-07 Hewlett-Packard Development Company, L.P. Multidimension clusters for data partitioning
CN102982103A (en) * 2012-11-06 2013-03-20 东南大学 On-line analytical processing (OLAP) massive multidimensional data dimension storage method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451142A (en) * 2016-05-31 2017-12-08 北京京东尚科信息技术有限公司 The method and apparatus and its management system of data are write and inquired about in database
CN107451142B (en) * 2016-05-31 2022-05-27 北京京东尚科信息技术有限公司 Method and apparatus for writing and querying data in database, management system and computer-readable storage medium thereof
CN113761036A (en) * 2021-09-07 2021-12-07 国网福建省电力有限公司经济技术研究院 Power grid statistics professional index multidimensional association table query display method

Also Published As

Publication number Publication date
CN104424231B (en) 2019-07-16
CN104424231A (en) 2015-03-18

Similar Documents

Publication Publication Date Title
WO2015027831A1 (en) Multidimensional data processing method and device
Ardagna et al. Context-aware data quality assessment for big data
Qi et al. Spatial-temporal data-driven service recommendation with privacy-preservation
US20140207795A1 (en) Searching and determining active area
CA2881780C (en) System and method for measuring and improving the efficiency of social media campaigns
CN104077723B (en) A kind of social networks commending system and method
Van Ham et al. Centrality based visualization of small world graphs
CN109033234A (en) It is a kind of to update the streaming figure calculation method and system propagated based on state
CN110555172B (en) User relationship mining method and device, electronic equipment and storage medium
CN107220308B (en) Method, device and equipment for detecting rationality of POI (Point of interest) and readable medium
CN105824855B (en) Method and device for screening and classifying data objects and electronic equipment
Sampson et al. Surpassing the limit: Keyword clustering to improve Twitter sample coverage
US11023495B2 (en) Automatically generating meaningful user segments
Zhai et al. Null model and community structure in multiplex networks
CN111414410A (en) Data processing method, device, equipment and storage medium
US20150169794A1 (en) Updating location relevant user behavior statistics from classification errors
KR20110040685A (en) Retrospective event processing pattern language and execution model extension
CN108140027B (en) Access point for a map
Velu et al. Data mining in predicting liver patients using classification model
Xinchang et al. Movie recommendation algorithm using social network analysis to alleviate cold-start problem
Li et al. Cost-efficient data acquisition on online data marketplaces for correlation analysis
CN108804454A (en) One population portrait method, group's portrait device and server
Al_Sayed et al. On the nature of urban dependencies: How Manhattan and Barcelona reinforced a natural organisation despite planning intentionality
Jeffery et al. Power to detect spatial disturbances under different levels of geographic aggregation
JP6871395B2 (en) Systems and methods for providing cross-network event attribution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14839339

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16.06.2016)

122 Ep: pct application non-entry in european phase

Ref document number: 14839339

Country of ref document: EP

Kind code of ref document: A1