CN106528787B - query method and device based on multidimensional analysis of mass data - Google Patents

query method and device based on multidimensional analysis of mass data Download PDF

Info

Publication number
CN106528787B
CN106528787B CN201610985200.6A CN201610985200A CN106528787B CN 106528787 B CN106528787 B CN 106528787B CN 201610985200 A CN201610985200 A CN 201610985200A CN 106528787 B CN106528787 B CN 106528787B
Authority
CN
China
Prior art keywords
dimension
data
name
information
subbcube
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610985200.6A
Other languages
Chinese (zh)
Other versions
CN106528787A (en
Inventor
翟东波
宋少峰
任永强
江志鹏
周盛
董亚卫
潘柏宇
王冀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING GAODE YUNTU TECHNOLOGY Co.,Ltd.
Original Assignee
Youku Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Youku Network Technology Beijing Co Ltd filed Critical Youku Network Technology Beijing Co Ltd
Priority to CN201610985200.6A priority Critical patent/CN106528787B/en
Publication of CN106528787A publication Critical patent/CN106528787A/en
Application granted granted Critical
Publication of CN106528787B publication Critical patent/CN106528787B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Abstract

The application discloses a query method and a query device for multidimensional analysis of mass data, wherein the method comprises the following steps: receiving a query request which is sent by a user and carries dimension information to be queried, querying data corresponding to the dimension information in a pre-established subtube table according to the dimension information, returning the data to the user when the data corresponding to the dimension information is queried, querying the data corresponding to the dimension information in the pre-established cube table when the data corresponding to the dimension information is not queried, returning the data to the user, and collecting a dimension name contained in the dimension information as a dimension combination, wherein the subtube table is synthesized by part of columns in the cube table. Through the method, the number of lines in the subtube table is less than that in the cube table, a user firstly queries in the pre-established subtube table, so that the query efficiency can be effectively improved, and the subtube table only lists the dimension combinations of part of dimensions, does not need to exhaust all the dimension combinations, and thus the calculated amount is effectively reduced.

Description

Query method and device based on multidimensional analysis of mass data
Technical Field
the present application relates to the field of computer technologies, and in particular, to a query method and apparatus based on multidimensional analysis of mass data.
background
with the continuous development of computer technology, in order to provide multi-angle data support for enterprise decision makers, more and more enterprises begin to use On-line Analytical Processing (OLAP) to query data corresponding to different dimensions from each other, and analyze the queried data to make corresponding decisions.
At present, enterprise users can store collected multidimensional data into a data warehouse according to a fixed storage format through online analysis processing, and then can quickly and flexibly perform complex query of large data volume through online analysis processing according to the dimensional requirements of the enterprise users, and query results are provided for the enterprise users in an intuitive and understandable mode.
In the prior art, the online analysis process is mainly based on the form of a relational database to store and query multidimensional data, so the multidimensional data stored in the database is mainly stored in the form of a cube table (cube table is a two-dimensional table), and each dimension in the cube table is taken as a column, for example, assuming that only dimension a and dimension B exist, table 1:
dimension A Dimension B Factual value
A1 B1 1
A1 B2 1
A2 B1 1
A2 B2 1
TABLE 1
In addition, on-line analysis processing also exists in a form based on multidimensional data organization to store and query multidimensional data, so that in the process of storing multidimensional data by a database, all dimensions need to be combined, and the combined data is stored in a Key-Value form, for example, assuming that values of all dimensions are as shown in table 2:
Dimension A dimension B
A1 B1
A2 B2
TABLE 2
the database combines the dimensions (i.e., AB, a, B), and stores the dimensions in K-V as shown in table 3, using the storage manner of Key-Value:
A1,B2
A1,B1
A2,B1
A2,B2
A1
A2
B1
B2
TABLE 3
However, when the online analysis processing is based on the form of the relational database to store and query the multidimensional data, if the dimensions are large, the data size is large, the data stored in the cube table has a lot of row data, and in the query process, the data in the cube table needs to be filtered line by line according to the dimension query information, and then the filtered data is aggregated, which results in low query efficiency.
Disclosure of Invention
The embodiment of the application provides a query method and a query device for multidimensional analysis of mass data, which are used for solving the problems that in the prior art, when online analysis processing is based on a relational database to store and query multidimensional data, query efficiency is low, and when online analysis processing is based on a multidimensional data organization to store and query multidimensional data, all dimensions are exhausted in advance, so that the calculation amount is huge.
The embodiment of the application provides a query method for multidimensional analysis of mass data, which comprises the following steps:
Receiving a query request which is sent by a user and carries dimension information to be queried, wherein the dimension information comprises: dimension name and dimension value;
according to the dimension information, inquiring data corresponding to the dimension information in a pre-established subcube table;
When the data corresponding to the dimension information is inquired, returning the data to the user; and when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to a user, and collecting the dimension name contained in the dimension information as a whole, wherein the cube table is synthesized by partial columns in the cube table.
preferably, query information of a user carrying a dimension name to be queried is obtained in advance, data corresponding to the dimension name is queried according to the dimension name, and a subpube table is established according to the dimension name and the data corresponding to the dimension name.
preferably, when the data in the pre-established cube table is updated, the method further includes: and acquiring updated data in a pre-established cube table, performing dimensionality reduction on the acquired data, and updating the data subjected to dimensionality reduction to a pre-established subcube table.
Preferably, for any acquired data, determining a subbcube table containing at least one dimension name corresponding to the data, and for any determined subbcube table, reducing the dimension corresponding to the data to be consistent with the dimension corresponding to the subbcube table, and searching for data with the same dimension value corresponding to the data in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
Preferably, the method further comprises: in a specific time, in the specific time, the same dimension combinations including the dimension names are grouped into one group, the times of the dimension combinations including the dimension names and collected in each group are counted, a Subcube table is newly built under the condition that the times of the collected dimension combinations including the dimension names exceed a preset threshold value, data corresponding to the dimension names included in the dimension combinations are inquired in a pre-built cube table, and the data corresponding to the dimension names included in the dimension combinations are merged and added to the newly built Subcube table.
An embodiment of the present application provides a query device for multidimensional analysis of mass data, including:
a receiving module, configured to receive a query request sent by a user and carrying dimension information to be queried, where the dimension information includes: dimension name and dimension value;
The query module is used for querying data corresponding to the dimension information in a pre-established subbcube table according to the dimension information;
the data return module is used for returning the data to the user when the data corresponding to the dimension information is inquired; and when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to a user, and collecting the dimension name contained in the dimension information as a whole, wherein the cube table is synthesized by partial columns in the cube table.
Preferably, the apparatus further comprises: the device comprises a pre-establishing module, a query module and a database module, wherein the pre-establishing module is used for acquiring query information of a user carrying a dimension name to be queried in advance, querying data corresponding to the dimension name according to the dimension name, and establishing a subpube table according to the dimension name and the data corresponding to the dimension name.
Preferably, the apparatus further comprises: and the first updating module is used for acquiring the updated data in the pre-established cube table when the data in the pre-established cube table is updated, performing dimensionality reduction on the acquired data, and updating the data subjected to dimensionality reduction to the pre-established subcube table.
Preferably, the first updating module is specifically configured to, for any obtained data, determine a subbcube table including at least one dimension name corresponding to the data, and for any determined subbcube table, reduce the dimension corresponding to the data to be consistent with the dimension corresponding to the subbcube table, and search for data with the same dimension value as the data in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
preferably, the apparatus further comprises: and the second updating module is used for grouping the same dimension combinations containing the dimension names into one group in specific time, counting the times of the dimension combinations containing the dimension names collected in each group, creating a subbcube table under the condition that the times of the collected dimension combinations containing the dimension names exceed a preset threshold value, inquiring data corresponding to the dimension names contained in the dimension combinations in a pre-established cube table, merging the data corresponding to the dimension names contained in the dimension combinations and adding the merged data into the newly created subbcube table.
The embodiment of the application provides a query method and a query device for multidimensional analysis of mass data, the method firstly receives a query request which is sent by a user and carries dimension information to be queried, wherein the dimension information comprises: and inquiring data corresponding to the dimension information in a pre-established subbcube table according to the dimension information, returning the data to the user when the data corresponding to the dimension information is inquired, inquiring the data corresponding to the dimension information in the pre-established cube table when the data corresponding to the dimension information is not inquired, returning the data to the user, and collecting the dimension name contained in the dimension information as a dimension combination, wherein the subbcube table is synthesized by partial columns in the cube table. Through the method, the subtube table is synthesized by partial columns in the cube table, namely the number of rows in the subtube table is less than that in the cube table, and subsequently, a user firstly queries the pre-established subtube table in the query process, so that the query efficiency can be effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
Fig. 1 is a schematic process diagram of query of multidimensional analysis of mass data according to an embodiment of the present application;
Fig. 2 is a schematic structural diagram of a query of multidimensional analysis of mass data according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a query process of multidimensional analysis of mass data provided in an embodiment of the present application, which specifically includes the following steps:
S101: and receiving an inquiry request which is sent by a user and carries dimension information to be inquired.
in practical application, an enterprise user can store collected multidimensional data into a data warehouse according to a fixed storage format through online analysis processing, and then can quickly and flexibly perform complex query of large data volume through online analysis processing according to the dimension requirement of the enterprise user, and provide a query result to the enterprise user in an intuitive and understandable form.
and before inquiring the needed dimension, the enterprise user needs to store the collected multidimensional data into a data warehouse.
Furthermore, since the present application aims to reduce the number of rows of the whole table by extracting some columns in the cube according to the actual requirement of the user and separately creating a table through reducing the number of columns of the whole table, that is, the number of columns of the table created separately from the extracted columns is reduced compared with the original cube table, the number of rows of the whole table is also reduced compared with the previous cube table, and the number of rows of the whole table is reduced, the query speed on the table becomes faster, therefore, in the present application, some columns can be extracted from the cube table stored in advance according to the actual requirement, and a table can be created according to the extracted columns.
it should be noted that, in order to be able to well distinguish the original cube table, extract some columns according to the actual requirements, and create a table according to the extracted columns, in this application, some columns are extracted according to the actual requirements, and a table created according to the extracted columns is defined as a cube table, that is, the cube table is synthesized by part of columns in the cube table, and for the same cube table, the actual requirements of the user usually include a plurality of different dimensional combinations, and according to each different dimensional combination included in the actual requirements of the user, a cube table is created respectively according to data included in the cube table, that is, the same cube table can create a plurality of cube tables with different combination dimensions according to a plurality of different dimensional combinations included in the actual requirements of the user.
Further, the present application provides a method for pre-establishing a subbcube table, which specifically comprises the following steps:
Acquiring query information of a user carrying a dimension name to be queried in advance, querying data corresponding to the dimension name according to the dimension name, and establishing a subbyte table according to the dimension name and the data corresponding to the dimension name.
here, the data corresponding to the dimension name includes: the dimension value and the fact value corresponding to the dimension value are obtained, the dimension name to be queried can be only one dimension name or a combination of a plurality of dimension names, and the dimension name to be queried specifically comprises a plurality of dimension names which are determined according to the actual requirements of the user.
in addition, in the process of building a subbcube table according to the dimension name and the data corresponding to the dimension name, a subbcube table containing an actual value name and a dimension name is built first, each column corresponds to a dimension name, the fact values with the same dimension (i.e. containing the dimension name and the dimension value) are merged, and the merged fact value and the corresponding dimension value are filled into the subbcube table, for example, if the dimension corresponding to the actual value 1 is: the actual value 2 also corresponds to the dimensions: province, beijing and quarter, then the two fact values may be merged, the merged fact value being 3, i.e., 1+2, and the fact value 3 filling in the dimension: in the columns of fact values corresponding to beijing province and first quarter, if the fact value 2 corresponds to dimension a (i.e., beijing province) and dimension C (quarter) then the fact values cannot be combined.
For example, for purposes of simplicity and clarity in explaining the present application, assume that cube tables in the data warehouse are as in Table 4:
Dimension A Dimension B Dimension C Factual value
A1 B1 C1 1
A1 B2 C2 1
A2 B1 C1 1
A2 B2 C2 1
TABLE 4
Suppose that a user a needs to establish a subtube table according to actual requirements, therefore, the data warehouse obtains query information of the user a carrying a dimension a (i.e., a dimension name) to be queried, and queries data corresponding to the dimension a, i.e., table 5, according to the dimension a.
Dimension A Factual value
A1 1
A1 1
A2 1
A2 1
TABLE 5
Creating a subbcube table containing actual value names and dimensions a (i.e., dimension names), each column corresponding to a dimension name, merging the actual values with the same dimensions (i.e., containing the dimension names and the dimension values), and filling the merged actual values and the corresponding dimension values into the subbcube table, as shown in table 6:
Dimension A factual value
A1 2
A2 2
TABLE 6
Further, after the creation of the subtube table is completed, the user can send a query request of the dimension information to be queried to the data warehouse through the terminal, and query the data corresponding to the required dimension.
It should be noted that, in the process of querying, it is necessary to know that the queried data is in the rows and columns, and therefore, the dimension information to be queried includes: dimension name and dimension value.
In addition, the dimension name included in the dimension information to be queried may be one, such as dimension a, or a combination of multiple dimension names, such as dimension a and dimension B, and the specific number of the dimension names included in the dimension information to be queried is determined according to the actual demand of the user.
Along with the above example, the user a queries the data corresponding to the dimension a (a 1) according to the actual requirement, and therefore, the user a sends a query request of the dimension a (a 1) (i.e., dimension information) to be queried to the data warehouse through the terminal, and queries the data corresponding to the required dimension.
s102: and inquiring data corresponding to the dimension information in a pre-established subtube table according to the dimension information.
after receiving a query request which is sent by a user and carries dimension information to be queried, a data warehouse directly queries in a pre-established subbyte table according to a dimension name and a dimension value contained in the dimension information, and queries data corresponding to the dimension information.
it should be noted that, the data corresponding to the dimension information may include: the actual value.
continuing the above example, after receiving a query request carrying dimension information to be queried and sent by a user a, a data warehouse directly performs query in a pre-established subcube table 6 according to a dimension a ═ a1 included in the dimension information, and queries that data corresponding to the dimension information is: the actual value is 2.
S103: when the data corresponding to the dimension information is inquired, returning the data to the user; and when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to the user, and collecting the dimension name contained in the dimension information as a whole.
And when the data corresponding to the dimension information is inquired in a pre-established subtube table, returning the data to the user.
Continuing to use the above example, the data corresponding to the dimension information is queried as follows: the fact value of 2 is returned to the user A.
However, in the process of pre-establishing a subube, the name of the dimension to be queried is determined according to the actual requirement of the user, and the user is determined according to experience or historical data only when determining the actual requirement, so that due to the limitation of experience or historical data, in practical application, the dimension required by the user in the query may not be in the pre-established subube table, and therefore, when the data corresponding to the dimension information is not queried, the data corresponding to the dimension information can only be queried in the pre-established cube table, and the data is returned to the user.
Further, in practical application, although the data warehouse does not query the data corresponding to the dimension information in the pre-established subcube table according to the dimension information to be queried sent by the user, it can also be described that a subsequent user may have a tendency to query the dimension information to be queried, and when the current user queries the dimension information, although only a certain dimension value under the dimension name is queried, it is also described that the user may have a tendency to query other dimension values under the dimension name later.
For example, assuming that the user a in the above example queries data corresponding to the dimension B — B1 according to the actual demand, the data warehouse does not query data corresponding to the dimension information in the pre-established subube 6 table, and queries data corresponding to the dimension information in the pre-established cube table 4, that is, the fact value is: and 2, returning the data to the user A, and collecting the dimension B as a dimension combination.
Further, after each acquisition, whether the dimension name included in the acquired dimension combination needs to be judged to establish the subbcube table according to the dimension name included in the dimension combination or not is judged, and more computer resources are wasted.
Further, the specific process of determining whether to establish the subbcube table according to the dimension name included in the collected dimension combination is as follows:
In a specific time, the collected dimension combinations containing the dimension names are classified into one group, the times of the dimension combinations containing the dimension names collected in each group are counted, a subibe table is newly built under the condition that the times of the collected dimension combinations containing the dimension names exceed a preset threshold value, data corresponding to the dimension names contained in the dimension combinations are inquired in a pre-built cube table, the data corresponding to the dimension names contained in the dimension combinations are combined, and the data are added to the newly built subibe table, wherein the specific time is consistent with the certain time.
For example, assuming that the specific time is one day, the collected dimension combinations containing the dimension names are shown in table 7:
Dimension B
Dimension B
Dimension A and dimension B
Dimension C
Dimension B
dimension B and dimension C
TABLE 7
grouping the same dimension combinations containing the dimension names into one group, and counting the times of the dimension combinations containing the dimension names collected in each group, as shown in table 8:
Dimension name Number of times
dimension B 4
Dimension A and dimension B 1
Dimension C 1
Dimension B and dimension C 1
TABLE 8
Assuming that the preset threshold is 3 times, the data warehouse determines that the number of times of the collected dimension combination including the dimension B exceeds the preset threshold, that is, 3 times, a subube table is newly created, data corresponding to the dimension B included in the dimension combination is queried in a pre-established cube table 4, data with the same dimension value corresponding to the dimension B included in the dimension combination is merged and added to the newly created subube table, as shown in table 9:
Dimension B Factual value
B1 2
B2 2
TABLE 9
Through the method, the subtube table is synthesized by partial columns in the cube table, namely the number of rows in the subtube table is less than that in the cube table, and subsequently, a user firstly queries the pre-established subtube table in the query process, so that the query efficiency can be effectively improved.
In practical applications, there is a case that data stored in a cube table of a data warehouse in advance is updated, and a subcube table is established according to data in the cube table, that is, when data in the cube table is updated, data in the subcube table is also changed, so in the present application, when data in the cube table is updated, data in the subcube table needs to be updated.
The present application provides a specific manner of updating data in a pre-established subbyte table, which is specifically as follows: and acquiring updated data in a pre-established cube table, performing dimensionality reduction on the acquired data, and updating the data subjected to dimensionality reduction to a pre-established subcube table.
in addition, in the process of performing dimension reduction processing on the acquired data, a subbcube table containing at least one dimension name corresponding to the data can be determined for any acquired data, the subbcube table is determined for any acquired data, the dimension corresponding to the data is reduced to be consistent with the dimension corresponding to the subbcube table, and the data with the same dimension value as that corresponding to the data is searched in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
for example, assume that the tables stored in the data warehouse contain: table 4, table 6, and table 9, it is assumed that user a adds a row of data in table 4, as shown in table 10:
Dimension A Dimension B dimension C Factual value
A1 B1 C1 1
A1 B2 C2 1
A2 B1 C1 1
A2 B2 C2 1
A1 B3 C1 1
Watch 10
The data warehouse acquires the updated data from the table 10, and determines a subube table containing at least one dimension name corresponding to the data, namely, a subube table 6 and a subtube table 9, for the acquired data.
For the subube table 6, the dimension corresponding to the data is reduced to be consistent with the dimension corresponding to the subube table 6, that is, the dimension corresponding to the reduced data only includes the dimension a, and the data with the same dimension value as the data is found in the subube table 6, and is merged, specifically, as shown in table 11:
dimension A factual value
A1 3
A2 2
TABLE 11
For the subube table 9, the dimension corresponding to the data is reduced to be consistent with the dimension corresponding to the subube table 9, that is, the dimension corresponding to the data after the dimension reduction only includes the dimension B, and the data with the same dimension value as the data is not found in the subube table 9, so that the data is directly added in the subube table 9, specifically as in table 12:
dimension B Factual value
B1 2
B2 2
B3 1
TABLE 12
Based on the same idea, the embodiment of the present application further provides a query device for multidimensional analysis of mass data.
As shown in fig. 2, an inquiry apparatus for multidimensional analysis of mass data provided in an embodiment of the present application includes:
A receiving module 201, configured to receive a query request sent by a user and carrying dimension information to be queried, where the dimension information includes: dimension name and dimension value;
the query module 202 is configured to query, according to the dimension information, data corresponding to the dimension information in a pre-established subbcube table;
The data returning module 203 is configured to return the data to the user when the data corresponding to the dimension information is queried; and when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to a user, and collecting the dimension name contained in the dimension information as a dimension combination, wherein the cube table is synthesized by partial columns in the cube table.
The device further comprises:
the pre-establishing module 204 is configured to obtain query information of a user carrying a dimension name to be queried in advance, query data corresponding to the dimension name according to the dimension name, and establish a subpube table according to the dimension name and the data corresponding to the dimension name.
The device further comprises:
The first updating module 205 is configured to, when data in the pre-established cube table is updated, obtain the updated data in the pre-established cube table, perform dimension reduction on the obtained data, and update the dimension-reduced data to the pre-established subcube table.
the first updating module 205 is specifically configured to, for any obtained data, determine a subbcube table including at least one dimension name corresponding to the data, and for any determined subbcube table, reduce the dimension corresponding to the data to be consistent with the dimension corresponding to the subbcube table, and search for data with the same dimension value as the data in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
The device further comprises:
a second updating module 206, configured to, in a specific time, group the same dimensional combinations including the dimension names into one group, count the times of the dimensional combinations including the dimension names acquired in each group, create a subbcube table when the times of the acquired dimensional combinations including the dimension names exceed a preset threshold, query the pre-established cube table for data corresponding to the dimension names included in the dimensional combinations, merge the data corresponding to the dimension names included in the dimensional combinations and add the merged data to the newly created subbcube table.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (8)

1. A query method for multidimensional analysis of mass data is characterized by comprising the following steps:
receiving a query request which is sent by a user and carries dimension information to be queried, wherein the dimension information comprises: dimension name and dimension value;
According to the dimension information, inquiring data corresponding to the dimension information in a pre-established subcube table;
When the data corresponding to the dimension information is inquired, returning the data to the user; when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to a user, collecting the dimension name contained in the dimension information as a dimension combination, and judging whether a subcube table needs to be newly established according to the dimension name contained in the collected dimension combination, wherein the subcube table is synthesized by part columns in the cube table;
the method for judging whether a subbcube table is to be newly established according to the dimension name contained in the collected dimension combination includes the following steps:
Grouping the same dimension combinations containing the dimension names into one group in a specific time;
Counting the times of the collected dimension combinations containing the dimension names in each group;
under the condition that the collected times of the dimension combination containing the dimension name exceeds a preset threshold value, a subtube table is newly built, data corresponding to the dimension name contained in the dimension combination is inquired in a pre-built cube table, the data corresponding to the dimension name contained in the dimension combination with the same dimension value are combined, and the data are added to the newly built subtube table.
2. The method of claim 1, wherein pre-establishing a subbcube table specifically comprises:
acquiring query information of a user carrying a dimension name to be queried in advance;
inquiring data corresponding to the dimension name according to the dimension name;
And establishing a subbcube table according to the dimension name and the data corresponding to the dimension name.
3. The method of claim 2, wherein when the data in the pre-established cube table is updated, the method further comprises:
Acquiring updated data in a pre-established cube table;
Performing dimensionality reduction on the acquired data;
And updating the data after the dimensionality reduction processing to a pre-established subbcube table.
4. The method according to claim 3, wherein performing the dimension reduction processing on the acquired data specifically includes:
Determining a subbcube table containing at least one dimension name corresponding to any acquired data;
Aiming at any determined subbcube table, reducing the dimensionality corresponding to the data to be consistent with the dimensionality corresponding to the subbcube table, and searching the data with the same dimensionality value corresponding to the data in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
5. An inquiry device for multidimensional analysis of mass data is characterized by comprising:
a receiving module, configured to receive a query request sent by a user and carrying dimension information to be queried, where the dimension information includes: dimension name and dimension value;
The query module is used for querying data corresponding to the dimension information in a pre-established subbcube table according to the dimension information;
The data return module is used for returning the data to the user when the data corresponding to the dimension information is inquired; when the data corresponding to the dimension information is not inquired, inquiring the data corresponding to the dimension information in a pre-established cube table, returning the data to a user, collecting the dimension name contained in the dimension information as a dimension combination, and judging whether a subcube table needs to be newly established according to the dimension name contained in the collected dimension combination, wherein the subcube table is synthesized by part columns in the cube table;
The device further comprises:
and the second updating module is used for grouping the same dimension combinations containing the dimension names into one group in specific time, counting the times of the dimension combinations containing the dimension names collected in each group, creating a subbcube table under the condition that the times of the collected dimension combinations containing the dimension names exceed a preset threshold value, inquiring data corresponding to the dimension names contained in the dimension combinations in a pre-established cube table, merging the data corresponding to the dimension names contained in the dimension combinations and adding the merged data into the newly created subbcube table.
6. the apparatus of claim 5, wherein the apparatus further comprises:
the device comprises a pre-establishing module, a query module and a database module, wherein the pre-establishing module is used for acquiring query information of a user carrying a dimension name to be queried in advance, querying data corresponding to the dimension name according to the dimension name, and establishing a subpube table according to the dimension name and the data corresponding to the dimension name.
7. The apparatus of claim 6, wherein the apparatus further comprises:
And the first updating module is used for acquiring the updated data in the pre-established cube table when the data in the pre-established cube table is updated, performing dimensionality reduction on the acquired data, and updating the data subjected to dimensionality reduction to the pre-established subcube table.
8. The apparatus of claim 7, wherein the first update module is specifically configured to: for any acquired data, determining a subbcube table containing at least one dimension name corresponding to the data, for any determined subbcube table, reducing the dimension corresponding to the data to be consistent with the dimension corresponding to the subbcube table, and searching the data with the same dimension value corresponding to the data in the subbcube table; and when the data with the same dimension value corresponding to the data is found out, merging the data, and when the data with the same dimension value corresponding to the data is not found out, directly adding the data to the subbyte table.
CN201610985200.6A 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data Active CN106528787B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610985200.6A CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610985200.6A CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Publications (2)

Publication Number Publication Date
CN106528787A CN106528787A (en) 2017-03-22
CN106528787B true CN106528787B (en) 2019-12-17

Family

ID=58350619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610985200.6A Active CN106528787B (en) 2016-11-09 2016-11-09 query method and device based on multidimensional analysis of mass data

Country Status (1)

Country Link
CN (1) CN106528787B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932257B (en) * 2017-05-25 2021-10-08 北京国双科技有限公司 Multi-dimensional data query method and device
CN110019186A (en) * 2017-09-07 2019-07-16 北京国双科技有限公司 The method and device of data storage
CN108280046A (en) * 2017-11-30 2018-07-13 深圳市科列技术股份有限公司 A kind of method, battery data server and the user terminal of battery data processing
CN110209686A (en) * 2018-02-22 2019-09-06 北京嘀嘀无限科技发展有限公司 Storage, querying method and the device of data
CN108363819B (en) * 2018-03-23 2021-04-13 联想(北京)有限公司 Query engine matching method, device, server group and readable storage medium
CN108829795A (en) * 2018-06-04 2018-11-16 北京奇艺世纪科技有限公司 Data query method and device
CN108830015A (en) * 2018-07-03 2018-11-16 北京华大九天软件有限公司 A method of utilizing unit performance trend in graphical display analytical unit library
CN110334122A (en) * 2019-07-11 2019-10-15 江苏曲速教育科技有限公司 The query analysis method and system of educational data
CN110837511B (en) * 2019-11-15 2022-08-23 金蝶软件(中国)有限公司 Data processing method, system and related equipment
CN112000747B (en) * 2020-07-08 2022-11-18 苏宁云计算有限公司 Data multidimensional analysis method, device and system
CN112948441B (en) * 2021-03-26 2023-09-29 浪潮通用软件有限公司 Multi-dimensional data collection method and equipment for financial data
CN113393190B (en) * 2021-06-10 2023-12-05 北京京东振世信息技术有限公司 Warehouse information processing method and device, electronic equipment and readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334795A (en) * 2008-08-07 2008-12-31 金蝶软件(中国)有限公司 Data storage method and device
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
CN103605651A (en) * 2013-08-28 2014-02-26 杭州顺网科技股份有限公司 Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN105224534A (en) * 2014-05-29 2016-01-06 腾讯科技(深圳)有限公司 A kind of method and device of asking response

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8185487B2 (en) * 2001-02-12 2012-05-22 Facebook, Inc. System, process and software arrangement for providing multidimensional recommendations/suggestions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334795A (en) * 2008-08-07 2008-12-31 金蝶软件(中国)有限公司 Data storage method and device
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
CN103605651A (en) * 2013-08-28 2014-02-26 杭州顺网科技股份有限公司 Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN105224534A (en) * 2014-05-29 2016-01-06 腾讯科技(深圳)有限公司 A kind of method and device of asking response

Also Published As

Publication number Publication date
CN106528787A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
CN106528787B (en) query method and device based on multidimensional analysis of mass data
CN106484875B (en) MOLAP-based data processing method and device
Han et al. Hgrid: A data model for large geospatial data sets in hbase
CN107408114B (en) Identifying join relationships based on transactional access patterns
WO2019067079A1 (en) System and method for load, aggregate and batch calculation in one scan in a multidimensional database environment
CN109241159B (en) Partition query method and system for data cube and terminal equipment
JP6928677B2 (en) Data processing methods and equipment for performing online analysis processing
CN106407207B (en) Real-time newly-added data updating method and device
CN106528847A (en) Multi-dimensional processing method and system for massive data
CN112765405B (en) Method and system for clustering and inquiring spatial data search results
WO2014058711A1 (en) Creation of inverted index system, and data processing method and apparatus
CN103036921B (en) A kind of user behavior analysis system and method
CN104346458A (en) Data storage method and device
CN112214472A (en) Meteorological grid point data storage and query method, device and storage medium
CN103200269A (en) Internet information statistical method and Internet information statistical system
CN107451204B (en) Data query method, device and equipment
CN112231351A (en) Real-time query method and device for PB-level mass data
CN106570029B (en) Data processing method and system for distributed relational database
CN110851758B (en) Webpage visitor quantity counting method and device
CN102955808A (en) Data acquisition method and distributed file system
CN110019192B (en) Database retrieval method and device
CN115658680A (en) Data storage method, data query method and related device
CN104794237A (en) Web page information processing method and device
CN108121733B (en) Data query method and device
CN114564501A (en) Database data storage and query methods, devices, equipment and medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100080 A 5 C, block A, China International Steel Plaza, 8 Haidian Avenue, Haidian District, Beijing.

Applicant after: Youku network technology (Beijing) Co., Ltd.

Address before: 100080 A 5 C, block A, China International Steel Plaza, 8 Haidian Avenue, Haidian District, Beijing.

Applicant before: 1Verge Inc.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200710

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Patentee before: Youku network technology (Beijing) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210303

Address after: Room 715, 7-storey, 7-storey, No. 10 Furong Street, Chaoyang District, Beijing, 100102

Patentee after: BEIJING GAODE YUNTU TECHNOLOGY Co.,Ltd.

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.