Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The data query method provided by the embodiment of the application is based on a ROLAP model of a relational database, wherein the data for query belong to analytical data, a corresponding server acquires the data from the relational database, and the server is associated with the relational database.
In one mode of the application, the server receives a query request from a user, and obtains corresponding data from the relational database and feeds the data back to the user.
In another mode, the server provides data query services in a service platform mode for different users (either individual users or enterprise users) to use, and of course, different users may provide their respective data to the service platform for other users to query. And are not to be construed as limiting the application herein.
Specifically, the data query method provided in the embodiment of the present application, as shown in fig. 1, specifically includes the following steps:
s101, receiving a data query request.
When a user needs to query data, a corresponding query request can be sent out, and in practical application, the query request includes data to be queried by the user and/or information (such as dimensions) related to the data, for example: specific data such as income data of 201X years, sales data of the last half year of the company and the like, which are required to be inquired by the user, for example: the dimension information such as commodity category, production capacity in a certain area and the like which are required to be inquired by the user.
In one mode of the embodiment of the present application, the dimension is similar to a key and may include information such as a user name, a gender, a user type, a sales amount, a region, and the like, which are used to divide different types of data.
It should be noted that, in an actual application scenario, a corresponding server is associated with a database and can provide a data query function, and at a user side, a user may access the query function provided by the server through a corresponding browser, or the user may obtain the query function provided by the server through a client. And are not to be construed as limiting the application herein.
And S102, determining the identification information of the data to be queried corresponding to the data query request.
In this embodiment of the present application, the identification information of the data to be queried includes: the dimension to be queried and its corresponding metric value, or dimension. Wherein the metric value may be considered a specific data value. In other words, when a user queries data, the user may query for a specific data value in a certain dimension, such as: the user can inquire about the dimension 'date' and the metric value '10 months'; queries can also be made for a single dimension, such as: the user may query for the dimension "product number". And are not to be construed as limiting the application herein.
When the server receives a data query request sent by a user, the server determines the identification information of the data to be queried corresponding to the data query request, so as to query the data according to the identification information.
Of course, in practical applications, the user may input the corresponding query request through an interface such as a browser or a client interface. And are not to be construed as limiting the application herein.
S103, determining the query path corresponding to the identification information in a pre-established multidimensional data relation model according to the determined identification information.
In view of the practical requirements of complex queries where a user needs as many query results as possible to fully and clearly reflect the state of an event, the data in a multidimensional data set (e.g., Cube) may be difficult to provide fully comprehensive query results. Therefore, in the embodiment of the present application, the server may pre-establish a corresponding multidimensional data relationship model, where the relationship between the multidimensional data sets in the multidimensional data relationship model includes: the parent-child relationship, i.e., the multidimensional data relationship model, includes a parent multidimensional data set (hereinafter referred to as a master data set) and a child multidimensional data set (hereinafter referred to as a child data set). The multidimensional data relation model comprises a plurality of correlated multidimensional data sets, and comprehensive data can be provided for the user from a plurality of angles based on the multidimensional data relation model so as to meet the complex query requirement of the user.
Based on this, the query path mentioned in the embodiment of the present application specifically refers to a relationship chain between data sets in the multidimensional data relationship model, where the data sets have an association relationship with a database corresponding to the identification information of the user.
Through the query path, different data having an association relationship with the data to be queried by the user can be queried, and specifically, the following step S104 is performed.
And S104, performing data query according to the query path and the identification information to generate a query result.
According to the query path, at least one multi-dimensional data set can be determined, and the data corresponding to the identification information is queried in the determined multi-dimensional data set, so that the multi-dimensional data which are associated with each other and have an association relation with the identification information in the multi-dimensional data set are obtained.
For example: the user inquires about sales data of a company all year round, and assumes that a data table (i.e., a multidimensional data set a) recording sales data and a data table (i.e., a multidimensional data set B) recording sales data are associated with each other in a database, and a parent-child relationship is provided between the data table recording sales data and the data table recording sales data, then a relational link (i.e., an inquiry path) between the multidimensional data sets a to B can be determined, and then, for the user, sales data and corresponding sales data are finally obtained.
Obviously, such query results can provide the user with associated data more comprehensively, so that the user can obtain sufficient data to perform subsequent operations (data analysis, data decision, etc.).
Through the steps, after the server receives a data query request sent by a user, the server further determines the dimensionality or the measurement of the data to be queried by the user according to the data query request, and accordingly determines a pre-established data relation model which accords with the dimensionality or the measurement in a corresponding database, wherein the data relation model comprises a plurality of multi-dimensional data sets which are associated with each other, so that various data which have an association relation with the dimensionality or the measurement can be determined in the plurality of multi-dimensional data sets in the data relation model according to the dimensionality or the measurement of the data to be queried by the user. Obviously, different from the prior art, for the user, the above-mentioned method does not need the user to know the specific data table or the specific field where the data to be queried is located, and likewise, does not need the user to define the format of the data table by himself, and the user only needs to provide the data to be queried with the association, and the server can query the data in the pre-established data relation model according to the data provided by the user, thereby facilitating the user operation and effectively reducing the excessively tedious operations in the existing query method.
In practical application, the process of the server for establishing the multidimensional data relation model in advance specifically comprises the following steps: pre-selecting a fact table in a database, taking the selected fact table as a main data set, determining each attribute information in the main data set, and determining each associated data table corresponding to each attribute information in the database according to each attribute information, wherein each associated data table comprises a fact table and/or a dimension table; and taking each determined associated data table as a sub data set, and establishing a data relationship between each sub data set and the main data set to form the multi-dimensional data relationship model.
It should be noted that, in the multidimensional data relationship model according to the embodiment of the present application, the main data set is not formed by the sub data sets, but the main data set and the sub data sets have a dependency relationship therebetween, and this does not constitute a limitation of the present application.
In an actual application scenario, the fact table as a main data set may be formed by a data instruction provided by a user, and the server summarizes corresponding data according to the data instruction of the user to generate the fact table, and determines the fact table as the main data set, where of course, the fact table does not form a limitation to the present application.
The dimension table may be a data table used for storing dimensions in a database, and the dimension table usually has no specific metric value and only includes different dimensions, for example: a dimension table with the date as the dimension can contain further subdivided sub-dimensions of day, month, year and the like.
After the main data set is determined, the server determines each associated data table (i.e., the sub data set) according to each attribute information (i.e., the dimension of the data) in the main data set. Here, the associated data table in the embodiment of the present application is a data table having an associated relationship with a fact table as a main data set.
Specifically, according to the attribute information, determining each associated data table corresponding to each attribute information in the database, specifically including: and for each attribute information, determining each data table containing the attribute information in the database, and taking each determined data table as an associated data table.
It can be seen that, for each data table in the database, if some attribute information (i.e. dimension) in the fact table is included, the data table may be determined as an associated data table having an association relationship with the fact table.
For example: as shown in FIG. 2, a fact table (i.e., a master data set) in a database and associated data tables having an association relationship with the fact table are displayed.
The data table a in fig. 2 is a fact table (hereinafter referred to as a fact table a, where the fact table a includes both different dimensions and measurement values under each dimension, and the measurement values serve as examples and are not limited to the present application), and the fact table a may be used as a main data set. It can be seen that the fact table a contains a plurality of dimensions (i.e., the above-mentioned attribute information): i.e. order number, salesman number, customer number, product number, date identification, area name, quantity, total price. According to different dimensions, associated data tables containing the dimensions, such as the data tables B1-B6 shown in FIG. 2, can be determined in the database. Obviously, for the data tables B1-B6, each of them contains a certain dimension of the fact table a, so that the data tables B1-B6 can be determined to be the associated data tables having the association relation with the fact table a.
Of course, unlike the prior art, in the embodiment of the present application, the associated data table may be a dimension table or a fact table, in other words, for the associated data tables B1 to B6 shown in fig. 2, not only the dimensions in the fact table a but also specific metric values in different dimensions may be included (the specific metric values in the associated data tables B1 to B6 are not shown in fig. 2, and do not limit the present application).
In practical applications, different sub data sets (i.e., associated data tables) may have their own sub data sets, and in such a case, the server may determine a complete data relationship model, specifically, establish a data relationship between each sub data set and the main data set, which specifically includes: determining each associated data table containing the attribute information as a primary sub data set associated with the main data set;
and determining each level of sub-data set of the main data set by adopting the following method: determining sub-attribute information contained in each level of sub-data set, determining each associated data table containing the sub-attribute information, and taking each associated data table containing the sub-attribute information as a next level of sub-data set of the first level of sub-data set until the associated data table cannot be determined according to the sub-attribute information.
For example: as shown in fig. 3, the complete data relationship model determined for the server includes a main data set and 5 sub data sets, and the sub data set 1 itself has two subsets: subdata sets 2 and 3; the sub data set 4 itself has a subset: the sub data set 5.
On the basis of determining the complete data relation model, the data required by the user can be queried. If the identification information falls into a plurality of data sets, determining a query path to which the identification information belongs, specifically comprising: determining each data set in which the identification information falls, determining a common superior data set of each data set in which the identification information falls in the multi-dimensional data relation model, determining each path from each data set in which the identification information falls to the common superior data set, and taking each determined path as a query path to which the identification information belongs.
Still taking fig. 3 as an example for explanation, assuming that the dimensions to be queried by the user are stored in the sub data sets 2 and 3, respectively, when the server queries data, it first determines the sub data set in which the data is located, i.e. the sub data sets 2 and 3, and then traces back to the common higher level data set of the two sub data sets, i.e. the sub data set 1 in fig. 3. Therefore, the query path of the data to be queried by the user at this time can be determined as follows: subdata set 1-subdata set 2, and subdata set 1-subdata set 3.
In another example, assuming that the dimension to be queried by the user falls in the sub data set 2 and the sub data set 5, when the server queries data, the sub data set 2 and the sub data set 5 are first determined, and are traced back to the upper layer relationship until being traced back to the common upper layer data set, as can be seen from fig. 3, the common upper layer data set of the sub data set 2 and the sub data set 5 is the main data set. Therefore, the query path of this time is: main data set-sub data set 1-sub data set 2, and main data set-sub data set 4-sub data set 5.
Based on this, after the query path is determined, the query process specifically includes: performing data query according to the query path and the identification information, specifically including: determining each data set corresponding to the query path according to the query path; and querying data corresponding to the identification information according to the identification information in the determined data sets. Thus, the queried data is data with an association relationship.
Continuing with the above example, with reference to the data relationship model shown in fig. 3, assuming that the dimensions M and N to be queried by the user fall into the sub data set 2 and the sub data set 5, respectively, the query path determined by the server is: the method comprises the steps that a main data set, a sub data set 1, a sub data set 2 and a main data set, a sub data set 4, a sub data set 5 are obtained, then data relevant to the dimension M are obtained from the main data set, the sub data set 1 and the sub data set 2 respectively, and meanwhile data relevant to the dimension N are obtained from the main data set, the sub data set 4 and the sub data set 5 respectively. In this way, the data of the associated dimensions M and N are eventually queried.
Of course, in a manner in this embodiment of the application, the server generates a corresponding Structured Query Language (SQL) according to the determined Query path, and completes querying the data required by the user according to the identification information.
In practical applications, when a user performs data query, the display modes of the required data results may be different, for example: some users want to display the queried sales and sales data in a column, and time data in a row, etc.
Based on this, in order for the user to obtain the data result he wants, therefore, the method further comprises: and receiving a display instruction sent by a user, and converting the format of the data contained in the query result into a data format corresponding to the display instruction according to the display instruction.
Wherein, the data format corresponding to the display instruction comprises: at least one of a graphical presentation format, a multi-dimensional list format. Among them, the graphic presentation format includes but is not limited to: pie charts, bar charts, line charts, cube charts, etc. have a view of the graphical structure.
That is to say, in the embodiment of the present application, the user may define the display mode of the data result required by the user, and send the corresponding definition setting (that is, the display instruction) to the server, so that the server may process the queried data according to the display mode required by the user. And finally returned to the user.
In summary, in the embodiment of the present application, through the association relationship between the multidimensional data sets, a manner of performing physical modeling (for example, constructing a specific data table format, etc.) on data in a database in the prior art is avoided, so that a user does not need to know a storage manner of specific data, and a specific data table or field in which the data is stored. On one hand, the use threshold of the user is reduced, and on the other hand, when application scenes such as data interconnection are involved, the flexibility is greatly improved. In addition, because the associated multidimensional data set in the data relation model is a duplicate copy, the subsequent metadata configuration changes such as dimension/measurement and the like can not affect each other, and the stability of the logic model is maintained.
Based on the same idea, the embodiment of the present application further provides a data query device, as shown in fig. 4.
In fig. 4, the data querying device includes: a receiving module 401, a determining module 402, a query path module 403, and a query processing module 404, wherein,
the receiving module 401 is configured to receive a data query request.
The determining module 402 is configured to determine identification information of the data to be queried corresponding to the data query request.
The detecting module 403 is configured to determine, according to the determined identification information, a query path corresponding to the identification information in a pre-established multidimensional data relationship model.
And the query processing module 404 is configured to perform data query according to the query path and the identification information, and generate a query result.
In an embodiment of the present application, the apparatus further includes: the data relation module is used for pre-selecting a fact table in a database, taking the selected fact table as a main data set, determining each attribute information in the main data set, and determining each associated data table corresponding to each attribute information in the database according to each attribute information, wherein each associated data table comprises a fact table and/or a dimension table; and taking each determined associated data table as a sub data set, and establishing a data relationship between each sub data set and the main data set to form the multi-dimensional data relationship model.
Further, the data relationship module is specifically configured to determine, for each attribute information, each data table including the attribute information in the database, and use each determined data table as an associated data table.
Based on this, the data relationship module is specifically configured to determine each associated data table including the attribute information as a primary sub data set associated with the main data set; and determining each level of sub data set of the main data set by performing the following operations:
determining sub-attribute information contained in each level of sub-data set, determining each associated data table containing the sub-attribute information, and taking each associated data table containing the sub-attribute information as a next level of sub-data set of the first level of sub-data set until the associated data table cannot be determined according to the sub-attribute information.
In an embodiment, if the identification information falls into multiple data sets, the query processing module 404 is specifically configured to determine each data set into which the identification information falls, determine, in the multidimensional data relationship model, a common higher-level data set of each data set into which the identification information falls, determine each path from each data set into which the identification information falls to the common higher-level data set, and determine each path as the query path to which the identification information belongs.
The query processing module 404 is specifically configured to determine, according to the query path, each data set corresponding to the query path, and query, in each determined data set, data corresponding to the identification information according to the identification information.
Furthermore, the apparatus further comprises: a data format processing module 405, configured to receive a display instruction sent by a user, perform format conversion on data included in the query result according to the display instruction, and convert the data into a data format corresponding to the display instruction, where the data format corresponding to the display instruction includes: at least one of a graphical presentation format, a multi-dimensional list format.
In this embodiment of the application, the identification information of the data to be queried includes: the dimension to be queried and its corresponding metric value, or dimension.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.