CN114996369A - Method and device for constructing data warehouse index library - Google Patents

Method and device for constructing data warehouse index library Download PDF

Info

Publication number
CN114996369A
CN114996369A CN202210695782.XA CN202210695782A CN114996369A CN 114996369 A CN114996369 A CN 114996369A CN 202210695782 A CN202210695782 A CN 202210695782A CN 114996369 A CN114996369 A CN 114996369A
Authority
CN
China
Prior art keywords
index
dimension
attribute
data warehouse
fact table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210695782.XA
Other languages
Chinese (zh)
Inventor
吴振伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Miaoyi Biotechnology Co ltd
Original Assignee
Shanghai Miaoyi Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Miaoyi Biotechnology Co ltd filed Critical Shanghai Miaoyi Biotechnology Co ltd
Priority to CN202210695782.XA priority Critical patent/CN114996369A/en
Publication of CN114996369A publication Critical patent/CN114996369A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Abstract

The invention relates to a method and a device for constructing a data warehouse index library, wherein the method comprises the following steps: defining the dimensionality of the data warehouse, and setting a dimensionality master table of the dimensionality and a fact table associated with the dimensionality master table; defining an atomic index and setting a fact table associated with the atomic index; defining a modifier; configuring an index library of a data warehouse by using a graph database, wherein the configuration at least comprises the following steps: dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between dimension main table and fact table of dimension, and incidence relation between atom index and fact table. By the method and the device, the problem that the definition of the indexes in the data warehouse is not clear and definite enough in the related technology is solved, the indexes of the data warehouse are managed in a standardized mode, and repeated development and definition ambiguity of the indexes are reduced.

Description

Method and device for constructing data warehouse index library
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for constructing a data warehouse index library, computer equipment and a computer readable storage medium.
Background
At present, platforms for data analysis and data application of data warehouses built by enterprises are very common, data can be conveniently fetched and analyzed through the platforms, how the data of the bottom layer are processed is not required to be concerned, and the data analysis efficiency is improved. However, the definition of the index in the data warehouse is not clear enough, and the same index may be repeatedly developed, resulting in waste of resources, and also causing the occurrence of different definitions of the same index. And the calculation logic of the index can be known only by inquiring codes by developers, but the developers all define the index by the experience and the oral phase, so that the development after the addition of new people in the later period is not facilitated.
At present, no effective solution is provided for the problem that the definition of indexes in a data warehouse in the related art is not clear and definite enough.
Disclosure of Invention
The present application aims to overcome the defects in the prior art, and provides a method, an apparatus, a computer device and a computer-readable storage medium for constructing a data warehouse index library, so as to at least solve the problem that the definition of indexes in a data warehouse in the related art is not clear and definite.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
in a first aspect, an embodiment of the present application provides a method for constructing a database index library, including:
defining the dimensionality of a data warehouse, and setting a dimensionality main table of the dimensionality and a fact table associated with the dimensionality main table;
defining an atom index and setting a fact table associated with the atom index;
defining a modifier;
configuring an index library of a data warehouse by using a graph database, wherein the configuration at least comprises the following steps: dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between dimension main table and fact table of dimension, and incidence relation between atom index and fact table.
In some of these embodiments, configuring the metrics repository of the data repository with the graph database includes:
and setting a dimension node and an attribute thereof, an atom index node and an attribute thereof, a modifier node and an attribute thereof, a fact table node and an attribute thereof, and a relationship node and an attribute thereof by using a graph database, wherein the relationship node and the attribute thereof are used for indicating the association relationship between a dimension main table of the dimension and a fact table and the association relationship between an atom index and the fact table.
In some of these embodiments, further comprising:
determining a target fact table;
and acquiring a target dimension and a target atom index associated with the target fact table according to the relation node and the attribute thereof.
In some embodiments, the method further comprises:
generating a derivative index, wherein the derivative index is determined by a dimension, an atomic index and a modifier;
and taking the number from the data warehouse by using the derived index.
In some embodiments, the method further comprises:
determining a target fact table, and determining dimensions, an atomic index and a polymerization mode;
and aggregating the atomic indexes under the dimensionalities based on the determined dimensionalities, the atomic indexes and the aggregation mode to generate a target data table.
In some of these embodiments, the dimension master table includes at least: the method comprises the following steps of (1) dimension master table primary keys, dimension master table value keys and dimension master table attributes; the fact table includes at least: a foreign key and a dimension redundancy value, wherein the upper foreign key is consistent with the dimension master table primary key.
In a second aspect, an embodiment of the present application provides an apparatus for constructing a database of data warehouse indexes, including:
the first definition unit is used for defining the dimensionality of the data warehouse and setting a dimensionality master table of the dimensionality and a fact table associated with the dimensionality master table;
the second definition unit is used for defining the atomic index and setting a fact table associated with the atomic index;
a third defining unit for defining modifiers;
a configuration unit, configured to configure an index library of a data warehouse by using a graph database, where the configuration at least includes: the method comprises the following steps of dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between a dimension main table and a fact table of the dimension, and incidence relation between an atom index and the fact table.
In some of these embodiments, the configuration unit comprises:
the setting module is used for setting the dimension nodes and the attributes thereof, the atom index nodes and the attributes thereof, the modifier nodes and the attributes thereof, the fact table nodes and the attributes thereof, and the relationship nodes and the attributes thereof by using the graph database, wherein the relationship nodes and the attributes thereof are used for indicating the association relationship between the dimension main table and the fact table of the dimension and the association relationship between the atom index and the fact table.
In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the processor implements the method for constructing the database index library according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method for constructing a database index library as described in the first aspect.
By adopting the technical scheme, compared with the prior art, the method for constructing the index database of the data warehouse provided by the embodiment of the application defines the dimensionality of the data warehouse and sets the dimensionality master table and the fact table related to the dimensionality master table; defining an atomic index and setting a fact table associated with the atomic index; defining a modifier; the index database of the data warehouse is configured by using the graph database, so that the problem that the definition of the indexes in the data warehouse is not clear and definite enough in the related technology is solved, the indexes of the data warehouse are managed in a standardized manner, the repeated development and definition ambiguity of the indexes are reduced, and the convenience and the accuracy of using the data by a demander can be effectively improved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of building a data warehouse index library according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an index library configured using a graph database according to an embodiment of the present application;
FIG. 4 is a schematic diagram of data retrieval from a data warehouse using an index library according to an embodiment of the present application;
FIG. 5 is a block diagram of an apparatus for constructing a database index library according to an embodiment of the present application;
fig. 6 is a hardware structure diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless otherwise defined, technical or scientific terms referred to herein should have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The use of the terms "a" and "an" and "the" and similar referents in the context of describing the invention (including a single reference) are to be construed in a non-limiting sense as indicating either the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The embodiment provides a mobile terminal. Fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application. As shown in fig. 1, the mobile terminal includes: a Radio Frequency (RF) circuit 110, a memory 120, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, a wireless fidelity (WiFi) module 170, a processor 180, and a power supply 190. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each constituent element of the mobile terminal in detail with reference to fig. 1:
the RF circuit 110 may be used for receiving and transmitting signals during a message transmission or a call, and in particular, may be used for processing the downlink message of the base station after receiving the downlink message; in addition, the data for designing uplink is transmitted to the base station. In general, the RF circuit includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.
The memory 120 may be used to store software programs and modules, and the processor 180 executes various functional applications and data processing of the mobile terminal by operating the software programs and modules stored in the memory 120. The memory 120 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Further, the memory 120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the input unit 130 may include a touch panel 131 and other input devices 132. The touch panel 131, also referred to as a touch screen, may collect touch operations of a user on or near the touch panel 131 (e.g., operations of the user on or near the touch panel 131 using any suitable object or accessory such as a finger or a stylus pen), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 131 may include two parts, i.e., a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 180, and can receive and execute commands sent by the processor 180. In addition, the touch panel 131 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 130 may include other input devices 132 in addition to the touch panel 131. In particular, other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 140 may be used to display information input by a user or information provided to the user and various menus of the mobile terminal. The Display unit 140 may include a Display panel 141, and optionally, the Display panel 141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 131 can cover the display panel 141, and when the touch panel 131 detects a touch operation on or near the touch panel 131, the touch operation is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. Although the touch panel 131 and the display panel 141 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 131 and the display panel 141 may be integrated to implement the input and output functions of the mobile terminal.
The mobile terminal may also include at least one sensor 150, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 141 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 141 and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the mobile terminal, and related functions (such as pedometer and tapping) for vibration recognition; other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile terminal, are not described herein again.
A speaker 161 and a microphone 162 in the audio circuit 160 may provide an audio interface between the user and the mobile terminal. The audio circuit 160 may transmit the electrical signal converted from the received audio data to the speaker 161, and convert the electrical signal into a sound signal for output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal, which is received by the audio circuit 160 and converted into audio data, which is then processed by the audio data output processor 180 and then transmitted to, for example, another mobile terminal via the RF circuit 110, or the audio data is output to the memory 120 for further processing.
WiFi belongs to a short-distance wireless transmission technology, and the mobile terminal can help a user to send and receive e-mails, browse webpages, access streaming media and the like through the WiFi module 170, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 170, it is understood that it does not belong to the essential components of the mobile terminal, and it can be omitted or replaced with other short-range wireless transmission modules, such as Zigbee module or WAPI module, etc., as required within the scope not changing the essence of the invention.
The processor 180 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby performing overall monitoring of the mobile terminal. Alternatively, processor 180 may include one or more processing units; preferably, the processor 180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.
The mobile terminal also includes a power supply 190 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 180 via a power management system that may be configured to manage charging, discharging, and power consumption.
Although not shown, the mobile terminal may further include a camera, a bluetooth module, and the like, which will not be described herein.
In this embodiment, the processor 180 is configured to:
defining the dimensionality of a data warehouse, and setting a dimensionality main table of the dimensionality and a fact table associated with the dimensionality main table;
defining an atomic index and setting a fact table associated with the atomic index;
defining a modifier;
configuring an index library of a data warehouse by using a graph database, wherein the configuration at least comprises the following steps: the method comprises the following steps of dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between a dimension main table and a fact table of the dimension, and incidence relation between an atom index and the fact table.
In some of these embodiments, the processor 180 is further configured to:
and setting a dimension node and an attribute thereof, an atom index node and an attribute thereof, a modifier node and an attribute thereof, a fact table node and an attribute thereof, and a relationship node and an attribute thereof by using a graph database, wherein the relationship node and the attribute thereof are used for indicating the association relationship between a dimension main table of the dimension and a fact table and the association relationship between an atom index and the fact table.
In some of these embodiments, the processor 180 is further configured to:
determining a target fact table;
and acquiring a target dimension and a target atom index associated with the target fact table according to the relationship node and the attribute thereof.
In some of these embodiments, the processor 180 is further configured to:
generating a derivative index, wherein the derivative index is determined by a dimension, an atomic index and a modifier;
and taking the number from the data warehouse by using the derived index.
In some of these embodiments, the processor 180 is further configured to:
determining a target fact table, and determining dimensions, an atomic index and a polymerization mode;
and aggregating the atom indexes under the dimensionality based on the determined dimensionality, the atom indexes and the aggregation mode to generate a target data table.
The embodiment provides a method for constructing a data warehouse index library. Fig. 2 is a flowchart of a method for constructing a database index library according to an embodiment of the present application, where the flowchart includes the following steps, as shown in fig. 2:
step S201, defining the dimensionality of a data warehouse, and setting a dimensionality main table of the dimensionality and a fact table related to the dimensionality main table;
step S202, defining an atom index and setting a fact table associated with the atom index;
step S203, defining modifiers;
step S204, a database is used for configuring an index library of a data warehouse, wherein the configuration at least comprises the following steps: the method comprises the following steps of dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between a dimension main table and a fact table of the dimension, and incidence relation between an atom index and the fact table.
Through the steps, the problem that the definition of the indexes in the data warehouse is not clear and definite in the related technology is solved, the indexes of the data warehouse are managed in a standardized mode, repeated development and definition ambiguity of the indexes are reduced, and convenience and accuracy of using data by a demander can be effectively improved.
In some of these embodiments, configuring the metrics repository of the data repository with the graph database includes:
and setting a dimension node and an attribute thereof, an atom index node and an attribute thereof, a modifier node and an attribute thereof, a fact table node and an attribute thereof, a relation node and an attribute thereof, and a relation node and an attribute thereof by using a graph database, wherein the relation node and the attribute thereof are used for indicating the incidence relation between a dimension main table and a fact table of the dimension, and the incidence relation between an atom index and the fact table.
In some of these embodiments, the method further comprises:
determining a target fact table;
and acquiring a target dimension and a target atom index associated with the target fact table according to the relationship node and the attribute thereof.
It should be noted that the graph database map is composed of two elements: nodes and relationships. Each node represents an entity (person, place, thing, category or other data), and dimensions, fact tables, etc. can be represented in the data warehouse, and each relationship represents the association of two nodes. The general structure can be used for modeling various scenes, the relation between a fact table and a dimension can be flexibly defined in a data warehouse, and various attributes such as dependency attributes, parent-child attributes, related attributes and the like can be flexibly defined in relation.
Unlike other databases, relationships dominate graph databases. The data model of the graph database is also simpler and more expressive than relational or other NoSQL databases. Relational databases are not very friendly to the processing of relationships, and even if configuration is completed, performance may be degraded if the relationships are complex when configuration is obtained; the graph database only needs to configure the nodes, the attributes and the relationships among the nodes, so that the speed in the aspect of relationship processing is very high, and the graph database is more flexible and convenient.
Background configuration can be conveniently completed by using a graph database, the background configuration comprises setting atomic indexes and attributes thereof, setting dimension nodes and attributes thereof, setting modifier nodes and attributes thereof, setting fact table nodes and attributes thereof, and setting relationship nodes and attributes thereof, the work is mainly completed by data warehouse developers, after configuration is completed, a foreground user can directly call a graph database to call out the dimensions, the atomic indexes, modifiers and the like of the previously configured relationships after selecting a certain fact table, the selection can also be completed by the foreground user, the configuration can be more conveniently and flexibly completed by the user, and the specific process is shown in FIG. 3. The configuration of the derived index can also be set and generated in a background database.
In some of these embodiments, the dimension master table includes at least: the method comprises the following steps of (1) dimension master table primary keys, dimension master table value keys and dimension master table attributes; the fact table includes at least: a foreign key and a dimension redundancy value, wherein the upper foreign key is consistent with the dimension master table primary key.
In some embodiments, after defining the dimensions, the atomic index, the modifiers, the fact table, the method further comprises:
generating a derivative index, wherein the derivative index is determined by a dimension, an atomic index and a modifier;
and taking the number from the data warehouse by using the derived index.
In some of these embodiments, the method further comprises:
determining a target fact table, and determining dimensions, an atomic index and a polymerization mode;
and aggregating the atom indexes under the dimensionality based on the determined dimensionality, the atom indexes and the aggregation mode to generate a target data table.
In an actual application scenario, a process of taking numbers from a data warehouse by using an index library can be as shown in fig. 4, and based on a data warehouse and a subject domain determined by the data warehouse, the index library can be constructed, and dimensions, atomic indexes, modifiers and fact tables can be defined and an association relationship between the dimension, the atomic indexes, the modifiers and the fact tables can be set in the index library; then generating a derivative index based on the dimension, the atom index and the modifier; and finally, using the derived indexes to obtain the data from the data warehouse.
The following points need to be explained:
1. management of dimensions
The dimension is the angle from which things are observed, such as some stores pay amounts in the last 120 days, a store is a dimension. In SQL it is typically part of a group by. After the data warehouse defines the dimension, a dimension main table and an associated fact table of the dimension need to be set.
a. Dimension master table
There is typically one dimension master table for each dimension. The dimension master table mainly comprises the following contents:
dimension master key: typically, the ID is a store ID or the like.
Dimension master table value key: typically a name such as a shop name.
Dimension Main Table Attribute: and other auxiliary information such as the store leader and the like.
b. Dimension-related fact table
The dimension needs to associate a fact table containing indexes, and the fact table generally has two dimensions of fields:
an external key: the general field names are consistent with the primary keys of the dimension primary tables and serve as the foreign keys of the fact tables for associating the fact tables with the dimension tables.
Dimension redundancy value: a part of commonly used dimension attribute values are generally redundant on the fact table, so that the method is convenient to directly obtain and does not need to associate a dimension main table. Such as the store address, the store leader, etc.
2. Management of atomic indexes
The atomic index and the measurement have the same meaning, are based on the measurement under a certain business event behavior, are indexes which can not be split again in the business definition, and have definite business meanings, such as payment amount. The atomic index describes what is actually a type of index, such as order payment amount, payment amount of orders, order placing amount, PV, UV, and the like. But only one atomic index is not directly accessible, such as what is the number of visitors? This question is not answered. Because this problem does not specify information such as a specific dimension (number of visitors in the store dimension, or number of visitors in the merchandise dimension) and modifiers (number of visitors in the last 1 day, or last 90 days).
a. Correlation mode
An atomic index needs to be associated with a specific fact table, one atomic index may be associated with one or more fact tables, and the corresponding dimension of each fact table may be different.
b. Polymerization mode
The aggregation modes which can be used by different fact tables can also be different, and the aggregation modes can be configured according to actual situations and can be aggregated according to different dimensions. The fact table may be selected first, and corresponding dimension fields, atomic indexes, and aggregation modes (summation, counting, averaging, etc.) may be selected according to different fact tables, where the aggregation mode is for the atomic indexes, and finally, the corresponding table is generated.
For example: the method comprises the steps of providing a fact table of store sales, wherein dimensions comprise store names, responsible persons, sales dates and the like, measurement values comprise sales, the dimensions can be selected to be set to be 60 days for the store names and modifiers using the sales dates, atomic indexes select the sales, aggregation modes select and sum, the generated table is a list of the store names and the list of the sales of 60 days, and the sales of 60 days in each store can be conveniently inquired through the table.
3. Management of modifiers
Modifiers are some specific value of a dimension. Corresponding to the where filter condition in SQL. The approximate SQL statement is as follows:
condition of select sum (atomic index) from table where modifier
The dimension and the modifier have certain mutual substitution functions. There is no strict theorem on when to wait for a dimension and when to use modifiers. Just the problem that can be solved with dimensions is to try to use dimensions, not modifiers, because dimensions are a more general form. The service party can filter the dimensionality in the mysql library according to the requirement of the service party.
4. Management of derived metrics
The dimension and atom indexes are designed by standing at the angle of the data warehouse and the BI and conform to the star model of the data warehouse. There are multiple indexes on each fact table, and each fact table contains external keys of multiple dimension tables. But the index that the business side is more concerned about is the index that has the meaning of actual business and can directly fetch data. For example, the order payment amount of the shop for about 1 day is a derivative index, and the derivative index can be directly displayed on the product to be seen by the merchant. But this index cannot be directly drawn from the unified middle level of the data warehouse (the data warehouse typically provides a large wide table because there is no existing fact field). The index requirement of a bridge connection data warehouse middle layer and a business side is needed, so that the derived index is obtained.
The algorithm for deriving the index is as follows: the derived index is dimension + atomic index + modifier. When dimensions, atomic indexes and modifiers are all determined, a derivative index can be uniquely determined, and specific numerical values are given at the same time. The shop is a dimension in the order payment amount of the shop in the last 1 day, the shop is a modifier of a time type in the last 1 day, and the payment amount is an atomic index. The business side makes each derived index defined by selecting three metadata of dimension, atomic index and modifier, and the uniqueness of the index can be ensured compared with the use of name to distinguish different indexes. If the 2 derived indices are different, their components must be distinguished, either by different dimensions, or by different atomic indices, or by different modifiers.
By means of the method for creating the derived indexes, because the dimension and the atomic indexes are marked in the information index library, the index library can generate the SQL (aggregated by the data warehouse middle layer) for data acquisition, the development of the indexes is simplified, the trouble caused by the lack of experience or the inexperience of the data warehouse middle layer in the data development classmates is avoided, and the generated SQL (the data classmates can be modified based on the SQL and sometimes needs SQL optimization) is also used as the technical caliber of the derived indexes. When the derived index is needed in the subsequent item, the index can be searched in the index library, and the part of the access logic is checked and used. Repeated development of indexes is avoided (if a new derivative index is directly created without retrieval, the three elements uniquely determine one derivative index, and repeated creation can be detected), and ambiguity is eliminated.
The index library gives a globally unique and accurate definition to each index, and indexes can be inquired, developed and used more quickly and conveniently by a demander through the index library. The construction of the index library can effectively improve the convenience and accuracy of the data used by the demander. Centralized and standardized management can be realized in the index management process, and repeated development and definition ambiguity of the indexes are reduced.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment provides a device for constructing a database index library, where the device is used to implement the foregoing embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram of a construction apparatus of a data warehouse index library according to an embodiment of the present application, and as shown in fig. 5, the apparatus includes:
a first defining unit 51, configured to define a dimension of the data warehouse, and set a dimension master table of the dimension and a fact table associated therewith;
a second defining unit 52, configured to define an atomic index and set a fact table associated therewith;
a third defining unit 53, configured to define modifiers;
a configuration unit 54, configured to configure an index library of the data warehouse by using the graph database, where the configuration at least includes: dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between dimension main table and fact table of dimension, and incidence relation between atom index and fact table.
In some of these embodiments, the configuration unit 54 includes:
the setting module is used for setting the dimension nodes and the attributes thereof, the atom index nodes and the attributes thereof, the modifier nodes and the attributes thereof, the fact table nodes and the attributes thereof, and the relationship nodes and the attributes thereof by using the graph database, wherein the relationship nodes and the attributes thereof are used for indicating the association relationship between the dimension main table and the fact table of the dimension and the association relationship between the atom index and the fact table.
In some of these embodiments, the apparatus further comprises:
a first determination unit for determining a target fact table;
and the acquisition unit is used for acquiring the target dimension and the target atom index associated with the target fact table according to the relationship node and the attribute thereof.
In some of these embodiments, the apparatus further comprises:
the device comprises a first generation unit, a second generation unit and a third generation unit, wherein the first generation unit is used for generating a derivative index, and the derivative index is determined by dimension, an atom index and a modifier;
and the access unit is used for accessing from the data warehouse by using the derived indexes.
In some of these embodiments, the apparatus further comprises:
the second determining unit is used for determining the target fact table and determining the dimension, the atomic index and the aggregation mode;
and the second generation unit is used for aggregating the atomic indexes under the dimensionality based on the determined dimensionality, the atomic indexes and the aggregation mode to generate a target data table.
In some of these embodiments, the dimension master table includes at least: a dimension master table primary key, a dimension master table value key and a dimension master table attribute; the fact table includes at least: an outer key, a dimension redundancy value, wherein the upper outer key is consistent with the dimension primary table primary key.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
An embodiment provides a computer device. The method for constructing the database index database in combination with the embodiment of the application can be realized by computer equipment. Fig. 6 is a hardware structure diagram of a computer device according to an embodiment of the present application.
The computer device may comprise a processor 61 and a memory 62 in which computer program instructions are stored.
Specifically, the processor 61 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 62 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 62 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 62 may include removable or non-removable (or fixed) media, where appropriate. The memory 62 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 62 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 62 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended Data Out Dynamic Random Access Memory (EDODRAM), a Synchronous Dynamic Random Access Memory (SDRAM), and the like.
The memory 62 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 61.
The processor 61 reads and executes the computer program instructions stored in the memory 62 to implement the method for constructing the database index library in any of the above embodiments.
In some of these embodiments, the computer device may also include a communication interface 63 and a bus 60. As shown in fig. 6, the processor 61, the memory 62, and the communication interface 63 are connected via a bus 60 to complete communication therebetween.
The communication interface 63 is used for implementing communication between modules, devices, units and/or apparatuses in the embodiments of the present application. The communication interface 63 may also enable communication with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
Bus 60 comprises hardware, software, or both coupling the components of the computer device to each other. Bus 60 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 60 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 60 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the present application, any suitable buses or interconnects are contemplated by the present application.
In addition, in combination with the method for constructing the database index library in the foregoing embodiment, the embodiment of the present application may provide a computer-readable storage medium to implement the method. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement the method of constructing a database of data warehouse indices of any of the above embodiments.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims (10)

1. A method for constructing a database index database is characterized by comprising the following steps:
defining the dimensionality of a data warehouse, and setting a dimensionality main table of the dimensionality and a fact table associated with the dimensionality main table;
defining an atomic index and setting a fact table associated with the atomic index;
defining a modifier;
configuring an index library of a data warehouse by using a graph database, wherein the configuration at least comprises the following steps: dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between dimension main table and fact table of dimension, and incidence relation between atom index and fact table.
2. The method of constructing a database index library of a data warehouse as claimed in claim 1, wherein the configuring of the database index library of the data warehouse using the database comprises:
and setting a dimension node and an attribute thereof, an atom index node and an attribute thereof, a modifier node and an attribute thereof, a fact table node and an attribute thereof, a relation node and an attribute thereof, and a relation node and an attribute thereof by using a graph database, wherein the relation node and the attribute thereof are used for indicating the incidence relation between a dimension main table and a fact table of the dimension, and the incidence relation between an atom index and the fact table.
3. The method of building a data warehouse index library of claim 2, further comprising:
determining a target fact table;
and acquiring a target dimension and a target atom index associated with the target fact table according to the relationship node and the attribute thereof.
4. The method of building a data warehouse index library of claim 1, further comprising:
generating a derivative index, wherein the derivative index is determined by a dimension, an atomic index and a modifier;
and taking the number from the data warehouse by using the derived index.
5. The method of building a data warehouse index library of claim 1, further comprising:
determining a target fact table, and determining dimensions, an atomic index and a polymerization mode;
and aggregating the atomic indexes under the dimensionalities based on the determined dimensionalities, the atomic indexes and the aggregation mode to generate a target data table.
6. The method of building a data warehouse index library according to any of claims 1 to 5, wherein the dimension master table includes at least: the method comprises the following steps of (1) dimension master table primary keys, dimension master table value keys and dimension master table attributes; the fact table includes at least: an outer key, a dimension redundancy value, wherein the upper outer key is consistent with the dimension primary table primary key.
7. An apparatus for constructing a database of data warehouse index libraries, comprising:
the first definition unit is used for defining the dimensionality of the data warehouse and setting a dimensionality master table of the dimensionality and a fact table associated with the dimensionality master table;
the second definition unit is used for defining the atomic index and setting a fact table associated with the atomic index;
a third defining unit for defining modifiers;
a configuration unit, configured to configure an index library of a data warehouse by using a graph database, where the configuration at least includes: the method comprises the following steps of dimension attribute, atom index attribute, modifier attribute, fact table attribute, incidence relation between a dimension main table and a fact table of the dimension, and incidence relation between an atom index and the fact table.
8. The apparatus of claim 7, wherein the configuration unit comprises:
the setting module is used for setting the dimension nodes and the attributes thereof, the atom index nodes and the attributes thereof, the modifier nodes and the attributes thereof, the fact table nodes and the attributes thereof, and the relationship nodes and the attributes thereof by using the graph database, wherein the relationship nodes and the attributes thereof are used for indicating the association relationship between the dimension main table and the fact table of the dimension and the association relationship between the atom index and the fact table.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements a method of building a database of data warehouse indicators as claimed in any one of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of constructing a data warehouse index library according to any one of claims 1 to 6.
CN202210695782.XA 2022-06-20 2022-06-20 Method and device for constructing data warehouse index library Pending CN114996369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210695782.XA CN114996369A (en) 2022-06-20 2022-06-20 Method and device for constructing data warehouse index library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210695782.XA CN114996369A (en) 2022-06-20 2022-06-20 Method and device for constructing data warehouse index library

Publications (1)

Publication Number Publication Date
CN114996369A true CN114996369A (en) 2022-09-02

Family

ID=83035357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210695782.XA Pending CN114996369A (en) 2022-06-20 2022-06-20 Method and device for constructing data warehouse index library

Country Status (1)

Country Link
CN (1) CN114996369A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861455A (en) * 2023-06-25 2023-10-10 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium
CN116861455B (en) * 2023-06-25 2024-04-26 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116861455A (en) * 2023-06-25 2023-10-10 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium
CN116861455B (en) * 2023-06-25 2024-04-26 上海数禾信息科技有限公司 Event data processing method, system, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN107464162B (en) Commodity association method and device and computer-readable storage medium
CN108541310B (en) Method and device for displaying candidate words and graphical user interface
CN107741937A (en) A kind of data query method and device
CN111125269B (en) Data management method, blood relationship display method and related device
CN104112213A (en) Method and apparatus of recommendation information
CN110019825B (en) Method and device for analyzing data semantics
CN107025225A (en) A kind of parallel execution method and apparatus of terminal database
CN111078556B (en) Application testing method and device
CN107463686A (en) A kind of method and device of calculating network public sentiment temperature
CN104516886A (en) Method, mobile terminal and server for displaying data analysis result
CN105335653A (en) Abnormal data detection method and apparatus
CN112395086B (en) Resource allocation method and device, electronic equipment and storage medium
CN111966491B (en) Method for counting occupied memory and terminal equipment
CN111027854A (en) Comprehensive portrait index generation method based on enterprise big data and related equipment
CN110597793A (en) Data management method and device, electronic equipment and computer readable storage medium
CN112540996A (en) Service data verification method and device, electronic equipment and storage medium
CN113392150A (en) Data table display method, device, equipment and medium based on service domain
CN109246233A (en) Data processing method, device, equipment and storage medium based on on-line monitoring
WO2021007757A1 (en) User identification method and related product
CN108616637B (en) Information processing method and device, electronic equipment and computer readable storage medium
CN110599158A (en) Virtual card combination method, virtual card combination device and terminal equipment
CN106777383B (en) File sorting method and intelligent terminal
CN114996369A (en) Method and device for constructing data warehouse index library
CN114840565A (en) Sampling query method, device, electronic equipment and computer readable storage medium
CN115686506A (en) Data display method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination