CN109376148B - Data processing method and device for slow change dimension table and electronic equipment - Google Patents

Data processing method and device for slow change dimension table and electronic equipment Download PDF

Info

Publication number
CN109376148B
CN109376148B CN201810962478.0A CN201810962478A CN109376148B CN 109376148 B CN109376148 B CN 109376148B CN 201810962478 A CN201810962478 A CN 201810962478A CN 109376148 B CN109376148 B CN 109376148B
Authority
CN
China
Prior art keywords
data
time
data partition
partition
target service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810962478.0A
Other languages
Chinese (zh)
Other versions
CN109376148A (en
Inventor
崔晓晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201810962478.0A priority Critical patent/CN109376148B/en
Publication of CN109376148A publication Critical patent/CN109376148A/en
Application granted granted Critical
Publication of CN109376148B publication Critical patent/CN109376148B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure belongs to the technical field of data processing, and relates to a data processing method and device for a slow change dimension table, a computer readable storage medium and electronic equipment, wherein the method comprises the following steps: creating a first data partition corresponding to a time range according to the preset time range, wherein the first data partition is used for storing a value before updating a main key of the slow change dimension table in the time range, and the updating frequency of the main key in the time range is not more than one time; creating a second data partition, wherein the second data partition is used for storing updated latest values of the current query time corresponding to the primary key; extracting the latest value of the primary key in the first data partition before the target service time to obtain historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time. The method saves a large amount of storage space on one hand and can reproduce a full amount of data every day on the other hand.

Description

Data processing method and device for slow change dimension table and electronic equipment
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method for a slowly-changing dimension table, a data processing apparatus for a slowly-changing dimension table, a computer storage medium, and an electronic device.
Background
With economic development and social progress, terminal devices such as computers and tablet computers become main tools for life and work of people, and each operation of people on the terminal devices is stored in a data warehouse in the terminal devices in a data form, and the data can be used for analyzing user behaviors or program fault reasons.
One major requirement of data warehouse is to keep historical data, while maintaining a table with an ID corresponding to only one record is generally required to ensure data association efficiency. Thus, a contradiction is generated that is difficult to reconcile, and if an ID can only correspond to one record, it can only record the latest attribute value of the ID, and cannot record the change of the attribute value. The industry often adopts the whole daily extraction of the dimension table data, and places the dimension table data into the current day partition, so that the history data is stored in the whole daily extraction. However, when the dimension table is large, such as commodity data of electronic commerce and billions of user data, the scheme can use a large amount of storage space.
Therefore, there is a need in the art for a data processing method and apparatus for slowly changing dimension tables.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure aims to provide a data processing method of a slowly-varying dimension table, a data processing device of a slowly-varying dimension table, a computer storage medium and an electronic device, so that a large amount of storage space is saved at least to some extent, and a full-scale data snapshot of service time can be reproduced.
According to one aspect of the present disclosure, there is provided a data processing method of a slowly varying dimensional table, comprising:
creating a first data partition corresponding to a time range according to the preset time range, wherein the first data partition is used for storing a value before updating a main key of the slow change dimension table in the time range, and the updating frequency of the main key in the time range is not more than one time;
creating a second data partition, wherein the second data partition is used for storing updated latest values of the current query time corresponding to the primary key;
extracting the latest value of the primary key in the first data partition before the target service time to obtain historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
In an exemplary embodiment of the present disclosure, the number of the primary keys is a plurality; the creating a first data partition corresponding to a time range according to the preset time range, where the first data partition is used to store a value before updating a primary key of the slowly-varying dimension table in the time range, and includes:
and when the value of part of the main keys is not updated, the position corresponding to the main key which is not updated in the first data partition is empty.
In an exemplary embodiment of the disclosure, a plurality of data storage areas are respectively disposed in the first data partition and the second data partition, and the data storage areas are used for storing data corresponding to different primary keys.
In an exemplary embodiment of the present disclosure, the extracting the latest value of the primary key in the first data partition before a business time to obtain historical full data of the target business time includes:
acquiring the target service time;
and extracting all the first data partitions before the target service time according to the target service time, acquiring data which corresponds to each main key and has the update time closest to the target service time in the first data partitions, and determining the data which corresponds to each main key and has the update time closest to the target service time as historical full data of the target service time.
In an exemplary embodiment of the present disclosure, the method further comprises:
and when the main key is deleted, setting the position corresponding to the main key in the second data partition as null.
In an exemplary embodiment of the present disclosure, the number of the primary keys is a plurality; the method further comprises the steps of:
and when each primary key is not updated in the time range, the first data partition is not created corresponding to the time range.
According to an aspect of the present disclosure, there is provided a data processing apparatus for a slowly varying dimensional table, comprising:
the first data partition creation module is used for creating a first data partition corresponding to a time range according to the preset time range, wherein the first data partition is used for storing a value before the primary key of the slow change dimension table is updated in the time range, and the number of times of updating the primary key in the time range is not more than one;
the second data partition creation module is used for creating a second data partition, and the second data partition is used for storing the latest value of the current query time corresponding to the primary key;
the full data acquisition module is used for extracting the latest value of the main key in the first data partition before the target service time so as to acquire historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
In an exemplary embodiment of the present disclosure, the full-volume data acquisition module includes:
a target service time acquisition unit, configured to acquire the target service time;
and the historical full data acquisition unit is used for extracting all the first data partitions before the target service time according to the target service time, acquiring data which corresponds to the main keys and has the update time closest to the target service time in the first data partitions, and determining the data which corresponds to the main keys and has the update time closest to the target service time as the historical full data of the target service time.
According to one aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data processing method of the slowly varying dimensional table of any one of the above.
According to one aspect of the present disclosure, there is provided an electronic device including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the data processing method of the slowly varying dimensional table of any of the above via execution of the executable instructions.
According to the data processing method of the slow change dimension table, a first data partition is created according to a preset time range, and a second data partition is created at the same time, wherein the first data partition is used for storing a value before a main key of the slow change dimension table is updated in the time range, the second data partition is used for storing a value after the main key is updated, historical full data of target service time is obtained by extracting the latest value of the main key in the first data partition before the target service time, or the updated value of the main key in the second data partition is extracted to obtain the full data of the current query time. The data processing method of the slow change dimension table can save a large amount of storage space on one hand; and on the other hand, the full data snapshot corresponding to the service time can be reproduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 schematically illustrates a flow chart of a method of data processing for a slowly varying dimension table;
FIG. 2 schematically illustrates an exemplary diagram of an application scenario for a data processing method for a slowly varying dimension table;
FIG. 3 schematically illustrates a block diagram of a data partition;
FIG. 4 schematically shows a schematic diagram of a structure for reproducing full-volume data;
FIG. 5 schematically shows a schematic diagram of a data processing apparatus of a slowly varying dimension table;
FIG. 6 schematically illustrates an example block diagram of an electronic device for implementing a data processing method for a slowly varying dimensional table;
fig. 7 schematically illustrates a computer readable storage medium for implementing a data processing method for a slowly varying dimensional table.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
In this exemplary embodiment, a data processing method of a slowly-changing dimension table is provided first, where the data processing method of the slowly-changing dimension table may be executed on a server, or may be executed on a server cluster, a cloud server, or the like, and of course, those skilled in the art may execute the method of the present disclosure on other platforms according to requirements, which is not limited in particular in this exemplary embodiment. Referring to fig. 1, the method of adjusting the winning probability may include the steps of:
s110, creating a first data partition corresponding to a time range according to the preset time range, wherein the first data partition is used for storing a value before updating a main key of the slow change dimension table in the time range, and the number of times of updating the main key in the time range is not more than one time;
s120, creating a second data partition, wherein the second data partition is used for storing the latest value of the current query time corresponding to the primary key;
s130, extracting the latest value of the primary key in the first data partition before the target service time to obtain historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
In the data processing method of the slow change dimension table, the first data partition and the second data partition are created to respectively store the value before the main key of the slow change dimension table is updated and the latest value in the preset time range, and the historical full data of the target service time is obtained by extracting the latest value of the main key in the first data partition before the target service time, or the value of the main key of the second data partition is extracted to obtain the full data of the current query time. On one hand, the method saves a large amount of storage space, and on the other hand, the method can conveniently acquire the full data corresponding to the service time.
Next, each step in the data processing method of the above-described slow change dimension table in the present exemplary embodiment will be explained and described in detail with reference to fig. 2.
In step S110, a first data partition corresponding to a preset time range is created according to the time range, where the first data partition is used to store a value before the primary key of the slowly-varying dimension table is updated in the time range, and the number of times that the primary key is updated in the time range is not more than one.
In an exemplary embodiment of the present disclosure, a fact table for storing actual real-time data may be first created in the server 201 or the terminal 202, a primary key and a foreign key of the fact table are defined, and a record of the fact data is inserted into the fact table; simultaneously creating a plurality of dimension tables for storing data describing the attributes from different angles, defining a main key of the dimension tables, and inserting records of the data describing the attributes into the dimension tables; and finally, associating the fact table with the dimension table by using the external key of the fact table and the main keys of the plurality of dimension tables. As will be appreciated by those skilled in the art, there is only one primary key per dimension table.
In exemplary embodiments of the present disclosure, the data may be updated once a day, possibly also once a half month, etc., due to slow data update rates in the slowly varying dimensional tables. Thus, when creating the first data partition according to a time frame, the first data partition may be created according to different time frames, such as daily, weekly, monthly, etc., and the number of updates of the primary key within the time frame is guaranteed not to exceed once. For example, when the first data partition is created by day, the first data partition may be created from 00:00:00-23:59:59 of the day. The first data partition comprises a plurality of data storage areas, wherein each data storage area corresponds to a main key of each slow change vitamin table and is used for storing a value before updating a time range corresponding to the main key.
In step S120, a second data partition is created, where the second data partition is used to store the latest value of the current query time corresponding to the primary key.
In an exemplary embodiment of the present disclosure, a second data partition for storing the updated latest value of the primary key may be additionally created. For example, the dimension table associated with the fact table F is a first dimension table D1, a second dimension table D2, and a third dimension table D3, the primary key of the first dimension table D1 is a, the primary key of the second dimension table D2 is B, and the primary key of the third dimension table D3 is C, and accordingly, the first data partition and the second data partition respectively include three data storage areas, which respectively correspond to the primary keys A, B and C, and when one or more of A, B, C is updated at a certain time, the value before updating the primary key is stored in the corresponding data storage area in the first data partition, and the updated value is stored in the corresponding data storage area in the second data partition.
In an exemplary embodiment of the present disclosure, if a primary key is not updated, a data storage area corresponding to the primary key in a first data partition of a corresponding time range is set to be empty, so that storage space can be saved; further, if all the primary keys are not updated, then the first data partition is not created for the time frame, that is, only when the primary key is updated, the first data partition is created.
Fig. 3 shows a schematic diagram of a data partition, as shown in fig. 3, in which a first data partition is created in a time range of day, and is not created if a primary key is not updated, and the first data partition and a second data partition (New-DB) each include three data storage areas. Assuming that the three primary keys are A, B, C, and the data corresponding to the primary key A, B, C in the second data partition are a1, b1, and c1, if the data corresponding to the primary key a changes on the date of 2018, month 4, and day 1, the new value a2 is stored in the position corresponding to the primary key a in the new data partition, and the previous data a1 is stored in the first data partition with the number 20180401 and is placed corresponding to the primary key a; if the data corresponding to the primary key B changes in 2018, 5 and 1, storing the new value B2 in the position corresponding to the primary key B of the new data partition, and storing the previous data B1 in the first data partition with the number 20180501 and placing the corresponding primary key B; if the data corresponding to the primary key C changes in 2018, 5 and 25 days, the new value C2 is stored in the position corresponding to the primary key C in the new data partition, and the previous data C1 is stored in the first data partition with the number 20180525 and is placed corresponding to the primary key C.
In step S130, the latest value of the primary key in the first data partition before the target service time is extracted, so as to obtain historical full-scale data of the target service time; and simultaneously extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
In an exemplary embodiment of the present disclosure, in order to extract an appropriate amount of data and analyze it, when data processing is completed, the latest value of the primary key in the first data partition before the target business time may be extracted to obtain the historical full data of the target business time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
In an exemplary embodiment of the present disclosure, since the latest updated value of each primary key is stored in the second data partition, the latest full-amount data can be obtained by extracting the value of the second data partition; if the full data corresponding to the target service time is wanted to be obtained, all the first data partitions before the target service time can be extracted, the data which corresponds to each main key and has the latest update time and is closest to the target service time in each first data partition can be obtained, and the historical full data corresponding to the target service time can be obtained.
Fig. 4 shows a schematic diagram of a structure of reproducing total data, as shown in fig. 4, in which a first data partition of a 20180101-20180525 time range is stored in a database, if it is desired to obtain historical total data of 2018, 5, 1, then data of a time closest to 2018, 5, 1, corresponding to each primary key is extracted, for example, the first data partition of a primary key a closest to 2018, 5, 1, is the first data partition of 2018, 3, 15, then corresponding data a2 is the latest value of the primary key a, similarly, the latest data corresponding to the primary key B is B3, the latest data corresponding to the primary key C is C3, that is, the historical total data of 2018, 5, 1, is a2B3C3.
In an exemplary embodiment of the present disclosure, in the ETL stage, the data may be newly added data, modified data, and deleted data, and when the data is newly added or modified data, the newly added or modified data is stored in a data storage area corresponding to each primary key in the second data partition, and the data of each primary key stored in the second data partition before is stored in a data storage area corresponding to each primary key in the first data partition corresponding to a time range in which the data changes; when the data is deleted data, the data stored in the second data partition is stored in the first data partition when the data is deleted, and the corresponding data storage area of the second data partition is set to be empty.
The data processing method of the slow change dimension table greatly increases the data storage space, can save the historical data in a full amount, and can obtain the corresponding full amount of data according to the service time for data analysis.
The present disclosure also provides a data processing apparatus for slowly changing dimension tables. Fig. 5 shows a schematic diagram of a structure of a data processing apparatus of a slow-varying dimension table, which may include a first data partition creation module 510, a second data partition creation module 520, and a full-volume data acquisition module 530, as shown in fig. 5. Wherein:
a first data partition creating module 510, configured to create a first data partition corresponding to a time range according to a preset time range, where the first data partition is used to store a value before a primary key of the slowly-changing dimension table is updated in the time range, and the number of times that the primary key is updated in the time range is not more than one;
a second data partition creation module 520 for creating a second data partition, the second data partition is used for storing the latest value of the current query time corresponding to the primary key;
a full data obtaining module 530, configured to extract the latest value of the primary key in the first data partition before the target service time, so as to obtain historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
The specific details of each module in the data processing device of the slowly-varying dimension table are described in detail in the data processing method of the corresponding slowly-varying dimension table, so that the details are not repeated here.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to fig. 6. The electronic device 600 shown in fig. 6 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 6, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, and a bus 630 that connects the various system components, including the memory unit 620 and the processing unit 610.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs steps according to various exemplary embodiments of the present disclosure described in the above-described "exemplary methods" section of the present specification. For example, the processing unit 610 may perform step S110 as shown in fig. 1: creating a first data partition corresponding to a time range according to the preset time range, wherein the first data partition is used for storing a value before updating a main key of the slow change dimension table in the time range, and the updating frequency of the main key in the time range is not more than one time; step S120: creating a second data partition, wherein the second data partition is used for storing the latest value of the current query time corresponding to the primary key; step S130: extracting the latest value of the primary key in the first data partition before the target service time to obtain historical full data of the target service time; and extracting the value of the primary key in the second data partition to acquire the full data of the current query time.
The storage unit 620 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 1100 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. As shown, network adapter 660 communicates with other modules of electronic device 600 over bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
Referring to fig. 7, a program product 700 for implementing the above-described method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (6)

1. A method of processing data in a slowly varying dimensional table, comprising:
creating a first data partition which corresponds to a time range and comprises a plurality of data storage areas according to the preset time range, wherein each data storage area in the first data partition is used for storing values before updating different primary keys of the slow change dimension table in the time range, and the number of times of updating the primary keys in the time range is not more than one; when the number of the first data partitions is a plurality of, the lengths of the time ranges corresponding to the first data partitions are the same, and no intersection exists in the time ranges corresponding to the first data partitions; when each primary key is not updated in the time range, the first data partition is not created corresponding to the time range;
creating a second data partition, wherein the number of the data storage areas in the second data partition is the same as that of the data storage areas in the first data partition, and the second data partition is used for storing the latest value of the current query time corresponding to the primary key;
acquiring target service time, extracting all the first data partitions before the target service time according to the target service time, acquiring data which corresponds to each main key and has the update time closest to the target service time in the first data partitions, and determining the data which corresponds to each main key and has the update time closest to the target service time as historical full data of the target service time; and extracting all the primary key values in the second data partition to obtain the full data of the current query time.
2. The method for processing data of a slowly varying dimensional table according to claim 1, wherein the number of the primary keys is a plurality;
the creating a first data partition corresponding to a preset time range and including a plurality of data storage areas according to the time range, wherein each data storage area in the first data partition is used for storing a value before updating a primary key of the slow change dimension table in the time range, and the creating includes:
and when the value of part of the main keys is not updated, the position corresponding to the main key which is not updated in the first data partition is empty.
3. The method of data processing of a slowly varying dimensional table of claim 1, further comprising:
and when the main key is deleted, setting the position corresponding to the main key in the second data partition as null.
4. A data processing apparatus for a slowly varying dimensional table, comprising:
the first data partition creation module is used for creating a first data partition which corresponds to a preset time range and comprises a plurality of data storage areas, wherein each data storage area in the first data partition is used for storing values before updating different primary keys of the slow change dimension table in the time range, and the number of times of updating the primary keys in the time range is not more than one; when the number of the first data partitions is a plurality of, the lengths of the time ranges corresponding to the first data partitions are the same, and no intersection exists in the time ranges corresponding to the first data partitions; when each primary key is not updated in the time range, the first data partition is not created corresponding to the time range;
the second data partition creation module is used for creating a second data partition, the number of the data storage areas in the second data partition is the same as that of the data storage areas in the first data partition, and the second data partition creation module is used for storing updated latest values of the main key corresponding to the current query time;
the full data acquisition module is used for acquiring target service time, extracting all the first data partitions before the target service time according to the target service time, acquiring data which corresponds to each main key and has the update time closest to the target service time in the first data partitions, and determining the data which corresponds to each main key and has the update time closest to the target service time as historical full data of the target service time; and extracting all the primary key values in the second data partition to obtain the full data of the current query time.
5. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a data processing method of a slowly varying dimensional table according to any of claims 1-3.
6. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the data processing method of the slowly varying dimensional table of any of claims 1-3 via execution of the executable instructions.
CN201810962478.0A 2018-08-22 2018-08-22 Data processing method and device for slow change dimension table and electronic equipment Active CN109376148B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810962478.0A CN109376148B (en) 2018-08-22 2018-08-22 Data processing method and device for slow change dimension table and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810962478.0A CN109376148B (en) 2018-08-22 2018-08-22 Data processing method and device for slow change dimension table and electronic equipment

Publications (2)

Publication Number Publication Date
CN109376148A CN109376148A (en) 2019-02-22
CN109376148B true CN109376148B (en) 2023-07-18

Family

ID=65404468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810962478.0A Active CN109376148B (en) 2018-08-22 2018-08-22 Data processing method and device for slow change dimension table and electronic equipment

Country Status (1)

Country Link
CN (1) CN109376148B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274253A (en) * 2020-01-10 2020-06-12 北京奇艺世纪科技有限公司 Generation method and device of full-scale partition view, storage medium and electronic device
CN112306999B (en) * 2020-10-19 2024-08-23 亚信科技(中国)有限公司 Data auditing method, device, electronic equipment and computer readable storage medium
CN113779053A (en) * 2021-01-26 2021-12-10 北京沃东天骏信息技术有限公司 Method and device for acquiring primary key

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8260822B1 (en) * 2008-08-12 2012-09-04 United Services Automobile Association (Usaa) Systems and methods for storing and querying slowly changing dimensions
CN107861989A (en) * 2017-10-17 2018-03-30 平安科技(深圳)有限公司 Partitioned storage method, apparatus, computer equipment and the storage medium of data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2008201035A1 (en) * 2007-04-13 2008-10-30 Acei Ab A partition management system
CN102141963B (en) * 2010-01-28 2016-06-08 阿里巴巴集团控股有限公司 A kind of data analysing method and equipment
CN103577474B (en) * 2012-08-03 2017-06-09 阿里巴巴集团控股有限公司 The update method and system of a kind of database
CN106709269B (en) * 2017-03-13 2018-08-07 山东众阳软件有限公司 A kind of creation method and system in medical treatment big data warehouse
CN112579692B (en) * 2019-09-29 2023-05-05 杭州海康威视数字技术股份有限公司 Data synchronization method, device, system, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8260822B1 (en) * 2008-08-12 2012-09-04 United Services Automobile Association (Usaa) Systems and methods for storing and querying slowly changing dimensions
CN107861989A (en) * 2017-10-17 2018-03-30 平安科技(深圳)有限公司 Partitioned storage method, apparatus, computer equipment and the storage medium of data

Also Published As

Publication number Publication date
CN109376148A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
US11210181B2 (en) System and method for implementing data manipulation language (DML) on Hadoop
CN109376148B (en) Data processing method and device for slow change dimension table and electronic equipment
CN104899295B (en) A kind of heterogeneous data source data relation analysis method
CN109634587B (en) Method and equipment for generating warehousing script and warehousing data
CN109947791B (en) Database statement optimization method, device, equipment and storage medium
CN111651519B (en) Data synchronization method, data synchronization device, electronic equipment and storage medium
CN109376142B (en) Data migration method and terminal equipment
CN110162518B (en) Data grouping method, device, electronic equipment and storage medium
CN110795478A (en) Data warehouse updating method and device applied to financial business and electronic equipment
US20190065548A1 (en) Method and system of optimizing database system, electronic device and storage medium
US20230012642A1 (en) Method and device for snapshotting metadata, and storage medium
CN112988770A (en) Method and device for updating serial number, electronic equipment and storage medium
CN110334545A (en) A kind of authority control method based on SQL, device and electronic equipment
US10055421B1 (en) Pre-execution query optimization
US8732655B2 (en) Systems and methods for metamodel transformation
CN110502566B (en) Near real-time data acquisition method and device, electronic equipment and storage medium
CN114077518A (en) Data snapshot method, device, equipment and storage medium
EP2904520B1 (en) Reference data segmentation from single to multiple tables
CN109240916A (en) Information output controlling method, device and computer readable storage medium
CN112395366A (en) Data processing and creating method and device of distributed database and electronic equipment
CN115543428A (en) Simulated data generation method and device based on strategy template
CN114356945A (en) Data processing method, data processing device, computer equipment and storage medium
CN111221817B (en) Service information data storage method, device, computer equipment and storage medium
CN111241089B (en) ERP system secondary development method, system, device and readable storage medium
CN114579600A (en) Question-answering method and system based on table

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant