CN116401325B - Data processing method and device based on data warehouse model - Google Patents

Data processing method and device based on data warehouse model Download PDF

Info

Publication number
CN116401325B
CN116401325B CN202310545863.6A CN202310545863A CN116401325B CN 116401325 B CN116401325 B CN 116401325B CN 202310545863 A CN202310545863 A CN 202310545863A CN 116401325 B CN116401325 B CN 116401325B
Authority
CN
China
Prior art keywords
data
layer
index
data layer
inputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310545863.6A
Other languages
Chinese (zh)
Other versions
CN116401325A (en
Inventor
彭友斌
袁俊飞
陈凯旋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Feishi Digital Technology Co ltd
Original Assignee
Guangzhou Feishi Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Feishi Digital Technology Co ltd filed Critical Guangzhou Feishi Digital Technology Co ltd
Priority to CN202310545863.6A priority Critical patent/CN116401325B/en
Publication of CN116401325A publication Critical patent/CN116401325A/en
Application granted granted Critical
Publication of CN116401325B publication Critical patent/CN116401325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Abstract

The application provides a data processing method and device based on a data warehouse model, wherein the data processing method based on the data warehouse model comprises the following steps: obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; and inputting the logic table to be processed into a data warehouse model to obtain output data. The intelligent black box scheduling task is generated after the logic table is submitted, the production data is automated, and the data processing efficiency is greatly improved.

Description

Data processing method and device based on data warehouse model
Technical Field
The application relates to the technical field of data processing, in particular to a data processing method and device based on a data warehouse model.
Background
In the data development, the standard modeling is used for constructing a logical data model, layering in the model design construction can be uniformly managed and collected, but in the development mode of the traditional manual SQL codes, the situation that the statistical index calculation caliber is inconsistent often occurs, and the development efficiency is low. For example, in our daily data development, a developer does not know the model or the business line, so that the same index SQL task can be repeatedly developed, the problem of inconsistent calculation caliber occurs, and different reports have different results. Traditional SQL development also needs a developer to have good SQL development work, so that data anomalies and maximized application database performance can be better avoided in development, and continuous debugging is also needed in SQL development.
That is, the manner in which the prior art data processing is inefficient.
Disclosure of Invention
The application provides a data processing method and device based on a data warehouse model, and aims to solve the problem that the efficiency of a data processing mode in the prior art is low.
In a first aspect, the present application provides a data processing method based on a data warehouse model, the data processing method based on the data warehouse model including:
obtaining a logic table to be processed;
obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer;
and inputting the logic table to be processed into a data warehouse model to obtain output data.
Optionally, the inputting the logic table to be processed into a data warehouse model to obtain output data includes:
inputting the logic table to be processed into an ODS original data layer;
acquiring service related data from a data source by using an ODS (open management system) original data layer, wherein the service related data comprises service system data, service running log data, machine running generated log data, and web crawlers or other external data acquired in other modes;
and inputting the business related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data.
Optionally, the inputting the service related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data includes:
inputting business related data into a DWD detail data layer;
cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of table and production table data are automatically created;
and inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data.
Optionally, the inputting the cleaned service related data into the DWS summary data layer and the ADS application data layer to obtain output data includes:
inputting the cleaned business related data into a DWS summary data layer;
generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer;
and generating data of the corresponding data table and the scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data.
Optionally, the acquiring service related data from the data source by using the ODS original data layer includes:
a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established;
service related data is obtained from the data source based on the corresponding target table and the timing synchronization task.
Optionally, the data processing method based on the data warehouse model includes:
creating an atomic index, a derivative index and a composite index.
Optionally, the creating the atomic index, the derivative index and the composite index includes:
adding new atomic indexes, selecting a detail data table, configuring calculation logic, storing and issuing new derivative indexes, selecting an atomic index, selecting a summary data table, inputting service limits, storing and issuing new compound indexes, selecting a derivative index, writing index calculation logic and storing and issuing.
In a second aspect, the present application provides a data processing apparatus based on a data warehouse model, the data processing apparatus based on a data warehouse model comprising:
the first acquisition unit is used for acquiring a logic table to be processed;
the second acquisition unit is used for acquiring a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer;
and the input unit is used for inputting the logic table to be processed into the data warehouse model to obtain output data.
Optionally, the input unit is configured to:
inputting the logic table to be processed into an ODS original data layer;
acquiring service related data from a data source by using an ODS (open management system) original data layer, wherein the service related data comprises service system data, service running log data, machine running generated log data, and web crawlers or other external data acquired in other modes;
and inputting the business related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data.
Optionally, the input unit is configured to:
inputting business related data into a DWD detail data layer;
cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of table and production table data are automatically created;
and inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data.
Optionally, the input unit is configured to:
inputting the cleaned business related data into a DWS summary data layer;
generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer;
and generating data of the corresponding data table and the scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data.
Optionally, the input unit is configured to:
a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established;
service related data is obtained from the data source based on the corresponding target table and the timing synchronization task.
Optionally, the input unit is configured to:
creating an atomic index, a derivative index and a composite index.
Optionally, the input unit is configured to:
adding new atomic indexes, selecting a detail data table, configuring calculation logic, storing and issuing new derivative indexes, selecting an atomic index, selecting a summary data table, inputting service limits, storing and issuing new compound indexes, selecting a derivative index, writing index calculation logic and storing and issuing.
In a third aspect, the present application provides an electronic device, including:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and are configured to be executed by the processor to implement the data warehouse model-based data processing method of any of the first aspects.
In a fourth aspect, the present application provides a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps in the data warehouse model-based data processing method of any one of the first aspects.
The application provides a data processing method and device based on a data warehouse model, wherein the data processing method based on the data warehouse model comprises the following steps: obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; and inputting the logic table to be processed into a data warehouse model to obtain output data. The intelligent black box scheduling task is generated after the logic table is submitted, the production data is automated, and the data processing efficiency is greatly improved.
Further, in index management we are divided into atomic index, derivative index and compound index. The atomic index is an abstraction of index statistical caliber and concrete algorithm, and is an index formed by setting calculation logic based on submitted facts or dimensions. Deriving an atomic index. The derived index is used to delineate the scope of the atomic index statistics business. The composite index is calculated for different derivative indexes. In the index development process, only different attribute options or associated different models are selected in a concise development management interface, and besides some expressions of operation among index data, part of SQL codes are manually filled in, and the SQL corresponding to production data is generated by the back end of the system.
Furthermore, after the online model is automatically configured, the scheduling tasks of the corresponding models and the dependency relationship among the tasks can be produced by one key, so that the data development efficiency is greatly improved, the technical cost of developing the data is reduced, the developed indexes can be reused in different service development tasks, and the consistency of the final result is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of a data processing system based on a data warehouse model according to an embodiment of the present application;
FIG. 2 is a flow diagram of one embodiment of a data warehouse model-based data processing method provided in an embodiment of the present application;
FIG. 3 is a flow diagram of creating atomic, derivative, and composite metrics in one embodiment of a data warehouse model-based data processing method provided in embodiments of the present application;
FIG. 4 is a schematic diagram of one embodiment of a data processing apparatus based on a data warehouse model provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the description of the present application, it should be understood that the terms "center," "longitudinal," "transverse," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate an orientation or positional relationship based on that shown in the drawings, merely for convenience of description and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be configured and operated in a particular orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In this application, the term "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been shown in detail to avoid obscuring the description of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
The embodiment of the application provides a data processing method and device based on a data warehouse model, and the method and device are respectively described in detail below.
Referring to fig. 1, fig. 1 is a schematic view of a scenario of a data processing system based on a data warehouse model according to an embodiment of the present application, where the data processing system based on a data warehouse model may include an electronic device 100, and a data processing apparatus based on a data warehouse model is integrated in the electronic device 100.
In the embodiment of the present application, the electronic device 100 may be a general-purpose computer device or a special-purpose computer device. In a specific implementation, the electronic device 100 may be a desktop, a writing computer, a network server, a palm computer (PersonalDigitalAssistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, an embedded device, an intelligent television, etc., and the embodiment is not limited to the type of the electronic device 100.
It will be appreciated by those skilled in the art that the application environment shown in fig. 1 is only one application scenario of the present application and is not limited to the application scenario of the present application, and that other application environments may also include more or fewer electronic devices than those shown in fig. 1, for example, only 1 electronic device is shown in fig. 1, and it will be appreciated that the data warehouse model-based data processing system may also include one or more other electronic devices capable of processing data, and is not limited herein.
In addition, as shown in FIG. 1, the data warehouse model-based data processing system may also include a memory 200 for storing data.
It should be noted that, the schematic view of the scenario of the data processing system based on the data warehouse model shown in fig. 1 is merely an example, and the data processing system based on the data warehouse model and the scenario described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as one of ordinary skill in the art can know, with the evolution of the data processing system based on the data warehouse model and the appearance of a new service scenario, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
First, in an embodiment of the present application, a data processing method based on a data warehouse model is provided, where the data processing method based on the data warehouse model includes: obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; and inputting the logic table to be processed into a data warehouse model to obtain output data.
As shown in fig. 2, fig. 2 is a flowchart of one embodiment of a data processing method based on a data warehouse model provided in an embodiment of the present application, where the data processing method based on a data warehouse model includes the following steps S201 to S203:
s201, obtaining a logic table to be processed.
S202, acquiring a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer.
In the embodiment of the application, in the data modeling layering, the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer.
The ODS raw data layer is used to obtain service-related data from a data source, where the service-related data includes service system data, log data of service operations, log data generated by machine operations, web crawlers, or other external data obtained in some other way. Not only the data generated in the business database, but all the data related to the enterprise should be aggregated into the ODS raw data layer, including business system data, log data of business operations, log data generated by machine operations, web crawlers, or other external data acquired in some other way.
The DWD detail data layer provides dimensions, which are angles of viewing things, descriptive attributes that provide for filtering and classifying facts involved in a business process event, and fact data derived from viewing things, facts related to metrics from the business process event. The DWS summary data layer makes some statistics values according to the dimensions, the calculation is more efficient, the special requirements can be met much faster, the ODS original data layer and the DWD detail data layer do not need to be repeatedly taken for processing, and the ADS application data layer is normally served. The ADS application data layer is a personalized data assembly layer prepared for specific business requirements on the basis that the dimension table, the fact table and the index are built, and besides the special business personalized labels are required to be processed independently, the construction results of the dimension table, the fact table and the index are multiplexed as much as possible.
S203, inputting the logic table to be processed into a data warehouse model to obtain output data.
In this embodiment of the present application, inputting a logic table to be processed into a data warehouse model to obtain output data includes: inputting the logic table to be processed into an ODS original data layer; acquiring service related data from a data source by using an ODS (oxide management system) original data layer, wherein the service related data comprises service system data, service running log data, log data generated by machine running, and external data acquired by a web crawler or other modes; and inputting the business related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data.
Specifically, the obtaining service related data from a data source by using an ODS original data layer includes: a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established; service related data is obtained from the data source based on the corresponding target table and the timing synchronization task.
In this embodiment of the present application, inputting service-related data into a DWD detail data layer, a DWS summary data layer, and an ADS application data layer to obtain output data includes: inputting business related data into a DWD detail data layer; cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of the table and production table data are automatically created; and inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data.
In this embodiment of the present application, inputting the cleaned service-related data into the DWS summary data layer and the ADS application data layer to obtain output data, including: inputting the cleaned business related data into a DWS summary data layer; generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer; and generating data of the corresponding data table and the scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data.
In the embodiment of the application, the data processing method based on the data warehouse model comprises the following steps: creating an atomic index, a derivative index and a composite index.
The atomic index refers to a measurement value based on a business process, and as the name implies, the atomic index is an index which can not be split, a corresponding fact table field and a calculation function are required to be selected in a system, and a corresponding SQL fragment is generated; the derived indexes are generated corresponding summary tables based on the addition of the atomic indexes to the statistical period, the business definition and the statistical dimension, and if the derived indexes which are the same as other dimensions are newly established, the index fields of the derived indexes are generated in the same summary table; the composite index is formed by conforming various logic operations on the basis of one or more derivative indexes in the same dimension, and the data of the composite index is also generated in the same summary table.
Among them, in index management we are divided into atomic index, derivative index and compound index. The atomic index is an abstraction of index statistical caliber and concrete algorithm, and is an index formed by setting calculation logic based on submitted facts or dimensions. The derived index is used to delineate the scope of the atomic index statistics business. The composite index is calculated for different derivative indexes. In the index development process, only different attribute options or associated different models are selected in a concise development management interface, and besides some expressions of operation among index data, part of SQL codes are manually filled in, and the SQL corresponding to production data is generated by the back end of the system.
As shown in fig. 3, in the embodiment of the present application, creating an atomic index, a derivative index, and a composite index includes: adding new atomic indexes, selecting a detail data table, configuring calculation logic, storing and issuing new derivative indexes, selecting an atomic index, selecting a summary data table, inputting service limits, storing and issuing new compound indexes, selecting a derivative index, writing index calculation logic and storing and issuing. Wherein selecting the summary data table includes selecting a table field, selecting an associated statistical dimension.
The traditional IT architecture and various data are integrated with new and old modes, island data are integrated, data assets are deposited, data service capability is formed rapidly, support is provided for enterprise business decision making and fine operation, and the mechanism is a data center. The data center station takes the data asset as a basic element to be independent, and the data which becomes the asset is used as production data to be integrated into a business value creation process, so that continuous productivity for promoting the development of enterprises is provided.
The service of the application generates data and data service, and forms a closed loop. The accuracy, the integrity, the timeliness, the legality and the consistency of the data quality are ensured through the data processing process and the ODS original data layer feedback data quality.
In order to better implement the data processing method based on the data warehouse model in the embodiment of the present application, on the basis of the data processing method based on the data warehouse model, a data processing device based on the data warehouse model is further provided in the embodiment of the present application, where the data processing device based on the data warehouse model is integrated in an electronic device, as shown in fig. 4, the data processing device based on the data warehouse model includes:
a first obtaining unit 301, configured to obtain a logic table to be processed;
a second obtaining unit 302, configured to obtain a data warehouse model, where the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer, and an ADS application data layer;
and the input unit 303 is used for inputting the logic table to be processed into the data warehouse model to obtain output data.
Optionally, the input unit is configured to:
inputting the logic table to be processed into an ODS original data layer;
acquiring service related data from a data source by using an ODS (open management system) original data layer, wherein the service related data comprises service system data, service running log data, machine running generated log data, and web crawlers or other external data acquired in other modes;
and inputting the business related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data.
Optionally, the input unit is configured to:
inputting business related data into a DWD detail data layer;
cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of table and production table data are automatically created;
and inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data.
Optionally, the input unit is configured to:
inputting the cleaned business related data into a DWS summary data layer;
generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer;
and generating data of the corresponding data table and the scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data.
Optionally, the input unit is configured to:
a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established;
service related data is obtained from the data source based on the corresponding target table and the timing synchronization task.
Optionally, the input unit is configured to:
creating an atomic index, a derivative index and a composite index.
Optionally, the input unit is configured to:
adding new atomic indexes, selecting a detail data table, configuring calculation logic, storing and issuing new derivative indexes, selecting an atomic index, selecting a summary data table, inputting service limits, storing and issuing new compound indexes, selecting a derivative index, writing index calculation logic and storing and issuing.
The embodiment of the application also provides an electronic device, which integrates any one of the data processing devices based on the data warehouse model, and the electronic device comprises:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to perform the steps of the data warehouse model-based data processing method of any of the data warehouse model-based data processing method embodiments described above.
As shown in fig. 5, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown, specifically:
the electronic device may include one or more processing cores 'processors 601, one or more computer-readable storage media's memory 602, power supply 603, and input unit 604, among other components. It will be appreciated by those skilled in the art that the electronic device structure shown in the figures is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:
the processor 601 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 602, and calling data stored in the memory 602, thereby performing overall monitoring of the electronic device. Optionally, the processor 601 may include one or more processing cores; the processor 601 may be a central processing unit (CentralProcessingUnit, CPU), but may also be other general purpose processors, digital signal processors (DigitalSignalProcessor, DSP), application specific integrated circuits (ApplicationSpecificIntegratedCircuit, ASIC), off-the-shelf programmable gate arrays (Field-ProgrammableGateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and preferably, the processor 601 may integrate an application processor primarily handling operating systems, physical interfaces, application programs, and the like, with a modem processor primarily handling wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.
The memory 602 may be used to store software programs and modules, and the processor 601 may execute various functional applications and data processing by executing the software programs and modules stored in the memory 602. The memory 602 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. In addition, the memory 602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 602 may also include a memory controller to provide access to the memory 602 by the processor 601.
The electronic device further comprises a power supply 603 for supplying power to the various components, preferably the power supply 603 may be logically connected to the processor 601 by a power management system, so that functions of managing charging, discharging, power consumption management and the like are achieved by the power management system. The power supply 603 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The electronic device may further comprise an input unit 604, which input unit 604 may be used for receiving input digital or character information and for generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with physical settings and function control.
Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 601 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 602 according to the following instructions, and the processor 601 executes the application programs stored in the memory 602, so as to implement various functions as follows:
obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; and inputting the logic table to be processed into a data warehouse model to obtain output data.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer readable storage medium, which may include: read-only memory (ROM, readOnlyMemory), random access memory (RAM, random AccessMemory), magnetic or optical disk, and the like. On which a computer program is stored, the computer program being loaded by a processor for performing the steps of any of the data warehouse model based data processing methods provided by the embodiments of the present application. For example, the loading of the computer program by the processor may perform the steps of:
obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; and inputting the logic table to be processed into a data warehouse model to obtain output data.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and the portions of one embodiment that are not described in detail in the foregoing embodiments may be referred to in the foregoing detailed description of other embodiments, which are not described herein again.
In the implementation, each unit or structure may be implemented as an independent entity, or may be implemented as the same entity or several entities in any combination, and the implementation of each unit or structure may be referred to the foregoing method embodiments and will not be repeated herein.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
The foregoing has described in detail a data processing method and apparatus based on a data warehouse model provided in the embodiments of the present application, and specific examples have been applied herein to illustrate the principles and embodiments of the present application, where the foregoing examples are only for aiding in understanding the method and core idea of the present application; meanwhile, as those skilled in the art will vary in the specific embodiments and application scope according to the ideas of the present application, the contents of the present specification should not be construed as limiting the present application in summary.

Claims (4)

1. A data processing method based on a data warehouse model, the data processing method based on the data warehouse model comprising: obtaining a logic table to be processed; obtaining a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer; inputting the logic table to be processed into a data warehouse model to obtain output data;
the step of inputting the logic table to be processed into a data warehouse model to obtain output data, which comprises the following steps: inputting the logic table to be processed into an ODS original data layer; acquiring service related data from a data source by using an ODS (open management system) original data layer, wherein the service related data comprises service system data, service running log data, machine running generated log data, and web crawlers or other external data acquired in other modes; inputting the business related data into a DWD detail data layer, a DWS summary data layer and an ADS application data layer to obtain output data;
the step of inputting the business related data into the DWD detail data layer, the DWS summary data layer and the ADS application data layer to obtain output data comprises the following steps: inputting business related data into a DWD detail data layer; cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of table and production table data are automatically created; inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data;
the step of inputting the cleaned business related data into the DWS summary data layer and the ADS application data layer to obtain output data comprises the following steps: inputting the cleaned business related data into a DWS summary data layer; generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer; generating data of a corresponding data table and a scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data;
the acquiring service related data from a data source by using an ODS original data layer comprises the following steps: a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established; acquiring service related data from a data source based on the corresponding target table and the timing synchronization task;
the data processing method based on the data warehouse model comprises the following steps: creating an atomic index, a derivative index and a composite index;
the creating the atomic index, the derivative index and the composite index comprises the following steps: the method comprises the steps of adding an atomic index newly, selecting a detail data table, configuring calculation logic, storing and releasing the added atomic index, adding a derivative index newly, selecting an atomic index, selecting a summary data table, inputting service limit, storing and releasing the added derivative index newly, adding a compound index newly, selecting a derivative index, writing index calculation logic, and storing and releasing the added compound index.
2. A data processing apparatus based on a data warehouse model, the data processing apparatus based on a data warehouse model comprising: the first acquisition unit is used for acquiring a logic table to be processed;
the second acquisition unit is used for acquiring a data warehouse model, wherein the data warehouse model is divided into an ODS original data layer, a DWD detail data layer, a DWS summary data layer and an ADS application data layer;
the input unit is used for inputting the logic table to be processed into a data warehouse model to obtain output data, and specifically comprises the following steps:
inputting the logic table to be processed into an ODS original data layer;
acquiring service related data from a data source by using an ODS (open management system) original data layer, wherein the service related data comprises service system data, service running log data, machine running generated log data, and web crawlers or other external data acquired in other modes; wherein, the acquiring service related data from the data source by using the ODS original data layer comprises: a spark or datax synchronization task is configured in the data synchronization task of the ODS original data layer, and a corresponding target table and a timing synchronization task are automatically established; acquiring service related data from a data source based on the corresponding target table and the timing synchronization task;
inputting the business related data into a DWD detail data layer, a DWS summary data layer and an ADS application data layer to obtain output data, wherein the output data comprises the following specific steps: inputting business related data into a DWD detail data layer; cleaning the service related data by using a DWD detail data layer to obtain cleaned service related data, wherein corresponding SQL tasks are configured according to the configured table structure in the DWD detail data layer, and timing scheduling tasks of table and production table data are automatically created; inputting the cleaned business related data into a DWS summary data layer and an ADS application data layer to obtain output data;
the step of inputting the cleaned business related data into the DWS summary data layer and the ADS application data layer to obtain output data comprises the following steps: inputting the cleaned business related data into a DWS summary data layer; generating a corresponding data table and scheduling task bearing data according to the configured derivative index and the composite index by using the DWS summary data layer; generating data of a corresponding data table and a scheduling task bearing application layer by using the ADS application data layer according to the configured multiple indexes as dimensions and index fields to obtain output data;
the input unit is also used for creating an atomic index, a derivative index and a composite index; the creating the atomic index, the derivative index and the composite index comprises the following steps: the method comprises the steps of adding an atomic index newly, selecting a detail data table, configuring calculation logic, storing and releasing the added atomic index, adding a derivative index newly, selecting an atomic index, selecting a summary data table, inputting service limit, storing and releasing the added derivative index newly, adding a compound index newly, selecting a derivative index, writing index calculation logic, and storing and releasing the added compound index.
3. An electronic device, the electronic device comprising: one or more processors; a memory; and one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the data warehouse model-based data processing method of claim 1.
4. A computer-readable storage medium, on which a computer program is stored, the computer program being loaded by a processor for performing the steps of the data warehouse model-based data processing method as claimed in claim 1.
CN202310545863.6A 2023-05-15 2023-05-15 Data processing method and device based on data warehouse model Active CN116401325B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310545863.6A CN116401325B (en) 2023-05-15 2023-05-15 Data processing method and device based on data warehouse model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310545863.6A CN116401325B (en) 2023-05-15 2023-05-15 Data processing method and device based on data warehouse model

Publications (2)

Publication Number Publication Date
CN116401325A CN116401325A (en) 2023-07-07
CN116401325B true CN116401325B (en) 2024-03-05

Family

ID=87020000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310545863.6A Active CN116401325B (en) 2023-05-15 2023-05-15 Data processing method and device based on data warehouse model

Country Status (1)

Country Link
CN (1) CN116401325B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008253A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Data model generation method, data warehouse generation device and electronic equipment
CN111460045A (en) * 2020-03-02 2020-07-28 心医国际数字医疗系统(大连)有限公司 Modeling method, model, computer device and storage medium for data warehouse construction
CN111475528A (en) * 2020-03-23 2020-07-31 深圳市酷开网络科技有限公司 OTT-based data warehouse construction method, equipment and storage medium
CN112434115A (en) * 2020-11-23 2021-03-02 京东数字科技控股股份有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113742325A (en) * 2021-08-09 2021-12-03 广州市易工品科技有限公司 Data warehouse construction method, device and system, electronic equipment and storage medium
CN114218218A (en) * 2021-12-16 2022-03-22 新奥数能科技有限公司 Data processing method, device and equipment based on data warehouse and storage medium
CN115525724A (en) * 2022-09-30 2022-12-27 阿里巴巴(中国)有限公司 Modeling method and system applied to data warehouse and electronic equipment
CN116089431A (en) * 2023-01-12 2023-05-09 中移信息技术有限公司 Data processing method and device of data warehouse, electronic equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008253A (en) * 2018-10-08 2020-04-14 阿里巴巴集团控股有限公司 Data model generation method, data warehouse generation device and electronic equipment
CN111460045A (en) * 2020-03-02 2020-07-28 心医国际数字医疗系统(大连)有限公司 Modeling method, model, computer device and storage medium for data warehouse construction
CN111475528A (en) * 2020-03-23 2020-07-31 深圳市酷开网络科技有限公司 OTT-based data warehouse construction method, equipment and storage medium
CN112434115A (en) * 2020-11-23 2021-03-02 京东数字科技控股股份有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113742325A (en) * 2021-08-09 2021-12-03 广州市易工品科技有限公司 Data warehouse construction method, device and system, electronic equipment and storage medium
CN114218218A (en) * 2021-12-16 2022-03-22 新奥数能科技有限公司 Data processing method, device and equipment based on data warehouse and storage medium
CN115525724A (en) * 2022-09-30 2022-12-27 阿里巴巴(中国)有限公司 Modeling method and system applied to data warehouse and electronic equipment
CN116089431A (en) * 2023-01-12 2023-05-09 中移信息技术有限公司 Data processing method and device of data warehouse, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116401325A (en) 2023-07-07

Similar Documents

Publication Publication Date Title
US11727132B2 (en) Activity-based content object access permissions
US8555248B2 (en) Business object change management using release status codes
US20210125144A1 (en) Bill of material conversion method, electronic apparatus and non-transitory computer-readable storage medium
CN103455489A (en) Method and system for rapidly constructing key performance indicators (KPIs) of enterprises
US20200065313A1 (en) Extensible content object metadata
US8892557B2 (en) Optimal persistence of a business process
US9740994B2 (en) Simulation of supply chain plans using data model
US9292405B2 (en) HANA based multiple scenario simulation enabling automated decision making for complex business processes
CN103678591A (en) Device and method for automatically executing multi-service receipt statistical treatment
CN116401325B (en) Data processing method and device based on data warehouse model
Vaxman et al. Canonical möbius subdivision
US11514236B1 (en) Indexing in a spreadsheet based data store using hybrid datatypes
CN111858739A (en) Mapreduce-based data aggregation method and system
CN113850558A (en) Workflow arrangement method and device
CN108804401B (en) Report template merging method and device
CN102779092B (en) Quote check system and quote inspection method
CN116029648A (en) Relationship modeling management method, device and system based on product BOM structure
US11500839B1 (en) Multi-table indexing in a spreadsheet based data store
CN114816341A (en) Method and system for constructing full life cycle model of product
CN113159674A (en) Material information creating method, material management device and storage medium
US9785894B2 (en) Simulation of supply chain plans using web service
CN113159871A (en) Price checking method, price checking management system and storage medium
JP2002215692A (en) Information processor and method
CN113505438B (en) Technical state driven spacecraft final assembly process templated programming method and system
US11768818B1 (en) Usage driven indexing in a spreadsheet based data store

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant