CN117785984A - Data extraction method, device, electronic equipment and storage medium - Google Patents

Data extraction method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117785984A
CN117785984A CN202410217192.5A CN202410217192A CN117785984A CN 117785984 A CN117785984 A CN 117785984A CN 202410217192 A CN202410217192 A CN 202410217192A CN 117785984 A CN117785984 A CN 117785984A
Authority
CN
China
Prior art keywords
data
target
dimension
tables
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410217192.5A
Other languages
Chinese (zh)
Inventor
吴华夫
徐晓兰
王在祥
唐忠文
张佳锐
周学毓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Smart Software Co ltd
Original Assignee
Guangzhou Smart Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Smart Software Co ltd filed Critical Guangzhou Smart Software Co ltd
Priority to CN202410217192.5A priority Critical patent/CN117785984A/en
Publication of CN117785984A publication Critical patent/CN117785984A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the technical field of data processing, and provides a data extraction method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: responsive to a data model creation operation on an interactive interface, displaying a data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables; responding to trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table; and responding to data extraction triggering operation on the data preparation interface, and extracting corresponding service data from a service database according to each star model to generate a plurality of service data tables. According to the method and the device, based on a plurality of star models, service data are extracted from the service database, a plurality of service data tables are obtained, and the storage space of the data is reduced.

Description

Data extraction method, device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data extraction method, a data extraction device, electronic equipment and a storage medium.
Background
With the advent of the data age, the analysis and decision-making dependence on data is more evident for each industry. Data is usually stored in a service database, and because the service database often cannot respond to analysis operation of massive data in time, the service data in the service database needs to be extracted into a cache library first for analysis.
In the related art, a plurality of database tables with association relation are extracted from a service database, and the data of the database tables are combined into a large data width table. Since each row of data in the data large-width table needs to contain all fields in several database tables, the data of each row of partial fields in the data large-width table needs to be complemented. Thus, a large amount of data to complement appears in the merged data large-width table, thereby occupying a large amount of storage space.
Disclosure of Invention
The embodiment of the application provides a data extraction method, a device, electronic equipment and a storage medium, which can reduce the storage space of data, and the technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a data extraction method, including the steps of:
responsive to a data model creation operation on the interactive interface, displaying a data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables;
responding to trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table;
and responding to data extraction triggering operation on a data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables.
In a second aspect, an embodiment of the present application provides a data extraction apparatus, including:
the data preparation interface display module is used for responding to the data model creation operation on the interactive interface and displaying the data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables;
the star model generation module is used for responding to the trigger operation of selecting and determining a target fact table from a plurality of fact tables and selecting and determining a target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table;
and the service data table generating module is used for responding to the data extraction triggering operation on the data preparation interface, extracting corresponding service data from the service database according to each star model and generating a plurality of service data tables.
In a third aspect, embodiments of the present application provide an electronic device comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method as in the first aspect when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program which, when executed by a processor, performs steps as the method of the first aspect.
The method comprises the steps of responding to a data model creation operation on an interactive interface, and displaying a data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables; responding to trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table; and responding to data extraction triggering operation on a data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables. The business data extracted based on each star model are independent of each other, business data table combination is not needed, and compared with the prior art, the business data table combination method reduces a lot of complementary data and reduces the storage space of the data. And based on the parallel extraction data of each star model, the service data volume of each service data table is smaller, and the storage space of the data is also reduced.
For a better understanding and implementation, the technical solutions of the present application are described in detail below with reference to the accompanying drawings.
Drawings
Fig. 1 is a flow chart of a data extraction method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a data extraction device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. The word "if"/"if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination", depending on the context.
The data extraction method can be applied to data extraction and data query scenes of a business intelligence (Business Intelligence, BI for short) system. The inventors found in the course of implementing the present invention that: in the prior art, a plurality of database tables with association relation are extracted from a service database, and the data of the database tables are combined into a large data width table. Since each row of data in the data large-width table needs to contain all fields in several database tables, the data of each row of partial fields in the data large-width table needs to be complemented. Thus, a large amount of data to complement appears in the merged data large-width table, thereby occupying a large amount of storage space. For example, the database table a includes product category, sales and purchase amount fields, the database table B includes product category, date, order amount and customer fields, the combined data broad table of the database table a and the database table B includes product category, sales, purchase amount, date, order amount and customer fields, a certain data in the data broad table includes one data record of the database table a, and the complementary data of the date, order amount and customer fields, and a certain data in the data broad table includes one data record of the database table B, and the complementary data of the sales and purchase amount fields.
Therefore, the service data extracted based on each star model are mutually independent, and service data table combination is not needed. And based on the parallel extraction data of each star model, the service data volume of each service data table is smaller, and the storage space of the data is also reduced.
The data extraction method provided in the embodiments of the present application may be performed by a data extraction device, where the data extraction device may be implemented by software and/or hardware, and the data extraction device may be configured by two or more physical entities or may be configured by one physical entity. The data extraction device can be any electronic device provided with data processing software, and the electronic device can be intelligent devices such as a computer, a mobile phone or a tablet.
Referring to fig. 1, fig. 1 is a flowchart of a data extraction method according to a first embodiment of the present application, where the method includes the following steps:
s10: responsive to a data model creation operation on the interactive interface, displaying a data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables and dimension tables, wherein the fact tables have association relations.
The interactive interface may be a main interface of an application program or a main interface of a plug-in, which is not limited herein.
The data preparation interface displays a plurality of fact tables and a plurality of dimension tables, and an association relationship is provided between one fact table and at least one dimension table. Specifically, foreign keys in one fact table are consistent with primary keys of at least one dimension table.
In the embodiment of the application, the user can perform the data model creation operation on the interactive interface. Specifically, the interactive interface displays a data model control, the user clicks the data model control, and the interactive interface jumps to the data preparation interface. The data preparation interface includes a fact table selection area for displaying the created plurality of fact tables, a dimension table selection area for displaying the created plurality of dimension tables, and a data model display area for displaying the data model to be generated. Each fact table includes a plurality of foreign keys and a plurality of metric fields, and each dimension table may include a primary key and a plurality of dimension fields.
S20: in response to a trigger operation that selects at least one target fact table from among the plurality of fact tables and at least one target dimension table from among the plurality of dimension tables, a plurality of star models are generated from the at least one target fact table and the at least one target dimension table.
In the embodiment of the application, the data model is composed of a plurality of star-shaped models. The user may sort at least one target fact table in the fact table selection area and sort at least one target dimension table in the dimension table selection area. And taking each target fact table as a center, and mounting a target dimension table with an association relation with each target fact table on the corresponding target fact table to obtain a plurality of star models.
Specifically, a user selects a target fact table in a fact table selection area of the data preparation page, selects at least one target dimension table in a dimension table selection area of the data preparation page, takes the target fact table as a center, and mounts the target dimension table with an association relation with the target fact table on the target fact table to obtain a star model. And similarly, generating another star model by performing similar operation. For example, the user selects the contract fact table and three dimension tables of contract dimension, industry dimension and occurrence date dimension, and associates them to construct a star model. The user selects the cost fact table, the contract dimension and the income type dimension to construct another star model.
S30: and responding to data extraction triggering operation on a data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables.
The service database is a database for storing service data. Specifically, a plurality of database tables are stored in the service database, and each database table records a plurality of service data.
Wherein, each star model is corresponding to a business data table. Because each star model only comprises one target fact table and a plurality of target dimension tables, the number of fields of the service data table is relatively small, and the service data table is a small and wide table.
In the embodiment of the application, the data preparation interface is displayed with a data extraction control, the user clicks the data extraction control, a data query statement is generated according to the fields contained in each star model, and corresponding service data is extracted from the service database according to the data query statement to generate a corresponding service data table. Wherein, each star model is corresponding to a business data table. Because each star model only comprises one target fact table and a plurality of target dimension tables, the fields of the service data table are fewer, and the data volume of the service data table is smaller.
By applying the embodiment of the application, the data preparation interface is displayed on the interactive interface by responding to the data model creation operation on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables; responding to trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table; and responding to data extraction triggering operation on a data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables. The business data extracted based on each star model are independent of each other, business data table combination is not needed, and compared with the prior art, the business data table combination method reduces a lot of complementary data and reduces the storage space of the data. And based on the parallel extraction data of each star model, the service data volume of each service data table is smaller, and the storage space of the data is also reduced.
In an alternative embodiment, steps S1 to S4 are included before step S10, specifically as follows:
s1: and acquiring service data associated with the preset service theme.
The service data may include raw data related to a preset service theme or service data processed by data, and the preset service theme includes, but is not limited to, financial service, retail service, telecommunication service, and the like.
In the embodiment of the application, a plurality of database tables associated with a preset service theme can be acquired from a service database, and each database table contains a plurality of service data.
S2: a number of dimension fields and a number of metric fields are determined from the business data.
In the embodiment of the application, each database table records detailed information of service data. For example, take a sales database table as an example, a specific sales record in the sales database table is: date 2022-01-13, sales director: li Ming sales manager: wang hong, sales person: zhang III, client: and Li IV, product category: food, product subcategory: fruit, product: banana, number: 5, sum of money: 8, wherein the dimension field includes a date, a sales director, a sales manager, a sales person, a product category, a product subcategory, and a product, and the metric field includes a quantity and an amount.
S3: performing dimension classification and granularity classification on the plurality of dimension fields, and determining each dimension field with the minimum granularity under each dimension category; and creating a plurality of fact tables according to each dimension field with the minimum granularity and a plurality of measurement fields.
In the embodiment of the present application, since each database table includes different dimension fields and metric fields, the dimension fields may be first classified to obtain a plurality of dimension categories. And then carrying out granularity classification on the dimension fields in the same dimension category, and determining the dimension field with the minimum granularity. For example, three dimension fields of a sales director, a sales manager and a sales person can be divided into the same dimension category, and the sales person is the dimension field with the smallest granularity. The three dimension fields of the product category, the product sub-category and the product are divided into the same dimension category, and the product is the dimension field with the minimum granularity.
And splicing each dimension field with the minimum granularity and a plurality of measurement fields into a fact table. For example, a fact table includes four dimension fields, namely date, sales person, customer, product, and two measurement fields, namely quantity, amount.
S4: and creating each dimension table with an association relation with a plurality of fact tables according to the dimension field of the same dimension category to which each dimension field with the minimum granularity belongs.
In the embodiment of the application, after determining the fact table, the dimension table is built again. For example, a personnel dimension table is established according to three dimension fields of a sales director, a sales manager and sales personnel. And establishing a product dimension table according to the three dimension fields of the product category, the product subcategory and the product.
By extracting dimension fields and metric fields from the business data, dimension classification and granularity classification of the dimension fields, fact tables and dimension tables associated with the fact tables may be automatically and quickly generated.
In an alternative embodiment, the step of generating a plurality of star models in step S20 according to at least one target fact table and at least one target dimension table includes:
s21: foreign keys in each target fact table and primary keys of each target dimension table are identified.
In the embodiment of the application, the target fact table comprises at least one foreign key, and the target dimension table comprises a main key. For example, if the target fact table includes four dimension fields of date, sales person, customer, product, and two measurement fields of quantity and amount, then the date, sales person, customer, product are foreign keys of the target fact table. If the target dimension table comprises three dimension fields of a sales director, a sales manager and a sales person, the sales person is a primary key of the target dimension table.
S22: and if the target dimension table with the main key consistent with the external key of the target fact table exists, mounting the target dimension table on the target fact table to generate a star model.
In the embodiment of the application, the main key of each target dimension table is matched with the external key of each target fact table, and the target dimension tables consistent in matching are mounted on the target fact tables to generate the star model.
By automatically identifying foreign keys in each target fact table and primary keys of each target dimension table, a star model corresponding to each target fact table may be automatically generated.
In an alternative embodiment, after the step of generating a plurality of star models according to at least one target fact table and at least one target dimension table in step S20, steps S201 to S203 are included, which specifically include the following steps:
s201: at least one first target fact table is obtained in response to a field add-delete operation or a field attribute modification operation to the at least one target fact table.
The field adding and deleting operation includes adding or deleting a field to the target fact table, wherein the field can be an outer key, a dimension field and a measurement field of the target fact table.
Wherein the field attribute modifications include data type modifications and data format modifications to the fields of the target fact table. For example, the data type of the metric field of the amount is modified from floating point type to integer type.
In the embodiment of the application, fields of one or some target fact tables may be added or deleted or field attributes may be modified according to service requirements, where the target fact table becomes the first target fact table. For example, the target fact table includes four dimension fields of date, salesman, customer and product and two measurement fields of quantity and amount, the salesman field is deleted to obtain a first target fact table, and the first target fact table includes three dimension fields of date, customer and product and two measurement fields of quantity and amount.
S202: and determining a first target dimension table corresponding to the at least one first target fact table.
In the embodiment of the application, according to the foreign key of the first target fact table, a corresponding first target dimension table is determined. Specifically, the foreign key of the first target fact table is identified, the primary key of the target dimension table is matched with the foreign key of the first target fact table, and the target dimension table with the primary key consistent with the foreign key of the first target fact table is determined to be the first target dimension table. For example, the target dimension table corresponding to the target fact table is a sales person dimension table and a product dimension table, and the first target dimension table corresponding to the first target fact table is a product dimension table.
S203: regenerating a plurality of star models according to at least one first target fact table and a first target dimension table.
In the embodiment of the application, a first target dimension table is mounted on a first target fact table with an association relation with the first target dimension table, and a new star model is generated.
By performing the adding and deleting operation and/or the field attribute modifying operation on the single or some target fact tables, the corresponding star model is regenerated, and the target fact tables with the fields not added and deleted or the field attributes not modified do not need to be regenerated. According to the service requirement, only part of star-shaped models in the data model need to be regenerated, so that the generation efficiency of the data model is improved.
In an alternative embodiment, step S30 includes steps S31-S32, as follows:
s31: and in response to the data extraction triggering operation on the data preparation interface, determining all foreign keys and all measurement fields in the target fact table and all primary keys and all dimension fields in the target dimension table in each star model.
In the embodiment of the application, the data extraction triggering operation is detected, and all foreign keys and all measurement fields in the target fact table and all primary keys and all dimension fields in the target dimension table in each star model are automatically identified.
S32: and extracting corresponding service data from the service database according to all the foreign keys, all the measurement fields, all the primary keys and all the dimension fields, and generating a plurality of service data tables.
In the embodiment of the application, all foreign keys and all measurement fields in the target fact table and all primary keys and all dimension fields in the target dimension table in each star model automatically generate a data query statement of each star model, and service data corresponding to the star model are extracted from a service database according to the data query statement to obtain a service data table corresponding to the star model.
According to the fields contained in the target fact table and the target dimension table in each star model, the database tables in the service database can be extracted into a plurality of independent service data tables, and compared with the database tables, the service data tables have fewer fields and no service data redundancy.
In an alternative embodiment, the data extraction method includes steps S40 to S80, which are specifically as follows:
s40: a number of business data tables are stored in a cache library.
Among them, the cache library includes, but is not limited to, smartbiMpp, presto +hive, vertica, star ring, etc. databases.
In the embodiment of the application, after generating a plurality of service data tables, the plurality of service data tables are stored in a SmartbiMpp database.
S50: a data query request is received.
In this embodiment of the present application, a data query request sent by a client is received, where the data query request includes a data query parameter. Specifically, when the client needs to generate a spreadsheet or preview data, the user triggers the client to send a data query request.
S60: and determining data query parameters according to the data query request.
In the embodiment of the application, the data query request is analyzed, and the data query parameters are obtained from the data query request, wherein the data query parameters are data information required to be queried by a user. For example, if a user wants to query the sales of a certain product, the data query parameters include two parameters, namely, the product name and the sales.
S70: and determining target service data from a plurality of service data tables in the cache library according to the data query parameters.
In the embodiment of the application, the data query parameters are matched with the fields in the service data table, and the service data corresponding to the matched fields is used as the target service data.
S80: and displaying the target business data on a data query interface.
In the embodiment of the application, the client displays a data query interface for a user to look up the service data to be queried. After the target service data is obtained, the target service data is sent to the client, and the target service data is displayed on a data query interface of the client.
By querying the target service data from the plurality of service data tables stored in the cache library, the data query efficiency is improved due to the small data volume of each service data table.
In an alternative embodiment, the data query parameters include a query dimension field and a query metric field, and step S70 includes steps S71 to S72, which are specifically as follows:
s71: and matching a plurality of target business data tables with fields consistent with the query dimension fields and the query measurement fields from the plurality of business data tables.
In the embodiment of the application, the query dimension field and the query measurement field are matched with the fields in the service data table, and if the fields consistent with the query dimension field and the query measurement field exist in the service data table, the data corresponding to the fields in the service data table is used as target service data.
Considering that the query dimension field and the query metric field included in the data query parameter may appear in different service data tables, the field of each service data table is respectively matched with the query dimension field and the query metric field, so as to determine a plurality of target service data tables with the fields consistent with the query dimension field and the query metric field.
S72: and combining the service data of the plurality of target service data tables to obtain target service data.
In the embodiment of the present application, if the number of the target service data tables is at least 2, the query dimension field and the service data corresponding to the query metric field are extracted from each target service data table, and the service data are combined to obtain the target service data.
By performing cross-table query on a plurality of target service data tables, the target service data can be automatically and quickly obtained.
In an alternative embodiment, step S70 includes steps S701 to S702, which are specifically as follows:
s701: and aggregating a plurality of service data tables in the cache library according to a preset aggregation rule to obtain a plurality of aggregation tables.
The preset aggregation rules can be set manually according to requirements. Specifically, the aggregation rules include, but are not limited to, grouping, averaging, maximum, minimum, summing, statistics, and deduplication statistics.
In the embodiment of the application, each service data table may be aggregated according to different aggregation rules to obtain a plurality of aggregation tables. Specifically, according to an aggregation function corresponding to an aggregation rule, selecting part of fields from a service data table to perform aggregation operation to obtain an aggregation result, and taking the aggregation result as fields in the aggregation table to obtain the aggregation table. For example, to calculate the monthly sales volume of a product, aggregation operation is performed on the daily date, the product category, and the sales number fields in the product sales business data table to obtain the monthly sales volume of the product. To calculate the sales of the product area, the shipping area, the product category, the unit price and the sales number fields in the product sales business data table are aggregated to obtain the sales of the product area.
S702: and determining target business data from a plurality of aggregation tables in the cache library according to the data query parameters.
In the embodiment of the application, the data query parameters are matched with the fields in the aggregation table, and if the fields consistent with the data query parameters exist in the aggregation table, the data corresponding to the fields in the aggregation table are used as the target service data.
And the aggregation is carried out on the basis of the extracted small wide table to obtain an aggregation table, which is equivalent to carrying out secondary extraction on the service data in the service database, so that the data volume can be further reduced. And the target service data is obtained from the aggregation table according to the data query parameters, so that the calculated amount in the data query process can be reduced, and the data query efficiency is improved.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a data extraction device provided in the present application. The data extraction device 4 provided in the embodiment of the present application includes:
a data preparation interface display module 41 for displaying a data preparation interface on the interactive interface in response to a data model creation operation on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables;
a star model generation module 42, configured to generate a plurality of star models according to the target fact table and the target dimension table in response to a trigger operation of selecting the target fact table from the plurality of fact tables and selecting the target dimension table from the plurality of dimension tables;
the service data table generating module 43 is configured to respond to the data extraction triggering operation on the data preparation interface, extract corresponding service data from the service database according to each star model, and generate a plurality of service data tables.
By applying the embodiment of the application, the data preparation interface is displayed on the interactive interface by responding to the data model creation operation on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables; responding to trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table; and responding to data extraction triggering operation on a data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables. The business data extracted based on each star model are independent of each other, business data table combination is not needed, and compared with the prior art, the business data table combination method reduces a lot of complementary data and reduces the storage space of the data. And based on the parallel extraction data of each star model, the service data volume of each service data table is smaller, and the storage space of the data is also reduced.
Fig. 3 is a schematic structural diagram of an electronic device provided in the present application. As shown in fig. 3, the electronic device 21 may include: a processor 210, a memory 211, and a computer program 212 stored in the memory 211 and executable on the processor 210, for example: a data extraction program; the processor 210, when executing the computer program 212, implements the steps of the embodiments described above.
Wherein the processor 210 may include one or more processing cores. The processor 210 utilizes various interfaces and wiring to connect various portions within the electronic device 21, performs various functions of the electronic device 21 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 211, and invoking data in the memory 211, and alternatively, the processor 210 may be implemented in at least one hardware form in the digital signal processing (Digital Signal Processing, DSP), field-programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programble Logic Array, PLA). The processor 210 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the touch display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 210 and may be implemented by a single chip.
The Memory 211 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 211 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 211 may be used to store instructions, programs, code sets, or instruction sets. The memory 211 may include a storage program area and a storage data area, wherein the storage program area may store instructions for implementing an operating system, instructions for at least one function (such as touch instructions, etc.), instructions for implementing the above-described various method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 211 may optionally also be at least one storage device located remotely from the aforementioned processor 210.
The embodiment of the present application further provides a computer storage medium, where a plurality of instructions may be stored, where the instructions are adapted to be loaded and executed by a processor, and the specific implementation procedure may refer to the specific description of the foregoing embodiment, and details are not repeated herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of each method embodiment described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc.
The present invention is not limited to the above-described embodiments, but, if various modifications or variations of the present invention are not departing from the spirit and scope of the present invention, the present invention is intended to include such modifications and variations as fall within the scope of the claims and the equivalents thereof.

Claims (11)

1. A data extraction method, comprising the steps of:
responsive to a data model creation operation on an interactive interface, displaying a data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables;
generating a plurality of star models according to at least one target fact table and at least one target dimension table in response to a trigger operation of selecting and determining at least one target fact table from a plurality of fact tables and selecting and determining at least one target dimension table from a plurality of dimension tables;
and responding to data extraction triggering operation on the data preparation interface, and extracting corresponding service data from a service database according to each star model to generate a plurality of service data tables.
2. The data extraction method according to claim 1, wherein:
the step of displaying the data preparation interface in response to the data model creation operation on the interactive interface is preceded by the step of:
acquiring service data associated with a preset service theme;
determining a plurality of dimension fields and a plurality of measurement fields from the service data;
performing dimension classification and granularity classification on a plurality of dimension fields, and determining each dimension field with the minimum granularity under each dimension category; creating a plurality of fact tables according to each dimension field with the minimum granularity and a plurality of measurement fields;
and creating each dimension table with an association relation with a plurality of fact tables according to the dimension field of the same dimension category to which each dimension field with the minimum granularity belongs.
3. The data extraction method according to claim 1, wherein:
the step of generating a plurality of star models according to at least one target fact table and at least one target dimension table comprises the following steps:
identifying an outer key in each target fact table and a main key of each target dimension table;
and if a target dimension table with the main key consistent with the external key of the target fact table exists, mounting the target dimension table on the target fact table to generate a star model.
4. The data extraction method according to claim 1, wherein:
after the step of generating a plurality of star models according to at least one target fact table and at least one target dimension table, the method comprises the following steps:
obtaining at least one first target fact table in response to a field adding/deleting operation or a field attribute modifying operation on at least one of the target fact tables;
determining a first target dimension table corresponding to at least one first target fact table;
regenerating a plurality of star models according to at least one first target fact table and the first target dimension table.
5. The data extraction method according to claim 1, wherein:
the step of responding to the data extraction triggering operation on the data preparation interface, extracting corresponding service data from a service database according to each star model, and generating a plurality of service data tables comprises the following steps:
responding to data extraction triggering operation on the data preparation interface, and determining all foreign keys and all measurement fields in the target fact table, all main keys and all dimension fields in the target dimension table in each star model;
and extracting corresponding service data from a service database according to the all foreign keys, the all measurement fields, the all primary keys and the all dimension fields, and generating a plurality of service data tables.
6. The data extraction method according to any one of claims 1 to 5, characterized by further comprising:
storing a plurality of said business data tables in a cache library;
receiving a data query request;
determining data query parameters according to the data query request;
determining target business data from a plurality of business data tables in the cache library according to the data query parameters;
and displaying the target business data on a data query interface.
7. The data extraction method according to claim 6, wherein:
the data query parameters comprise a query dimension field and a query metric field;
the step of determining target service data from a plurality of service data tables in the cache library according to the data query parameters comprises the following steps:
matching a plurality of target business data tables with fields consistent with the query dimension fields and the query measurement fields from a plurality of business data tables;
and combining the service data of the plurality of target service data tables to obtain target service data.
8. The data extraction method according to claim 6, wherein:
the step of determining target service data from a plurality of service data tables in the cache library according to the data query parameters comprises the following steps:
according to a preset aggregation rule, aggregating a plurality of service data tables in the cache library to obtain a plurality of aggregation tables;
and determining target service data from a plurality of aggregation tables according to the data query parameters.
9. A data extraction apparatus, comprising:
the data preparation interface display module is used for responding to the data model creation operation on the interactive interface and displaying the data preparation interface on the interactive interface; the data preparation interface comprises a plurality of fact tables with association relations and dimension tables;
the star model generation module is used for responding to the trigger operation of selecting a determined target fact table from a plurality of fact tables and selecting a determined target dimension table from a plurality of dimension tables, and generating a plurality of star models according to the target fact table and the target dimension table;
and the service data table generating module is used for responding to the data extraction triggering operation on the data preparation interface, extracting corresponding service data from the service database according to each star model and generating a plurality of service data tables.
10. An electronic device, comprising: a processor, a memory and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 8 when the computer program is executed.
11. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 8.
CN202410217192.5A 2024-02-28 2024-02-28 Data extraction method, device, electronic equipment and storage medium Pending CN117785984A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410217192.5A CN117785984A (en) 2024-02-28 2024-02-28 Data extraction method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410217192.5A CN117785984A (en) 2024-02-28 2024-02-28 Data extraction method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117785984A true CN117785984A (en) 2024-03-29

Family

ID=90383770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410217192.5A Pending CN117785984A (en) 2024-02-28 2024-02-28 Data extraction method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117785984A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991183A (en) * 2017-03-27 2017-07-28 福建数林信息科技有限公司 A kind of business intelligence ETL method for packing and system
CN113553341A (en) * 2021-07-27 2021-10-26 咪咕文化科技有限公司 Multidimensional data analysis method, multidimensional data analysis device, multidimensional data analysis equipment and computer readable storage medium
CN113704365A (en) * 2021-08-24 2021-11-26 北京明略昭辉科技有限公司 Method, system, device and storage medium for intelligently dividing data subjects
CN116089518A (en) * 2023-04-07 2023-05-09 广州思迈特软件有限公司 Data model extraction method and system, terminal and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991183A (en) * 2017-03-27 2017-07-28 福建数林信息科技有限公司 A kind of business intelligence ETL method for packing and system
CN113553341A (en) * 2021-07-27 2021-10-26 咪咕文化科技有限公司 Multidimensional data analysis method, multidimensional data analysis device, multidimensional data analysis equipment and computer readable storage medium
CN113704365A (en) * 2021-08-24 2021-11-26 北京明略昭辉科技有限公司 Method, system, device and storage medium for intelligently dividing data subjects
CN116089518A (en) * 2023-04-07 2023-05-09 广州思迈特软件有限公司 Data model extraction method and system, terminal and medium

Similar Documents

Publication Publication Date Title
CN110580649B (en) Method and device for determining commodity potential value
US9088811B2 (en) Information providing system, information providing method, information providing device, program, and information storage medium
US10579589B2 (en) Data filtering
CN110633331B (en) Method, system and related equipment for extracting data in relational database
WO2022223024A1 (en) Data processing method and apparatus, device, and storage medium
CN112487018B (en) Method and device for generating list, electronic equipment and computer readable storage medium
CN112508119A (en) Feature mining combination method, device, equipment and computer readable storage medium
CN113225580A (en) Live broadcast data processing method and device, electronic equipment and medium
CN117785984A (en) Data extraction method, device, electronic equipment and storage medium
CN115994830A (en) Method for constructing fetch model, method for collecting data and related device
CN104182433B (en) The hinge analysis method of use condition group
CN115357623A (en) Intelligent organization method, system and medium for multidimensional cube data
CN114048225A (en) Data monitoring method and device and computer equipment
CN113434507A (en) Data textualization method, device, equipment and storage medium
CN113705184A (en) Method and device for generating custom report, storage medium and electronic equipment
KR102251588B1 (en) Method, apparatus, and system visualizing correlation level between setup fields
KR102251586B1 (en) Method, apparatus and system arranging user interface for setting predetermined value using concentration level of the predetermined value
KR102272175B1 (en) Method, apparatus and system arranging user interface for setting predetermined value using analysis result for the predetermined value according to characteristic of shopping mall
CN112182071B (en) Data association relation mining method and device, electronic equipment and storage medium
KR20200142256A (en) Method, apparatus and system arranging user interface for setting predetermined value using correlation level between setup fields
CN113673626A (en) Hardware sales information classification method and device
CN115511560A (en) Time series analysis method and device, electronic equipment and readable storage medium
CN112860678A (en) Log display method and system
CN112597376A (en) Information query method and device
CN116738451A (en) Method, platform, equipment and storage medium for controlling authority of low-code platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination