CN113360494A - Wide table data generation method, wide table data updating method and related devices - Google Patents

Wide table data generation method, wide table data updating method and related devices Download PDF

Info

Publication number
CN113360494A
CN113360494A CN202010148063.7A CN202010148063A CN113360494A CN 113360494 A CN113360494 A CN 113360494A CN 202010148063 A CN202010148063 A CN 202010148063A CN 113360494 A CN113360494 A CN 113360494A
Authority
CN
China
Prior art keywords
data
wide
dimension
tables
dimension data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010148063.7A
Other languages
Chinese (zh)
Other versions
CN113360494B (en
Inventor
吴帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010148063.7A priority Critical patent/CN113360494B/en
Publication of CN113360494A publication Critical patent/CN113360494A/en
Application granted granted Critical
Publication of CN113360494B publication Critical patent/CN113360494B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a generation method, an updating method and a related device of wide table data, and relates to the technical field of computers. One embodiment of the method comprises: the method comprises the steps of obtaining a source table according to a data table with dimension non-dynamic updating, obtaining a dimension data table according to the data table with dimension dynamic updating, generating a corresponding summary table according to the data of the source table according to a first corresponding relation between a configured source table and the summary table, and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and a second corresponding relation between wide tables. When the generated wide table data is updated, the data processing script is not required to be modified, calculation is not required to be carried out on all the theme table data, the defects of heavy tasks, high cost and high risk are overcome, repeated operation is reduced, the data amount needing to be repeatedly calculated is greatly reduced, the overall calculation time can be shortened, and the waste of server resources is reduced.

Description

Wide table data generation method, wide table data updating method and related devices
Technical Field
The invention relates to the technical field of computers, in particular to a method for generating and updating broad-form data and a related device.
Background
Wide-list data are generated according to certain dimensions and are archived based on big data platform data processing. Updates to historical archived data are required due to changes to certain dimensional data. A common way today is to modify the data processing script (i.e. the broad form data generation script), such as modifying different statistical times, partitions, etc. to update the historical data. The data are re-run by modifying the data processing script, the task is heavy, the cost is high, the risk is large, and especially for the statistical dimension data which is changed frequently, the previous operation needs to be repeated every time the data are changed. The re-running historical data tracing may be several years ago, the re-running historical data task execution time is long, and the task execution is restarted every time the data processing script is modified. Each re-run data needs to be calculated for all the data of the theme table (i.e. the data table used for generating the wide table), and each theme table has a huge data size, so that historical data is re-run frequently, and server resources are wasted.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
the existing scheme for generating and updating the wide table data has the disadvantages that when certain dimension data are changed, a data processing script needs to be modified to update the generated wide table data, the task is heavy, the cost is high, the risk is high, calculation needs to be carried out on all theme table data, repeated operation is excessive, the data amount of repeated calculation is huge, the overall calculation time is long, and server resources are wasted.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method for generating and updating broad table data, and a related device, so that when updating the generated broad table data, it is not necessary to modify a data processing script and calculate all the subject table data, thereby overcoming the defects of heavy task, high cost, and high risk, reducing repeated operations, greatly reducing the amount of data that needs to be repeatedly calculated, shortening the overall calculation time, and reducing the waste of server resources.
To achieve the above object, according to an aspect of an embodiment of the present invention, a method for generating wide table data is provided.
A method for generating wide table data comprises the following steps: obtaining a source table according to a data table with dimension dynamically updated in each data table, and obtaining a dimension data table according to a data table with dimension dynamically updated in each data table; generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table; and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables.
Optionally, the method further includes configuring the first corresponding relationship and the second corresponding relationship in advance, where: configuring the first correspondence includes: configuring a summary table, and generating each source table, fields to be extracted by each source table and a primary key of dimension data, which are required by the summary table; configuring the second correspondence includes: configuring a wide table, and generating all summary tables required by the wide table, fields required to be extracted by all the summary tables, and the primary keys of the dimensional data corresponding to all the summary tables.
Optionally, data of the source table is dynamically increased, and the summary table includes one or more partition tables; generating a corresponding summary table according to the configured first corresponding relationship between the source table and the summary table according to the data of the source table, including: and periodically extracting data from the newly added data of each source table according to the configured fields to be extracted of each source table, wherein each period generates a partition table of the summary table according to the data extracted from the newly added data.
Optionally, configuring the second corresponding relationship further includes configuring dynamic partition information of each summary table; generating corresponding wide table data according to the summary table and the dimension data of the dimension data table according to the configured summary table, the dimension data of the dimension data table and the second corresponding relationship among the wide tables, and including: determining the partition table needed to be used by each summary table according to the configured dynamic partition information; and according to the configured fields to be extracted from each summary table, extracting data from the partition tables to be used by each summary table, and according to the dimension data in the dimension data table, summarizing the data extracted from each partition table to generate corresponding wide table data.
According to another aspect of the embodiments of the present invention, a method for updating wide table data is provided.
A method for updating wide table data generated by a method for generating wide table data according to an embodiment of the present invention includes: and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
Optionally, the updating, according to the configured second corresponding relationship between the summary table, the dimension data of the dimension data table, and the wide table, the corresponding wide table data according to the summary table and the updated dimension data of the dimension data table includes: determining a dependency relationship between the summary tables and the wide tables according to the second corresponding relationship, wherein if one wide table data is generated based on one summary table, the dependency relationship is a single dependency; if one said wide table data is generated based on a plurality of said summary tables, said dependency relationship is multiple dependency; for each summary table which is dependent on each wide table, carrying out summary operation in parallel according to the updated dimension data of the dimension data table to obtain the updated data of each wide table; for each summary table which is more dependent on each wide table, grouping the summary tables corresponding to each wide table according to the minimum calculation granularity, after all the obtained groups are deduplicated, parallelly performing first summarization on the summary tables of each group after deduplication according to the updated dimension data of the dimension data table, respectively storing each first summarization result in the cache table corresponding to each group, respectively obtaining the corresponding cache table according to the group where the summary table corresponding to each wide table is located, and parallelly performing second summarization on the cache tables corresponding to each wide table to obtain the updated data of each wide table.
According to still another aspect of the embodiments of the present invention, there is provided a wide table data generating apparatus.
An apparatus for generating wide table data, comprising: the data table extraction module is used for obtaining a source table according to the data tables with the non-dynamic updating dimensionality in each data table and obtaining a dimensionality data table according to the data tables with the dynamic updating dimensionality in each data table; the summary table generating module is used for generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table; and the wide table data generation module is used for generating corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the dimension data of the summary table and the dimension data table.
Optionally, the system further includes a configuration module, configured to pre-configure the first corresponding relationship and the second corresponding relationship, where: configuring the first correspondence includes: configuring a summary table, and generating each source table, fields to be extracted by each source table and a primary key of dimension data, which are required by the summary table; configuring the second correspondence includes: configuring a wide table, and generating all summary tables required by the wide table, fields required to be extracted by all the summary tables, and the primary keys of the dimensional data corresponding to all the summary tables.
Optionally, data of the source table is dynamically increased, and the summary table includes one or more partition tables; the summary table generation module is further configured to: and periodically extracting data from the newly added data of each source table according to the configured fields to be extracted of each source table, wherein each period generates a partition table of the summary table according to the data extracted from the newly added data.
Optionally, the configuration module is further configured to configure the second corresponding relationship, where the configuring further includes configuring dynamic partition information of each summary table; the wide table data generation module is further configured to: determining the partition table needed to be used by each summary table according to the configured dynamic partition information; and according to the configured fields to be extracted from each summary table, extracting data from the partition tables to be used by each summary table, and according to the dimension data in the dimension data table, summarizing the data extracted from each partition table to generate corresponding wide table data.
According to another aspect of the embodiments of the present invention, there is provided an apparatus for updating wide table data.
An updating apparatus for wide table data generated by a wide table data generating apparatus according to an embodiment of the present invention includes a wide table data updating module configured to: and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
Optionally, the wide table data update module is further configured to: determining a dependency relationship between the summary tables and the wide tables according to the second corresponding relationship, wherein if one wide table data is generated based on one summary table, the dependency relationship is a single dependency; if one said wide table data is generated based on a plurality of said summary tables, said dependency relationship is multiple dependency;
for each summary table which is dependent on each wide table, carrying out summary operation in parallel according to the updated dimension data of the dimension data table to obtain the updated data of each wide table; for each summary table which is more dependent on each wide table, grouping the summary tables corresponding to each wide table according to the minimum calculation granularity, after all the obtained groups are deduplicated, parallelly performing first summarization on the summary tables of each group after deduplication according to the updated dimension data of the dimension data table, respectively storing each first summarization result in the cache table corresponding to each group, respectively obtaining the corresponding cache table according to the group where the summary table corresponding to each wide table is located, and parallelly performing second summarization on the cache tables corresponding to each wide table to obtain the updated data of each wide table.
According to yet another aspect of an embodiment of the present invention, an electronic device is provided.
An electronic device, comprising: one or more processors; a memory for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement a method for generating wide table data or a method for updating wide table data provided by an embodiment of the present invention.
According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.
A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements a method for generating wide table data or a method for updating wide table data provided by an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: the method comprises the steps of obtaining a source table according to a data table with dimension non-dynamic updating, obtaining a dimension data table according to the data table with dimension dynamic updating, generating a corresponding summary table according to the data of the source table according to a first corresponding relation between a configured source table and the summary table, and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and a second corresponding relation between wide tables. And under the condition of updating the dimension data, updating the corresponding wide table data according to the summary table and the updated dimension data of the dimension data table. By the embodiment of the invention, the data processing script is not required to be modified when the wide table data is updated, calculation is not required to be carried out on all the theme table data, the defects of heavy task, high cost and high risk are overcome, repeated operation is reduced, the data amount required to be repeatedly calculated is greatly reduced, the overall calculation time is shortened, and the waste of server resources is reduced.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of a method for generating wide table data according to one embodiment of the present invention;
FIG. 2 is a logical architecture diagram of wide table data generation, according to one embodiment of the present invention;
FIG. 3 is a diagram illustrating the main steps of a method for updating wide table data according to an embodiment of the present invention;
FIGS. 4(a) and 4(b) are single and multiple dependency diagrams, respectively, according to embodiments of the present invention;
FIG. 5 is a schematic diagram of a wide table data update flow according to one embodiment of the invention;
FIG. 6 is a diagram illustrating the multitasking parallel execution of wide table data updates according to one embodiment of the present invention;
FIG. 7 is a schematic diagram of the main blocks of a wide table data generation apparatus according to one embodiment of the present invention;
FIG. 8 is a schematic diagram of the main blocks of an apparatus for updating wide table data according to an embodiment of the present invention;
FIG. 9 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 10 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of main steps of a method for generating wide table data according to an embodiment of the present invention.
As shown in fig. 1, the method for generating the wide table data according to an embodiment of the present invention mainly includes the following steps S101 to S103.
Step S101: and obtaining a source table according to the data tables with the dimensionality being dynamically updated in each data table, and obtaining a dimensionality data table according to the data tables with the dimensionality being dynamically updated in each data table.
Step S102: and generating a corresponding summary table according to the data of the source table according to the first corresponding relation between the configured source table and the summary table.
Step S103: and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables according to the configured summary table, the dimension data of the dimension data table and the dimension data of the wide table.
In the embodiment of the invention, the source table is a data table which is required to be used for generating the summary table, the data table can also be called a theme table, and the source table can also be called a source theme table. Taking the e-commerce industry as an example, the data of the data sheet includes record information such as a user click behavior record, PV (Page View, i.e. Page access amount) of a Page, UV (Unique viewer, number of independent Visitor accesses) and the like, and the data amount of the data sheet per day may be billion.
The dimension data table is obtained according to a data table dynamically updated by the dimension, and the dimension data in the dimension data table is dynamically updated. Such as corporate organization architecture, the corresponding dimensional data is dynamically updated as department personnel and the like update.
The summary table is a table obtained by summarizing and calculating data of the source table.
The embodiment of the present invention further includes pre-configuring the first corresponding relationship and the second corresponding relationship, wherein:
configuring the first correspondence includes: configuring a summary table, and generating each source table, each field required to be extracted by the source table and a primary key of the dimension data required by the summary table. Preferably, the method also comprises an association field between the source tables and a dimension data table where the primary key of the dimension data is located.
Configuring the second correspondence includes: configuring a wide table, generating all summary tables required by the wide table, fields required to be extracted by all the summary tables and primary keys of dimensional data corresponding to all the summary tables. The primary key of the dimension data is also an associated field between all summary tables. Preferably, the method also comprises the step of executing the number of tasks in parallel, wherein the number of the tasks in parallel determines the concurrency number of the tasks for calculating the wide table data.
In one embodiment, the data of the source table is dynamically increased. Dynamic addition refers to the fact that the dimension is unchanged, but data is continuously increased along with time, for example, sales data of a certain commodity, and as time increases, each period (for example, each month) has a new piece of sales data.
The summary table may include one or more partition tables.
Generating a corresponding summary table according to the configured first corresponding relationship between the source table and the summary table and according to the data of the source table, which may specifically include: periodically extracting data from the newly added data of each source table according to fields to be extracted of each configured source table, wherein each period generates a partition table of the summary table according to the data extracted from the newly added data.
In one embodiment, configuring the second correspondence further includes configuring dynamic partition information for each summary table. The dynamic partition information of the summary table indicates the latest N partition tables of the summary table that need to be used, such as configuration: and $4, which means that the latest 4 partition table data are used to generate corresponding wide table data in a summary manner.
Generating corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relationship between the wide tables and the dimension data of the summary table and the dimension data table, which may specifically include: determining partition tables required to be used by each summary table according to the configured dynamic partition information; and according to the configured fields to be extracted from each summary table, extracting data from the partition tables to be used by each summary table, and summarizing the data extracted from each partition table according to the dimension data in the dimension data table to generate corresponding wide table data.
According to the method for generating the wide table data, provided by the embodiment of the invention, when the wide table data is updated subsequently, the data processing script is not required to be modified, calculation is not required to be carried out on all the theme table data, the defects of heavy task, high cost and high risk are overcome, repeated operation is reduced, the data volume needing to be calculated repeatedly is greatly reduced, the overall calculation time can be shortened, and the waste of server resources is reduced.
The method for generating the broad table data according to the embodiment of the present invention is described in further detail below.
Since the amount of data in a data warehouse is typically very large and may be updated or increased periodically, for example once a day. According to business requirements, theme table data need to be processed and summarized into a data wide table, taking e-commerce industry as an example, the data volume of the theme table data such as user click behavior records, PV, UV and other record information of pages may be billions each day.
Some of the dimensions of the subject tables are not dynamically updated, but only the data thereof is periodically added, while others are dynamically updated, such as company organization architecture, classification of goods, and the like. When the dimension of each theme table for generating the wide table data is not changed, as the data of partial theme tables is periodically increased, the corresponding wide table also periodically generates a piece of data, and the data is generally distinguished through partitioning.
In the prior art, a manner of generating the wide table data is to perform summary calculation on each topic table used for generating the wide table data, for example, when the wide table 1 needs to be generated through the topic table a, the topic table B, the topic table C, and the topic table D, the topic tables a to D are subjected to the summary calculation to obtain the wide table 1. Assuming that the theme table D is dynamically updated in dimension, if the theme table D is updated, there is no influence on the newly-calculated and processed wide table, but the wide table which has been processed and filed in history needs to be updated, according to the prior art scheme, the wide table needs to be completely recalculated according to the logic of the previous processing, which is very labor-intensive and time-consuming.
An embodiment of the present invention provides a logical architecture for generating wide table data, and fig. 2 is a schematic diagram of a logical architecture for generating wide table data according to an embodiment of the present invention. As shown in fig. 2, data of the volatile (dynamically updated) calculation dimension in the process of generating the wide table data is configured by means of configuration management. During the calculation, the information (theme tables D and Z) of the part of variable dimensions is subjected to dimension data table generation. The summary table is generated according to the dimension invariant data calculation, for example, the summary table H1 is generated according to the theme table A, B, C, and the summary table H2 is generated according to the theme table X, Y. Through a configuration management mode, extracting theme table information of dynamically changing dimensions, optimizing generation of wide table data, generating a dimension data table and a summary table of a middle layer, calculating the summary table and the dimension data table in the middle layer to obtain wide table data, for example, in fig. 2, a wide table s1 is generated according to the summary table H1(H1 may be one or more of a plurality of summary tables) and a dimension data table w1, and wide tables s2 and s3 are generated according to the summary table H2(H2 may be one or more of a plurality of summary tables) and the dimension data table w 2. According to the logic architecture, when the wide table data is updated subsequently, repeated calculation only needs to be carried out in the middle layer, and processing calculation on the theme table data is not needed.
Each summary table information and the dimension data table information are bound by a primary key, for example, the primary key information of the summary table may include the primary key information of the corresponding dimension data table. The dimension data table keeps the latest version data, and if the historical data of the wide table (the data of the wide table generated in the past) needs to be updated subsequently, only the dimension data table and the historical summary table (namely the summary table generated in the past) need to be combined for calculation and summary, so that the calculated data amount is reduced by several orders of magnitude compared with the prior art.
In the embodiment of the invention, the corresponding relation between the source table and the summary table is configured through the configuration file table, and the source table, namely the source subject table in the table 1 is a subject table which is required to be used for generating the summary table. The corresponding relation between the configuration source table and the summary table specifically comprises the configuration summary table, each source table required for generating the summary table, fields required to be extracted by each source table, associated fields among the source tables, a dimension data table and a dimension data main key. The configuration information in the configuration file table is shown in table 1, for example, and the target table in table 1 refers to a summary table to be generated.
TABLE 1
Figure BDA0002401460750000101
All the theme tables, required fields in the theme tables, fields associated with the theme tables, generated target table names, dimension data tables required to be processed and main keys of the dimension data tables are configured through a configuration file table, and finally, the preset form building statement template is combined, only personalized parts (namely, statements which are not in the form building statement template) are filled into the template, and finally, a corresponding summary table is generated through periodic calculation.
Because each summary table and the dimension data table are prepared periodically by the method, the wide table is calculated periodically by configuring information and monitoring the completion condition of the summary table.
And the corresponding relation among the summary table, the dimension data table and the wide table is also configured in the configuration file table. The method specifically comprises the steps of configuring a wide table, generating all summary tables required by the wide table, extracting fields required by all the summary tables, and associating fields among all the summary tables, wherein the associating fields among all the summary tables are primary keys of dimensional data corresponding to all the summary tables. The configuration file table is also configured with period updating information, dynamic partition information and the number of parallel execution tasks, and the number of the parallel execution tasks determines the concurrent number of the tasks for calculating the wide table data.
As shown in table 2, the source summary table in table 2 is a configured summary table, and the target table is a configured wide table. In the process of generating the wide table data by calculation, the partition tables corresponding to multiple cycles of the summary table may need to be calculated. The configuration of the dynamic partition information field in the configuration file table needs to be corresponded by means of wildcards to determine how many partition tables are used for calculation, for example, using: and $4, which represents the computational tasks for the latest 4 partition tables. And finally, filling the personalized part into the template through calculation by using a table building statement template and a written data query module.
TABLE 2
Figure BDA0002401460750000111
FIG. 3 is a diagram illustrating the main steps of a method for updating wide table data according to an embodiment of the present invention.
As shown in fig. 3, the method for updating the wide table data according to an embodiment of the present invention includes steps S301 to S304. Steps S301 to S303 are the same as steps S101 to S103, and are not described again.
Step S304: and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
In an embodiment, according to the configured summary table, the dimension data of the dimension data table, and the second corresponding relationship between the wide tables, updating the corresponding wide table data according to the updated dimension data of the summary table and the dimension data table, which may specifically include: determining a dependency relationship between the summary table and the wide table according to the second corresponding relationship, wherein if one wide table data is generated based on one summary table, the dependency relationship is a single dependency; if one wide table data is generated based on a plurality of summary tables, the dependency relationship is multiple dependency; for each summary table which is dependent on each wide table, carrying out summary operation in parallel according to the updated dimension data of the dimension data table to obtain the updated data of each wide table; for each summary table which is more dependent on each wide table, grouping the summary tables corresponding to each wide table according to the minimum calculation granularity, after removing the duplication of all the obtained groups, parallelly performing first summarization on the summary tables of each group after the duplication is removed according to the updated dimension data of the dimension data table, respectively storing each first summarization result in the cache tables corresponding to each group, respectively obtaining corresponding cache tables according to the group where the summary tables corresponding to each wide table are located, and parallelly performing second summarization on the cache tables corresponding to each wide table to obtain the updated data of each wide table.
The summary operation of the two summary tables may be considered as the minimum computation granularity. For example, the summary table 1 and the summary table 2 perform a summary operation to generate certain wide table data (in this embodiment of the present invention, the wide table data is not directly generated, but a cache table is generated first), and then the summary table 1 and the summary table 2 may be regarded as a group.
In one embodiment, the summary table includes a plurality of partition tables, and each group includes at least two partition tables, for example, assuming that the configuration dynamic partition information indicates that the latest 2 partition tables are used, the summary table 1 and the summary table 2 perform a summary operation, that is, using the latest two partition tables of the summary table 1 and the summary table 2, respectively, and totaling 4 partition tables into one group, perform a summary operation, and generate a cache table corresponding to the group.
According to the method for updating the wide-list data, disclosed by the embodiment of the invention, the data processing script is not required to be modified, calculation is not required to be carried out on all the theme list data, the defects of heavy task, high cost and high risk are overcome, repeated operation is reduced, the data volume needing repeated calculation is greatly reduced, the overall calculation time can be shortened, and the waste of server resources is reduced.
The method for updating the wide-list data according to the embodiment of the present invention is described in further detail below.
When the dimension data of the configured dimension data table is changed, the wide table history data needs to be refreshed (i.e. updated). The rerun data script may be automatically generated to perform an update of the generated wide table data. In the embodiment of the invention, only the dimension data table and the wide table summary table need to be involved in the wide table data updating process, and repeated calculation of the subject table is not needed, so that repeated operation and the data amount needing repeated calculation are greatly reduced.
If the time related to the back-flushing is traced back relatively long, in order to improve the calculation efficiency and avoid the excessively long execution time, the embodiment of the invention reads the configuration information in the configuration file table and determines the dependency relationship between the summary table and the wide table when the historical data of the wide table is updated, wherein the dependency relationship comprises single dependency and multiple dependency. The single dependency means that one wide table depends on one summary table, that is, the wide table data is generated by summarizing one summary table according to the dimension data of the dimension data table, for example, as shown in fig. 4(a), the wide table s1 is obtained by summarizing one summary table w1 according to the dimension data w in the dimension data table. The multiple dependencies refer to that one wide table depends on multiple summary tables, that is, the wide table data is generated by summarizing the multiple summary tables according to the dimension data of the dimension data table, for example, as shown in fig. 4(b), the wide table s2 and the wide table s3 are obtained by summarizing the multiple summary tables w2 according to the dimension data w in the dimension data table.
For single dependency, the summary table and the wide table of each dimension are in one-to-one correspondence, and will not change with the increase of the summary table, so that each calculation task is processed simultaneously in each period to obtain data of each wide table through parallel calculation. For multiple dependencies, one wide table depends on multiple summary tables, and each summary table may be utilized by multiple wide table data, so to avoid repeated calculation of data, embodiments of the present invention may adopt a manner similar to a merge sorting algorithm, specifically, group the summary tables corresponding to each wide table according to a minimum calculation granularity, deduplicate all the obtained groups, summarize the summary tables of each group after deduplication according to the latest dimension data, respectively store each summarized result in the cache tables corresponding to each group, respectively obtain corresponding cache tables according to the group where the summary table corresponding to each wide table is located, and summarize the cache tables corresponding to each wide table in parallel, to obtain updated data of each wide table.
According to the embodiment of the invention, the rerun data script is automatically generated according to the task types respectively corresponding to single dependency and multiple dependencies, and multiple tasks are concurrently executed, so that the back-brushing process of the wide-list historical data is optimized, and the task execution time is shortened.
The updating process of the wide table data according to one embodiment of the present invention is shown in fig. 5. When updating the data of the wide table, firstly reading configuration file information, namely configuration information in the configuration file table, judging the dependency relationship between the summary table and the wide table according to the configuration information, if the dependency relationship is single dependency, designating the number of back-brushing tasks (namely the number of parallel execution tasks), and executing back-brushing in a multi-task parallel mode; if the dependency relationship is multi-dependency, the number of the back-brushing tasks is specified, each cache table is generated, and then the back-brushing is executed in a multi-task parallel mode. Where each task is used to compute a wide table of data.
FIG. 6 is a diagram illustrating the multitasking parallel execution of wide table data updates according to one embodiment of the present invention. As shown in FIG. 6, the breadth table 1 is generated according to the summary tables 1-4, the breadth table 2 is generated according to the summary tables 2-5, the breadth table 3 is generated according to the summary tables 3-6, … … is analogized, and only a part of the summary tables and the breadth tables are shown in FIG. 6. If the corresponding wide table data is directly generated according to the plurality of summary tables, the calculated amount is N multiplied by M (wherein N is the total summary table quantity, and M is the partition quantity required by each wide table). Specifically, the plurality of summary tables corresponding to each wide table are grouped according to the minimum calculation granularity, that is, two summary tables are grouped into one group, according to fig. 6, summary tables 1 and 2 are summarized to obtain a cache table 1, summary tables 3 and 4 are summarized to obtain a cache table 2, summary tables 5 and 6 are summarized to obtain a cache table 3 … …, and finally, each cache table is generated, so that the processing calculation of the cache tables is completed, and only cache tables 1 to 7 are shown in fig. 6. And calculating the cache table 1 and the cache table 2 to generate a width table 1, calculating the cache table 5 (obtained by summarizing according to the summary tables 2 and 3) and the cache table 6 (obtained by summarizing according to the summary tables 4 and 5) to generate the width table 2, calculating other width tables in the same way, and finally finishing the processing calculation of all the width tables. In the process of generating the cache table and the wide table, the calculation tasks can be executed in parallel, repeated calculation is reduced by changing time in space, and the overall calculation time is shortened.
Fig. 7 is a schematic diagram of main blocks of a wide table data generation apparatus according to an embodiment of the present invention.
The apparatus 700 for generating wide table data according to an embodiment of the present invention mainly includes: a data table extraction module 701, a summary table generation module 702, and a wide table data generation module 703.
The data table extraction module 701 is configured to obtain a source table according to the data table with the non-dynamic updated dimensionality in each data table, and obtain a dimensionality data table according to the data table with the dynamic updated dimensionality in each data table.
The summary table generating module 702 is configured to generate a corresponding summary table according to the first corresponding relationship between the configured source table and the summary table and according to the data of the source table.
And a wide table data generating module 703, configured to generate corresponding wide table data according to the configured summary table, the dimension data of the dimension data table, and the second correspondence between the wide tables, and according to the dimension data of the summary table and the dimension data table.
The apparatus 700 for generating wide table data may further include a configuration module configured to pre-configure the first corresponding relationship and the second corresponding relationship, wherein: configuring the first correspondence includes: configuring a summary table, and generating each source table, each field required to be extracted by the source table and a primary key of the dimension data required by the summary table. Configuring the second correspondence includes: configuring a wide table, generating all summary tables required by the wide table, fields required to be extracted by all the summary tables and primary keys of dimensional data corresponding to all the summary tables.
In one embodiment, the data of the source table may be dynamically populated and the summary table includes one or more partition tables.
The summary table generation module 702 may be specifically configured to: and periodically extracting data from the newly added data of each source table according to the fields to be extracted of each configured source table, wherein each period generates a partition table of the summary table according to the data extracted from the newly added data.
The configuration module may be further configured to configure the second correspondence further including configuring dynamic partition information of each summary table.
The wide table data generation module 703 may be specifically configured to: determining partition tables required to be used by each summary table according to the configured dynamic partition information; and according to the configured fields to be extracted from each summary table, extracting data from the partition tables to be used by each summary table, and summarizing the data extracted from each partition table according to the dimension data in the dimension data table to generate corresponding wide table data.
Fig. 8 is a schematic diagram of main blocks of an apparatus for updating wide table data according to an embodiment of the present invention.
The apparatus 800 for updating wide table data according to an embodiment of the present invention mainly includes a data table extraction module 801, a summary table generation module 802, a wide table data generation module 803, and a wide table data updating module 804.
The data table extracting module 801, the summary table generating module 802, and the wide table data generating module 803 have the same corresponding functions as the data table extracting module 701, the summary table generating module 702, and the wide table data generating module 703, respectively, and are not described herein again.
A wide table data update module 804 to: and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
The wide table data update module 804 may be specifically configured to: determining a dependency relationship between the summary table and the wide table according to the second corresponding relationship, wherein if one wide table data is generated based on one summary table, the dependency relationship is a single dependency; if one wide table data is generated based on a plurality of summary tables, the dependency relationship is multiple dependency;
for each summary table which is dependent on each wide table, carrying out summary operation in parallel according to the updated dimension data of the dimension data table to obtain the updated data of each wide table;
for each summary table which is more dependent on each wide table, grouping the summary tables corresponding to each wide table according to the minimum calculation granularity, after removing the duplication of all the obtained groups, parallelly performing first summary on the summary tables of each group after the duplication is removed according to the updated dimension data of the dimension data table, respectively storing each first summary result in the cache tables corresponding to each group, respectively obtaining corresponding cache tables according to the group where the summary table corresponding to each wide table is located, and parallelly performing second summary on the cache tables corresponding to each wide table to obtain the updated data of each wide table.
In addition, in the embodiment of the present invention, the detailed implementation contents of the wide table data generation device and the wide table data updating device are already described in detail in the above-described wide table data generation method and wide table data updating method, respectively, and therefore, the repeated contents are not described again here.
Fig. 9 shows an exemplary system architecture 900 to which a method for generating and a method for updating wide table data or a device for generating and a device for updating wide table data according to an embodiment of the present invention can be applied.
As shown in fig. 9, the system architecture 900 may include end devices 901, 902, 903, a network 904, and a server 905. Network 904 is the medium used to provide communication links between terminal devices 901, 902, 903 and server 905. Network 904 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 901, 902, 903 to interact with a server 905 over a network 904 to receive or send messages and the like. The terminal devices 901, 902, 903 may have installed thereon various messenger client applications such as, for example only, a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc.
The terminal devices 901, 902, 903 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 905 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 901, 902, 903. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the method for generating the wide table data and the method for updating the wide table data provided by the embodiment of the present invention are generally executed by the server 905, and accordingly, the device for generating the wide table data and the device for updating the wide table data are generally disposed in the server 905.
It should be understood that the number of terminal devices, networks, and servers in fig. 9 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 10, a block diagram of a computer system 1000 suitable for implementing a terminal device or server of an embodiment of the present application is shown. The terminal device or the server shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 10, the computer system 1000 includes a Central Processing Unit (CPU)1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the system 1000 are also stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
The following components are connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The computer program executes the above-described functions defined in the system of the present application when executed by the Central Processing Unit (CPU) 1001.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a data table extraction module, a summary table generation module and a wide table data generation module. For example, the data table extraction module may also be described as a "module for obtaining a source table according to a data table with dimensions not dynamically updated in each data table, and obtaining a dimension data table according to a data table with dimensions dynamically updated in each data table".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: obtaining a source table according to a data table with dimension dynamically updated in each data table, and obtaining a dimension data table according to a data table with dimension dynamically updated in each data table; generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table; and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables. Or obtaining a source table according to a data table with dimension dynamically updated in each data table, and obtaining a dimension data table according to a data table with dimension dynamically updated in each data table; generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table; generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and a second corresponding relation among the wide tables according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables; and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
According to the technical scheme of the embodiment of the invention, the source table is obtained according to the data table with the non-dynamic updated dimension in each data table, the dimension data table is obtained according to the data table with the dynamic updated dimension, the corresponding summary table is generated according to the data of the source table, and the corresponding wide table data is generated according to the summary table and the dimension data of the dimension data table. And under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the summary table and the updated dimension data of the dimension data table. By the embodiment of the invention, the data processing script is not required to be modified when the wide table data is updated, calculation is not required to be carried out on all the theme table data, the defects of heavy task, high cost and high risk are overcome, repeated operation is reduced, the data amount required to be repeatedly calculated is greatly reduced, the overall calculation time is shortened, and the waste of server resources is reduced.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for generating wide table data, comprising:
obtaining a source table according to a data table with dimension dynamically updated in each data table, and obtaining a dimension data table according to a data table with dimension dynamically updated in each data table;
generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table;
and generating corresponding wide table data according to the summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables.
2. The method of claim 1, further comprising pre-configuring the first correspondence and the second correspondence, wherein:
configuring the first correspondence includes: configuring a summary table, and generating each source table, fields to be extracted by each source table and a primary key of dimension data, which are required by the summary table;
configuring the second correspondence includes: configuring a wide table, and generating all summary tables required by the wide table, fields required to be extracted by all the summary tables, and the primary keys of the dimensional data corresponding to all the summary tables.
3. The method of claim 2, wherein the data of the source table is dynamically increased, and the summary table comprises one or more partition tables;
generating a corresponding summary table according to the configured first corresponding relationship between the source table and the summary table according to the data of the source table, including:
and periodically extracting data from the newly added data of each source table according to the configured fields to be extracted of each source table, wherein each period generates a partition table of the summary table according to the data extracted from the newly added data.
4. The method of claim 2, wherein configuring the second correspondence further comprises configuring dynamic partition information for the summary tables;
generating corresponding wide table data according to the summary table and the dimension data of the dimension data table according to the configured summary table, the dimension data of the dimension data table and the second corresponding relationship among the wide tables, and including:
determining the partition table needed to be used by each summary table according to the configured dynamic partition information;
and according to the configured fields to be extracted from each summary table, extracting data from the partition tables to be used by each summary table, and according to the dimension data in the dimension data table, summarizing the data extracted from each partition table to generate corresponding wide table data.
5. A method for updating wide-table data generated according to any one of claims 1 to 4, comprising:
and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
6. The method of claim 5, wherein updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table, and the second corresponding relationship between the wide tables according to the updated dimension data of the summary table and the dimension data table comprises:
determining a dependency relationship between the summary tables and the wide tables according to the second corresponding relationship, wherein if one wide table data is generated based on one summary table, the dependency relationship is a single dependency; if one said wide table data is generated based on a plurality of said summary tables, said dependency relationship is multiple dependency;
for each summary table which is dependent on each wide table, carrying out summary operation in parallel according to the updated dimension data of the dimension data table to obtain the updated data of each wide table;
for each summary table which is more dependent on each wide table, grouping the summary tables corresponding to each wide table according to the minimum calculation granularity, after all the obtained groups are deduplicated, parallelly performing first summarization on the summary tables of each group after deduplication according to the updated dimension data of the dimension data table, respectively storing each first summarization result in the cache table corresponding to each group, respectively obtaining the corresponding cache table according to the group where the summary table corresponding to each wide table is located, and parallelly performing second summarization on the cache tables corresponding to each wide table to obtain the updated data of each wide table.
7. An apparatus for generating wide table data, comprising:
the data table extraction module is used for obtaining a source table according to the data tables with the non-dynamic updating dimensionality in each data table and obtaining a dimensionality data table according to the data tables with the dynamic updating dimensionality in each data table;
the summary table generating module is used for generating a corresponding summary table according to the first corresponding relation between the configured source table and the summary table and the data of the source table;
and the wide table data generation module is used for generating corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the dimension data of the summary table and the dimension data table.
8. An apparatus for updating wide table data generated according to claim 7, comprising a wide table data updating module configured to:
and under the condition that the dimension data of the dimension data table is updated, updating the corresponding wide table data according to the configured summary table, the dimension data of the dimension data table and the second corresponding relation among the wide tables and the updated dimension data of the summary table and the dimension data table.
9. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method recited in any of claims 1-6.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-6.
CN202010148063.7A 2020-03-05 2020-03-05 Wide-table data generation method, updating method and related device Active CN113360494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010148063.7A CN113360494B (en) 2020-03-05 2020-03-05 Wide-table data generation method, updating method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010148063.7A CN113360494B (en) 2020-03-05 2020-03-05 Wide-table data generation method, updating method and related device

Publications (2)

Publication Number Publication Date
CN113360494A true CN113360494A (en) 2021-09-07
CN113360494B CN113360494B (en) 2024-04-05

Family

ID=77523784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010148063.7A Active CN113360494B (en) 2020-03-05 2020-03-05 Wide-table data generation method, updating method and related device

Country Status (1)

Country Link
CN (1) CN113360494B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130086460A1 (en) * 2011-10-04 2013-04-04 Microsoft Corporation Automatic Relationship Detection for Reporting on Spreadsheet Data
CN109189835A (en) * 2018-08-21 2019-01-11 北京京东尚科信息技术有限公司 The method and apparatus of the wide table of data are generated in real time

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130086460A1 (en) * 2011-10-04 2013-04-04 Microsoft Corporation Automatic Relationship Detection for Reporting on Spreadsheet Data
CN109189835A (en) * 2018-08-21 2019-01-11 北京京东尚科信息技术有限公司 The method and apparatus of the wide table of data are generated in real time

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YU DU等: "Broadband rotary hybrid generator for wide-flow-rate fluid energy harvesting and bubble power generation", 《ENERGY CONVERSION AND MANAGEMENT》 *
张雅文;刘春霞;党伟超;白尚旺;: "面向SaaS应用基于多宽表模式的多租户索引研究", 计算机应用与软件, no. 07 *

Also Published As

Publication number Publication date
CN113360494B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN109189835B (en) Method and device for generating data wide table in real time
US10521404B2 (en) Data transformations with metadata
US10372723B2 (en) Efficient query processing using histograms in a columnar database
US8380680B2 (en) Piecemeal list prefetch
CN107480205B (en) Method and device for partitioning data
CN111125064A (en) Method and device for generating database mode definition statement
CN112597126A (en) Data migration method and device
CN109960212B (en) Task sending method and device
CN112925859A (en) Data storage method and device
CN111753019A (en) Data partitioning method and device applied to data warehouse
CN108985805B (en) Method and device for selectively executing push task
CN113360494B (en) Wide-table data generation method, updating method and related device
CN110858199A (en) Document data distributed computing method and device
CN110781238B (en) Client view caching method and device based on combination of Redis and Hbase
CN113760861A (en) Data migration method and device
CN113704242A (en) Data processing method and device
CN113326680A (en) Method and device for generating table
CN112817930A (en) Data migration method and device
Lee et al. On a hadoop-based analytics service system
CN110019162B (en) Method and device for realizing attribute normalization
CN110727672A (en) Data mapping relation query method and device, electronic equipment and readable medium
CN113778501B (en) Code task processing method and device
CN109446183B (en) Global anti-duplication method and device
CN113448940A (en) Method and device for expanding database
CN112215249A (en) Hierarchical classification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant