CA3176450A1

CA3176450A1 - Method and apparatus for implementing incremental data consistency

Info

Publication number: CA3176450A1
Application number: CA3176450A
Authority: CA
Inventors: Hu Peng; Shangqiang FU; Yang Liu; Qian Sun
Original assignee: 10353744 Canada Ltd
Current assignee: 10353744 Canada Ltd
Priority date: 2019-03-28
Filing date: 2019-09-29
Publication date: 2020-10-01
Also published as: CN110046168A; CN110046168B; WO2020192064A1

Abstract

An incremental data consistency implementation method and device, relating to the technical field of data warehouses. The method comprises: initializing all data of data tables having an association relationship in a service system, and loading the data into a first database to generate a plurality of total data tables (S1); synchronizing real-time data of the data tables to the plurality of total data tables and a plurality of incremental data tables of a second database on the basis of logs of a service database, respectively (S2); extracting all service unique identifiers in the plurality of incremental data tables, and merging said identifiers in the second database to generate an incremental identifier merged table (S3); and querying according to the increment identification merged table to obtain service data related to the increment identifier merged table in the plurality of total data tables, and correspondingly writing the service data into a consistency increment data table of the second database (S4). There is basically no impact on the normal operation of the service database in the incremental data consistency implementation process, and the loss of database resources is relatively low.

Description

METHOD AND APPARATUS FOR IMPLEMENTING INCREMENTAL DATA
CONSISTENCY
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of data warehouses, and more particularly to a method and an apparatus for implementing incremental data consistency.
Description of Related Art

[0002] For constructing an operational data store (ODS) responsibility relationship data table, a big-data warehouse has to construct a consistent incremental data table, so as to ensure consistency of incremental data from different data tables that have association therebetween. Taking retail orders for example, a consistent incremental data table between the order header table and the sub-table for each order helps ensure that a changed order number exists in every order's incremental data table. This prevents the awkward situation that one changed order number is present in some tables but absent from others, which breaks association between incremental table data.

[0003] In the prior art, consistency of incremental data is typically realized using solutions described below.

[0004] The first method is about incrementally acquiring and feeding the data in order header table and sub-tables from the business system to the incremental data table of the big-data platform, then using hive/spark to generate the corresponding full data table, generating full changed order numbers according to incremental data table to brute-force each table, and finally generating a consistent incremental data table of the tables.

[0005] An alternative approach is implemented by incrementally acquiring the order number data from the order header table and sub-tables of the business system and feeding them Date Regue/Date Received 2022-09-22 to an order number change staging table of the business system, then, from the business system, acquiring business data of the order number corresponding to the header table and each sub-table according to order numbers in the order number change staging table using database indexes, and put the acquired business data to a consistent incremental data table of a data warehouse.

[0006] While the foregoing two solutions are easy to implement, they have some defects and shortcomings.

[0007] As to the former prior-art solution, it requires full reading of the full order data to generate a full data table in a Hive system. Assuming that there are originally 10 billion orders, and the number of orders increases 2 million every day, for each time of updating the data of 2 million orders, reading and writing of the data of the 10 billion orders have to be done, and for generation of the consistent incremental data table, full reading of the full data table has to be done again. The process as a whole requires two time of full reading and one time of full writing to the full data table, leading to huge consumption of the big data platform resources and inefficiency.

[0008] Regarding the latter known solution, it requires the business system to create and access with writing permission on an order number change staging table. The business system has to read the business system table data twice, and the use relies on business system data table indexes. The entire process is highly dependent on the business system, and extraction can cause the database locked. Particularly, during large-scale promotional events, the system can be degraded. This will directly prevent data extraction and in turn suspend big data computing, making the system unable to generate analysis data as scheduled.

[0009] In addition, a Hive-based data warehouse does not support enquiry for order number indexes and is not suitable for tracing back of orders. For business analysis scenarios like after-sales service, it is necessary to associate the order data corresponding to business, and the time span for such order data is changeable, from one month to more than one year. Since a Hive table is basically incapable of indexing, such business analysis is relatively difficult to implement on basis of a Hive table.

Date Regue/Date Received 2022-09-22 SUMMARY OF THE INVENTION

[0010] For addressing the issues of the prior art, embodiments of the present invention provide a method and an apparatus for implementing incremental data consistency.

[0011] The present invention adopts the following technical schemes.

[0012] In one aspect, the present invention provides a method for implementing incremental data consistency, comprising:

[0013] initializing all data of data tables having an association relationship in a business system, and loading the data to a first database so as to generate plural full data tables;

[0014] based on database logs of the business system, synchronizing real-time data of each said data table to the plural full data tables and to incremental data tables of a second database;

[0015] extracting all business unique identities in the plural incremental data table and merging them in the second database to generate an incremental identity merged table;
and

[0016] according to the incremental identity merged table, making a query to find out business data associated with the incremental identity merged table from the plural full data tables, and correspondingly writing the business data into the consistent incremental data table of the second database.

[0017] In a preferred implementation, the step of based on database logs of the business system, synchronizing real-time data of each said data table to the plural full data tables and to incremental data tables of a second database comprises:

[0018] analyzing the database logs of the business system to get the real-time data of each said data table, and synchronizing the real-time data to a real-time data stream;

[0019] landing the data in the real-time data stream onto the plural full data tables; and

[0020] writing the data in the real-time data stream into the plural incremental data tables.

[0021] In a preferred implementation, the first database is a KY database, the second database is a Hive database.

[0022] In a preferred implementation, the step of according to the incremental identity merged table, making a query to find out business data associated with the incremental identity Date Regue/Date Received 2022-09-22 merged table from the plural full data tables comprises:

[0023] for every order number in the incremental identity merged table, enquiring for the business data matching the order number from the plural full data tables, respectively, through an SQL enquiry interface, so as to obtain the enquiry result.

[0024] In a preferred implementation, the method further comprises:

[0025] receiving a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and returning data backward enquiry results.

[0026] In another aspect, the present invention provides an apparatus for implementing incremental data consistency, comprising:

[0027] an initializing module, for initializing all data of data tables having an association relationship in a business system, and loading the data to a first database so as to generate plural full data tables;

[0028] a real-time synchronizing module, for based on database logs of the business system, synchronizing real-time data of each said data table to the plural full data tables and to incremental data tables of a second database;

[0029] an identity merging module, for extracting all business unique identities in the plural incremental data table and merging them in the second database to generate an incremental identity merged table;

[0030] an enquiring module, for according to the incremental identity merged table, enquiring to find out the business data associated with the incremental identity merged table from the plural full data tables; and

[0031] a writing module, for correspondingly writing the business data associated with the incremental identity merged table into the consistent incremental data table of the second database.

[0032] In a preferred implementation, the real-time synchronizing module is specifically for:

[0033] analyzing the database logs of the business system to get the real-time data of each said data table, and synchronizing the real-time data to a real-time data stream;

[0034] landing the data in the real-time data stream onto the plural full data tables; and Date Regue/Date Received 2022-09-22

[0035] writing the data in the real-time data stream into the plural incremental data tables.

[0036] In a preferred implementation, the first database is a KY database, the second database is a Hive database.

[0037] In a preferred implementation, the enquiring module is specifically for:

[0038] as to every order number in the incremental identity merged table, enquiring for the business data matching the order number from the plural full data tables, respectively, through an SQL enquiry interface.

[0039] In a preferred implementation, the enquiring module is further for:

[0040] receiving a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and returning data backward enquiry results.

[0041] The present invention provides a method and an apparatus for implementing incremental data consistency, which uses the database log to synchronize the real-time data of each data table in the business database to the data warehouse. Different from the known solution that uses a created order number change staging table to read the table data of the business system and thus is highly dependent on the use of data table indexes of the business system, the present invention basically has no interference with the normal operation of the business database when collecting data from the business database, and only require a single time of full reading for enquiring the business data related to incremental identity merged table from plural incremental data tables, thus consuming less database resources. Besides, the consistent incremental data table obtained by writing the enquiry results can ensure consistent incremental data across data tables.
In addition, since data analysis of the consistent incremental data table supports analysis based on incremental data, all order-related analyses for data on the current day can be easily accomplished by retrieving the data of the current day in each table, without the need of retrieving data in the history zone, so the consumption to database resources is smaller.
BRIEF DESCRIPTION OF THE DRAWINGS
Date Regue/Date Received 2022-09-22

[0042] To better illustrate the technical schemes as disclosed in the embodiments of the present invention, accompanying drawings referred in the description of the embodiments below are introduced briefly. It is apparent that the accompanying drawings as recited in the following description merely provide a part of possible embodiments of the present invention, and people of ordinary skill in the art would be able to obtain more drawings according to those provided herein without paying creative efforts, wherein:

[0043] FIG. 1 shows a flowchart of a method for implementing incremental data consistency of the present invention;

[0044] FIG. 2 shows an implementation flowchart of an incremental order data consistency in an operational data store ODS according to the present invention; and

[0045] FIG. 3 shows a block diagram of an apparatus for implementing incremental data consistency of the present invention.
DETAILED DESCRIPTION OF THE INVENTION

[0046] To make the foregoing objectives, features, and advantages of the present invention clearer and more understandable, the following description will be directed to some embodiments as depicted in the accompanying drawings to detail the technical schemes disclosed in these embodiments. It is, however, to be understood that the embodiments referred herein are only a part of all possible embodiments and thus not exhaustive. Based on the embodiments of the present invention, all the other embodiments can be conceived without creative labor by people of ordinary skill in the art, and all these and other embodiments shall be encompassed in the scope of the present invention.

[0047] Unless specified otherwise in the context, the terms "comprising", "including", and the like as used throughout the disclosure and the appended claims should be construed with inclusive meaning but not exclusive or exhaustive meaning. In other words, these terms are intended to mean "including but not limited to".

[0048] It is to be understood that, in the description of the present invention, the terms "first", "second", and so on are merely descriptive and shall not be understood as indicating or Date Regue/Date Received 2022-09-22 implying relative importance. Additionally, unless otherwise stated, in the description of the present invention, the term "plural" means two or more than two.

[0049] Embodiment 1

[0050] The embodiment of the present invention provides a method for implementing incremental data consistency, which is applicable to data warehouses (e.g., an operational data store, ODS), as shown in FIG. 1. The method comprises the following steps.

[0051] Si: initializing all data of data tables having an association relationship in a business system, and loading the data to a first database so as to generate plural full data tables.

[0052] In the present embodiment, data tables having association relationship may have one-to-one or one-to-multiple relationship. Among the data tables having one-to-multiple relationship, one data table may be the parent table, while the others are each a child table.
For example, in a scenario involving retail transaction orders, the order header table is the parent table, and the order product table, the order payment table, and the order expansion table are all child tables.

[0053] Specifically, based on an ETL tool, all data of the data tables are extracted from the business database corresponding to the business system. The data are then cleaned and converted before being loaded into a first database to form plural full data tables corresponding to respective data tables.

[0054] For example, all the data of each of the order header table, the order product table, the order payment table, and the order expansion table in the business database are loaded into the first database to generate full data tables corresponding to the order header table, the order product table, the order payment table, and the order expansion table, respectively.

[0055] Therein, first database may be a KY (Key-Value) database. A key-value database is a database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs as marks, so that values can be found rapidly using the corresponding keys, and allows nice reading and writing operations from the exterior. A
representative Key-value database may be redis.

[0056] S2 involves, based on the database log of the business system, synchronizing real-time Date Regue/Date Received 2022-09-22 data of each said data table to plural full data tables and to plural incremental data tables of a second database.

[0057] Therein, the real-time data are data newly added or changed with respect to each data table.

[0058] Therein, the second database is a Hive database. A Hive database is a data warehouse tool based on Hadoop, and can map structuralized data files into a database table while providing simple SQL enquiry functions. It can convert a SQL sentence into a MapReduce task to be executed. It is advantageous because the learning cost is low, and fast and simple MapReduce statistics can be accomplished through SQL-like sentences, without the need of developing a dedicated MapReduce application, making it very suitable for statistical analyses of data warehouses.

[0059] Specifically, this involves analyzing the real-time data of each data table from the database log of the business system, and synchronizing the real-time data to a real-time data stream;

[0060] landing the data in the real-time data stream into the plural full data table; and

[0061] writing data in the real-time stream into plural incremental data tables of the second database.

[0062] Therein, the database log records information of operations made to the business database.
The database log may specifically be a Binlog, which can be analyzed regularly using a Binlog analyzer.

[0063] In the present embodiment, the database log may be acquired when the database log has been updated. Therein, the update includes addition, deletion, or modification made to any field in the data tables of the business database.

[0064] It is to be noted that, the step of landing the data of the real-time data stream into the plural full data tables and the step of writing the data of the real-time data stream into the plural incremental data tables may be conducted in any sequence, without limitation. In the present embodiment of the present invention, the two steps are preferably conducted at the same time.

[0065] S3: extracting all business unique identities in the plural incremental data tables, and Date Regue/Date Received 2022-09-22 merging them in the second database to generate an incremental identity merged table.

[0066] Therein, the business unique identity may exclusively mark one business record in the database table. In an order-related scenario, the business unique identity is the order number.

[0067] Specifically, this is about extracting all the business unique identities from the plural incremental data tables, and merging and de-duplicating the business unique identities, so as to generate the incremental identity merged table.

[0068] In the present embodiment, all the business unique identities may be merged into a set, with the repeated business unique identities removed. The business unique identities after de-duplication form the incremental identity merged table, which is stored in the Hive database.

[0069] S4: according to the incremental identity merged table, acquiring business data associated with the incremental identity merged table from the plural full data tables by means of enquiry, and correspondingly writing them into a consistent incremental data table of the second database.

[0070] Specifically, the process may comprise:

[0071] for every order number in the incremental identity merged table, through the SQL enquiry interface, enquiring for business data matching the order number from plural full data tables.

[0072] In a practical implementing process, a SQL enquiry interface may be developed to integrate KY database enquiries to the SQL, so as to make development easier, thereby achieving real-time association between the Hive database and the KY database by means of SQL.

[0073] Since the Hive database and the KY database can be associated through SQL, the full data table in the KY database can support fast retrieval based on the order numbers, thereby providing the function of data index retrieval, without increasing burden on the Hadoop platform or the business system.

[0074] Further, in addition to the foregoing steps, the method disclosed in the embodiment of the present invention may further comprise:

Date Regue/Date Received 2022-09-22

[0075] based on the consistent incremental data table in the second database, performing analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

[0076] Since data analysis of the consistent incremental data table supports analysis based on incremental data, all order-related analyses for data on the current day can be easily accomplished by retrieving the data of the current day in each table, such as wide tables with respect to ordering, payment, or product return/exchange, without the need of retrieving data in the history zone, so the consumption to database resources is smaller.

[0077] Further, in addition to the foregoing steps, the method disclosed in the embodiment of the present invention may further comprise:

[0078] receiving a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and returning data backward enquiry results.

[0079] Exemplarily, dealing with a customer complaint is now described as an example. If the customer complaint is with respect to an order made long time ago, retrieval in a Hive table for a long time span can be inefficient. Instead, a backward enquiry made to a KY
database through a SQL enquiry interface can efficiently lead to search for customer order information throughout full data table, thereby effectively dealing with business scenarios for after-sales services where retrieval of the past order data and acquisition of associated orders for dimensional analyses are required, with improved performance of retrieval and reduced consumption of the database resources.

[0080] The following description is directed to an order scenario for further explaining the method for implementing incremental data consistency of the Embodiment 1 of the present invention. FIG. 2 shows a process for implementing incremental order data consistency for an operational data store ODS. The process comprising:

[0081] Step 1: initializing all data in the parent table and its child tables of the business system, and loading the data into a KY database to form plural full data tables;

[0082] Step 2: synchronizing the data from the business system to a data stream through a database log in a real-time manner;
Date Regue/Date Received 2022-09-22

[0083] Step 3: landing the data in the real-time data stream into the incremental data table of the Hive database;

[0084] Step 4: writing data of the real-time data stream into the full data table of the KY database correspondingly;

[0085] Step 5: merging and de-duplicating all the order numbers extracted from the incremental data tables and writing them into the incremental order number merged table of the Hive database; and

[0086] Step 6: enquiring and calling the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and writing the enquiry results into the consistent incremental data table of the Hive database.

[0087] With the foregoing steps, the consistent incremental data table of the Hive database of the operational data warehouse ODS and the full data table of the KY database can be eventually generated.

[0088] The method for implementing incremental data consistency disclosed in the present invention uses the database log to synchronize the real-time data of each data table in the business database to the data warehouse. Different from the known solution that uses a created order number change staging table to read the table data of the business system and thus is highly dependent on the use of data table indexes of the business system, the present invention basically has no interference with the normal operation of the business database when collecting data from the business database, and only require a single time of full reading for enquiring the business data related to incremental identity merged table from plural incremental data tables, thus consuming less database resources.
Besides, the consistent incremental data table obtained by writing the enquiry results can ensure consistent incremental data across data tables. In addition, since data analysis of the consistent incremental data table supports analysis based on incremental data, all order-related analyses for data on the current day can be easily accomplished by retrieving the data of the current day in each table, without the need of retrieving data in the history zone, so the consumption to database resources is smaller.

Date Regue/Date Received 2022-09-22

[0089] Embodiment 2

[0090] The embodiment of the present invention provides an apparatus for implementing incremental data consistency. As shown in FIG. 3, the apparatus comprises the components detailed below.

[0091] An initializing module 31 is for initializing all data of data tables having an association relationship in a business system, and loading the data to a first database so as to generate plural full data tables.

[0092] A real-time synchronizing module 32 is for, based on logs of business databases, synchronizing real-time data of each said data table to plural full data tables and to plural incremental data tables of a second database.

[0093] A identity merging module 33 is for extracting all business unique identities in the plural incremental data tables, and merging them in the second database to generate an incremental identity merged table.

[0094] An enquiring module 34 is for, according to the incremental identity merged table, enquiring to find out the business data associated with the incremental identity merged table from the plural full data tables.

[0095] A writing module 35 is for correspondingly writing the business data associated with the incremental identity merged table into the consistent incremental data table of the second database.

[0096] Further, the real-time synchronizing module 32 is specifically for:

[0097] analyzing the database logs of the business system to get the real-time data of each said data table, and synchronizing the real-time data to a real-time data stream;

[0098] landing the data in the real-time data stream onto the plural full data tables; and

[0099] writing the data in the real-time data stream into the plural incremental data tables.

[0100] Further, the first database is a KY database, and the second database is a Hive database.

[0101] Further, the enquiring module 34 is specifically for:

[0102] as to every order number in the incremental identity merged table, enquiring for the business data matching the order number from the plural full data tables, respectively, through an SQL enquiry interface.

Date Regue/Date Received 2022-09-22

[0103] Further, the enquiring module 34 is further for:

[0104] receiving a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and returning data backward enquiry results.

[0105] The apparatus for implementing incremental data consistency of the present invention uses the database log to synchronize the real-time data of each data table in the business database to the data warehouse. Different from the known solution that uses a created order number change staging table to read the table data of the business system and thus is highly dependent on the use of data table indexes of the business system, the present invention basically has no interference with the normal operation of the business database when collecting data from the business database, and only require a single time of full reading for enquiring the business data related to incremental identity merged table from plural incremental data tables, thus consuming less database resources.
Besides, the consistent incremental data table obtained by writing the enquiry results can ensure consistent incremental data across data tables. In addition, since data analysis of the consistent incremental data table supports analysis based on incremental data, all order-related analyses for data on the current day can be easily accomplished by retrieving the data of the current day in each table, without the need of retrieving data in the history zone, so the consumption to database resources is smaller.

[0106] All the alternative technical schemes described above may be combined in any manner to form more alternative embodiments of the present invention and no enumeration is made herein.

[0107] It is to be noted that work division among the foregoing functional modules of the apparatus for implementing incremental data consistency of the present embodiment to implement the method for implementing incremental data consistency are merely exemplary. In practical implementations, the work division may be made differently among functional modules. In other words, the internal architecture of the apparatus for implementing incremental data consistency may be reconfigured with different functional modules to perform all or a part of the functions as described previously. In addition, Date Regue/Date Received 2022-09-22 since the apparatus for implementing incremental data consistency of the present embodiment and the disclosed method for implementing incremental data consistency in the previous embodiment stem from the same conception, the details of its implementation can be learned from the description made to the method of the previous embodiment, and no repetition is made herein.

[0108] As will be appreciated by people of ordinary skill in the art, implementation of all or a part of the steps of the method of the present invention as described previously may be realized by hardware components, or by having a program instruct related hardware components. The program may be stored in a computer-readable storage medium, wherein the abovementioned storage medium may be a ROM, a magnetic disk, an optical disk or the like.

[0109] The preferred embodiments of the present invention described previously are not intended to limit the present invention. Any modification, equivalent replacement, and improvement made under the spirit and principle of the present invention shall be included in the scope of the present invention.

Date Regue/Date Received 2022-09-22

Claims

Claims:

1. A device for implementing incremental data consistency, the device comprising:
an initializing module configured to initialize all data of a plurality of data tables having an association relationship in a business system, and load the data to a first database so as to generate plural full data tables;
a real-time synchronizing module configured to synchronize real-time data of each said data table to the plural full data tables and to incremental data tables of a second database, based on database logs of the business system;
an identity merging module configured to extract all business unique identities in the plural incremental data table and merge them in the second database to generate an incremental identity merged table; and an enquiring module configured to make a query to find out business data associated with the incremental identity merged table from the plural full data tables according to the incremental identity merged table;
a writing module configured to correspondingly write the business data into the consistent incremental data table of the second database.

2. The device of claim 1 wherein the data of the data tables are extracted from a business database corresponding to the business system.

3. The device of claim 2, wherein to synchronize real-time data of each data table to the plural full data tables and to incremental data tables of a second database further, the real-time synchronizing module is further configured to:
analyze the database logs of the business system to get the real-time data of each said data table, and synchronize the real-time data to a real-time data stream;
land the data in the real-time data stream onto the plural full data tables;
and write the data in the real-time data stream into the plural incremental data tables.
Date recue/Date received 2023-09-26

4. The device of claim 3 wherein the database log records information of the operations made to the business database.

5. The device of any one of claims 3 or 4 wherein the database log is a Binlog, wherein the Binlog is analyzed by a Binlog analyzer.

6. The device of any one of claims 1 to 5 wherein the data from the plurality of data tables is cleaned and converted before being loaded to the first database.

7. The device of any one of claims 1 to 6 wherein the plurality of data tables having an association relationship have a one-to-one relationship.

8. The device of any one of claims 1 to 6 wherein the plurality of data tables having an association relationship have a one-to-multiple relationship.

9. The device of claim 8 wherein one of the plurality of data tables is a parent table and the remaining data tables of the plurality of data tables are child tables.

10. The device of any one of claims 1 to 9 wherein new data is added and existing data is changed in real-time.

11. The device of any one of claims 1 to 10, wherein the first database is a key-value (KV) database and the second database is a Hive database.

12. The device of claim 11 wherein the key-value database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs.

13. The device of claim 10 or 11 wherein the key-value database is Redis.

14. The device of any one of claims 1 to 13 wherein each business unique identity marks one business record in the database table.

15. The device of any one of claims 1 to 14 wherein the business unique identities are merged into a set and any duplicated business unique identities are removed, wherein the merged and unduplicated business unique identities form the incremental identity merged table.

Date recue/Date received 2023-09-26

16. The device of claim 11 wherein to make a query to find out business data associated with the incremental identity merged table from the plural full data tables further the enquiring module is further configured to:
enquiring for the business data matching the business unique identifier from the plural full data tables, respectively, through an SQL enquiry interface.

17. The device of claim 16 wherein the SQL enquiry interface integrates key-value database enquiries into an SQL, to achieve real-time association between the key-value database and the Hive database.

18. The device of claim 16 wherein the enquiring module is further configured to:
receive a data backward enquiry instruction, enquire the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and return data backward enquiry results.

19. The device of any one of claims 1 to 18 wherein based on the incremental data table in the second database the device is further configured to perform analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

20. The device of claim 19 wherein the business is a retail business and a business scenario involves an order and the device is configured to:
initialize all data in the parent table and the child tables of the business system, and load the data into the KV database to form plural full data tables;
synchronize the data from the business system to a data stream through the database log in a real-time manner;
land the data in the real-time data stream into the incremental data table of the Hive database;

Date recue/Date received 2023-09-26 write data of the real-time data stream into the full data table of the KV
database correspondingly;
merge and de-duplicate all the order numbers extracted from the incremental data tables and write them into the incremental order number merged table of the Hive database; and enquire and call the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and write the enquiry results into the consistent incremental data table of the Hive database.

21. The device of claim 20 wherein the parent table is an order header table and the child tables include an order product table, an order payment table, and an order expansion table.

22. The device of claim 20 wherein the business unique identity is an order number.

23. A system for implementing incremental data consistency, the system comprising:
an initializing module configured to initialize all data of a plurality of data tables having an association relationship in a business system, and load the data to a first database so as to generate plural full data tables;
a real-time synchronizing module configured to synchronize real-time data of each said data table to the plural full data tables and to incremental data tables of a second database, based on database logs of the business system;
an identity merging module configured to extract all business unique identities in the plural incremental data table and merge them in the second database to generate an incremental identity merged table; and an enquiring module configured to make a query to find out business data associated with the incremental identity merged table from the plural full data tables according to the incremental identity merged table;
a writing module configured to correspondingly write the business data into the consistent incremental data table of the second database.

Date recue/Date received 2023-09-26

24. The system of claim 23 wherein the data of the data tables are extracted from a business database corresponding to the business system.

25. The system of claim 24, wherein to synchronize real-time data of each data table to the plural full data tables and to incremental data tables of a second database further, the real-time synchronizing module is further configured to:
analyze the database logs of the business system to get the real-time data of each said data table, and synchronize the real-time data to a real-time data stream;
land the data in the real-time data stream onto the plural full data tables;
and write the data in the real-time data stream into the plural incremental data tables.

26. The system of claim 25 wherein the database log records information of the operations made to the business database.

27. The system of any one of claims 25 or 26 wherein the database log is a Binlog, wherein the Binlog is analyzed by a Binlog analyzer.

28. The system of any one of claims 23 to 27 wherein the data from the plurality of data tables is cleaned and converted before being loaded to the first database.

29. The system of any one of claims 23 to 28 wherein the plurality of data tables having an associ ati on relationship have a one-to-one relationship.

30. The system of any one of claims 23 to 28 wherein the plurality of data tables having an association relationship have a one-to-multiple relationship.

31. The system of claim 30 wherein one of the plurality of data tables is a parent table and the remaining data tables of the plurality of data tables are child tables.

32. The system of any one of claims 23 to 31 wherein new data is added and existing data is changed in real-time.

Date recue/Date received 2023-09-26

33. The system of any one of claims 23 to 32, wherein the first database is a key-value (KV) database and the second database is a Hive database.

34. The system of claim 33 wherein the key-value database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs.

35. The system of claim 32 or 33 wherein the key-value database is Redis.

36. The system of any one of claims 23 to 35 wherein each business unique identity marks one business record in the database table.

37. The system of any one of claims 23 to 36 wherein the business unique identities are merged into a set and any duplicated business unique identities are removed, wherein the merged and unduplicated business unique identities form the incremental identity merged table.

38. The system of claim 33 wherein to make a query to fmd out business data associated with the incremental identity merged table from the plural full data tables further the enquiring module is further configured to:
enquire for the business data matching the business unique identifier from the plural full data tables, respectively, through an SQL enquiry interface.

39. The system of claim 38 wherein the SQL enquiry interface integrates key-value database enquiries into an SQL, to achieve real-time association between the key-value database and the Hive database.

40. The system of claim 38 wherein the enquiring module is further configured to:
receive a data backward enquiry instruction, enquire the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and return data backward enquiry results.
Date recue/Date received 2023-09-26

41. The system of any one of claims 23 to 40 wherein based on the incremental data table in the second database the system is further configured to perform analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

42. The system of claim 41 wherein the business is a retail business and a business scenario involves an order and the system is configured to:
initialize all data in the parent table and the child tables of the business system, and load the data into the KV database to form plural full data tables;
synchronize the data from the business system to a data stream through the database log in a real-time manner;
land the data in the real-time data stream into the incremental data table of the Hive database;
write data of the real-time data stream into the full data table of the KV
database correspondingly;
merge and de-duplicate all the order numbers extracted from the incremental data tables and write them into the incremental order number merged table of the Hive database; and enquire and call the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and write the enquiry results into the consistent incremental data table of the Hive database.

43. The system of claim 42 wherein the parent table is an order header table and the child tables include an order product table, an order payment table, and an order expansion table.

44. The system of claim 42 wherein the business unique identity is an order number.

45. A computer equipment comprising:
a computer readable physical memory;

Date recue/Date received 2023-09-26 a processor communicatively coupled to the memory, a computer program stored on the memory and operable on the processor, wherein the processor executes the computer program configured to:
initialize all data of a plurality of data tables having an association relationship in a business system, and load the data to a first database so as to generate plural full data tables;
synchronize real-time data of each said data table to the plural full data tables and to incremental data tables of a second database, based on database logs of the business system;
extract all business unique identities in the plural incremental data table and merge them in the second database to generate an incremental identity merged table; and make a query to find out business data associated with the incremental identity merged table from the plural full data tables according to the incremental identity merged table, and correspondingly write the business data into the consistent incremental data table of the second database.

46. The equipment of claim 45 wherein the data of the data tables are extracted from a business database corresponding to the business system.

47. The equipment of claim 46, wherein to synchronize real-time data of each data table to the plural full data tables and to incremental data tables of a second database the program is further configured to:
analyze the database logs of the business system to get the real-time data of each said data table, and synchronize the real-time data to a real-time data stream;
land the data in the real-time data stream onto the plural full data tables;
and write the data in the real-time data stream into the plural incremental data tables.

48. The equipment of claim 47 wherein the database log records information of the operations made to the business database.

Date recue/Date received 2023-09-26

49. The equipment of any one of claims 47 or 48 wherein the database log is a Binlog, wherein the Binlog is analyzed by a Binlog analyzer.

50. The equipment of any one of claims 45 to 49 wherein the data from the plurality of data tables is cleaned and converted before being loaded to the first database.

51. The equipment of any one of claims 45 to 50 wherein the plurality of data tables having an association relationship have a one-to-one relationship.

52. The equipment of any one of claims 45 to 50 wherein the plurality of data tables having an association relationship have a one-to-multiple relationship.

53. The equipment of claim 52 wherein one of the plurality of data tables is a parent table and the remaining data tables of the plurality of data tables are child tables.

54. The equipment of any one of claims 45 to 53 wherein new data is added and existing data is changed in real-time.

55. The equipment of any one of claims 45 to 54, wherein the first database is a key-value (KV) database and the second database is a Hive database.

56. The equipment of claim 55 wherein the key-value database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs.

57. The equipment of claim 54 or 55 wherein the key-value database is Redis.

58. The equipment of any one of claims 45 to 57 wherein each business unique identity marks one business record in the database table.

59. The equipment of any one of claims 45 to 58 wherein the business unique identities are merged into a set and any duplicated business unique identities are removed, wherein the merged and unduplicated business unique identities form the incremental identity merged table.

Date recue/Date received 2023-09-26

60. The equipment of claim 55 wherein to make a query to find out business data associated with the incremental identity merged table from the plural full data tables the program is further configured to:
enquire for the business data matching the business unique identifier from the plural full data tables, respectively, through an SQL enquiry interface.

61. The equipment of claim 60 wherein the SQL enquiry interface integrates key-value database enquiries into an SQL, to achieve real-time association between the key-value database and the Hive database.

62. The equipment of claim 60 wherein the program is further configured to:
receive a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and return data backward enquiry results.

63. The equipment of any one of claims 45 to 62 wherein the program is further configured to:
based on the incremental data table in the second database, perform analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

64. The equipment of claim 63 wherein the business is a retail business and a business scenario involves an order, and the program is further configured to:
initialize all data in the parent table and the child tables of the business system, and load the data into the KV database to form plural full data tables;
synchronize the data from the business system to a data stream through the database log in a real-time manner;
land the data in the real-time data stream into the incremental data table of the Hive database;

Date recue/Date received 2023-09-26 write data of the real-time data stream into the full data table of the KV
database correspondingly;
merge and de-duplicate all the order numbers extracted from the incremental data tables and writing them into the incremental order number merged table of the Hive database;
and enquire and call the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and writing the enquiry results into the consistent incremental data table of the Hive database.

65. The equipment of claim 64 wherein the parent table is an order header table and the child tables include an order product table, an order payment table, and an order expansion table.

66. The equipment of claim 64 wherein the business unique identity is an order number.

67. A method for implementing incremental data consistency, the method comprising:
initializing all data of a plurality of data tables having an association relationship in a business system, and loading the data to a first database so as to generate plural full data tables;
synchronizing real-time data of each said data table to the plural full data tables and to incremental data tables of a second database, based on database logs of the business system;
extracting all business unique identities in the plural incremental data table and merging them in the second database to generate an incremental identity merged table;
and making a query to find out business data associated with the incremental identity merged table from the plural full data tables according to the incremental identity merged table, and correspondingly writing the business data into the consistent incremental data table of the second database.

68. The method of claim 67 wherein the data of the data tables are extracted from a business database corresponding to the business system.
Date recue/Date received 2023-09-26

69. The method of claim 68, wherein synchronizing real-time data of each data table to the plural full data tables and to incremental data tables of a second database further comprises:
analyzing the database logs of the business system to get the real-time data of each said data table, and synchronizing the real-time data to a real-time data stream;
landing the data in the real-time data stream onto the plural full data tables; and writing the data in the real-time data stream into the plural incremental data tables.

70. The method of claim 69 wherein the database log records information of the operations made to the business database.

71. The method of any one of claims 69 or 70 wherein the database log is a Binlog, wherein the Binlog is analyzed by a Binlog analyzer.

72. The method of any one of claims 67 to 71 wherein the data from the plurality of data tables is cleaned and converted before being loaded to the first database.

73. The method of any one of claims 67 to 72 wherein the plurality of data tables having an association relationship have a one-to-one relationship.

74. The method of any one of claims 67 to 72 wherein the plurality of data tables having an association relationship have a one-to-multiple relationship.

75. The method of claim 74 wherein one of the plurality of data tables is a parent table and the remaining data tables of the plurality of data tables are child tables.

76. The method of any one of claims 67 to 75 wherein new data is added and existing data is changed in real-time.

77. The method of any one of claims 67 to 76, wherein the first database is a key-value (KV) database and the second database is a Hive database.

78. The method of claim 77 wherein the key-value database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs.

Date recue/Date received 2023-09-26

79. The method of claim 76 or 77 wherein the key-value database is Redis.

80. The method of any one of claims 67 to 79 wherein each business unique identity marks one business record in the database table.

81. The method of any one of claims 67 to 80 wherein the business unique identities are merged into a set and any duplicated business unique identities are removed, wherein the merged and unduplicated business unique identities form the incremental identity merged table.

82. The method of claim 77 wherein making a query to find out business data associated with the incremental identity merged table from the plural full data tables further comprises:
enquiring for the business data matching the business unique identifier from the plural full data tables, respectively, through an SQL enquiry interface.

83. The method of claim 82 wherein the SQL enquiry interface integrates key-value database enquiries into an SQL, to achieve real-time association between the key-value database and the Hive database.

84. The method of claim 82 further comprising:
receiving a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and returning data backward enquiry results.

85. The method of any one of claims 67 to 84 further comprising:
based on the incremental data table in the second database, performing analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

86. The method of claim 85 wherein the business is a retail business and a business scenario involves an order, further comprising:
initializing all data in the parent table and the child tables of the business system, and loading the data into the KV database to form plural full data tables;

Date recue/Date received 2023-09-26 synchronizing the data from the business system to a data stream through the database log in a real-time manner;
landing the data in the real-time data stream into the incremental data table of the Hive database;
writing data of the real-time data stream into the full data table of the KV
database correspondingly;
merging and de-duplicating all the order numbers extracted from the incremental data tables and writing them into the incremental order number merged table of the Hive database; and enquiring and calling the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and writing the enquiry results into the consistent incremental data table of the Hive database.

87. The method of claim 86 wherein the parent table is an order header table and the child tables include an order product table, an order payment table, and an order expansion table.

88. The method of claim 86 wherein the business unique identity is an order number.

89. A computer readable physical memory having stored thereon a computer program executed by a computer configured to:
initialize all data of a plurality of data tables having an association relationship in a business system, and load the data to a first database so as to generate plural full data tables;
synchronize real-time data of each said data table to the plural full data tables and to incremental data tables of a second database, based on database logs of the business system;
extract all business unique identities in the plural incremental data table and merge them in the second database to generate an incremental identity merged table; and Date recue/Date received 2023-09-26 make a query to find out business data associated with the incremental identity merged table from the plural full data tables according to the incremental identity merged table, and correspondingly write the business data into the consistent incremental data table of the second database.

90. The memory of claim 89 wherein the data of the data tables are extracted from a business database corresponding to the business system.

91. The memory of claim 90, wherein to synchronize real-time data of each data table to the plural full data tables and to incremental data tables of a second database the program is further configured to:
analyze the database logs of the business system to get the real-time data of each said data table, and synchronize the real-time data to a real-time data stream;
land the data in the real-time data stream onto the plural full data tables;
and write the data in the real-time data stream into the plural incremental data tables.

92. The memory of claim 91 wherein the database log records information of the operations made to the business database.

93. The memory of any one of claims 91 or 92 wherein the database log is a Binlog, wherein the Binlog is analyzed by a Binlog analyzer.

94. The memory of any one of claims 89 to 93 wherein the data from the plurality of data tables is cleaned and converted before being loaded to the first database.

95. The memory of any one of claims 89 to 94 wherein the plurality of data tables having an association relationship have a one-to-one relationship.

96. The memory of any one of claims 89 to 94 wherein the plurality of data tables having an association relationship have a one-to-multiple relationship.

97. The memory of claim 96 wherein one of the plurality of data tables is a parent table and the remaining data tables of the plurality of data tables are child tables.

Date recue/Date received 2023-09-26

98. The memory of any one of claims 89 to 97 wherein new data is added and existing data is changed in real-time.

99. The memory of any one of claims 89 to 98, wherein the first database is a key-value (KV) database and the second database is a Hive database.

100. The memory of claim 99 wherein the key-value database stores data by key-value pairs, and storage of and access to its data are both conducted using key-value pairs.

101. The memory of claim 98 or 99 wherein the key-value database is Redis.

102. The memory of any one of claims 89 to 101 wherein each business unique identity marks one business record in the database table.

103. The memory of any one of claims 89 to 102 wherein the business unique identities are merged into a set and any duplicated business unique identities are removed, wherein the merged and unduplicated business unique identities form the incremental identity merged table.

104. The memory of claim 99 wherein to make a query to find out business data associated with the incremental identity merged table from the plural full data tables the program is further configured to:
enquire for the business data matching the business unique identifier from the plural full data tables, respectively, through an SQL enquhy interface.

105. The memory of claim 104 wherein the SQL enquiry interface integrates key-value database enquiries into an SQL, to achieve real-time association between the key-value database and the Hive database.

106. The memory of claim 104 wherein the program is further configured to:
receive a data backward enquiry instruction, enquiring the business data associated with the data backward enquiry instruction from the first database through the SQL
enquiry interface, and return data backward enquiry results.

107. The memory of any one of claims 89 to 106 wherein the program is further configured to:
Date recue/Date received 2023-09-26 based on the incremental data table in the second database, perform analyses of indicators, dimensions, and attributes related to the business subject, wherein the business subject may be ordering, payment or product return/exchange.

108. The memory of claim 107 wherein the business is a retail business and a business scenario involves an order, and the program is further configured to:
initialize all data in the parent table and the child tables of the business system, and load the data into the KV database to form plural full data tables;
synchronize the data from the business system to a data stream through the database log in a real-time manner;
land the data in the real-time data stream into the incremental data table of the Hive database;
write data of the real-time data stream into the full data table of the KV
database correspondingly;
merge and de-duplicate all the order numbers extracted from the incremental data tables and writing them into the incremental order number merged table of the Hive database;
and enquire and call the data of each full data table according to the incremental order number merged table through the SQL enquiry interface, and writing the enquiry results into the consistent incremental data table of the Hive database.

109. The memory of claim 108 wherein the parent table is an order header table and the child tables include an order product table, an order payment table, and an order expansion table.

110. The memory of claim 108 wherein the business unique identity is an order number.

Date recue/Date received 2023-09-26