WO2019100638A1

WO2019100638A1 - Data synchronization method, device and equipment, and storage medium

Info

Publication number: WO2019100638A1
Application number: PCT/CN2018/082270
Authority: WO
Inventors: 付军
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-11-22
Filing date: 2018-04-09
Publication date: 2019-05-31
Also published as: CN107967316A

Abstract

Disclosed in the present application are a data synchronization method, device and equipment, and a readable storage medium, the method comprising: acquiring inputted source library information that at least comprises source library name, source library table name and source library table type, and parsing metadata comprised in a source library corresponding to the source library information so as to obtain a source table structure; generating, according to the source table structure, a first target library table for establishing temporary storage data in a target library and a table creation script for establishing a second target library table which stores the same data as the source library in the target library; and acquiring the table type of the first target library table, and correspondingly generating a synchronization script for synchronizing metadata from the source library to the target library by means of the first target library table and the second target library table in sequence.

Description

Data synchronization method, device, device and storage medium

This application claims the priority of the Chinese Patent Application filed on November 21, 2017, the Chinese Patent Office, Application No. 201711175125.8, entitled "A Data Synchronization Method, Apparatus, and Computer Readable Storage Medium", the entire contents of which are hereby incorporated by reference. The citation is incorporated in the application.

Technical field

The present application relates to the field of data synchronization technologies, and in particular, to a data synchronization method, apparatus, device, and storage medium.

Background technique

Currently, when synchronizing the data of an enterprise database (that is, synchronizing data from the source library to the target library through the synchronization tool), the following development is required:

1. According to the source library table structure information, establish a corresponding table in the target hive database;

2. Develop a synchronization script program based on the synchronization tool used;

The scripts and synchronization scripts in the above process are all developed by the developer through manual development. The development process is complicated, the efficiency is low, and the error rate is high, which greatly reduces the efficiency of data synchronization.

Therefore, the prior art has yet to be improved and developed.

Application content

In view of the above-mentioned deficiencies of the prior art, the purpose of the present application is to provide a data synchronization method, apparatus, device, and storage medium, which are intended to solve the problem of creating a table script and a synchronization script in a data synchronization process in the prior art. Through manual development, the development process is complicated, the efficiency is low, the error rate is high, and the defect of data synchronization efficiency is greatly reduced.

In order to achieve the above objectives, the present application adopts the following technical solutions:

A data synchronization method includes the following steps:

Obtaining the source library information included in the source library name, the source library table name, and the source library table type, and parsing the source library information corresponding to the metadata included in the source library to obtain the source table structure;

Generating, according to the source table structure, a first target library table for establishing temporary storage data in the target library, and a table creation script for establishing a second target library table storing the same data as the source library in the target library;

Obtaining a table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library through the first target library table and the second target library table to the target library.

Optionally, the table type of the first target library table is one of an incremental table, a pipeline table, or a full scale table; and the table type of the second target library table is an incremental table, a pipeline table, or a full scale table. One kind.

Optionally, the step of acquiring the source library information including the source library name, the source library table name, and the source library table type, and the step of parsing the source library information corresponding to the metadata included in the source library to obtain the source table structure includes:

Obtaining the source library information included in the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name;

Obtaining metadata from a source library corresponding to the source library name in the source library information;

The metadata information table of the obtained metadata is parsed, and the source table structure is obtained according to the metadata information table.

Optionally, the step of acquiring the table type of the first target library table and correspondingly generating a synchronization script for synchronizing metadata from the source library to the second target library table and the second target library table to the target library includes: ;

Obtaining a table type of the first target library table, and determining that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

When the table type of the first target library table is an incremental table, the first sqoop synchronization script and the hive program are generated correspondingly; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first target. In the specified partition of the library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.

Optionally, the step of acquiring a table type of the first target library table, and correspondingly generating a synchronization script for synchronizing metadata from the source library to the first target library table and the second target library table to the target library Also includes:

When the table type of the first target library table is a pipeline table, the second sqoop synchronization script and the hive program are generated correspondingly; the second sqoop synchronization script and the hive program are used to synchronize the metadata from the source library to the first target library. In the specified partition of the table, the metadata in the first target library table is stored in the second target library table.

When the table type of the first target library table is a full scale, the third sqoop synchronization script and the hive program are generated correspondingly; the third sqoop synchronization script and the hive program are used to synchronize the metadata from the source library to the first target library. The table then stores the metadata in the first target library table into the second target library table.

Optionally, the execution periods of the first sqoop synchronization script and the hive program, the second sqoop synchronization script, the hive program, the third sqoop synchronization script, and the hive program are both 24 hours.

Optionally, the metadata information table corresponding to the metadata includes at least a table owner, a table name, a table comment, a column name, a column comment, and a column order.

A data synchronization device, the data synchronization device comprising:

The input parsing module is configured to obtain source library information including at least a source library name, a source library table name, and a source library table type, and parse the source library information corresponding to the metadata included in the source library to obtain a source table structure;

a table creation script module, configured to generate a first target library table for establishing temporary storage data in the target library according to the source table structure, and to establish a second target library table for storing the same data in the target library as the source library Table script

The synchronization script generation module is configured to obtain a table type of the first target library table, and correspondingly generate a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table.

A data synchronization device includes: a processor, a memory, a communication bus; and the memory stores a computer readable program executable by the processor;

The communication bus implements connection communication between the processor and the memory;

The processor implements the steps in the data synchronization method described above when the computer readable program is executed.

A storage medium, wherein the storage medium stores one or more programs, the one or more programs being executable by one or more processors to implement the steps in the data synchronization method described above.

Compared with the prior art, the present application automatically generates a table creation script and a synchronization script by configuring a limited number of entries, so that the data synchronization operation is simplified, the development efficiency is improved, and human error is reduced.

DRAWINGS

FIG. 1 is a flowchart of a data synchronization method provided by the present application.

FIG. 2 is a flowchart of step S100 in the data synchronization method provided by the present application.

FIG. 3 is a flowchart of step S100 in the data synchronization method provided by the present application.

FIG. 4 is a schematic diagram of an operating environment of a preferred embodiment of a data synchronization device provided by the present application.

FIG. 5 is a functional block diagram of a preferred embodiment of the data synchronization procedure of the present application.

FIG. 6 is a structural block diagram of a data synchronization system provided by the present application.

Detailed ways

In view of the prior art, the table creation script and the synchronization script in the data synchronization process are all developed by the developer through manual development, the development process is complicated, the efficiency is low, the error rate is high, and the efficiency of data synchronization is greatly reduced. The purpose is to provide a data synchronization method, device, device and storage medium, and automatically generate a table creation script and a synchronization script by configuring a limited number of entries, thereby simplifying data synchronization operation, improving development efficiency, and reducing human error.

Referring to FIG. 1, the data synchronization method provided by the present application includes the following steps:

Step S100: Obtain source library information that includes at least a source library name, a source library table name, and a source library table type, and parse the source library information corresponding to the metadata included in the source library to obtain a source table structure.

In this embodiment, when it is required to synchronize the metadata from the source library to the target library, it is only necessary to input the source in the interface of the synchronization tool (the synchronization tool is an application) designed by the data synchronization method. The source library information of the library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name may be used. After the source library information is entered, the build script and the synchronization script are automatically generated later, which makes the data synchronization operation simple.

Wherein, the source library initially stored by the metadata is an oracle database (the oracle database is also named Oracle) RDBMS, or Oracle for short, is a relational database management system from Oracle), MySQL (MySQL is an open source small relational database management system) database or PostgreSQL (PostgreSQL) Is a free object-relational database server) database, the target library is hive (hive is a data warehouse tool based on Hadoop) database. These databases are common and easy to operate database management systems and tools to facilitate analysis and processing of data in this embodiment.

Step S200: Generate a first target library table for establishing temporary storage data in the target library according to the source table structure, and a table construction script for establishing a second target library table storing the same data as the source library in the target library.

In this embodiment, after the source library information is entered, the source table structure is automatically parsed according to the entered source library information, and the table creation script is automatically generated according to the source table structure and the preset table construction rules. In order to more clearly understand the process of automatically generating the build script, the following describes the preset build rules.

The source library's table construction rules are as follows. The source table in the source library is divided into three types, namely the source table delta table (also called the source table table delta table) and the source table flow table (also called the source). The library table flow table) and the source table full scale (also known as the source library table full scale) three. Among them, the data in the source table delta table is continuously updated and added, and the historical data before the day is updated on the same day; the data in the source table flow table is continuously added, and the historical data before the day is not updated on the same day; the source table is full. The amount of data in the table is small, such as some configuration tables or dimension tables.

The rules for creating a target library are as follows: a first target library table corresponding to the source table (ie, a src table, that is, a source file table, a source table) and a second target library table (ie, an ods table) are created in the target library. Operational Data Store table, operation data storage table). Among them, the src table is used as a temporary table, which is divided into src increment table (by day partition), src flow table (by day partition), src full scale (no partition required). The ods table is consistent with the source table data, which is divided into ods delta tables (no partitioning, ods delta table can be used to remove all data in the src delta table), ods flow meter (by day partition, not divided) Heavy), ods full scale (no partitioning, no weight removal).

After the above-mentioned table construction rules are preset, in the interface of the synchronization tool, the source including the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name is entered. Library information is ok, which simplifies operation.

Step S300: Obtain a table type of the first target library table, and generate a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table.

In this embodiment, the table type of the first target library table is one of a delta table, a pipeline table, or a full scale table; and the table type of the second target library table is an increment table, a flow table, or a full amount. One of the tables; and the table type of the first target library table is the same as the source library table type, and the table type of the second target library table is the same as the source library table type. In this way, according to the source library table type of the source library information, the table type of the first target library table is obtained, and the table type of the second target library table is obtained.

In an embodiment, as shown in FIG. 2, the step S100 specifically includes:

Step S101: Obtain source library information that includes the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name.

Specifically, for example, when it is necessary to synchronize metadata from the nesop database (nesop is the source library name), a table n_bas_pc_sop (n_bas_pc_sop is also just a source database table name in a source database, the data included is metadata) to hive The database pa_nesop (pa_nesop is the target library name of the hive type), you can enter the following information on the interface of the synchronization tool:

Source library name: nesop

Source library table name: n_bas_pc_sop

Source library table type: delta table

Source table update field: update_date

Source table deduplication field: id

Target library name: pa_nesop.

In the entire configuration process, it is only necessary to complete the input of the above source library information, which greatly facilitates the user.

Step S102: Obtain metadata from a source library corresponding to the source library name in the source library information.

The metadata information is mainly various information of the front-end business system, such as various identity and deposit information of the customers of the banking system, and the metadata information is embodied by the table owner, table name, table comment, column name, column comment and Column order and other parts of the metadata information table.

Step S103: Parse the metadata information table of the acquired metadata, and obtain the source table structure according to the metadata information table.

The table structure defines the fields, types, primary keys, foreign keys, and indexes of a table. These basic attributes form the table structure of the database. The source table structure, that is, the source table corresponding to the table structure in the source library. In the business system, the change of the table structure is mainly reflected in the change of the column name, such as adding or modifying the gender information column of the customer, the income status information column, etc., so in this embodiment, the table mainly focuses on the metadata information. The owner's column names are compared.

In an embodiment, as shown in FIG. 3, the step S300 includes:

Step S301: Obtain a table type of the first target library table, and determine that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

Step S302, when the table type of the first target library table is an incremental table, correspondingly generating a first sqoop synchronization script and a hive program; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to In the specified partition of the first target library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.

According to the build script, the first target library table (n_bas_pc_sop_src) and the second target library table (n_bas_pc_sop_ods) are automatically generated. At this point, when the first target library table (n_bas_pc_sop_src) is determined to be an incremental table (that is, the delta table is a partition table, partitioned by day), the program automatically generates the first sqoop synchronization script and the hive program, and the data is The source library is synchronized to a new partition of the first target library table (n_bas_pc_sop_src), and then all the data in the first target library table (n_bas_pc_sop_src) is deduplicated according to the source table deduplication field (id) and stored in the second target library table. (n_bas_pc_sop_ods). The execution period of the first sqoop synchronization script and the hive program is 24h (that is, once a day).

In order to understand the sqoop synchronization script more clearly, the following is explained by a specific embodiment. For example, execute the sqoop command to import data from mysql into hive. The instructions are as follows:

Sqoop import--connect Jdbc:mysql://10.1.11.78:3306/video --table base_event --username root --password 123456 --m 1 --hive-import --hive-database video --hive-table Base_event --hive-overwrite --fields-terminated --by "\t" –lines –terminated --by "\n"---as-textfile;

Among them, sqoop import -- execute the sqoop import instruction;

--connect Jdbc: mysql://hostname:port/database--the database address, port number, database database to be connected;

--table base_app ----- The database table to be operated;

--username root ----- The username to connect to the database;

--password 123456 --- Password to connect to the database;

-m 1 ------ the number of maps to be started;

--hive-import --- Imported by hive;

[--create-hive-table] --- If the imported table does not exist in the hive, sqoop automatically creates the table in hive. However, when the table exists, adding this option will cause the command to report an error.

--hive-database XXX --- To import the database table into the database of hive;

--hive-table base_ XXX --- To import the database table into the hive table;

--hive-overwrite --- If there is already data in the table of hive, after adding this operation, the original data will be overwritten;

--fields-terminated-by "\t" --- Hive stores the separator between fields in the file in hdfs;

--lines-terminated-by “\n”– Hive is the separator between each line in the file stored in hdfs;

--as-textfile ---hive Stores the file format into hdfs, using text storage.

That is, the synchronization script can be automatically generated according to the source library information entered and the preset table construction rules.

In an embodiment, as shown in FIG. 3, the step S300 further includes:

Step S303, when the table type of the first target library table is a pipeline table, correspondingly generating a second sqoop synchronization script and a hive program; the second sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first In the specified partition of the target library table, the metadata in the first target library table is stored in the second target library table.

When the first target library table (n_bas_pc_sop_src) is determined to be a pipeline table (that is, the pipeline table is a partition table, partitioned by day), the program automatically generates a second sqoop synchronization script and a hive program, and synchronizes the data from the source library to the first day. In a new partition of the target library table (n_bas_pc_sop_src), all the data in the first target library table (n_bas_pc_sop_src) is directly stored in the second target library table (n_bas_pc_sop_ods) according to the source table without deduplication. The execution period of the second sqoop synchronization script and the hive program is 24h (that is, once a day).

In an embodiment, as shown in FIG. 3, the step S300 further includes:

Step S304, when the table type of the first target library table is a full scale, corresponding to generating a third sqoop synchronization script and a hive program; the third sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first A target library table, and then storing the metadata in the first target library table into the second target library table.

When the first target library table (n_bas_pc_sop_src) is a full scale table (that is, the full scale table is a non-partitioned table), the program automatically generates a third sqoop synchronization script and a hive program, and the data is not removed from the full table of the source library every day. The data is directly synchronized to the first target library table (n_bas_pc_sop_src), and then all the data in the first target library table (n_bas_pc_sop_src) is directly stored in the second target library table (n_bas_pc_sop_ods). The execution period of the third sqoop synchronization script and the hive program is 24h (that is, once a day).

Please continue to refer to FIG. 4. FIG. 4 is a schematic diagram showing the internal structure of a computer device in an embodiment. The computer device may be a terminal or a server, wherein the terminal may be a communication device, such as a smart phone, a tablet computer, a notebook computer, or a desktop computer. The server can be a standalone server or a server cluster consisting of multiple servers. Referring to FIG. 4, the computer device includes a processor, a non-volatile storage medium, an internal memory, and a network interface connected by a system bus. Wherein, the non-volatile storage medium of the computer device can store an operating system and a computer readable program that, when executed, can cause the processor to perform a method of verifying the difficulty prediction. The processor of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device. The internal memory can store a computer readable program that, when executed by the processor, causes the processor to perform a data synchronization method. The network interface of the computer device is used for network communication, such as sending assigned tasks. It will be understood by those skilled in the art that the structure shown in FIG. 4 does not constitute a limitation on the computer device to which the present application is applied, and the specific computer device may include more or less components than those shown in the figure. , or combine some components, or have different component arrangements.

The application also provides a data synchronization device including a processor 10, a memory 20, and a display 30. Figure 4 shows only some of the components of the data synchronization device, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

The memory 20, in some embodiments, may be an internal storage unit of components of the data synchronization device, such as a hard disk or memory of a data synchronization device. The memory 20 may also be an external storage device of each component of the data synchronization device in other embodiments, such as a plug-in hard disk equipped on each component of the data synchronization device, and a smart memory card (Smart Media Card, SMC), Secure Digital (SD) card, flash card (Flash) Card) and so on. Further, the memory 20 may also include both an internal storage unit of the data synchronization device and an external storage device. The memory 20 is configured to store application software and various types of data installed on the data synchronization device, such as the program code of the installation data synchronization device. The memory 20 can also be used to temporarily store data that has been output or is about to be output. In one embodiment, a memory synchronization program 40 is stored on the memory 20, and the data synchronization program 40 can be executed by the processor 10 to implement the modified source database table structure method of various embodiments of the present application.

The processor 10 is configured to execute program code or process data stored in the memory 20, for example, to execute the rights authentication method and the like. The display 30 is for displaying information processed in the WeChat customer behavior feedback device and a user interface for displaying visualizations, such as an assignment information interface, an authentication report interface, and the like. The components 10-30 of the WeChat customer behavior feedback device communicate with one another via a system bus.

In an embodiment, the various steps of the data synchronization method described above are implemented when the processor 10 executes the data synchronization program 40 in the memory 20.

Please refer to FIG. 5, which is a functional block diagram of a data synchronization apparatus for implementing the data synchronization method of the present application. In this embodiment, the data synchronization device can be divided into a record resolution module 31, a table creation script generation module 32, and a synchronization script generation module 33:

The input parsing module 31 is configured to obtain source library information that includes at least a source library name, a source library table name, and a source library table type, and parse the source library information corresponding to the metadata included in the source library to obtain a source table structure;

a table creation script module 32, configured to generate, according to the source table structure, a first target library table for establishing temporary storage data in the target library, and a second target library table for storing the same data in the target library as that of the source library Table creation script;

The synchronization script generating module 33 is configured to obtain a table type of the first target library table, and correspondingly generate a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table.

Optionally, the acquiring the table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table. The steps include:

Optionally, the acquiring the table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table. The steps also include:

Based on the above data synchronization method, device and device, the present application further provides a data synchronization system. Referring to FIG. 6, a plurality of source databases 110, a target database 120, and a data synchronization device 130 are included.

The metadata of the plurality of source databases 110 are processed by the data synchronization device 130 and uploaded to the target database 120 by the automatically generated table creation script and synchronization script.

Based on the above data synchronization method, device and device, the present application also provides a storage medium accordingly. The storage medium stores one or more programs that can be executed by one or more processors to implement the various steps of the data synchronization method described above.

A person skilled in the art can understand that all or part of the process of implementing the above embodiments can be completed by a computer program to instruct related hardware, and the computer program can be stored in a non-volatile computer readable storage medium. The computer program, when executed, may include the flow of an embodiment of the methods as described above. Wherein, the foregoing computer readable storage medium can be a magnetic disk, an optical disk, or a read-only storage memory (Read-Only) Non-volatile storage media such as Memory, ROM).

In summary, the present application automatically generates a table creation script and a synchronization script by configuring a limited number of entries, thereby simplifying data synchronization operations, improving development efficiency, and reducing human error.

Certainly, those skilled in the art can understand that all or part of the processes in implementing the above embodiments may be completed by a computer program to instruct related hardware (such as a processor, a controller, etc.), and the program may be stored in a computer. In a readable storage medium, the program may include the flow of the method embodiments as described above when executed. The computer readable storage medium described therein may be a memory, a magnetic disk, an optical disk, or the like.

It should be understood that the application of the present application is not limited to the above-described examples, and those skilled in the art can make modifications and changes in accordance with the above description, all of which are within the scope of the appended claims.

Claims

A data synchronization method includes the following steps:

Obtaining the source library information included in the source library name, the source library table name, and the source library table type, and parsing the source library information corresponding to the metadata included in the source library to obtain the source table structure;

Generating, according to the source table structure, a first target library table for establishing temporary storage data in the target library, and a table creation script for establishing a second target library table storing the same data as the source library in the target library;

Obtaining a table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library through the first target library table and the second target library table to the target library.
The data synchronization method according to claim 1, wherein the table type of the first target library table is one of a delta table, a pipeline table, or a full scale table; and the table type of the second target library table is an increment. One of a table, a flow meter, or a full scale.
The data synchronization method according to claim 2, wherein the obtaining source library information including at least a source library name, a source library table name, and a source library table type is obtained, and the source library information is corresponding to the metadata included in the source library. The steps of the source table structure include:

Obtaining the source library information included in the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name;

Obtaining metadata from a source library corresponding to the source library name in the source library information;

The metadata information table of the obtained metadata is parsed, and the source table structure is obtained according to the metadata information table.
The data synchronization method according to claim 3, wherein the obtaining the table type of the first target library table and correspondingly generating the metadata for sequentially transferring the metadata from the source library to the first target library table and the second target library table to The steps of the synchronization script for the target library include:

Obtaining a table type of the first target library table, and determining that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

When the table type of the first target library table is an incremental table, the first sqoop synchronization script and the hive program are generated correspondingly; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first target. In the specified partition of the library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.
The data synchronization method according to claim 4, wherein the obtaining a table type of the first target library table is correspondingly generated for synchronizing metadata from the source library to the first target library table and the second target library table to The steps of the synchronization script of the target library also include:

When the table type of the first target library table is a pipeline table, the second sqoop synchronization script and the hive program are generated correspondingly; the second sqoop synchronization script and the hive program are used to synchronize the metadata from the source library to the first target library. In the specified partition of the table, the metadata in the first target library table is stored in the second target library table.
The data synchronization method according to claim 5, wherein the obtaining the table type of the first target library table and correspondingly generating the metadata for sequentially transferring the metadata from the source library to the first target library table and the second target library table to The steps of the synchronization script of the target library also include:

When the table type of the first target library table is a full scale, the third sqoop synchronization script and the hive program are generated correspondingly; the third sqoop synchronization script and the hive program are used to synchronize the metadata from the source library to the first target library. The table then stores the metadata in the first target library table into the second target library table.
The data synchronization method according to claim 6, wherein the execution periods of the first sqoop synchronization script and the hive program, the second sqoop synchronization script and the hive program, the third sqoop synchronization script, and the hive program are both 24h.
The data synchronization method according to claim 1, wherein the metadata information table corresponding to the metadata includes at least a table owner, a table name, a table comment, a column name, a column comment, and a column order.
A data synchronization device, wherein the data synchronization device comprises:

The input parsing module is configured to obtain source library information including at least a source library name, a source library table name, and a source library table type, and parse the source library information corresponding to the metadata included in the source library to obtain a source table structure;

a table creation script module, configured to generate a first target library table for establishing temporary storage data in the target library according to the source table structure, and to establish a second target library table for storing the same data in the target library as the source library Table script

The synchronization script generation module is configured to obtain a table type of the first target library table, and correspondingly generate a synchronization script for synchronizing the metadata from the source library to the target library through the first target library table and the second target library table.
The data synchronization apparatus according to claim 9, wherein the table type of the first target library table is one of an increment table, a pipeline table, or a full scale table; and the table type of the second target library table is increased. One of a scale, a flow meter, or a full scale.
The data synchronization device according to claim 10, wherein the obtaining source library information including at least a source library name, a source library table name, and a source library table type is retrieved, and the source library information is parsed corresponding to the metadata included in the source library. The steps to obtain the source table structure include:

Obtaining the source library information included in the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name;

Obtaining metadata from a source library corresponding to the source library name in the source library information;

The metadata information table of the obtained metadata is parsed, and the source table structure is obtained according to the metadata information table.
The data synchronization device according to claim 11, wherein said obtaining a table type of the first target library table and correspondingly generating for synchronizing the metadata from the source library through the first target library table and the second target library table The steps to the synchronization script to the target library include:

Obtaining a table type of the first target library table, and determining that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

When the table type of the first target library table is an incremental table, the first sqoop synchronization script and the hive program are generated correspondingly; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first target. In the specified partition of the library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.
A data synchronization device, comprising: a processor, a memory, a communication bus; and a memory readable program executable by the processor;

The communication bus implements connection communication between the processor and the memory;

When the processor executes the computer readable program, the following steps are implemented:

Obtaining the source library information included in the source library name, the source library table name, and the source library table type, and parsing the source library information corresponding to the metadata included in the source library to obtain the source table structure;

Generating, according to the source table structure, a first target library table for establishing temporary storage data in the target library, and a table creation script for establishing a second target library table storing the same data as the source library in the target library;

Obtaining a table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library through the first target library table and the second target library table to the target library.
The data synchronization device according to claim 13, wherein the table type of the first target library table is one of an increment table, a pipeline table, or a full scale table; and the table type of the second target library table is increased. One of a scale, a flow meter, or a full scale.
The data synchronization device according to claim 14, wherein the obtaining source library information including at least a source library name, a source library table name, and a source library table type is obtained, and the source library information is parsed corresponding to the metadata included in the source library. The steps to obtain the source table structure include:

Obtaining the source library information included in the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name;

Obtaining metadata from a source library corresponding to the source library name in the source library information;

The metadata information table of the obtained metadata is parsed, and the source table structure is obtained according to the metadata information table.
The data synchronization device according to claim 15, wherein said obtaining a table type of the first target library table and correspondingly generating for synchronizing metadata from the source library through the first target library table and the second target library table The steps to the synchronization script to the target library include:

Obtaining a table type of the first target library table, and determining that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

When the table type of the first target library table is an incremental table, the first sqoop synchronization script and the hive program are generated correspondingly; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first target. In the specified partition of the library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.
A storage medium, wherein the storage medium stores one or more programs, the one or more programs being executable by one or more processors, implementing the following steps:

Obtaining the source library information included in the source library name, the source library table name, and the source library table type, and parsing the source library information corresponding to the metadata included in the source library to obtain the source table structure;

Generating, according to the source table structure, a first target library table for establishing temporary storage data in the target library, and a table creation script for establishing a second target library table storing the same data as the source library in the target library;

Obtaining a table type of the first target library table, and correspondingly generating a synchronization script for synchronizing the metadata from the source library through the first target library table and the second target library table to the target library.
The storage medium of claim 17, wherein the table type of the first target library table is one of a delta table, a pipeline table, or a full scale table; and the table type of the second target library table is an increment One of a table, a flow meter, or a full scale.
The storage medium of claim 18, wherein the obtaining source library information including at least a source library name, a source library table name, and a source library table type is obtained, and the parsing source library information corresponding to the metadata included in the source library is obtained. The steps of the source table structure include:

Obtaining the source library information included in the source library name, the source library table name, the source library table type, the source table update field, the source table deduplication field, and the target library name;

Obtaining metadata from a source library corresponding to the source library name in the source library information;

The metadata information table of the obtained metadata is parsed, and the source table structure is obtained according to the metadata information table.
The storage medium of claim 18, wherein the obtaining a table type of the first target library table and correspondingly generating the metadata for sequentially transferring the metadata from the source library to the second target library table to the second target library table The steps of the synchronization script for the target library include:

Obtaining a table type of the first target library table, and determining that the table type of the first target library table is an incremental table, a flow table, or a full scale table;

When the table type of the first target library table is an incremental table, the first sqoop synchronization script and the hive program are generated correspondingly; the first sqoop synchronization script and the hive program are used to synchronize metadata from the source library to the first target. In the specified partition of the library table, the metadata in the first target library table is deduplicated according to the source table deduplication field and then stored in the second target library table.