WO2022127866A1

WO2022127866A1 - Data processing method and apparatus, and electronic device and storage medium

Info

Publication number: WO2022127866A1
Application number: PCT/CN2021/138821
Authority: WO
Inventors: 买建华; 刘志文; 付裕; 黄健; 许振华; 李从兵
Original assignee: 中兴通讯股份有限公司
Priority date: 2020-12-17
Filing date: 2021-12-16
Publication date: 2022-06-23
Also published as: CN114647659A

Abstract

A data processing method and apparatus, and an electronic device and a storage medium, which relate to the field of databases. The method comprises: acquiring a logical transaction log of a first node (101); acquiring SQL statements according to the logical transaction log (102); and merging SQL statements that satisfy a preset condition so as to generate a merged SQL statement, such that a second node plays back the merged SQL statement concurrently (103).

Description

Data processing method, device, electronic device, storage medium

cross reference

This application is based on the Chinese patent application with the application number "202011498751.2" and the application date is December 17, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference. Application.

technical field

The embodiments of the present application relate to the field of databases, and in particular, to a data processing method, apparatus, electronic device, and storage medium.

Background technique

In the process of data redistribution, the data of the old node will be distributed to the corresponding new node according to the configured distribution rules. During this process, the business still submits data on the old node, so it is necessary to append the data during this period to On the new node, the incremental data is added. In the process of retrieving incremental data in the later stage of data redistribution, the database cluster will parse the logs of the old nodes, obtain SQL (Structured Query Language, referred to as SQL) statements, and store the parsed SQL statements on the new nodes in the distributed database cluster. Play back up to achieve incremental data.

However, when incremental data is retrieved in the later stage of data redistribution, business concurrency is high and the pressure is high. If the SQL playback efficiency is too low, the speed of incremental data retrieval will not catch up with the speed of business data writing, resulting in the failure of incremental retrieval.

SUMMARY OF THE INVENTION

An embodiment of the present application provides a data processing method, including: acquiring a logical transaction log of a first node; acquiring SQL statements according to the logical transaction log; merging the SQL statements that meet preset conditions to generate a combined SQL statement SQL statement, so that the second node can play back the combined SQL statement concurrently.

The embodiment of the present application also provides a data processing device, including: a log acquisition module, configured to acquire a logical transaction log from a first node; a SQL statement acquisition module, used to acquire SQL statements according to the logical transaction log; SQL statement merging The module combines the SQL statements that meet the preset conditions, and generates a combined SQL statement for the second node to play back the combined SQL statement concurrently.

An embodiment of the present application further provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a program that can be executed by the at least one processor instructions, the instructions being executed by the at least one processor to enable the at least one processor to perform the data processing method described above.

Embodiments of the present application further provide a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the foregoing data processing method is implemented.

Description of drawings

1 is a flowchart of a data processing method according to a first embodiment of the present application;

2 is a flowchart of a data processing method according to a second embodiment of the present application;

3 is a schematic diagram of an additional amount according to a second embodiment of the present application;

4 is a flowchart of a data processing apparatus according to a third embodiment of the present application;

FIG. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present application.

Detailed ways

In order to make the objectives, technical solutions and advantages of the embodiments of the present application more clear, each embodiment of the present application will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can understand that, in each embodiment of the present application, many technical details are provided for the reader to better understand the present application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in the present application can be realized. The following divisions of the various embodiments are for the convenience of description, and should not constitute any limitation on the specific implementation of the present application, and the various embodiments may be combined with each other and referred to each other on the premise of not contradicting each other.

The first embodiment of the present application relates to a data processing method, which can be applied to electronic devices such as servers. This embodiment includes: acquiring the logical transaction log of the first node; acquiring SQL statements according to the logical transaction log; merging the SQL statements satisfying preset conditions to generate a combined SQL statement for concurrent use by the second node The combined SQL statement is played back. This embodiment reduces the number of SQL statements output to the new node during the incremental increment process by merging SQL statements, thereby reducing the number of SQL statements running on the database node, and merging SQL statements reduces the coupling between SQL statements relationship, increase the playback efficiency of SQL statements, and improve the success rate of incremental increments.

In one example, the cluster manager receives a redistribution request from an upper-layer service, and exports the data that needs to be redistributed in full, that is, according to the number of lines in the split file and the number of distribution batches configured in the cluster manager. Split the file, verify the distribution field and the column data of the distribution field in the split file, construct a calculation object for the distribution field and the correct column data, use the distribution algorithm to calculate the destination distribution node, write When the number of split files reaches the number of distribution batches, the split files are sent to import the data of the old node into the new nodes of each configuration.

The above-mentioned cluster manager can receive cluster-related requests of upper-layer services, such as the above-mentioned data redistribution request, manage the distributed cluster, coordinate the report of the database (Data Base, DB) status of the resource manager, and notify the resource manager. Commands such as switchover, backup, and redistribution are performed. The resource manager, usually the upper-level agent of the database, is a local database monitoring program that performs complex operations on the database in response to upper-level requests. In this embodiment, its main function is to respond to the redistribution request of the cluster manager, execute the redistribution process, split the SQL according to the distribution rules during the incremental incremental process, and connect the DB to play back the SQL. Both the first node and the second node are database nodes, which are basic nodes for storing data.

In the above process of data redistribution, the business still submits data on the old node, so the data during this period will be appended to the new node. This embodiment proposes a data processing method to process the incremental data. , to improve the incremental efficiency.

The flowchart of the data processing method of this embodiment is shown in FIG. 1 .

Step 101: Obtain the logical transaction log of the first node.

Exemplarily, the resource manager obtains the logical transaction log binlog of the first node.

Step 102: Obtain the SQL statement according to the logical transaction log.

Exemplarily, the logical transaction log binlog may be parsed by the SQL consolidator. For each SQL statement parsed from the binlog, the SQL consolidator first caches it in the memory of the server. The SQL consolidator can be understood as a process, and each database node under the resource manager corresponds to an SQL consolidator, that is, the SQL consolidator is the process of adding the first node to the logical transaction in the first node. A process in which logs are parsed and SQL is merged.

Step 103: Combine the SQL statements that meet the preset conditions to generate the combined SQL statement for the second node to play back the combined SQL statement concurrently.

In an example, the SQL statement that satisfies the preset condition is the SQL statement that operates on the same primary key in the same database table. In this implementation, the SQL statements that operate on the same primary key of the same database table are merged, and the primary key identifies a row of data, so that the merged statements are all operations performed on the data of different rows, that is, the row operations of the SQL statement no longer interact with each other. The association further reduces the strong coupling relationship of the original SQL statement, facilitates batch concurrent playback, shortens the playback time, and improves the performance of data incremental increments.

Exemplarily, each time a new SQL statement is generated from the logical transaction log, the SQL consolidator first caches it in the memory of the server, and checks whether there is a new SQL statement other than the newly generated SQL statement in the memory. In the same way, they are all SQL statements that operate on the same primary key in the same database table. If there is one, two SQL statements that operate on the same primary key in the same database table are merged, and the combined SQL statement is put into memory, so that The SQL statement parsed next time is merged according to the SQL statement stored in the memory. For example, if there are SQL statements s1, s2, and s3 in the memory, and the currently parsed SQL statement from the logical transaction log is s4, put s4 into the memory, and check whether there are any s1, s2, and s3 that can be merged with s4. If there is an SQL statement, for example, s4 can be merged with s1, then s4 and s1 are merged to generate s5, and s5 is also put into the memory. At this time, the SQL statements stored in the memory are s2, s3, and s5.

In one example, determine the execution time of the SQL statement that satisfies the preset condition in the logical transaction log, and combine the SQL statements that satisfy the preset condition according to the determined execution time to obtain The combined SQL statement. For example, determine the execution time of SQL statements that operate on the same table and the same primary key in the logical transaction log. If the execution time of the first SQL statement is earlier and the execution time of the second SQL statement is later, the execution effect of the combined SQL statement is the same as The effect of executing the first SQL statement and the second SQL statement in chronological order is the same. In this implementation, in the process of redistributing the incremental increments, they are merged according to the execution time of the SQL statements, so that the correlation degree of each SQL statement in the execution order is reduced, that is, the originally ordered SQL becomes disordered, and the original SQL is released. The relationship in execution steps further reduces the strong coupling relationship of SQL statements, facilitates parallel playback of SQL statements, and creates space for improving SQL playback efficiency.

In one example, the verb and field value of the combined SQL statement are determined according to the SQL statement that satisfies the preset condition and the execution time; and the combined SQL statement is generated according to the determined verb and field value. SQL statement. In this implementation, the SQL statements are combined according to their verbs and execution time, so that the combined SQL statement and the SQL statement executed according to the execution time have the same execution effect. Verbs of SQL statements include INSERT, UPDATE, and DELETE.

Exemplarily, for example, the two SQL statements that satisfy the preset condition are the first SQL statement and the second SQL statement, as follows:

If the verb of the first SQL statement is INSERT, the verb of the second SQL statement is UPDATE, and the execution time of the first statement in the logical transaction log is earlier than that of the second SQL statement, the verb of the combined SQL statement is INSERT;

If the verb of the first SQL statement is update UPDATE, the verb of the second SQL statement is update UPDATE, and the execution time of the first statement in the logical transaction log is earlier than that of the second SQL statement, the verb of the combined SQL statement is update UPDATE ;

If the verb of the first SQL statement is UPDATE, the verb of the second SQL statement is DELETE, and the execution time of the first statement in the logical transaction log is earlier than that of the second SQL statement, the verb of the combined SQL statement is DELETE DELETE ;

If the verb of the first SQL statement is DELETE, the verb of the second SQL statement is INSERT, and the execution time of the first statement in the logical transaction log is earlier than that of the second SQL statement, the verb of the combined SQL statement is UPDATE .

It should be noted that, if the verb of the first SQL statement is INSERT, the verb of the second SQL statement is DELETE, and the execution time of the first statement in the logical transaction log is earlier than that of the second SQL statement, the first SQL statement and the The combined result of the second SQL statement is that no SQL statement is generated.

Taking the table tb1 in the database db1 as an example, db1.tb1 has only three fields, namely a, b, and c; and the three fields are all int types, that is, integers, a is the primary key, and the execution of the first SQL statement The time is earlier than the second SQL statement.

The first SQL statement: INSERT INTO db1.tb1VALUES(1,2,3), that is, insert a piece of data with a=1, b=2, c=3 into db1.tb1; the second SQL statement: UPDATE db1.tb1SET a =4,b=5,c=6 WHERE a=1, that is, update the values of a, b, and c in the row of primary key a=1 in db1.tb1 to a=4, b=5, c=6 ;The merged SQL statement retains the latest data information. The merged SQL statement is: INSERT INTO db1.tb1VALUES(4,5,6), that is, insert a=4,b=5,c=in db1.tb1 6 data.

The first SQL statement: UPDATE db1.tb1SET a=4,b=5,c=6WHERE a=1, that is, update the values of a, b, and c to a in the row of primary key a=1 in db1.tb1 =4,b=5,c=6; the second SQL statement: UPDATE db1.tb1SET a=7,b=8,c=9WHERE a=4, that is, in db1.tb1, the primary key a=4 in the row The values of a, b, and c are updated to a=7, b=8, c=9; the combined SQL statement: UPDATE db1.tb1SET a=7, b=8, c=9 WHERE a=1, that is, in In db1.tb1, the values of a, b, and c in the row with the primary key a=1 are updated to a=7, b=8, c=9, and the merged SQL statement retains the before image of the first SQL statement The column value, that is, a=1 after WHERE, and the column value of the after image of the second SQL statement, that is, a=7, b=8, and c=9 after SET.

The first SQL statement: UPDATE db1.tb1SET a=4,b=5,c=6WHERE a=1, that is, update the values of a, b, and c to a in the row of primary key a=1 in db1.tb1 =4,b=5,c=6; the second SQL statement: DELETE FROM db1.tb1WHERE a=1, that is, delete the row data of the primary key a=1 in db1.tb1; the combined SQL statement is: DELETE FROM db1 .tb1WHERE a=1, that is, delete the row data of the primary key a=1 in db1.tb1, the value of a=1 after the WHERE in the merged SQL statement is the value after the WHERE in the first SQL statement, that is, the merged SQL The value of the statement is the value in the before image of the first SQL statement.

The first SQL statement: DELETE FROM db1.tb1WHERE a=1, that is, delete the row data of the primary key a=1 in db1.tb1; the second SQL statement: INSERT INTO db1.tb1VALUES(4, 5, 6), that is, in db1 Insert a piece of data with a=4, b=5, c=6 into .tb1; the combined SQL statement is: UPDATE db1.tb1SET a=4, b=5, c=6 WHERE a=1, the combined SQL statement The before image comes from the first SQL statement, and the after image comes from the second SQL statement.

It should be noted that, if the first SQL statement: INSERT INTO db1.tb1VALUE(1,2,3), that is, insert a piece of data with a=1, b=2, c=3 into db1.tb1; the second SQL statement : DELETE FROM db1.tb1WHERE a=1, that is, delete the row data of primary key a=1 in db1.tb1, and no SQL statement will be generated after merging.

Through the above step 103, the SQL statements that meet the preset conditions are combined to generate the combined SQL statement, so that the second node can concurrently play back the combined SQL statement.

It can be understood that, in this embodiment, the SQL statement that satisfies the preset condition is an SQL statement that operates on the same primary key in the same database table as an example. In practical applications, the SQL statement that satisfies the preset condition can also be: SQL statements that operate on different primary keys in the same database table, for example, can also be SQL statements that operate on the same field. Exemplarily, the first SQL statement is to change the value of field A in the first row of the table to 2, and the second SQL statement is to change the value of field A in the second row of the table to 3, then the first SQL statement can Combined with the second SQL statement, the combined SQL statement can change the value of field A in the first row to 2 and the value of field A in the second row to 3. The two SQL statements are changed to one, which reduces the number of SQL statements output to the new node, that is, the second node, and improves the playback efficiency. The other two SQL statements operate on the same column and field in the table, and will The operating SQL statements are merged, which reduces the coupling relationship of SQL statements on fields and facilitates concurrent execution.

In this embodiment, the logical transaction log of the first node is acquired, the SQL statement is acquired according to the logical transaction log, the SQL statements satisfying the preset conditions are combined, and the combined SQL statement is generated, so that the second node can play back the combined SQL statement concurrently, The number of output SQL statements is reduced, so that the number of SQL statements played back by database nodes is reduced, thereby increasing the speed of incremental increments. In addition, by merging SQL statements, the coupling relationship between SQL statements is reduced and SQL playback is increased. efficiency and improve the success rate of incremental increments.

The second embodiment of the present application relates to a data processing method. This embodiment is substantially the same as the first embodiment, except that: the SQL statements that meet the preset conditions are combined, and after the combined SQL statement is generated , including: obtaining a hash value according to the database table and primary key value in the merged SQL statement; determining a hash bucket for storing the merged SQL statement according to the hash value.

A flowchart of the data processing method according to the second embodiment of the present application is shown in FIG. 2 .

Step 201: Obtain the logical transaction log of the first node.

In step 202, the SQL statement is acquired according to the logical transaction log.

Step 203: Combine the SQL statements that meet the preset conditions to generate a combined SQL statement.

Steps 201 to 203 are substantially the same as steps 101 to 103 of the first embodiment of the present application, and details are not repeated here.

Step 204: Determine a hash bucket for storing the combined SQL statement according to the database table and the primary key value in the combined SQL statement.

In one example, a hash value is obtained according to a library table and a primary key value in the merged SQL statement; a hash bucket for storing the merged SQL statement is determined according to the hash value. The SQL statements are stored in the data structure of the hash bucket, so that the number of SQL statements in each hash bucket file is as uniform as possible, thereby improving the playback efficiency.

Exemplarily, the library table and the primary key value in the combined SQL statement are input into the hash function, the hash value is determined according to the hash function, and the hash bucket is determined according to the hash value.

Step 205: Determine a second node that plays back the combined SQL statement according to the combined SQL statement in the hash bucket.

Step 206: Split the hash bucket file according to the determined second node to obtain an SQL file. The hash bucket file is a file including all merged SQL statements in the hash bucket. In this implementation, the statements that operate on the same node in the hash bucket file are merged into one SQL file, which reduces the number of times the SQL statement is sent and further improves the playback efficiency.

Step 207: Send the SQL file to the determined second node. Wherein, the combined SQL statement in the SQL file is used for playback of the same second node.

In one example, the concurrent playback includes: concurrent playback between the SQL files and concurrent playback of each SQL statement in the SQL file. In this implementation, through concurrent playback, the playback efficiency is improved, and the success rate of incremental increments is increased.

Exemplarily, the resource manager modifies the hash bucket file generated by the SQL combiner according to the distribution rules, calculates the second node corresponding to each SQL statement in the hash bucket file, and obtains the SQL file, and the SQL file is used for the same one. For node playback, connect to the remote DB for playback. During playback, concurrent playback is performed between different SQL files, and SQL statements in the same SQL file are played back concurrently.

After the playback is completed, the resource manager returns the playback result to the cluster manager. After the cluster manager receives the reply, it continues a new round of incremental increment operations until the time of a certain round of incremental increments is less than the incremental increment threshold.

The schematic diagram of the incremental increment in this embodiment is shown in FIG. 3 , wherein the DBA is a database administrator (Database Administrator).

The cluster manager obtains the redistribution request and exports the data that needs to be redistributed in full, that is, splits the fully exported files according to the number of lines and distribution batches configured in the cluster manager. The distribution field in the obtained file and the column data of the distribution field are verified, the calculation object is constructed for the distribution field and the correct column data is verified, the distribution algorithm is used to calculate the destination distribution node, and the corresponding node is written, that is, the shard data. Cache, when the number of split files reaches the number of sending batches, the split files will be sent, and the data of the old node will be imported into the new nodes of each configuration. The cluster manager will initiate an incremental process. The cluster manager first queries the current logical transaction log location of each new node and records it, and then sends an incremental request to the resource controller of the old node. The resource controller scans the location based on the backup location. Logical transaction log, notify the SQL statement combiner to parse the logical transaction log and combine SQL statements, and generate a hash bucket file with hash bucket as the data structure to organize data according to the combined SQL statement. The hash bucket file is shown in Figure 3 In the redo SQL file, the resource manager splits the SQL statements in the hash bucket file according to the distribution key, that is, each hash bucket file is split into multiple files, and each SQL file is transferred to a certain node, resource The manager connects to the DB database node remotely. The DB implements concurrency between SQL files, executes each SQL statement concurrently in the SQL file, and returns the execution result after execution.

Take the document management system of MySQL distributed cluster data as an example. In this system, it is assumed that each document is stored in a table by document type. If the number of documents of a certain type increases, the database table of this type of document will bear a large amount of data. At this time, the distributed database cluster can perform the redistribution operation, split the document data according to the distribution rules, and then store it on the corresponding node. After this step is completed, the data increment operation is performed. The cluster manager sends an increment request to the resource manager corresponding to each old database table. After receiving the increment request, the resource manager obtains the logical transaction log, that is, the MySQL database. binlog file, the SQL combiner process parses the binlog file and merges and parses to obtain SQL statements. According to forming each SQL statement into a hash bucket file, the resource manager splits the hash bucket file into multiple SQL files, and then combines multiple new SQL statements. The generated SQL file is transferred to the corresponding new node, and the new node plays back the SQL file concurrently and returns the execution result.

It is worth mentioning that, during binlog parsing in this embodiment, the records of the same database table and the same primary key, that is, the SQL statements operating on the same database table and the same primary key, are combined to reduce the amount of SQL output to the hash bucket. Among them, the existence of the hash bucket makes the number of SQL in each bucket as uniform as possible, so that the subsequent playback efficiency is higher. Second, SQL merging enables better implementation in three dimensions during playback, thereby improving playback efficiency. 1. After the SQL is merged, the SQL of the same database table with different primary keys in the same hash bucket is played back concurrently; 2. After the SQL is merged, the SQL of different database tables in the same hash bucket is played back concurrently; 3. The SQL concurrently between the hash buckets playback. It can be seen that through SQL merging, all SQL statements can be broken up, each SQL statement is no longer related to each other, and batch concurrent playback can be achieved, thereby shortening the playback time and significantly improving the performance of data tracking.

The steps of the above various methods are divided only for the purpose of describing clearly. During implementation, they can be combined into one step or some steps can be split and decomposed into multiple steps. As long as the same logical relationship is included, they are all within the protection scope of this patent. ;Adding insignificant modifications to the algorithm or process or introducing insignificant designs, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.

The third embodiment of the present application relates to a data processing device, including: a log acquisition module 401 for acquiring a logical transaction log of a first node; a SQL statement acquisition module 402 for acquiring SQL statements according to the logical transaction log; SQL The statement merging module 403 combines the SQL statements that meet the preset conditions, and generates a combined SQL statement for the second node to play back the combined SQL statement concurrently.

In an example, the SQL statement that meets the preset condition in the SQL statement combining module 403 is the SQL statement that operates on the same primary key in the same database table.

In one example, the SQL statement merging module 403 is further configured to determine the execution time of the SQL statement that satisfies the preset condition in the logical transaction log; The SQL statements are combined to obtain a combined SQL statement.

In one example, the SQL statement combining module 403 is further configured to determine the verb and field value of the combined SQL statement according to the SQL statement satisfying the preset condition and the execution time; according to the determined verb and field value, and generate the combined SQL statement.

In an example, the SQL statement merging module 403 is further configured to obtain a hash value according to the library table and the primary key value in the merged SQL statement; and determine a hash value for storing the merged SQL according to the hash value a hash bucket of the statement; determine a second node that plays back the combined SQL statement according to the combined SQL statement in the hash bucket; send the combined SQL statement to the second node .

In an example, the SQL statement merging module 403 is further configured to split the hash bucket file according to the determined second node to obtain an SQL file, and send the SQL file to the determined second node; wherein , the hash bucket file is a file including all the merged SQL statements in the hash bucket, wherein the merged SQL statements in the SQL file are used for playback of the same second node .

In an example, the concurrent playback in the SQL statement merging module 403 includes: concurrent playback between the SQL files and concurrent playback of each SQL statement in the SQL file.

It is not difficult to find that this embodiment is a system example corresponding to the first embodiment, and this embodiment can be implemented in cooperation with the first embodiment. The relevant technical details mentioned in the first embodiment are still valid in this embodiment, and are not repeated here in order to reduce repetition. Correspondingly, the related technical details mentioned in this embodiment can also be applied to the first embodiment.

It is worth mentioning that each module involved in this embodiment is a logical module. In practical applications, a logical unit may be a physical unit, a part of a physical unit, or multiple physical units. A composite implementation of the unit. In addition, in order to highlight the innovative part of the present application, this embodiment does not introduce units that are not closely related to solving the technical problem raised by the present application, but this does not mean that there are no other units in this embodiment.

The fourth embodiment of the present application relates to an electronic device, as shown in FIG. 5 , comprising at least one processor 501 ; and a memory 502 communicatively connected to the at least one processor; wherein the memory stores data that can be Instructions executed by the at least one processor, the instructions being executed by the at least one processor, so that the at least one processor can execute the above-mentioned data processing method.

The memory and the processor are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors and various circuits of the memory. The bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides the interface between the bus and the transceiver. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium. The data processed by the processor is transmitted over the wireless medium through the antenna, and the antenna also receives the data and transmits the data to the processor.

The processor is responsible for managing the bus and general processing, and can also provide various functions, including timing, peripheral interface, voltage regulation, power management, and other control functions. Instead, memory may be used to store data used by the processor in performing operations.

The fifth embodiment of the present application relates to a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.

That is, those skilled in the art can understand that all or part of the steps in the method for implementing the above embodiments can be completed by instructing the relevant hardware through a program, and the program is stored in a storage medium and includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

Those of ordinary skill in the art can understand that the above-mentioned embodiments are specific examples for realizing the present application, and in practical applications, various changes can be made in form and details without departing from the spirit and the spirit of the present application. scope.

Claims

A data processing method comprising:

Get the logical transaction log of the first node;

Obtain the SQL statement according to the logical transaction log;

The SQL statements satisfying the preset conditions are combined to generate a combined SQL statement for the second node to play back the combined SQL statement concurrently.
The data processing method according to claim 1, wherein the SQL statement that satisfies the preset condition is the SQL statement that operates on the same primary key in the same database table.
The data processing method according to claim 2, wherein, the SQL statements that meet the preset conditions are combined to generate a combined SQL statement, comprising:

Determine the execution time of the SQL statement that satisfies the preset condition in the logical transaction log;

The SQL statements that meet the preset conditions are combined according to the determined execution time to obtain combined SQL statements.
The data processing method according to claim 3, wherein the combining the SQL statements satisfying the preset condition according to the determined execution time comprises:

Determine the verb and field value of the combined SQL statement according to the SQL statement that satisfies the preset condition and the execution time;

The combined SQL statement is generated according to the determined verb and field value.
The data processing method according to any one of claims 1 to 4, wherein after merging the SQL statements that meet the preset conditions to generate the merged SQL statement, the method further comprises:

Obtain a hash value according to the library table and the primary key value in the merged SQL statement;

Determine a hash bucket for storing the combined SQL statement according to the hash value;

Determine the second node that plays back the combined SQL statement according to the combined SQL statement in the hash bucket;

Send the combined SQL statement to the second node.
The data processing method according to claim 5, wherein the sending the combined SQL statement to the second node comprises:

Split the hash bucket file according to the determined second node to obtain an SQL file, and send the SQL file to the determined second node;

The hash bucket file is a file including all the merged SQL statements in the hash bucket, and the merged SQL statements in the SQL file are used for playback on the same second node.
The data processing method according to claim 6, wherein the concurrent playback comprises: concurrent playback between the SQL files and concurrent playback of each SQL statement in the SQL file.
A data processing device, comprising:

a log acquisition module, used to acquire the logical transaction log of the first node;

SQL statement acquisition module, for acquiring SQL statement according to the logical transaction log;

The SQL statement merging module combines the SQL statements that meet the preset conditions to generate the combined SQL statement for the second node to play back the combined SQL statement concurrently.
An electronic device comprising:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1 to 7 the data processing method described.
A computer-readable storage medium storing a computer program, when the computer program is executed by a processor, the data processing method according to any one of claims 1 to 7 is implemented.