CN114090332A

CN114090332A - Data processing method and device

Info

Publication number: CN114090332A
Application number: CN202111200242.1A
Authority: CN
Inventors: 吴迪; 傅宇; 丁顺杰; 袁诚; 马占峰; 杨鼎; 楼江航; 吴学强
Original assignee: Alibaba China Co Ltd; Alibaba Cloud Computing Ltd
Current assignee: Alibaba China Co Ltd; Alibaba Cloud Computing Ltd
Priority date: 2021-10-14
Filing date: 2021-10-14
Publication date: 2022-02-25
Also published as: WO2023061265A1

Abstract

The present specification provides a data processing method and apparatus, wherein the data processing method includes: receiving a data recovery instruction, wherein the data recovery instruction carries data recovery time; determining backup data and log files of each data node in the distributed database based on the data recovery instruction; determining target backup data and target log data from the backup data and the log files of each data node based on the data recovery time; and performing data recovery on each data node based on the target log data and the target backup data of each data node.

Description

Data processing method and device

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing method.

Background

With the development of internet and database technologies, a stand-alone database has a relatively perfect technology for backup recovery, which is an ability to guarantee data security, but for a distributed database, how to implement backup recovery at any time point and ensure data consistency of each node in the distributed database after backup recovery becomes an urgent problem to be solved.

Disclosure of Invention

In view of this, the embodiments of the present specification provide a data processing method. The present specification also relates to a data processing apparatus, a computing device, a computer-readable storage medium, and a computer program to solve the technical problems of the prior art.

According to a first aspect of embodiments herein, there is provided a data processing method including:

receiving a data recovery instruction, wherein the data recovery instruction carries data recovery time;

determining backup data and log files of each data node in the distributed database based on the data recovery instruction;

determining target backup data and target log data from the backup data and the log files of each data node based on the data recovery time;

and performing data recovery on each data node based on the target log data and the target backup data of each data node.

According to a second aspect of embodiments herein, there is provided a data processing apparatus comprising:

the device comprises a receiving module, a processing module and a processing module, wherein the receiving module is configured to receive a data recovery instruction, and the data recovery instruction carries data recovery time;

the first determination module is configured to determine backup data and log files of each data node in the distributed database based on the data recovery instruction;

a second determining module configured to determine target backup data and target log data from the backup data and the log file of each data node based on the data recovery time;

and the data recovery module is configured to perform data recovery on each data node based on the target log data and the target backup data of each data node.

According to a third aspect of embodiments herein, there is provided a computing device comprising:

a memory and a processor;

the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions, which when executed by the processor, implement the steps of any of the data processing methods.

According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of any of the data processing methods.

According to a fifth aspect of embodiments herein, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of any of the data processing methods.

The data processing method provided by the present specification includes: receiving a data recovery instruction, wherein the data recovery instruction carries data recovery time; determining backup data and log files of each data node in the distributed database based on the data recovery instruction; determining target backup data and target log data from the backup data and the log files of each data node based on the data recovery time; and performing data recovery on each data node based on the target log data and the target backup data of each data node.

Specifically, the method determines target backup data and target log data corresponding to data recovery time from backup data and log files of each data node in a distributed database based on a data recovery instruction carrying the data recovery time; and data recovery is carried out on each data node based on the target log data and the target backup data, so that backup recovery of the distributed database at any time point is realized, and the target log data and the target backup data correspond to the data recovery time, so that the data consistency of the distributed database after backup recovery is ensured.

Drawings

FIG. 1 is a diagram illustrating a distributed database for backup recovery according to an embodiment of the present disclosure;

fig. 2 is a processing flow diagram of a data processing method applied in a scenario of performing backup recovery on a distributed database according to an embodiment of the present specification;

FIG. 3 is a flow chart of a data processing method provided in an embodiment of the present specification;

fig. 4 is a schematic diagram illustrating determining a TSO for a distributed transaction in a data processing method according to an embodiment of the present specification;

fig. 5 is a schematic diagram of a log file in a data processing method provided in an embodiment of the present specification;

fig. 6 is a schematic diagram illustrating log file clipping based on a check termination time in a data processing method according to an embodiment of the present specification;

fig. 7 is a schematic diagram of candidate log data after rollback in a data processing method according to an embodiment of the present specification;

fig. 8 is a schematic diagram illustrating a recovery time being converted into a timestamp in a data processing method according to an embodiment of the present specification;

fig. 9 is a processing flow diagram of a data processing method applied in a scenario of performing backup recovery on a distributed database according to an embodiment of the present specification;

fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 11 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

First, the noun terms to which one or more embodiments of the present specification relate are explained.

X-Paxos protocol: a consensus agreement of consistency.

Xtrabakup: an on-line hot standby tool.

Crash Recovery: and (5) fault recovery flow.

application: a calling function.

ACID: abbreviations for four basic elements of correct execution of database transactions, the four basic elements include: atomicity (or indivisible), consistency (consistency), isolation (or independence), durability (durability).

pos information: and (5) site information.

GMS metadata management service.

The backup recovery is the necessary capability of the database to guarantee the data security, and for a single-machine database, the technology of the backup recovery is relatively perfect, and mainly comprises the following three schemes:

and (3) a logic backup scheme: and the database object level backup is realized, and the backup content is a database object such as a table, an index, a storage process and the like, such as MySQLmysqldump.

And (3) physical backup scheme: and backing up the database file level, wherein the backup content is the database file on an operating system, such as MySQL XtraBackup.

And (3) snapshot backup: a fully available copy of a specified data set is obtained based on a snapshot technology, and then the snapshot can be maintained only on the local machine, or the data cross-machine backup of the snapshot is carried out, such as a File System Veritas File System, a volume manager Linux LVM and a storage subsystem.

However, for the distributed database, how to perform backup and recovery of data faces many challenges, which mainly include: arbitrary point-in-time recovery (PITR) and global consistency;

recovery at an arbitrary Point in time (Point-in-time Recovery) is the first challenge to face. PITR refers to the ability to restore a database to any point in time (on the order of seconds) in the past, depending on the backup set. Databases typically rely on a full + incremental physical backup of a single machine to implement PITR capabilities, such as xtrackup + Binlog of MySQL.

For the distributed database, because the reading and writing of data involve a plurality of data nodes and the existence of distributed transactions, in the process of recovering at any point, besides the data integrity of a single node, the data consistency among the nodes also needs to be ensured. Referring to fig. 1, fig. 1 is a schematic diagram of a distributed database for backup recovery according to an embodiment of the present disclosure; wherein an example of global consistency of data given by transfer testing is shown in fig. 1; the user's account balance table is distributed among two data nodes (DN1, DN2), and at some point account C transfers 30-dollars to account a and account D transfers 20-dollars to account C through a distributed transaction. If the data node is restored to the moment, because the restoration at any time point can only be accurate to the second level, the restored data may have the condition that accounts A and C complete the transfer operation, and accounts B and D do not complete (such as restoration 1), and the balance data is inconsistent, which is not acceptable for the user. It is necessary to ensure that the recovered data is either in a pre-transfer state or in a post-transfer state (e.g., recovery 2).

Meanwhile, backup is high-frequency operation and maintenance operation of the database, and data safety is guaranteed by performing backup once a day according to suggestions. Since the backup is so high frequency, the process of data backup requires that the pair be potentially lossless. How to perform lossless backup on a database on the premise of ensuring data consistency is also one of challenges.

Distributed databases store much larger amounts of data than stand-alone databases, typically tens or even hundreds of TBs. In the face of such huge data volume, how to perform fast data backup and recovery becomes a problem that needs to be solved urgently. Meanwhile, with the increase of the data volume, how to ensure that the speed of backup recovery can also linearly increase is a problem to be solved, so that the time consumption of backup recovery is relatively stable. For example, when the data size of the distributed database is 10TB, the backup of the database needs 1 hour, and if the data size is increased to 100TB, the backup time at this time still needs to be guaranteed to be about 1 hour, rather than being increased to 10 hours.

Based on this, referring to fig. 2, fig. 2 is a processing flow chart of a data processing method applied in a scenario of performing backup recovery of a distributed database according to an embodiment of the present specification; fig. 2 illustrates a manner in which each Data Node (DN) of the distributed database performs backup recovery at any time point in the data processing method provided in this specification. Firstly, each data node carries out timed full backup on data of the data node through xtrabackup and incremental backup on a log. When the data node needs to be restored to a certain time point A (accurate to the second level), a full backup set closest to the time point A is firstly found for data restoration, and the full backup set comprises backup data. For example, in the case where the data node performs data backup once a day, and a certain time point a is 2021-07-2516:14:21, the full-strength backup set corresponding to the time point a may be the full-amount backup set (backup data) of the data node 2021-07-25.

Then, all the binlog files corresponding to the time period from the starting time 2021-07-2500:00:00 of the full backup set to the recovery time point 2021-07-2516:14:21, such as the binlog-06 … binlog-12 in fig. 2, are obtained from all the backup log files of each data node. Through the flow of blast Recovery of MySQL, the events (transactions) recorded in this binlog are applied, and the data of the distributed database can be restored to the specified time point a.

In the backup recovery process, two log files in the head and tail parts of the log file, namely binlog-06 and binlog-12, need to be processed separately.

binlog-06: in the process of full backup, data of the data nodes are still written normally, so that pos information of the log files corresponding to the backup time can be recorded and stored in the backup set after the full backup is completed. When applying a log file, binlog-06 needs to start an application from the pos location recorded in the pos information of the full backup set, and the part before the pos location needs to be discarded.

binlog-12: since the point in time of recovery is arbitrary, the last data change that needs to be recovered may exist anywhere in the log file. And (4) clipping the last log file binlog-12, and removing the transactions in the log file binlog-12 which are larger than the recovery time point.

In the present specification, a data processing method is provided, and the present specification relates to a data processing apparatus, a computing device, a computer-readable storage medium, and a computer program, which are described in detail one by one in the following embodiments.

Referring to fig. 3, fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present specification, which specifically includes the following steps:

step 302: and receiving a data recovery instruction, wherein the data recovery instruction carries data recovery time.

The data recovery instruction can be understood as an instruction which is received by the distributed database and can instruct the distributed database to perform backup recovery, and the data recovery time can be understood as a time point to which recovery is required in the process of realizing recovery (PITR) of the distributed database at any time point; the time point can be set according to the actual application scene, such as 2021-07-2516:14: 21.

In practical application, before the distributed database receives a data recovery instruction, in order to ensure that the database is subjected to lossless backup under the condition that the operation of the distributed database is lossless; and ensuring that the speed of backup recovery can linearly increase while the data volume increases, which is realized by the following method.

Before receiving the data recovery instruction, the method further includes:

generating processing data based on each data node in a distributed database, determining information to be recorded corresponding to the processing data and processing time corresponding to the information to be recorded, and adding the information to be recorded and the corresponding processing time to a processing log;

determining a backup node of each data node in the distributed database, and synchronizing the processing data and the processing log of each data node to the backup node;

based on the backup node of each data node in the distributed database, the processing data in each data node and the processing log corresponding to the processing data are backed up, and the backup data and the log file of each data node are generated.

The processing data can be understood as data generated in the operation process of each data node in a distributed database; the information to be recorded can be understood as a transaction generated by the data node; in practical applications, data can be generated during operations such as adding and modifying data by a data node, and when the operations are completed, specific contents of the operations are recorded in a log file as transactions (events).

The processing time may be understood as a timestamp, for example, a tso (time stamp oracle) timestamp, and correspondingly, the processing time corresponding to determining the information to be recorded may be understood as a process of stamping a timestamp on each transaction of the data node.

The processing log may be understood as a log file, i.e. binlog, that records transactions of data nodes.

In the data processing method provided by the present specification, the data node uses three MySQL nodes of an X-Paxos protocol, and includes a main node and a backup node (Follower node), where the main node is used to perform data processing, and the backup node is used to synchronize data of the main node.

Backup data may be understood as a backup of data in a data node; log data may be understood as a backup of a log file corresponding to a data node.

Specifically, before the distributed database receives a data recovery instruction, data and a log of each data node under the distributed database need to be backed up, and the backup process includes:

in a distributed database, generating processing data based on each data node in the distributed database, and determining information to be recorded corresponding to the processing data; and marking the information to be recorded with the corresponding processing time, and adding the information to be recorded and the processing time into the processing log.

Determining a backup node corresponding to each data node in the distributed database, and synchronizing the processing data and the processing log in each data node to the backup node; and based on the backup node of each data node, backing up the processing data in each data node and the processing log corresponding to the processing data, thereby generating backup data and a log file of each data node.

For example, taking the data processing method applied to the scenario of performing backup recovery on the distributed database as an example, further description is made on the backup data and the log data for generating each data node.

Before the distributed database receives a data recovery instruction, the running data and the log files in the data nodes need to be backed up. In this embodiment of the present specification, different backup manners may be set for the distributed database according to an actual application scenario, for example, xtrabackup is used to perform a timed full backup on data of data nodes every day, and a log file of each data node is subjected to an incremental backup.

In practical applications, performing incremental backup on a log file can be understood as performing incremental backup on log data (recorded transactions) in the log file; in the process of performing incremental backup on the log data, the location information of the log data of each backup, that is, pos information of the data node, is recorded. For example, when backing up a log file at 2/3/2021, it is recorded that the current log backup is from 1001 st data of the log file to 1500 th data of the log file. Thereby determining the number of the log data backed up this time and the location information of the log data.

Specifically, in the implementation of a TSO distributed transaction in a distributed database, see fig. 4, fig. 4 is a schematic diagram illustrating the determination of a TSO for a distributed transaction in a data processing method provided in an embodiment of the present specification; as shown in fig. 4, in order to provide external consistent read and guarantee distributed transaction capability at SI isolation level, a CTS (commit timestamp) extension is added to each data node by adopting a TSO timestamp scheme, and two sequence numbers (timestamps) are defined for each distributed transaction based on the CTS extension, one is a sequence number snapshot _ seq (for snapshot read) at the time of transaction start and one is a sequence number commit _ seq at the time of transaction commit. Thus, globally unique and monotonically increasing sequence numbers (TSO timestamps) are uniformly distributed by the GMS for both transaction start events (Prepare events) and transaction Commit events (Commit events) during the start and Commit phases of a transaction. And judging the currently read data version through a submission time stamp CTS (Commit timestamp) of the data, namely snapshot.cts > data.cts in the graph, so as to ensure the capability of snapshot reading.

In the process of adding, deleting and other operations, each data node of the distributed database generates operation data and distributed transactions corresponding to the operation data, and adds corresponding TSO time stamps to each distributed transaction through the GMS.

Adding the distributed transaction and the corresponding TSO timestamp to a log file; in addition to the XA Start, XAPrepare and XACommit events, a CTS Event is attached to identify the Start and time of commit of the transaction, which is a specific TSO timestamp.

After the operating data and the corresponding log files are determined, the backup node of each data node is determined in the distributed database, the operating data and the corresponding log files of each data node are synchronously copied to the backup node, and the operating data and the log files are backed up in the backup node through xtracackup, so that the backup data and the backup log files of each data node are obtained.

In this embodiment of the present description, because the backup operation is mainly performed on a following node of the data node, and because the following node does not take over traffic, the influence of the backup process on the operation of the data node is basically negligible. Moreover, because the backup and recovery operations are performed by each data node individually, and no heavy cooperative operation (mainly, issuing and synchronizing at the recovery time point, which is light in weight) needs to be performed between the data nodes, when the distributed database is expanded from 10TB (10 DNs) to 100TB (100 DNs), the overall backup and recovery time is still close to that of a single DN, and does not increase rapidly with the increase of the data volume.

Step 304: and determining backup data and log files of each data node in the distributed database based on the data recovery instruction.

Following the above example, determining backup data and log data for each data node in the distributed database based on the data restore instruction is further described.

Under the condition that the distributed database receives a data recovery instruction, determining backup data and backup log files corresponding to each data node in the distributed database based on the data recovery instruction.

Step 306: and determining target backup data and target log data from the backup data and the log files of each data node based on the data recovery time.

The target backup data may be understood as backup data corresponding to the data recovery time, and the target log data may be understood as log data (distributed transaction) in the backup log file corresponding to the data recovery time, which accurately corresponds to the data recovery time.

In practical application, the determining, based on the data recovery time, target backup data and target log data from the backup data and the log data of each data node further includes a first step and a second step.

Step one, determining the backup data corresponding to the data recovery time in each data node as the target backup data of each data node.

Following the above example, further describing the determination of the target backup data of each data node based on the data recovery time, after determining all backup data corresponding to each data node based on the data recovery instruction, the data recovery time is, for example, 2021-07-2516:14: 21; acquiring backup data corresponding to the data recovery time, for example, the backup data of 2021-07-25 of each data node, from the backup data; and the backup data of 2021-07-25 is taken as the target backup data.

And secondly, processing the log file of each data node based on the data recovery time to obtain target log data of each data node.

In practical application, the processing the log file of each data node based on the data recovery time to obtain the target log data of each data node includes:

determining an initial log file corresponding to the data recovery time from the log file of each data node;

and cutting the initial log file based on the data recovery time to obtain target log data corresponding to each data node.

The initial log file may be understood as a log file corresponding to the data recovery time in the backup log file.

Following the above example, further describing the determination of the target log data for each data node based on the data recovery time, after all log data corresponding to each data node are determined based on the data recovery instruction, the target log data for each data node is determined based on the data recovery time, such as 2021-07-2516:14: 21; and acquiring an initial log file corresponding to the data recovery time from all log files, for example, the log file related to the transaction occurring between 2021-07-2500:00:00 to 2021-07-2516:14:21 of each data node can be one, and cutting the initial log file based on the data recovery time to obtain a file only containing the transaction occurring in the time period of 2021-07-2500:00 to 2021-07-2516:14:21 of each data node.

In practical application, after the initial log file is cut according to the data recovery time, the log file only containing the transactions of each data node in the time period from 2021-07-2500:00:00 to 2021-07-2516:14:21 can be obtained, and the data node is backed up and recovered based on the log file.

In another case, if there are a plurality of initial log files corresponding to the data recovery time, the head log file and the tail log file in the plurality of log files need to be trimmed, so as to obtain the target log data corresponding to each data node, as shown in the following.

The processing the log file of each data node based on the data recovery time to obtain the target log data of each data node includes:

determining a log file to be processed corresponding to the data recovery time from the log file of each data node, and acquiring log backup information of the log file;

cutting the log file to be processed based on the log backup information to obtain an initial log file of each data node;

The log files to be processed may be understood as a plurality of backup log files corresponding to the data recovery time, and the log backup information may be understood as pos information of each data node.

The following uses the above example, further explanation is made on the log data is clipped based on the log backup information and the data recovery time, and the target log data corresponding to each data node is obtained.

After all log data corresponding to each data node is determined based on the data recovery instructions, the data recovery time is, for example, 2021-07-2516:14: 21; a plurality of initial log files corresponding to the data recovery time are obtained from all log files, for example, a plurality of log files related to transactions occurring between 2021-07-2500:00:00 to 2021-07-2516:14:21 of each data node, for example, 3 log files, log file a, log file B and log file C.

Obtaining pos information of a log file of the backup log file, and determining a transaction corresponding to each data node at the time of 2021-07-2500:00:00 based on the pos information, that is, a position of a first piece of backup log data at 2021-07-25, to be written in a row of the log file a, such as row 50. Based on the pos information, log data (transactions) for log file A that is smaller than line 50 is clipped. Thereby obtaining a log file to be processed.

After the trimming of the head log file is completed, the tail log file is trimmed based on the data recovery time 2021-07-2516:14:21, and the log records in the log file C which are larger than 2021-07-2516:14:21 are trimmed, so that a log file which only contains the transactions of each data node which occur between 2021-07-2500:00:00 and 2021-07-2516:14:21 is obtained.

In a specific implementation process, the cutting the log file to be processed based on the log backup information to obtain an initial log file of each data node includes:

determining target backup information corresponding to the data recovery time from the log backup information, wherein the log backup information comprises log backup records of each time point;

and determining data to be cut from the log file to be processed based on the target backup information, and cutting the cut data to obtain an initial log file of each data node.

The log backup information may be understood as a location where the log data is backed up each time during the log backup process of each data node, that is, pos information of each data node.

The target backup information may be understood as pos information corresponding to the data recovery time in the log backup information, for example, in the case that the data recovery time is 2021-07-2516:14:21, the target backup information may be understood as log records that are backed up at 2021-07-25 by each data node, starting from the 50 th line of the log a until the 70 th line of the log E.

According to the above example, the trimming data is trimmed based on the target backup information, and the initial log data of each data node is obtained.

After determining the log data, pos information corresponding to the data recovery time 2021-07-2516:14:21, such as the pos information of 2021-07-25, is determined from the pos information of each data node from the 50 th line of the log a to the 70 th line of the log E.

And determining the log records before the 50 th line in the log A, namely the head log file, as data to be cut based on the pos information, and cutting the data to be cut so as to obtain an initial log file of each data node.

In the embodiment of the present specification, target backup information corresponding to data recovery time is determined from log backup information, and data to be clipped determined based on the target backup information is clipped, so as to obtain an initial log file of each data node, which is convenient for obtaining target log data of each data node based on the initial log file subsequently.

In a specific implementation process, before the cutting the initial log file based on the data recovery time, the method further includes:

calculating to obtain verification termination time based on the data recovery time and preset transaction processing time;

correspondingly, the cutting the initial log file based on the data recovery time to obtain the target log data corresponding to each data node includes:

and cutting the initial log file based on the data recovery time and the verification termination time to obtain target log data corresponding to each data node.

The preset transaction processing time may be set according to an actual application scenario, for example, 60 seconds.

The following uses the above example to further explain the obtaining of the target log data corresponding to each data node by cutting the initial log data based on the data recovery time and the verification termination time.

The clipping logic of the log file seems simple, and only the data recovery time is compared with the size of the timestamp TSO of each transaction in the log, and the time in the log file is removed. Although the transaction acquisition TSOs are ordered, the TSOs falling into the log file of each data node during actual commit are not strictly ordered due to factors such as different network delays among the nodes. Referring to fig. 5, fig. 5 is a schematic diagram of a log file in a data processing method provided in an embodiment of the present specification; as shown in FIG. 5, the commit TSO (100) of transaction 2 is smaller than transaction 3, but in the log this transaction is after the event that transaction 3 committed. Such an out-of-order problem raises the question of when to terminate the cropping process.

Conventional clipping logic may terminate the flow once the first value greater than the compare TSO occurs, by traversing back one by one.

But this method is no longer applicable to the order of the logs described above. For example, when the TSO at the time point we need to recover is 100, and we scan the event of C3, we find that the TSO of the event is 101, which is already greater than 100, but if the clipping logic is terminated at this time, C2 of the submitted event will be missed.

How to determine the crop-termination condition in such out-of-order situations, it is considered here that the out-of-order of binlog is only within a short time frame, i.e. does not exceed the timeout time for transaction commit. If the time is exceeded and the transaction has not committed, the transaction will be rolled back and not commit. With this condition, we introduce a new variable (check expiry time) stop _ tso:

stop _ tso ═ (recovery time point + transaction timeout time (default 60s) + delta)

Wherein: delta is an amplification factor and is taken by default for 60 s.

With stop _ tso, we can terminate the clipping flow when the first event occurs that is greater than this value. For example, referring to fig. 6, fig. 6 is a schematic diagram illustrating log file clipping based on a check termination time in a data processing method provided in an embodiment of the present specification; the tso of the recovery time point is 100, but the stop _ tso is calculated to be 103, according to the condition, the corresponding log file can be removed, the effect after removal is as shown in fig. 6, and all three events P3, C3 and C1 need to be removed.

Specifically, the cutting the initial log file based on the data recovery time and the verification termination time to obtain the target log data corresponding to each data node includes:

acquiring the position information and the processing time of each piece of log data in the initial log file, and comparing the data recovery time with the processing time of each piece of log data based on the position information;

judging whether the processing time of each piece of log data is less than or equal to the data recovery time,

if yes, determining the log data with the processing time less than or equal to the data recovery time as candidate log data, and determining target log data corresponding to each data node based on the candidate log data,

if not, the processing is finished under the condition that the processing time is more than or equal to the verification termination time.

The location information may be understood as a sort location of each log data in the log file, and the sort location is arranged based on a time of writing to the log file.

In a plurality of backup files (log file A, log file B and log file C) corresponding to the recovery time point based on pos information, after data to be cut in a header log file A is cut, position information of log data (transactions) recorded in the log file A, the log file B and the log file C and a TSO timestamp of each transaction are determined.

Comparing the TSO timestamp at the first bit with the recovery time point according to the position information of each transaction, and if the TSO timestamp of the transaction is less than or equal to the recovery time point, taking the transaction as candidate log data; and if the TSO timestamp of the transaction is greater than the recovery time point, skipping the transaction, and comparing the TSO timestamp of the next transaction of the transaction with the recovery time point until the TSO timestamp of the transaction is greater than or equal to the check termination time stop _ TSO.

And after the verification is finished, determining target log data corresponding to each data node based on all the candidate log data.

In practical application, after the TSO timestamp of the transaction is greater than or equal to the check termination time stop _ TSO, all candidate log data can be used as the target log data corresponding to each data node.

In the embodiment of the present specification, candidate log data is determined by comparing the data recovery time with the processing time based on the acquired location information; and determining target log data corresponding to each data node based on the candidate log data. Therefore, the problem that the mistaken log file is cut out due to out-of-order export of log data in the process of cutting the log file is avoided, and the data consistency of each data node is further ensured when the distributed database is backed up and restored.

Referring to fig. 6, after finishing the cutting of the log file, when the data node finishes the log file application, it is found that although the data has reached a consistent state, transaction 1 is rejected because the corresponding C1 event, but P1 event is still applied, although because the transaction is not committed, we cannot see the corresponding data change in the recovered data, but in the distributed system, it belongs to suspended transactions, and these suspended transactions need to be rolled back through XA Rollback (a rolling back method), which is inconvenient to affect the execution of the subsequent transactions, and the specific process is as follows.

The determining the target log data corresponding to each data node based on the candidate log data comprises:

s1, judging whether the data type of the ith data in the candidate log data is the target type, if so, putting the ith data into a candidate data set as candidate data,

if not, deleting the candidate data associated with the ith data in the candidate data set, wherein i is 1;

s2, increasing i by 1, and continuing to execute the step S1 until i is larger than the number of the candidate log data;

and S3, deleting the candidate log data corresponding to the candidate data in the candidate data set in the candidate log data to obtain the target log data corresponding to each data node.

Following the above example, further description is made on determining the target log data corresponding to each data node based on the candidate log data, and after obtaining the subsequent log data, it is determined whether the data type of the 1 st data in the candidate log data is the log data of the target type, for example, a transaction of a Prepare type.

If yes, writing the transaction into a candidate data set as candidate data; if not, the 1 st data is not a transaction of the Prepare type but a transaction of a Commit type or a Rollback type, and the corresponding transaction of the Prepare type is found in the candidate data set and is deleted from the set.

And continuously judging whether the data type of the remaining data in the candidate log data is the log data of the target type or not, and executing the same operation as the 1 st data in the candidate log data until all the candidate log data are judged.

And performing rollback processing on candidate log data corresponding to the candidate data in the candidate data set in the candidate log data, and taking the candidate log data subjected to rollback operation as target log data corresponding to each data node.

Referring to fig. 7, fig. 7 is a schematic diagram of candidate log data after rollback in a data processing method according to an embodiment of the present specification; as can be seen, after it is determined that P1 is a hanging event, a rollback operation is performed on the event.

In this embodiment of the present specification, it is determined whether a data type of an ith data in the candidate log data is a target type, if yes, the ith data is placed in the candidate data set to serve as the candidate data, if not, the candidate data associated with the ith data in the candidate data set is deleted, and when i is greater than the number of the candidate log data, the candidate log data corresponding to the candidate data in the candidate data set in the candidate log data is deleted, so as to obtain target log data corresponding to each data node. The problem of hanging transactions in the distributed database is avoided, and the execution of subsequent transactions in the distributed database is not influenced.

In this embodiment of the present specification, since the TSO of a transaction is recorded in a log file, when any globally consistent time point recovery is required, first converting a time point (e.g., 2021-07-2516:14:21) that needs to be recovered into a corresponding TSO timestamp, then clipping the log file that needs to be applied to each data node based on the TSO timestamp, and removing the transaction events whose transaction commit sequence numbers (snapshot _ seq) are greater than the TSO timestamp, so as to ensure that there is no inconsistency of partial commit of the transaction in the recovered data, and specifically, the manner of converting the recovery time point into the TSO timestamp is as follows.

The cutting the log data of each data node based on the data recovery time to obtain the target log data of each data node comprises:

converting the data recovery time into a target recovery time, wherein the target recovery time is consistent with the format of the processing time;

and cutting the log file of each data node based on the target recovery time to obtain the target log data of each data node.

The following uses the above example to convert the data recovery time into the target recovery time for the target conversion object, and further explains the cutting of the log file of each data node based on the target recovery time.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating a recovery time is converted into a timestamp in a data processing method according to an embodiment of the present specification. The format of the TSO timestamp is shown in fig. 8, and it can be seen from fig. 8 that the TSO timestamp is a 64-bit number formed by combining a physical timestamp + a logical timestamp. Considering that PITR needs to be accurate only to the order of seconds, we directly shift the timestamp requiring recovery time point (e.g., 2021-07-2516:14:21) to the left by 42 bits, and the remaining lower 22 bits are all filled with 0, so that the corresponding TSO timestamp can be obtained.

And cutting the log file of each data node based on the TSO timestamp corresponding to the recovery time point, thereby obtaining the target log data of each data node.

Step 308: and performing data recovery on each data node based on the target log data and the target backup data of each data node.

Along the above example, after the target log data and the target backup data corresponding to each data node are determined, the business of the target log data is applied based on the target backup data through the process of the Crash recovery of the distributed database, so that the data of each data node in the distributed database is restored to the specified time point.

In the data processing method provided in the embodiment of the present specification, based on a data recovery instruction carrying data recovery time, target backup data and target log data corresponding to the data recovery time are determined from backup data and log files of each data node in a distributed database; and data recovery is carried out on each data node based on the target log data and the target backup data, so that backup recovery of the distributed database at any time point is realized, and the target log data and the target backup data correspond to the data recovery time, so that the data consistency of the distributed database after backup recovery is ensured.

The following description further describes the data processing method by taking an application of the data processing method provided in this specification in a scenario of performing backup recovery on a distributed database as an example, with reference to fig. 9. Fig. 9 shows a processing flow chart of a data processing method applied in a scenario of performing backup recovery on a distributed database according to an embodiment of the present specification, and specifically includes the following steps:

step 902: receiving a data recovery instruction aiming at the distributed database, wherein the data recovery instruction carries a recovery time point.

Step 904: and acquiring a full backup set corresponding to each data node and a recovery time point in the distributed database, and acquiring a backup log file corresponding to each data node from the starting time of the full backup set to the recovery time point.

Step 906: and determining data to be cut from the head log file of the backup log file based on pos information of the full backup set, and removing the data to be cut.

The data to be cut is a log record located before pos information in the header log file.

Step 908: the recovery time is converted to a TSO timestamp, and a verification termination time is calculated based on the TSO timestamp and the transaction timeout time.

The transaction timeout time may be set according to an actual application scenario, and this is not specifically set in this specification, for example, 60 seconds.

Step 910: and determining out-of-order log records in the backup log file based on the TSO time stamp corresponding to the recovery time and the verification termination time, and removing the out-of-order log records.

Step 912: and determining the hanging affairs in the backup log file, and removing the hanging affairs to obtain the cut backup log file.

Step 914: according to the Crash Recovery flow of the distributed database, backup Recovery is carried out on each data node based on a full backup set and the cut backup log file, and the data of each data node is recovered to a specified time point.

In the data processing method provided by the present specification, based on a data recovery instruction carrying data recovery time, target backup data and target log data corresponding to the data recovery time are determined from backup data and log files of each data node in a distributed database; and data recovery is carried out on each data node based on the target log data and the target backup data, so that backup recovery of the distributed database at any time point is realized, and the target log data and the target backup data correspond to the data recovery time, so that the data consistency of the distributed database after backup recovery is ensured.

Corresponding to the above method embodiment, this specification further provides an embodiment of a data processing apparatus, and fig. 10 shows a schematic structural diagram of a data processing apparatus provided in an embodiment of this specification. As shown in fig. 10, the apparatus includes:

a receiving module 1002, configured to receive a data recovery instruction, where the data recovery instruction carries data recovery time;

a first determining module 1004 configured to determine backup data and a log file of each data node in the distributed database based on the data recovery instruction;

a second determining module 1006, configured to determine target backup data and target log data from the backup data and the log file of each data node based on the data recovery time;

a data recovery module 1008 configured to perform data recovery on each data node based on the target log data and the target backup data of each data node.

Optionally, the second determining module 1006 is further configured to:

determining the backup data corresponding to the data recovery time in each data node as target backup data of each data node;

and processing the log file of each data node based on the data recovery time to obtain target log data of each data node.

Optionally, the second determining module 1006 is further configured to:

and under the condition that the processing time of the log data is longer than the data recovery time, cutting all the log data which are longer than the data recovery time to obtain target log data corresponding to each data node.

Optionally, the second determining module 1006 is further configured to:

Optionally, the data processing apparatus further includes a backup module configured to:

The data processing apparatus provided in the embodiment of the present specification determines, based on a data recovery instruction carrying data recovery time, target backup data and target log data corresponding to the data recovery time from backup data and log files of each data node in a distributed database; and data recovery is carried out on each data node based on the target log data and the target backup data, so that backup recovery of the distributed database at any time point is realized, and the target log data and the target backup data correspond to the data recovery time, so that the data consistency of the distributed database after backup recovery is ensured.

The above is a schematic configuration of a data processing apparatus of the present embodiment. It should be noted that the technical solution of the data processing apparatus and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing apparatus can be referred to the description of the technical solution of the data processing method.

FIG. 11 illustrates a block diagram of a computing device 1100 provided in accordance with one embodiment of the present description. The components of the computing device 1100 include, but are not limited to, memory 1110 and a processor 1120. The processor 1120 is coupled to the memory 1110 via a bus 1130 and the database 1150 is used to store data.

The computing device 1100 also includes an access device 1140, the access device 1140 enabling the computing device 1100 to communicate via one or more networks 1060. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 1140 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 1100, as well as other components not shown in FIG. 11, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 11 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 1100 can be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 1100 can also be a mobile or stationary server.

Wherein the processor 1120 is configured to execute computer-executable instructions that, when executed by the processor 1120, perform the steps of any of the data processing methods.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the data processing method.

An embodiment of the present specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of any of the data processing methods.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the data processing method.

An embodiment of the present specification also provides a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of any of the data processing methods.

The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the schematic plan of the computer program and the technical plan of the data processing method described above belong to the same concept, and details that are not described in detail in the schematic plan of the computer program can be referred to the description of the technical plan of the data processing method described above.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present disclosure is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present disclosure. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for this description.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the specification and its practical application, to thereby enable others skilled in the art to best understand the specification and its practical application. The specification is limited only by the claims and their full scope and equivalents.

Claims

1. A method of data processing, comprising:

2. The data processing method according to claim 1, wherein the determining target backup data and target log data from the backup data and the log file of each data node based on the data recovery time comprises:

3. The data processing method according to claim 2, wherein the processing the log file of each data node based on the data recovery time to obtain the target log data of each data node comprises:

4. The data processing method according to claim 2, wherein the processing the log file of each data node based on the data recovery time to obtain the target log data of each data node comprises:

5. The data processing method of claim 3 or 4, prior to the pruning the initial log file based on the data recovery time, further comprising:

6. The data processing method according to claim 4, wherein the cutting the to-be-processed log file based on the log backup information to obtain an initial log file of each data node comprises:

7. The data processing method according to claim 3 or 4, wherein the clipping the initial log file based on the data recovery time to obtain target log data corresponding to each data node comprises:

8. The data processing method according to claim 5, wherein the cutting the initial log file based on the data recovery time and the verification termination time to obtain target log data corresponding to each data node comprises:

9. The data processing method of claim 8, wherein determining target log data corresponding to the each data node based on the candidate log data comprises:

10. The data processing method of claim 7, wherein the clipping the initial log file of each data node based on the data recovery time to obtain the target log data of each data node comprises:

11. The data processing method of claim 1, further comprising, before receiving the data recovery instruction:

12. A data processing apparatus comprising:

13. A computing device, comprising:

a memory and a processor;

the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions, which when executed by the processor, implement the steps of the data processing method of any one of claims 1 to 11.

14. A computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the data processing method of any one of claims 1 to 11.