WO2022151593A1

WO2022151593A1 - Data recovery method, apapratus and device, medium and program product

Info

Publication number: WO2022151593A1
Application number: PCT/CN2021/084393
Authority: WO
Inventors: 阿瓦鲁卡纳卡库马尔; 库马尔潘卡吉; 智伟
Original assignee: 华为云计算技术有限公司
Priority date: 2021-01-13
Filing date: 2021-03-31
Publication date: 2022-07-21
Also published as: CN115087966A

Abstract

A data recovery method, apparatus and device, a medium and a program product. After a service crash processing flow is started, a WAL created by a first RS node in a distributed database before a fault occurs is acquired, the WAL comprising data processing records of a plurality of regions to be subjected to recovery, which regions belong to the first RS node; and then the data processing records, which belong to each region to be subjected to recovery, in an acquired WAL file are written in a recovery file, wherein different data recording areas in the recovery file record the data processing records belonging to different regions to be subjected to recovery, so that the recovery file having the data processing records can be further transmitted to a file system (300) for persistent storage. Therefore, the number of files needing to be transmitted to the file system (300) during a data recovery process can be effectively reduced, such that IO for transmitting files to the file system (300) can be reduced, resource consumption can be decreased, and the data recovery efficiency of a distributed database can be improved.

Description

A data recovery method, device, equipment, medium and program product

This application claims the priority of the Indian patent application with the application number 202131001638 and the invention titled "DISTRIBUTED DATABASE SYSTEM RECOVERY MECHANISM OPTIMIZATION METHOD AND SYSTEM" filed with the Indian Patent Office on January 13, 2021, the entire contents of which are incorporated herein by reference Applying.

technical field

The embodiments of the present application relate to the technical field of databases, and in particular, to a data recovery method, apparatus, device, medium, and program product.

Background technique

A distributed database, such as an HBase database, usually includes a master (master) node and at least one partition server (Region Server, RS) node. Among them, the master node is used to allocate the partition (region) that the RS node is responsible for for each RS node, and the number of partitions allocated to each RS node can be one or more; The write request executes the corresponding data write process. Among them, when the RS node writes data, it can first write the data write request into the write ahead log (Write ahead log, WAL), and after the writing is successful, insert the data to be written into the memory of the RS node; When the amount of data in the memory of the RS node reaches a preset threshold, the RS node can persistently store the data in the memory.

When the master node detects that there is an RS node failure, such as the RS node is powered off, maintenance restarts, etc., the master node will restore the unpersisted data in the RS node according to the WAL log. However, the number of files generated by the RS node in the process of restoring data is large, which causes a large number of files to be transmitted between the RS node and the file system when the generated files are transferred from the RS node to the file system for persistent storage. interaction, resulting in low data recovery efficiency and high resource consumption of RS nodes.

SUMMARY OF THE INVENTION

In view of this, the embodiments of the present application provide a data recovery method, so as to improve the data recovery efficiency of the RS node and reduce resource consumption. The present application also provides corresponding apparatuses, devices, computer-readable storage media, and computer program products.

In the first aspect, the embodiment of the present application provides a data recovery method. Specifically, after the service crash processing process is started, the WAL log created by the first RS node in the distributed database before the failure occurs, and the WAL log includes: Data processing records of multiple partitions to be restored belonging to the first RS node, and then writing the data processing records belonging to each partition to be restored in the obtained WAL file into the restoration file, wherein different data records in the restoration file The area records belong to the data processing records of different partitions to be restored, so that the restoration files with the data processing records can be further transferred to the file system for persistent storage.

Since the data processing records in each to-be-restored partition are written into the corresponding data record areas in the same restoration file, when data restoration is performed based on one (or multiple) WAL files, only one WAL file can be transferred to the file system. Recover files without performing multiple separate transfers for multiple files. In this way, the number of files that need to be transferred to the file system 300 in the data recovery process can be effectively reduced, thereby reducing the IO for transferring files to the file system, reducing resource consumption, and increasing the scalability of the distributed database. At the same time, the data recovery efficiency of the distributed database will also be improved due to the reduction in the number of interactions with the file system.

In a possible implementation manner, the data recovery method for the multiple partitions to be recovered of the first RS node may be performed on the first RS node after failure recovery, that is, after the first RS node fails and resumes operation, it may Perform the data recovery process yourself. Alternatively, the data recovery method may also be performed by other devices in the distributed database, where the other devices include a second RS node in the distributed database that is not faulty, and the second RS node represents any RS node that is not faulty; or , the other device may refer to a specific device in the distributed database for performing fault data recovery, and may be pre-configured in the distributed database by a technician.

In a possible implementation manner, the recovery file includes index information, where the index information is used to indicate the position offset in the recovery file of the data recording area corresponding to each partition to be recovered. In this way, when the data in the partition to be recovered is recovered based on the data processing records stored in the recovery file, the position of the data processing records belonging to each partition to be recovered can be determined in the recovery file according to the index information, so that the recovery can be determined. The data processing records belonging to the partition to be restored in the file.

In a possible implementation manner, the index information is specifically used to indicate the position offsets in the recovery file of sub-files corresponding to multiple partitions to be recovered, and the sub-files are used to store data processing records in the partitions to be recovered, And different sub-files are used to store different data processing records. In this way, the index information can be used to determine the sub-files belonging to each to-be-restored partition in the recovery file, so that the data processing records in the sub-files corresponding to each to-be-restored partition can be used to achieve data recovery of the to-be-restored partition; A large number of sub-files are generated during the data recovery process, but after multiple sub-files are packaged into one recovery file, the number of files transferred to the file system can be reduced, thereby reducing the number of files transferred from the first RS node to the file system. IO, reduce resource consumption, and increase the scalability of distributed databases.

In a possible implementation, when the recovery file is used to store the data processing records, it is not necessary to generate a sub-file to store the data processing records belonging to the partition to be recovered, and the data processing records may be directly stored in the recovery file. In this way, the number of files to be created during data recovery based on the WAL file can be reduced, thereby reducing the process of creating, moving, and deleting files, and improving data recovery efficiency.

In a possible implementation manner, before transferring the recovery file to the file system, the recovery file can be stored in the cache, so that the recovery file stored in the cache can be used to provide services for the clients of the distributed database. Clients entering the distributed database provide corresponding services such as reading and writing data. In this way, when the distributed database provides services, it is possible to obtain recovery files without reading the file system remotely, thereby reducing resource consumption of remote bandwidth.

In a possible implementation, after the recovery file is transferred to the file system, the recovery file stored in the cache is emptied. In this way, the cache resources are released, and the long-term occupation of the cache resources during the data recovery process is avoided as much as possible.

In a possible implementation manner, during the data recovery process, the data recovery instruction sent by the master node may be obtained first, and the data recovery process is implemented under the instruction of the data recovery instruction. Wherein, the data recovery instruction is used to instruct to perform data recovery on a plurality of partitions to be recovered in the first RS node. For example, after detecting that the first RS node is faulty, the master node may issue the data recovery instruction to the first RS node.

In a possible implementation manner, the format of the restored file is the format of the archive file, such as "*.har" format, ".tar" format, and so on.

In a second aspect, the present application provides a data recovery device, the data recovery device includes: an acquisition module for acquiring the first partition server RS node in the distributed database created before the failure occurs after the service crash processing flow SCP is started The write-ahead log WAL file, the WAL file includes data processing records belonging to a plurality of partitions to be restored in the first RS node; a writing module is used to process the data of each partition to be restored in the WAL file The records are written into a recovery file, wherein different data recording areas in the recovery file record data processing records belonging to different partitions to be recovered; a transmission module is used to transmit the recovery file to a file system for persistent storage.

In a possible implementation manner, the apparatus is applied to the first RS node after failure recovery, or applied to the execution of other devices in the distributed database, wherein the other devices include the distributed database. The second RS node in the distributed database that is not faulty, or a specific device in the distributed database for performing faulty data recovery.

In a possible implementation manner, the recovery file includes index information, where the index information is used to indicate the position offset of the data recording area corresponding to each partition to be recovered in the recovery file.

In a possible implementation manner, the index information is specifically used to indicate the position offsets in the restoration file of subfiles corresponding to the multiple partitions to be restored, and the subfiles are used to store the to-be-restored subfiles Data processing records in the partition, different sub-files are used to store different data processing records.

In a possible implementation manner, the apparatus further includes: a storage module, for storing the recovery file in a cache before transmitting the recovery file to the file system; a service module for using the The recovery file stored in the cache provides services to clients of the distributed database.

In a possible implementation manner, the apparatus further includes: a data clearing module, configured to clear the restored file stored in the cache after transferring the restored file to the file system.

In a possible implementation manner, the acquiring module is further configured to acquire a data recovery instruction sent by the master node, where the data recovery instruction is used to instruct to perform data recovery on multiple partitions to be recovered in the first RS node recover.

In a third aspect, the present application provides a computing device including a processor, a memory and a display. The processor and the memory communicate with each other. The processor is configured to execute the instructions stored in the memory to cause the computing device to perform the data recovery method as in the first aspect or any one of the implementations of the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, when the computer-readable storage medium runs on a computing device, the computing device causes the computing device to perform the first aspect or any one of the first aspect. A data recovery method described in an implementation manner.

In a fifth aspect, the present application provides a computer program product containing instructions, which, when run on a computing device, enables the computing device to execute the data recovery method described in the first aspect or any implementation manner of the first aspect .

On the basis of the implementation manners provided by the above aspects, the present application may further combine to provide more implementation manners.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are only some implementations described in the present application. For example, for those skilled in the art, other drawings can also be obtained from these drawings.

1 is a schematic diagram of the architecture of an exemplary distributed database of the application;

2 is a schematic diagram of data processing records belonging to different partitions in the split WAL file;

3 is a schematic diagram of the relationship between the MTTR of the distributed database 100 and the number of RS nodes;

4 is a schematic diagram of storing different data processing records in the WAL file in different data recording areas in the recovery file;

5 is a schematic flowchart of a data recovery method according to an embodiment of the present application;

6 is a schematic diagram of the corresponding relationship between the partition to be restored and the data recording area in the restored file;

7 is a schematic flowchart of another data recovery method provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a data recovery apparatus according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a hardware structure of a computing device according to an embodiment of the present application.

Detailed ways

The solutions in the embodiments provided in this application will be described below with reference to the accompanying drawings in this application.

The terms "first", "second" and the like in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged under appropriate circumstances, and this is only a distinguishing manner adopted when describing objects with the same attributes in the embodiments of the present application.

As shown in FIG. 1 , it is a schematic diagram of the architecture of an exemplary distributed database 100 . The distributed database 100 includes a master node 101 , an RS node 102 and an RS node 103 . The master node 101 is used to divide the data stored and managed by the distributed database 100 to obtain multiple partitions, each partition includes one or more pieces of data, and the data belonging to different partitions are usually different. As an implementation example of partitioning, when storing and managing each piece of data in the distributed database 100, part of the content of the piece of data may be used as a primary key corresponding to the piece of data, and the primary key It is used to uniquely identify this piece of data in the distributed database 100, so that the master node 101 can perform interval division according to the possible value range of the primary key, and each divided interval corresponds to a partition. For example, assuming that the value range of the primary key in the distributed database 100 is [0, 1000000], the primary node 101 can divide the value range of the primary key into 100 intervals, which are [0, 10000), [10000, 20000), ..., [980000, 99000), [990000, 1000000], each partition can be used to store 10,000 pieces of data, correspondingly, based on the 100 partitions, the distributed database 100 can store and manage 1 million pieces of data. At the same time, the master node 101 is also used for allocating partitions to the RS node 102 and the RS node 103 , and the partitions allocated to each RS node can be maintained through the management table created by the master node 101 . The RS node 102 and the RS node 103 are respectively used to execute data read and write services belonging to different partitions. As shown in FIG. 1 , the RS node 102 executes data read and write services belonging to the partitions 1 to N, while the RS node 103 executes the data read and write services belonging to the partition N+1. Data read and write services to partition M. It is worth noting that, in FIG. 1, the distributed database 100 includes one master node 101 and two RS nodes as an example for illustration. In other possible distributed databases 100, the number of master nodes 101 and RS nodes may also be Any value, which is not limited in this application.

The master node 101 , the RS node 102 and the RS node 103 can all be implemented by hardware or software. As some examples, when the master node 101 and each RS node are implemented by hardware, both the master node 101 and the multiple RS nodes may be physical servers in the distributed database 100 . That is, during actual deployment, at least one server in the distributed database 100 may be configured as the master node 101, and other servers in the distributed database 100 may be configured as RS nodes. Alternatively, when the master node 101 and each RS node are implemented by software, the master node 101 and multiple RS nodes may be processes or virtual machines running on one or more devices (eg, servers, etc.), respectively.

In practical applications, the distributed database 100 can be used as a local resource to provide local data read and write services to clients accessing the distributed database 100 through the master node 101 , the RS node 102 and the RS node 103 . Alternatively, the distributed database 100 can also be deployed in the cloud. In this case, the master node 101 , the RS node 102 and the RS node 103 can provide cloud services for reading and writing data to clients accessing the cloud.

The distributed database 100 may be connected to the client 200 and the file system 300 respectively, for example, the connection may be performed through a wireless communication protocol such as HyperText Transfer Protocol (HTTP). Assuming that the client 200 needs to modify data or write new data to the RS node 102, the client 200 can send a data write request to the RS node 102, and the data write request carries the data to be written in the distributed database 100. Data (hereinafter referred to as data to be written) and corresponding data processing operation content (such as write operation, modification operation, etc.). After receiving the data writing request, the RS node 102 may first generate a corresponding data processing record based on the data to be written and the data processing operation in the data writing request, and write the data processing record into a pre-created WAL file middle. After determining that the WAL file is successfully written, the RS node 102 persistently stores the WAL file in the file system 300; for example, the file system 300 may use a data structure such as log structured merge trees (LSM Trees). Store WAL files. At the same time, the RS node 102 inserts the data to be written in the data processing record into the memory 1021 of the RS node 102 . For example, the RS node 102 can first determine the primary key corresponding to the data processing record, and determine which partition to write the data processing record into according to the partition interval to which the value of the key belongs, so that the RS node 102 can process the data The data to be written in the record is inserted into the storage area corresponding to the partition in the memory 1021 ; then, the RS node 102 can report to the client 200 that the data writing is successful.

Normally, the RS node 102 writes data into the memory 1021 for one or more clients 200, so that the amount of data temporarily stored in the memory 1021 will increase continuously. When the amount of data in the memory 1021 reaches a preset threshold, the RS node 102 can persistently store the data in the memory 1021 to the file system 300 , specifically, the data can be written to the distributed file system 300 in the form of a file. As some examples, the file system 300 may be, for example, a distributed file system (distributed file system, DFS), a Hadoop distributed file system (hadoop distributed file system, HDFS), etc., which is not limited in this embodiment.

Further, the RS node 102 is also configured with region store files for each partition, and after persistently storing the data in the file system 300, the RS node 102 can store the data of each partition in the file system 300. The file stored in the partition is added to the partition storage file corresponding to the partition, specifically, the file name corresponding to each data in the partition is added under the directory of the partition storage file. Then, the RS node 102 may merge and delete the files included in each partition storage file, so as to eliminate the old version data in the partition storage file. For example, if the client 200 requests the RS node 102 to write data A at time T1, and requests the RS node 102 to replace data A with data B at time T2 (T1<T2), the partition storage file may include the file corresponding to data A a and the file b corresponding to the data B, and when the RS node 102 merges the partition storage files, it may specifically merge the file a and the file b into the file b, that is, only the file b corresponding to the new version of the data B is retained.

In practical application, the RS nodes in the distributed database 100 will inevitably suffer from failures such as power failure, maintenance restart, etc., which causes the data in the RS node memory to be lost. Therefore, when the master node 101 detects that the RS node 102 fails, usually A process called a server crash procedure (SCP) will be started to perform data recovery for the RS node 102, specifically to recover data lost in the memory 1021 of the RS node 102. Specifically, the SCP process will identify all the WAL files belonging to the failed RS node 102 from the file system 300, and designate the normal running RS node 103 to split the data processing records in each WAL file. As shown in FIG. 2, the RS node 103 specifically splits the data processing records belonging to different partitions in the RS node 102 in the WAL file, and then creates a separate recovery file for each partition to save the WAL file belonging to the partition. The data processing record is used to recover the data in the partition by using the data processing record in the recovery file (for example, the data is recovered by playing back the data processing operation, etc.), and the client 200 can continue to provide corresponding services based on the recovered data. Then, the RS node 103 persistently stores the recovery files corresponding to each partition to the file system one by one. Specifically, when transferring each restored file, the RS node 103 first sends a notification message to the file system 300 to notify the file system 300 that file transfer is currently required; after receiving the feedback from the file system 300, the RS node 103 restores the file Transfer to the file system 300; finally, after determining that the recovery file is successfully transferred to the file system 300, the RS node 103 sends a shutdown notification to the file system 300 to notify the file system 300 to end the transfer of the recovery file.

Normally, the RS node 103 generates N recovery files for each WAL file to record the data belonging to the N partitions in the WAL file. If the number of WAL files belonging to the faulty RS node 102 is M, then the number of WAL files belonging to the faulty RS node 102 is M. When the RS node 102 performs data recovery, the RS node 103 needs to create (M*N) recovery files. In this way, the RS node 103 needs to sequentially perform the above-mentioned file transfer process for (M*N) restored files. In practical application, if the number of RS nodes that need to perform data recovery is P, then the distributed database 100 is in the process of performing data recovery for P RS nodes, and the total number of recovery files that need to be generated and transmitted to the file system 300 is (M *N*P).

Due to the large number of restored files that need to be transmitted to the file system in the data restoration process, the RS node 103 needs to perform the above-mentioned notification, transmission, and closing processes for each restored file, which makes the RS node 103 transmit the file to the file system 300. The input/output (IO) of the file system is relatively large, and the system calls and resource consumption of the file system 300 are relatively high. At the same time, a large number of interaction processes between the RS node 103 and the file system 300 will also lead to low data recovery efficiency of the distributed database 100 . Moreover, as the scale of the RS nodes in the distributed database 100 increases, the mean time to recover (MTTR) of the distributed database 100 increases gradually (such as approximately exponentially increasing, etc.), which The scalability of the distributed database 100 is limited. For example, in the actual test, the data recovery test results for 100 RS nodes show that the relationship between the MTTR of the distributed database 100 and the number of RS nodes included in the distributed database 100 is as shown in FIG. 3, which is approximately exponential growth , the scalability of the distributed database 100 is low.

To this end, the embodiments of the present application provide a data recovery method, so as to improve data recovery efficiency and reduce resource consumption. Specifically, after the SCP process is started (and data recovery is also urgently started), the RS node 103 first obtains the WAL file created by the RS node 102 before the failure occurs, and the WAL file includes the N partitions to be restored belonging to the RS node 102 Each partition to be restored is also a partition in which data loss occurs in the memory 1021 of the RS node 102 that has failed. Then, the RS node 103 may write the data processing records of the respective partitions to be restored in each WAL file into the restoration file, wherein the restoration file includes a plurality of data recording areas, and the records of different data recording areas belong to different The data processing record of the partition to be restored is shown in Figure 4. Finally, the RS node 103 persistently stores the recovery file integrated with the data processing records in one or more WAL files to the file system 300 . Correspondingly, the data of each partition lost in the memory 1021 of the RS node 102 is the data to be written in the data processing records belonging to the partition in each recovery file.

Since the data processing records in each to-be-restored partition are written into the corresponding data record areas in the same restoration file, when data restoration is performed based on one (or multiple) WAL files, only one WAL file can be transferred to the file system. Recover files without performing multiple transfers for multiple files. In this way, when the number of WAL files belonging to the faulty RS node 102 is M, in the process of data recovery for the RS node 102, the number of recovered files to be transmitted can be reduced from (M*N) to M (or to value less than M), thereby effectively reducing the number of files that need to be transferred to the file system 300 in the data recovery process, thereby reducing the IO of the RS node 103 to transfer files to the file system 300, reducing resource consumption, and increasing the number of files in the distributed database 100. Scalability, and at the same time, the data recovery efficiency of the distributed database 100 is also improved because the number of interactions between the RS node 103 and the file system 300 is reduced.

Next, various non-limiting specific implementations of data recovery are described in detail.

Referring to FIG. 5 , it is a schematic flowchart of a data recovery method in an embodiment of the present application. This method can be applied to any RS node that can operate normally in the distributed database 100 shown in FIG. 1, including the RS node 103 and the RS node 102. For example, when the RS node 102 resumes operation after restarting, the RS node 102 can automatically operate according to the WAL File recovery of lost data in memory 1021. Alternatively, the method may also be applied to a device separately configured in the distributed database 100 and specifically for performing fault data recovery. This embodiment does not limit this. For ease of description, the following description will be given by taking the data recovery method performed by the RS node 103 as an example. The data recovery method shown in FIG. 5 may specifically include:

S501: The master node 101 determines that the RS node 102 is faulty, and instructs the RS node 103 to perform data recovery.

As an implementation example, each RS node in the distributed database 100 may periodically send a heartbeat message to the master node 101 . In this way, when the master node 101 normally receives the heartbeat message sent by the RS node 102, it can determine that the RS node 102 is not faulty; and when the master node 101 does not receive the heartbeat message from the RS node 102, it can determine that the RS node 102 is faulty .

In other possible fault detection methods, when the RS node 102 in the data processing system 100 fails and does not lose the communication function with the master node 101 (for example, the RS node restarts abnormally, etc.), the RS node 102 can The master node 101 sends a failure notification to notify the master node 101 of its failure. In this embodiment, the specific implementation manner of how the master node 101 implements the detection of the faulty RS node is not limited.

In practical application, when the master node 101 determines that the RS node is faulty, the master node 101 can start the SCP process, identify the failed RS node 102, and indicate other RS nodes that have not failed, that is, the RS node 103 in FIG. 1, Perform the appropriate data recovery procedure. Exemplarily, the master node 101 may send a data recovery instruction to the RS node 103, so as to use the data recovery instruction to instruct the RS node 103 to perform data recovery on multiple partitions to be recovered in the RS node 102.

In practical applications, when there are other normally operating RS nodes other than the RS node 103 in the distributed database 100, the master node 101 may instruct the other RS nodes to perform the data recovery process. Exemplarily, when there are a large number of non-faulty RS nodes in the distributed database 100, the master node 101 may instruct the RS node with the least load to perform the data recovery process according to the load of each non-faulty RS node.

S502: The RS node 103 obtains a WAL file created by the RS node 102 before the failure occurs, where the WAL file includes data processing records belonging to multiple partitions to be restored in the RS node 102.

Exemplarily, when instructing the RS node 103 to perform data recovery, the master node 101 may send the identification of the failed RS node 102 to the RS node 103, for example, the RS node 102's identity label (Identity, ID), factory serial number, etc. It is provided to the RS node 103 to instruct the RS node 103 for which RS node to perform data recovery, and can further obtain the WAL file belonging to the faulty RS node 102 by accessing the file system 300 . Since the WAL file is created by the RS node 102 before the failure occurs, the WAL file records the data written to each partition in the RS node 102, so that the RS node 103 can use the obtained WAL file for the RS node 102. Data recovery is performed on each partition (for the convenience of distinction, the partition in the RS node 102 is hereinafter referred to as the partition to be recovered). Wherein, the number of WAL files belonging to the RS node 102 may be one or more.

As an example of obtaining the WAL file, when storing data for each RS node, the file system 300 may create a folder for each RS node, and add the WAL file created by the RS node to the folder corresponding to the RS node. In this way, when the RS node 103 obtains the WAL file created by the RS node 102, it can access the folder corresponding to the RS node 102 in the file system 300 to obtain the required WAL file. Of course, the RS node 103 may also obtain the WAL file in other ways. For example, when the WAL file is created by each RS node stored in the file system 300, the file name of the WAL file may include the identifier of the RS node (such as the RS node's identifier). name, etc.), so that the file system 300 can find out the WAL file with the ID of the RS node 102 in the file system 300 . In this embodiment, the specific implementation manner for the RS node 103 to acquire the WAL file is not limited.

S503: The RS node 103 writes the data processing records belonging to each to-be-restored partition in the WAL file into the restoration file, wherein different data recording areas in the restoration file record data processing records belonging to different to-be-restored partitions.

As an example, after obtaining the WAL file, the RS node 103 can determine whether data loss occurs in the memory 1021 of the RS node 102 according to the WAL file, and restore the data according to the WAL file when it is determined that there is data loss in the memory 1021 Lost data in the memory 1021 of the RS node 102 . For example, the RS node 103 can determine from the acquired WAL file whether there is a data processing record without a persistent mark, and if so, it indicates that the data to be written in the part of the data processing record has not been persistently stored in the memory 1021 to the file system 300, so that the RS node 103 can restore the data in the memory 1021 based on these data processing records without persistent marks. However, for the data processing records with persistent flags, it indicates that the data to be written in these data processing records has completed the process of being persisted and stored by the memory 1021 to the file system 300, and the RS node 103 does not need to store these data processing records in the file system 300. The to-be-written data is restored to the memory 1021.

Further, in the WAL file obtained by the RS node 103, there may be some data processing records in the WAL file irrelevant to the data written in the memory 1021 before the failure of the RS node 102, as included in the data processing records in some WAL files. The data to be written is the data of the old version, and the data in the memory 1021 of the node 102 before the failure is the data of the new version, etc.; or, the data to be written included in the data processing record in some WAL files is the RS node 102. Data of partitions that have been deleted. Since the data processing records in this part of the WAL files have little effect on data recovery, the RS node 103 can also filter the acquired WAL files to reduce the number of WAL files involved in the data recovery calculation, thereby reducing data recovery. The amount of computation that needs to be performed in the process reduces resource consumption. Exemplarily, when the RS node 103 filters the data processing records of the old version, it can obtain the data processing records with the same primary key in the WAL file, and determine the data processing records of the old version according to the time stamps corresponding to each data processing record. , so as to filter the data processing records of the old version and retain the data processing records of the new version. When the RS node 103 filters the data of the deleted partition, it can match according to whether the partition identifier (such as the partition name, etc.) in each data processing record in the WAL file is the partition identifier of any partition to be restored currently. The partition identifier of 1 does not match the partition identifier of each partition. At this time, the partition identifier of the data processing record is usually the partition identifier of the partition that has been deleted in the RS node 102, and the RS node 103 determines to filter the data processing record.

Normally, the data processing records written in the WAL file and written to the RS node 102 may belong to different partitions to be restored of the RS node 102, and the RS node 102 may be different according to the data belonging to the different partitions to be restored before the failure occurs. Partitioned clients provide corresponding data read (and write) services. Correspondingly, when the RS node 103 performs data recovery, it can split the data processing records in the WAL file, and determine the data processing records belonging to each partition to be recovered.

In a possible implementation manner, the data processing records recorded in the WAL file may exist in the form of key-value pairs (key-value, KV), and different key-value pairs may belong to different Partition to be restored. Exemplarily, the key (key) in the key-value pair may indicate the to-be-restored partition in the WAL file, for example, the identifier of the to-be-restored partition, etc.; the value (value) in the key-value pair belongs to the to-be-restored partition. data processing records. Then, when the RS node 103 splits the data processing records for each WAL file, it can read each key-value pair in the WAL file, and determine the value in the key-value pair according to the key in the key-value pair The partition to be restored belongs to.

For each data processing record in the WAL file belonging to different partitions to be restored, the RS node 103 may store them in the same file (hereinafter referred to as the restoration file), that is, the RS node 103 performs data disassembly for one or more WAL files For time-sharing, only one recovery file is created for it to record the data processing records of different partitions to be recovered. The recovery file includes multiple non-overlapping data recording areas, and each data recording area is used to record data processing records belonging to one partition to be recovered. Of course, different data recording areas are used to record data belonging to different partitions to be recovered. Process records. The recovery file may include index information, and the index information may be used to indicate the position offset of the data recording area corresponding to each partition to be recovered in the recovery file.

In a possible implementation manner, the position space between two adjacent position offsets indicated in the index information can be used as a data recording area for recording data processing records belonging to the same partition to be restored. The offset may be, for example, the head address of the data recording area or the like. For example, assuming that the RS node 102 includes 4 partitions to be restored, and the location space for recording data processing records in the restored file includes logical addresses A to E, then the partition to be restored and the data recording area in the restored file are Correspondence can be as shown in Figure 6: belong to the data processing record in the subarea f1 to be restored, be stored in the data recording area [logical address A, logical address B); belong to the data processing record in the subarea f2 to be restored, be stored In the data recording area [logical address B, logical address C); belong to the data processing record in the partition f3 to be restored, be stored in the data recording area [logical address C, logical address D); belong to the partition f4 to be restored. The data processing record is stored in the data recording area [logical address D, logical address E]. The index information included in the recovery file may include 4 key-value pairs, wherein the key in key-value pair 1 is the identifier of the partition f1 to be restored, the value in key-value pair 1 is the logical address A, and the index information contains and so on for the rest of the key-value pairs. In this way, for the data processing records belonging to each to-be-restored partition obtained by splitting the WAL file, after writing them into the data recording area corresponding to the to-be-restored partition in the restoration file, the data recording area where the data processing records are located can be used subsequently. The partition to be restored to which the data belongs is determined, and the data processing records belonging to different to-be-restored partitions can be distinguished without creating a separate file for each to-be-restored partition. In this way, the number of files that the RS node 103 needs to create when performing data recovery based on the WAL file can be reduced, thereby reducing processes such as file creation, movement, and deletion, and improving data recovery efficiency.

In the above embodiment, each data record area in the recovery file can be directly used to store data processing records, and in other possible embodiments, the RS node 103 can also create a separate sub-file for each partition to be restored. , and the sub-file corresponding to each partition to be restored is used to record the data processing records belonging to the partition to be restored in the WAL file. In this way, for M WAL files and N partitions to be restored, the number of sub-files generated by the RS node 103 is (M*N). In order to minimize the large number of interactions generated during the transmission of a large number of sub-files to the file system 300, the RS node 103 can package a plurality of sub-files into a recovery file, so that the RS node 103 transmits the sub-files to the file system 300. When there are multiple sub-files, the file transfer process can be performed only once, thereby reducing the number of interactions between the RS node 103 and the file system 300; moreover, changes to existing solutions can be minimized to improve the feasibility of solution implementation. As an implementation example, for each WAL file, the RS node 103 may write the created N sub-files (corresponding to the N partitions to be restored) into the corresponding data recording areas in the restored file. The index information may specifically be the position offset (that is, the data recording area) of the subfile corresponding to each to-be-restored partition when it is stored in the file restoration. Wherein, the sub-file of each partition to be restored is used to record the data processing records belonging to the partition to be restored, and the data processing records recorded by different sub-files are different. Of course, in other examples, the N sub-files corresponding to the multiple WAL files may all be written into the same recovery file, which is not limited in this embodiment.

S504: The RS node 103 transmits the recovery file to the file system 300, so that the file system 300 can persistently store the recovery file.

The RS node 103 can perform split processing on a plurality of WAL files based on the above process, so that data processing records belonging to each to-be-restored partition can be recovered, and the recovered files can be persistently stored in the file system. In this way, when the RS node 103 performs data recovery for the N partitions to be recovered based on the M WAL files, the number of files (or the number of times of transferring files) to the file system 300 does not exceed M.

For the data lost in the memory 1021 belonging to each to-be-restored partition, the RS node 103 can play back the data processing operations in each data processing record corresponding to the to-be-restored partition, and recover the data lost in the memory 1021 of the RS node 102 when the failure occurs . In this way, the RS node 103 can provide the client 200 with services for reading data, writing data, or deleting data based on the recovered data to be written belonging to the respective partitions to be restored. query of incoming data, etc. While the RS node 103 is performing data recovery, the RS node 103 may provide the client 200 with services such as writing data, deleting data, and the like.

Further, after the RS node 103 persistently stores the recovery file corresponding to the WAL file in the file system 300, it can realize the re-online of the partition to be recovered based on these recovery files, that is, the RS node 103 can re-distribute the partition based on the partition to be recovered. The client of the database 100 provides services such as read and write, and informs the master node 101 that the partition to be restored is included in the partition currently running normally, so that the master node 101 manages the to-be-restored partition.

In one example, when the RS node 103 goes online to the target to-be-restored partition, it can read the recovery file corresponding to the target to-be-restored partition from the file system 300, and re-launch the to-be-restored partition based on the data processing record in the recovery file.

In another example, the RS node 103 may implement the partition to be restored back online based on the restoration file in the cache. Specifically, the RS node 103 is configured with a cache, and after generating the corresponding restoration file based on the WAL file, the RS node 103 may store the restored file in the cache of the RS node 103 before transmitting the restoration file to the file system. In this way, when the target partition to be recovered is brought back online, the RS node 103 can directly read the recovery file from the cache, and use the data processing records recorded in the recovery file that belong to the target partition to be recovered to provide data reading for the client 200 Services (including queries, modifications, etc. involving read data). In this way, when the RS node 103 re-launches the target to-be-restored partition, it can read the restoration file and the relevant information (such as file size, data length, etc.) from the distributed file system 300 without using a remote call. Thereby, system calls and corresponding resource consumption can be reduced.

Further, when the recovery file stored in the cache of the RS node 103 is persistently stored in the file system 300, or after the RS node 103 uses the recovery file in the cache to complete the reproduction of the partition to be recovered, the RS node 103 can clear the cache. , so as to release the buffer resources of the RS node 103 and avoid long-term occupation of the buffer resources of the RS node 103 during the data recovery process as much as possible. As some examples, the format of the recovery file may be, for example, the format of an archive (archival) file, such as "*.har" format, ".tar" format, and the like. In this embodiment, the specific implementation of the archive file format is not limited.

In practical application, after the RS node 103 re-launches the target partition to be restored, it can also notify the master node 101 to update the partition in the management table that the RS node 103 is allocated to and can provide services, so that the master node 101 can update the partition based on the updated management table. The management table further manages the partitions allocated to each RS node. If the master node 101 determines that the number of partitions allocated to some RS nodes is too large based on the updated management table, the partial partitions on the part of the RS nodes can be transferred to other RS nodes to balance the allocation of partitions on each RS node, etc.

Further, after the RS node 103 persistently stores the recovery file in the distributed file system 300, it can also update the partition storage file corresponding to each partition to be recovered, specifically the file in the directory of the partition storage file. Merge and delete to remove old versions of data stored in the partition.

It should be noted that, in the above-mentioned embodiments, the data recovery process performed by the RS node 103 that is not faulty is taken as an example for description. In other possible embodiments, data recovery for the RS node 102 may also be performed by a device independently configured in the distributed database 100 . Alternatively, when the RS node 102 resumes operation after a failure, the master node 101 may also instruct the RS node 102 to perform data recovery. Hereinafter, another data recovery method provided by the embodiment of the present application will be described by taking the RS node 102 recovering operation after a failure recovering data in its own partition as an example.

Referring to FIG. 7, it is a schematic flowchart of a data recovery method. The method is mainly applied to the RS node 102, and the method may specifically include:

S701: After determining that the RS node 102 is faulty, the master node 101 further determines whether the RS node 102 resumes operation within a preset time period.

S702: When it is determined that the RS node 102 resumes operation within the preset time period, the master node 101 sends a data recovery instruction to the RS node 102 to instruct the RS node 102 to perform data recovery on the multiple to-be-restored partitions.

Wherein, for the specific implementation process of the master node 101 detecting the faulty RS node, reference may be made to the description of the relevant part of step 501 in the foregoing embodiment, which will not be repeated here.

In this embodiment, after determining that the RS node 102 is faulty, the master node 101 can preferentially wait for whether the RS node 102 can resume normal operation within a preset time period (eg, 3 minutes, 5 minutes, etc.) after the fault occurs. If the RS node 102 can resume operation after the failure, the master node 101 can arrange the RS node 102 to perform the failure recovery by itself, and if the RS node 102 fails to resume operation in time after the failure, the master node 101 can arrange other RS nodes The node (eg, the RS node 103 in the aforementioned embodiment) is arranged to perform fault recovery for the partition to be recovered in the RS node 102 .

Wherein, when instructing the RS node 102 to restore data by itself, the master node 101 may send a data restoration instruction to the RS node 102 to instruct the RS node 102 to restore the data by itself.

S703: The RS node 102 obtains the WAL file created by the RS node 102 from the file system 300, where the WAL file includes data processing records belonging to multiple to-be-restored partitions in the RS node 102.

During specific implementation, the RS node 102 accesses the WAL folder corresponding to the RS node 102 in the file system 300 according to the received data recovery instruction, and reads the created multiple WAL files from the WAL folder. Normally, the WAL files created by the RS node 102 in the process of writing data for the client 200 are added to the WAL folder corresponding to the RS node 102. Therefore, the RS node 102 can The WAL file performs data recovery for multiple partitions to be recovered of the RS node 102 .

It is worth noting that, after the RS node 102 resumes operation, it can read the WAL file from the local file system 300, and can obtain the WAL file from the file system 300 without remote access. In this way, distributed distribution can be effectively reduced. The remote network bandwidth occupied by the database during data recovery.

Further, the RS node 102 can also filter the acquired WAL files to filter out some WAL files unrelated to the data written in the memory 1021 before the failure of the RS node 102, thereby reducing the number of WAL files involved in the data recovery calculation. In turn, the amount of computation that needs to be performed in the data recovery process can be reduced, and resource consumption can be reduced.

S704: The RS node 102 writes data processing records belonging to multiple partitions to be restored in the WAL file into the restoration file, wherein different data recording areas in the restoration file record data processing records belonging to different partitions to be restored.

In this embodiment, the RS node 102 can directly write the data processing records belonging to each partition to be restored in the WAL file into the pre-created restoration file, and the restoration file includes a plurality of different data recording areas, and each The to-be-restored partitions all correspond to at least one data recording area in the restoration file, and data recording areas corresponding to different to-be-restored partitions may not overlap. Correspondingly, when the RS node 102 records the data processing belonging to the to-be-restored partition in the write-recovery file, it may specifically write the data processing record into the data recording area corresponding to the to-be-restored partition in the recovery file. In this way, the subsequent RS node 102 can determine the to-be-restored partition to which the data processing record belongs based on the position of the data in the restoration file. In practical application, the corresponding relationship between the partition to be restored and the data recording area in the restored file can be recorded by corresponding index information, and the index information can be integrated into the restored file.

Alternatively, in other possible implementations, the RS node 102 may create a sub-file for each partition to be restored when splitting the data processing records in each WAL file, and the sub-file corresponding to each partition to be restored The file is used to record data processing records belonging to the to-be-restored partition in the WAL file. At this time, for the M WAL files and the N partitions to be restored, the RS node 102 may create (M*N) sub-files. In this embodiment, for the relatively large number of subfiles, the RS node 102 may add multiple subfiles to the same recovery file, for example, the RS node 102 may package N subfiles into one recovery file, so as to (M*N) sub-files, the RS node 102 will package to obtain M recovery files. For each recovery file, it includes a plurality of data recording areas and corresponding index information. Wherein, the sub-file corresponding to the target to-be-restored file may be added by the RS node 102 to the data recording area corresponding to the target to-be-restored partition in the restored file, and the difference between the to-be-restored partition and the data recording area in the restored file is different. The one-to-one correspondence can be recorded through the index information in the recovery file. In this way, when the RS node 102 transmits the multiple sub-files to the file system 300, since the multiple sub-files are packaged into a restored file, the RS node 102 can perform the file transfer process only once, thereby reducing the number of files between the RS node 102 and the file. The number of interactions between the systems 300; and, changes to existing solutions can be minimized, and the feasibility of solution implementation can be improved.

S705: The RS node 102 transmits the recovery file to the file system 300 for persistent storage, and realizes the re-online of the partition to be recovered.

Wherein, when the RS node 102 re-launches the target to-be-restored partition, the RS node 102 can read the recovery file corresponding to the target to-be-restored partition from the file system 300, and re-launch the to-be-restored partition based on the data processing records in the recovery file. Alternatively, when the RS node 102 performs data recovery, the generated recovery file may be stored in the cache, so that the RS node 102 may directly read the recovery file from the cache, and use the data recorded in the recovery file and belonging to the target partition to be recovered The processing record provides the client 200 with a data read service.

The data recovery method provided by the embodiments of the present application is described above with reference to FIGS. 1 to 7 . Next, the data recovery apparatus and computing device for performing the above data recovery provided by the embodiments of the present application are described with reference to the accompanying drawings.

FIG. 8 is a schematic structural diagram of a data recovery apparatus provided by the present application. The data recovery apparatus 800 can be applied to any node in the above-mentioned distributed database, such as the RS node 102 that resumes operation after a failure or an RS that does not fail. The node 103, or the device or the like applied to the distributed database 100 that is specifically used to perform fault recovery data. Wherein, the data recovery device 800 includes:

The obtaining module 801 is configured to obtain the write-ahead log WAL file created by the first partition server RS node in the distributed database before the failure occurs after the service crash processing process is started, and the WAL file includes files belonging to the first RS node. The data processing records of multiple partitions to be restored in ;

The writing module 802 is used to write the data processing records of each partition to be recovered in the WAL file into the recovery file, wherein different data recording area records in the recovery file belong to the data processing records of different partitions to be recovered;

The transmission module 803 is configured to transmit the recovery file to the file system for persistent storage.

In a possible implementation manner, the data recovery apparatus 800 is applied to the first RS node (such as the RS node 102 in FIG. 1 ) after failure recovery, or is applied to other nodes in the distributed database equipment execution, wherein the other equipment includes a second RS node (such as the RS node 103 in FIG. 1 ) that is not faulty in the distributed database, or a specific one in the distributed database for performing fault data recovery device of.

In a possible implementation manner, the restoration file includes index information, where the index information is used to indicate the position offset of the data recording area corresponding to each partition to be restored in the restoration file.

In a possible implementation manner, the data recovery apparatus 800 further includes:

a storage module 804, configured to store the restored file in a cache before transmitting the restored file to the file system;

The service module 805 is configured to provide services for the clients of the distributed database by using the restored files stored in the cache.

In a possible implementation, the device further includes:

The data clearing module 806 is configured to clear the restored file stored in the cache after the restored file is transmitted to the file system.

In a possible implementation manner, the acquiring module 801 is further configured to acquire a data recovery instruction sent by the master node, where the data recovery instruction is used to instruct the execution of multiple to-be-recovered partitions in the first RS node. Data Recovery.

The data recovery apparatus 800 according to the embodiments of the present application may correspond to executing the methods described in the embodiments of the present application, and the above-mentioned and other operations and/or functions of the various modules of the data recovery apparatus 800 are for realizing the RS in FIG. 5 and FIG. 7 , respectively. The node 102 or the RS node 103 executes the corresponding process in the method, which is not repeated here for brevity.

Figure 9 provides a computing device. As shown in FIG. 9 , the computing device 900 may be, for example, the RS node 102 or the RS node 103 that does not fail in the previous embodiment, or a device specifically used for executing failure recovery data in the distributed database 100 etc., and the computer device 900 can be specifically used to implement the functions of the data recovery apparatus 800 in the above-mentioned embodiment shown in FIG. 8 .

Computing device 900 includes bus 901 , processor 902 and memory 903 . A bus 901 communicates between the processor 902 and the memory 903.

The bus 901 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 9, but it does not mean that there is only one bus or one type of bus.

The processor 902 can be a central processing unit (central processing unit, CPU), a graphics processing unit (graphics processing unit, GPU), a microprocessor (micro processor, MP), or a digital signal processor (digital signal processor, DSP), etc. any one or more of the devices.

The memory 903 may include volatile memory, such as random access memory (RAM). The memory 903 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, hard drive (hard drive, HDD) or solid state drive (solid state drive) , SSD).

The memory 903 stores executable program codes, and the processor 902 executes the executable program codes to execute the data recovery method performed by the RS node 102 or the RS node 103 to which the data recovery apparatus 800 is applied.

Embodiments of the present application also provide a computer-readable storage medium. The computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that contains one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like. The computer-readable storage medium includes instructions that instruct a computing device to perform the data recovery method described above.

The embodiments of the present application also provide a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computing device, all or part of the processes or functions described in the embodiments of the present application are generated.

The computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer or data center. (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) to another website site, computer or data center.

The computer program product can be a software installation package, which can be downloaded and executed on a computing device when any of the aforementioned object recognition methods needs to be used.

The descriptions of the processes or structures corresponding to each of the above-mentioned drawings have their own emphasis, and for the parts that are not described in detail in a certain process or structure, reference may be made to the related descriptions of other processes or structures.

Claims

A data recovery method, characterized in that the method comprises:

After the service crash processing process SCP is started, obtain the write-ahead log WAL file created by the first partition server RS node in the distributed database before the failure occurs, and the WAL file includes a plurality of pending log files belonging to the first RS node. Data processing records of recovery partitions;

Write the data processing records of each partition to be recovered in the WAL file into the recovery file, wherein different data recording area records in the recovery file belong to the data processing records of different partitions to be recovered;

The recovery file is transferred to the file system for persistent storage.
The method according to claim 1, wherein the method is executed by the first RS node after failure recovery, or executed by other devices in the distributed database, wherein the other devices include The second RS node in the distributed database that is not faulty, or a specific device in the distributed database for performing faulty data recovery.
The method according to claim 1 or 2, wherein the restoration file includes index information, and the index information is used to indicate the position offset of the data recording area corresponding to each partition to be restored in the restoration file quantity.
The method according to claim 3, wherein the index information is specifically used to indicate the position offsets of the sub-files respectively corresponding to the plurality of partitions to be restored in the restoration file, and the sub-files use It is used to store data processing records in the partition to be restored, and different subfiles are used to store different data processing records.
The method according to any one of claims 1 to 4, wherein the method further comprises:

before transferring the recovery file to the file system, storing the recovery file in a cache;

Clients of the distributed database are served using the recovery files stored in the cache.
The method according to claim 5, wherein the method further comprises:

After the recovery file is transferred to the file system, the recovery file stored in the cache is emptied.
The method according to any one of claims 1 to 6, wherein the method further comprises:

Acquire a data recovery instruction sent by the master node, where the data recovery instruction is used to instruct to perform data recovery on multiple to-be-recovered partitions in the first RS node.
A data recovery device, characterized in that the device comprises:

The obtaining module is used to obtain the write-ahead log WAL file created by the first partition server RS node in the distributed database before the failure occurs after the service crash processing process SCP is started, and the WAL file includes files belonging to the first RS node. The data processing records of multiple to-be-restored partitions in ;

The writing module is used to write the data processing records of each partition to be recovered in the WAL file into the recovery file, wherein different data recording area records in the recovery file belong to the data processing records of different partitions to be recovered;

A transmission module, configured to transmit the recovery file to a file system for persistent storage.
The apparatus according to claim 8, wherein the apparatus is applied to the first RS node after failure recovery, or applied to the execution of other devices in the distributed database, wherein the other devices Including the second RS node in the distributed database that is not faulty, or a specific device in the distributed database for performing faulty data recovery.
The apparatus according to claim 8 or 9, wherein the restoration file includes index information, and the index information is used to indicate the position offset of the data recording area corresponding to each partition to be restored in the restoration file quantity.
The device according to claim 10, wherein the index information is specifically used to indicate the position offsets of the sub-files corresponding to the plurality of partitions to be restored in the restored file, and the sub-files use It is used to store data processing records in the partition to be restored, and different subfiles are used to store different data processing records.
The device according to any one of claims 8 to 11, wherein the device further comprises:

a storage module, configured to store the restored file in a cache before transmitting the restored file to the file system;

A service module, configured to provide services for clients of the distributed database by using the restored files stored in the cache.
The apparatus of claim 12, wherein the apparatus further comprises:

A data clearing module, configured to clear the restored file stored in the cache after transferring the restored file to the file system.
The apparatus according to any one of claims 8 to 13, wherein the acquiring module is further configured to acquire a data recovery instruction sent by the master node, where the data recovery instruction is used to instruct the first RS node data recovery from multiple partitions to be recovered.
A computing device, comprising a processor and a memory;

The processor is configured to execute instructions stored in the memory to cause the computing device to perform the method of any one of claims 1 to 7.
A computer-readable storage medium comprising instructions which, when executed on a computing device, cause the computing device to perform the method of any one of claims 1 to 7.
A computer program product comprising instructions which, when run on a computing device, cause the computing device to perform the method of any one of claims 1 to 7.