CN117472573A

CN117472573A - Data processing method, device and computer equipment

Info

Publication number: CN117472573A
Application number: CN202311451772.2A
Authority: CN
Inventors: 鲁伟; 范佳; 马立珂; 王子骏
Original assignee: Guangzhou Dingjia Computer Technology Co ltd
Current assignee: Guangzhou Dingjia Computer Technology Co ltd
Priority date: 2023-11-02
Filing date: 2023-11-02
Publication date: 2024-01-30

Abstract

The embodiment of the application provides a data processing method, a data processing device, computer equipment, a storage medium and a computer program product, and relates to the technical field of computers. The method comprises the following steps: responding to a data processing request triggered by a target user, and acquiring data to be processed contained in the data processing request; under the condition that the data to be processed is of a PVC type, acquiring a data processing strategy aiming at the data to be processed; distributing data to be processed to at least one processing node according to a data processing strategy; and processing the data to be processed on each processing node by using at least one processing node. The method can improve the speed of data processing, and further, can improve the efficiency of data processing.

Description

Data processing method, device and computer equipment

Technical Field

The present application relates to the field of computer technology, and in particular, to a data processing method, apparatus, computer device, storage medium, and computer program product.

Background

Data processing of a typical data processing system, such as a kubernetes cluster, may include metadata processing and persistent data (persistent storage (Persistent Volume Claim, PVC) type) processing. Taking data backup as an example, the kubernetes backup steps are as follows: 1. backup metadata, 2, backup persistent data (PVC type). The backed-up metadata may be backed-up to a remote store, while the persistent data is typically maintained in a distributed store. In general, metadata can be read directly through kubernetes API, while persistent data is mainly in two cases, local volumes and network volumes. For the data backup of the network volume in the PVC type persistent data, a single backup node is generally used for executing the backup process of the network volume data in the current data backup method, and the problems of low data backup speed and low efficiency exist.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a data processing method, apparatus, computer device, storage medium, and computer program product.

In a first aspect, the present application provides a data processing method applied to a first processing node in a data processing system. The method comprises the following steps:

responding to a data processing request triggered by a target user, and acquiring data to be processed contained in the data processing request;

acquiring a data processing strategy aiming at the data to be processed under the condition that the data to be processed is of a PVC type;

distributing the data to be processed to at least one processing node according to the data processing strategy;

and completing the processing of the data to be processed on each processing node by utilizing the at least one processing node.

In one embodiment, the data processing request comprises a data backup request; the data to be processed comprises data to be backed up; the data processing strategy comprises a data backup strategy; the processing node comprises a backup node; the first processing node is a first backup node, and the method includes: responding to the data backup request triggered by the target user, and acquiring the data to be backed up contained in the data backup request; acquiring the data backup strategy aiming at the data to be backed up under the condition that the data to be backed up is of a PVC type; distributing the data to be backed up to at least one backup node according to the data backup strategy; and backing up the data to be backed up on each backup node to an external memory corresponding to the data processing system by utilizing the at least one backup node.

In one embodiment, the data to be backed up is a local type of the PVC types; the at least one backup node comprises the first backup node; the distributing the data to be backed up to at least one backup node according to the data backup policy includes: and distributing the data to be backed up to the first backup node.

In one embodiment, the data to be backed up is a network volume type in the PVC type; the at least one backup node comprises the first backup node and at least one second backup node; the data backup strategy comprises a first data backup strategy; the first data backup strategy is distributed evenly; the distributing the data to be backed up to at least one backup node according to the data backup policy includes: and evenly distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node.

In one embodiment, the data backup policy includes a second data backup policy; the second data backup strategy is distributed proportionally based on the load capacity of each backup node; the distributing the data to be backed up to at least one backup node according to the data backup policy further includes: acquiring the data volume of the data to be backed up of the network volume type and the load capacity of each backup node; acquiring the distribution proportion of each backup node according to the data volume and the load capacity of each backup node; and distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node according to the distribution proportion.

In one embodiment, after the data to be backed up is distributed to at least one backup node according to the data backup policy, the method further includes: acquiring distributed backup data corresponding to each backup node according to the data backup strategy; creating temporary pod corresponding to each backup node; and Long Gesuo, distributing backup data corresponding to the backup nodes, and mounting cloned distributed backup data to temporary pod corresponding to each backup node.

In one embodiment, the data processing request comprises a data recovery request; the data to be processed comprises data to be recovered; the data processing policy includes a data recovery policy; the processing node comprises a recovery node; the first processing node is a first recovery node, and the method includes: responding to the data recovery request triggered by the target user, and acquiring the data to be recovered contained in the data recovery request; acquiring the data recovery strategy aiming at the data to be recovered under the condition that the data to be recovered is of the PVC type; distributing the data to be recovered to at least one recovery node according to the data recovery strategy; and copying the data to be restored on the external memory corresponding to the data processing system to the newly created PVC resource by utilizing the at least one restoring node.

In one embodiment, after the distributing the data to be recovered to at least one recovery node according to the data recovery policy, the method further includes: acquiring allocation recovery data corresponding to each recovery node according to the data recovery strategy; creating temporary pod corresponding to each recovery node; and mounting the allocation recovery data corresponding to each recovery node to the temporary pod corresponding to each recovery node.

In a second aspect, the present application provides a data processing apparatus for use with a first processing node in a data processing system. The device comprises:

the first acquisition module is used for responding to a data processing request triggered by a target user and acquiring data to be processed contained in the data processing request;

the second acquisition module is used for acquiring a data processing strategy aiming at the data to be processed under the condition that the data to be processed is of a PVC type;

the distribution module is used for distributing the data to be processed to at least one processing node according to the data processing strategy;

and the processing module is used for completing the processing of the data to be processed on each processing node by utilizing the at least one processing node.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

The data processing method, the device, the computer equipment, the storage medium and the computer program product can be applied to a first processing node in a data processing system, a target user can trigger a data processing request at a data processing client, and the first processing node can respond to the data processing request to acquire data to be processed contained in the data processing request; then, the first processing node may determine the data to be processed, and obtain a data processing policy for the data to be processed when determining that the data to be processed is of a PVC type, where the data processing policy may include a identifier of the data to be processed, an allocation policy of the data to be processed, and the like; further, the first processing node may distribute the data to be processed to at least one processing node according to a data processing policy; thus, the first processing node may utilize at least one processing node to complete processing of the data to be processed on each processing node. In the method provided by the embodiment of the application, the data to be processed can be distributed to at least one processing node for processing, so that the speed of data processing can be improved, and further, the efficiency of data processing can be improved.

Drawings

FIG. 1 is a block diagram of a data processing system according to one embodiment;

FIG. 2 is a flow chart of a data processing method according to an embodiment;

FIG. 3 is a flowchart of a data backup method according to an embodiment;

FIG. 4 is a flow chart of a data backup task for the data to be backed up according to an embodiment;

FIG. 5 is a flowchart of a data recovery method according to an embodiment;

FIG. 6 is a flow chart of a task for performing data recovery for the data to be recovered according to one embodiment;

FIG. 7 is a block diagram of a data processing apparatus according to one embodiment;

fig. 8 is an internal structural diagram of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The data processing method provided by the embodiment of the application can be applied to a data processing system shown in fig. 1. The data processing system can be a Kubernetes cluster, and the Kubernetes is an open-source container programming platform for automatically deploying, expanding and managing containerized applications, and provides a powerful container programming tool which can help developers and operation and maintenance teams to manage containerized applications more easily. The Kubernetes cluster may include: ding Jia Di Bei (backup), external memory (storage), agent pod, app pod, related pod, and Kubernetes APIs, etc. The Ding Jia Di Bei (backup) is an http service and is used for interacting with a target user, and the target user can initiate data backup and data recovery through interacting with the backup; the external memory (storage) may be an external memory corresponding to the Kubernetes cluster, and may be used to store backup data of the Kubernetes cluster; the agent is deployed in kubernetes and is a service for actually executing data backup and data recovery, and each processing node in kubernetes has an agent service; pod is the smallest deployable unit of Kubernetes, contains one or more containers, and shares network and storage resources; the related pod contains create/delete/attach/detach PVC; persistent storage (Persistent Volume Claim, PVC) is an object in Kubernetes that is used to apply for persistent storage resources, in Kubernetes clusters, pod can request persistent storage through PVC and mount it for use in Pod; the Kubernetes API is a Kubernetes system component; the App pod may be a temporary pod for mounting a newly created PVC. In addition, the Kubernetes cluster includes a plurality of processing nodes, and there is a correspondence between the agent pod and the processing nodes, and in one possible implementation, the data backup and data recovery operations for the Kubernetes cluster may be performed with a first processing node of the plurality of processing nodes as an execution body.

In an exemplary embodiment, as shown in fig. 2, a data processing method is provided, which is illustrated by taking an example that the method is applied to the first processing node in fig. 1, and includes the following steps 202 to 208.

Wherein:

step S202, responding to a data processing request triggered by a target user, and acquiring data to be processed contained in the data processing request.

The target user may be, among other things, a data processing system, such as a system operator of a Kubernetes cluster. The target user can interact with the backup in the Kubernetes cluster to trigger a data processing request, and the data processing request can contain a to-be-processed data identifier, a to-be-processed data type, a processing type and the like and is used for requesting to process the to-be-processed data. The data types of the data to be processed may include, but are not limited to, metadata types, persistent data types, and the like, which may include local storage volume types and network storage volume types. The backup can respond to a data processing request triggered by a target user, a processing task creation request aiming at the data processing request is sent to the first processing node, and the first processing node can create a data processing task aiming at the data processing request after receiving the processing task processing request; the data processing task may then be performed by the first processing node.

In one possible implementation, the data processing request may be a data backup request for data in the Kubernetes cluster, the data to be processed may be data to be backed up, and the data processing task may be a data backup task; in another possible implementation, the data processing request may be a data recovery request for data in the Kubernetes cluster, the data to be processed may be data to be recovered, and the data processing task may be a data recovery task.

Step S204, in the case that the data to be processed is PVC type, the data processing strategy for the data to be processed is obtained.

The PVC type may be a persistent data type, and in the case that the data to be processed is a persistent data type, a data processing policy for the data to be processed may be obtained, where the data processing policy may include a data identifier to be processed, a data type to be processed, and an allocation policy for the data to be processed, where the data identifier to be processed is used to mark and distinguish the data to be processed; the data type to be processed is used for representing the data type of the data to be processed; the allocation policy may be used to direct the allocation of the data to be processed to at least one processing node, which may be used to perform processing on the data to be processed.

Step S206, distributing the data to be processed to at least one processing node according to the data processing strategy.

Wherein the at least one processing node is operable to perform processing for data to be processed. The at least one processing node may include the first processing node, which may be an execution body in the Kubernetes cluster that performs data processing, and at least one second processing node, which may interact with the backup, and may be other processing nodes in the Kubernetes cluster than the first processing node, and may be configured to process data to be processed allocated to each of the second processing nodes based on a processing instruction of the first processing node.

Step S208, the processing of the data to be processed on each processing node is completed by using at least one processing node.

In one possible implementation manner, in the case of the data processing request being a data recovery request, the processing for the data to be processed on each processing node may be data recovery of the corresponding data to be processed by each processing node; in another possible implementation manner, in the case that the data processing request is a data recovery request, the processing for the data to be processed on each processing node may be data recovery of the corresponding data to be processed by each processing node.

In the method of this embodiment, the method may be applied to a first processing node in a data processing system, where a target user may trigger a data processing request at a data processing client, and the first processing node may respond to the data processing request to obtain data to be processed included in the data processing request; then, the first processing node may determine the data to be processed, and obtain a data processing policy for the data to be processed when determining that the data to be processed is of a PVC type, where the data processing policy may include a identifier of the data to be processed, an allocation policy of the data to be processed, and the like; further, the first processing node may distribute the data to be processed to at least one processing node according to a data processing policy; thus, the first processing node may utilize at least one processing node to complete processing of the data to be processed on each processing node. In the method provided by the embodiment of the application, the data to be processed can be distributed to at least one processing node for processing, so that the speed of data processing can be improved, and further, the efficiency of data processing can be improved.

In an exemplary embodiment, as shown in fig. 3, the data processing request in step 202 may include a data backup request, in which case the data processing request may include the following steps 302 to 308. Wherein:

Step S302, responding to a data backup request triggered by a target user, and acquiring data to be backed up contained in the data backup request.

Wherein the data processing request comprises a data backup request; the data to be processed comprises data to be backed up; the data processing policy includes a data backup policy; the processing node comprises a backup node; the first processing node is a first backup node. The target user may be a data processing system, such as a system operator of a Kubernetes cluster. The target user can interact with backup in the Kubernetes cluster to trigger a data backup request, and the data backup request can contain a data identifier to be backed up, a data type to be backed up, a backup type and the like and is used for requesting to backup the data to be backed up. The data types of the data to be backed up may include, but are not limited to, metadata types, persistent data types, and the like, which may include local storage volume types and network storage volume types. The backup can respond to a data backup request triggered by a target user, a backup task creation request aiming at the data backup request is sent to the first backup node, and the first backup node receives the backup task backup request and can create a data backup task aiming at the data backup request; the data backup task may then be performed by the first backup node.

In some possible implementations, as shown in fig. 4, the first backup node performing a data backup task for the data to be backed up may include the following steps:

1. receiving a backup task creating request sent by backup; wherein the creation backup task request is determined by backup according to the data backup request of the target user.

2. Traversing all uniform resource locators (Uniform Resource Locator, URLs) in the Kubernetes API according to the backup task to obtain group information of resources corresponding to the backup task, determining list information according to the group information, and determining resource information corresponding to the data to be backed up according to the list information.

3. And saving the resource information of the data to be backed up, and saving the resource information of the data to be backed up to an external memory.

It should be noted that, the resource information of the data to be backed up is first stored in the backup node, and then the resource information of the data to be backed up is stored in the external memory.

Specifically, kubernetes-based data backup includes three phases: creating a backup task, and executing backup and environment restoration by the first backup node. Creating a backup task stage: the target user creates a backup task through the data processing client, the data processing client sends the backup task to the backup, the backup creates a backup request according to the backup task to be created to the first backup node, the first backup node feeds back a response request to the backup, and the backup feeds back the response request to the data processing client. The first backup node performs a backup phase: traversing api and apis of the Kubernetes by the first backup node to acquire group information returned by the Kubernetes; traversing the acquired group information by the first backup node to acquire resource list information returned by the Kubernetes; the first backup node traverses the acquired resource list to acquire the resource information returned by the Kubernetes. When the resource information of the data to be backed up is PVC type, cloning the request of the PVC to the Kubernetes, and storing the PVC to each backup node; each backup node sends a request to Kubernetes to initiate a temporary pod to mount the cloned PVC; and each backup node stores the resource information corresponding to the temporary pod of the cloned PVC and the mounted cloned PVC to the storage. Environmental recovery: each backup node makes a request to Kubernetes to delete cloned PVC and temporary pod. It should be noted that, backup is performed mainly through http interaction between each backup node and Kubernetes API to traverse all URLs, and data to be backed up is read according to requirements; in the case where the data to be backed up is of the PVC type and is based on a container storage interface (Container Storage Interface, CSI), cloning of the volume is performed. A new pod is restored to mount the cloned volume, the pod is scheduled to the same physical machine as each backup node pod (each backup node is given enough authority), and each backup node copies the volume directory to the remote storage.

Step S304, under the condition that the data to be backed up is of a PVC type, a data backup strategy aiming at the data to be backed up is obtained.

The PVC type may be a persistent data type, and in the case that the data to be backed up is a persistent data type, a data backup policy for the data to be backed up may be obtained, where the data backup policy may include a data identifier to be backed up, a data type to be backed up, and an allocation policy for the data to be backed up, and the data identifier to be backed up is used to mark and distinguish the data to be backed up; the data type to be backed up is used for representing the data type of the data to be backed up; the allocation policy may be used to direct the allocation of the data to be backed up to at least one backup node, which may be used to perform a backup for the data to be backed up.

Step S306, distributing the data to be backed up to at least one backup node according to the data backup strategy.

Wherein the at least one backup node may be adapted to perform a backup for data to be backed up. The at least one backup node may include the first backup node and at least one second backup node, where the first backup node may be an execution body for performing data backup in the Kubernetes cluster and may interact with backup, and the at least one second backup node is another backup node in the Kubernetes cluster other than the first backup node and may be configured to backup data to be backed up allocated to each second backup node based on a backup instruction of the first backup node. When the data to be backed up is the local storage volume data in the persistent data, the data to be backed up can be distributed to the first backup node according to a data backup strategy; in the case where the data to be backed up is network storage volume data in persistent data, the data to be backed up may be distributed to at least one backup node according to a data backup policy.

Step S308, using at least one backup node to backup the data to be backed up on each backup node to the external memory corresponding to the data processing system.

Wherein each backup node sends a request to Kubernetes to initiate a temporary pod of the PVC on which the clone is mounted; and each backup node stores the resource information corresponding to the temporary pod of the cloned PVC and the mounted cloned PVC to the storage.

In the method of the embodiment, the method can be applied to a first backup node in a data processing system, a target user can trigger a data backup request at a data processing client, and the first backup node can respond to the data backup request to acquire data to be backed up contained in the data backup request; then, the first backup node may determine the data to be backed up, and obtain a data backup policy for the data to be backed up when determining that the data to be backed up is of a PVC type, where the data backup policy may include a data identifier to be backed up, an allocation policy of the data to be backed up, and the like; furthermore, the first backup node may distribute the data to be backed up to at least one backup node according to a data backup policy; therefore, the first backup node can utilize at least one backup node to complete the backup of the data to be backed up on each backup node. According to the method provided by the embodiment of the application, the data to be backed up can be distributed to at least one backup node for backup, so that the speed of data backup can be improved, and further, the efficiency of data backup can be improved.

In an exemplary embodiment, step S306 may include:

and distributing the data to be backed up to the first backup node.

Wherein, in the case that the data to be backed up is the local storage volume data in the persistent data, the at least one backup node may include a first backup node, and the data to be backed up may be allocated to the first backup node according to a data backup policy. The local storage volume data corresponds to a local backup node, which in this embodiment may be the first backup node.

In another exemplary embodiment, step S306 may include:

and evenly distributing the data to be backed up of the network volume type to the first backup node and at least one second backup node.

In the case that the data to be backed up is network storage volume data in persistent data, the at least one backup node may include the first backup node and at least one second backup node, where the first backup node may be an execution body for executing data backup in a Kubernetes cluster and may interact with backup, and the at least one second backup node is another backup node in the Kubernetes cluster except for the first backup node and may be configured to backup the data to be backed up allocated to each second backup node based on a backup instruction of the first backup node. The allocation policy for the data to be backed up in the data backup policy may be an allocation manner specified by the target user in the data backup request, or may be a default allocation manner of the data processing system, for example, an allocation manner of the data to be backed up to the first backup node and at least one second backup node in an average manner.

In yet another exemplary embodiment, step S306 may include:

acquiring the data volume of data to be backed up of a network volume type and acquiring the load capacity of each backup node; acquiring the distribution proportion of each backup node according to the data volume and the load capacity of each backup node; and distributing the data to be backed up of the network volume type to the first backup node and at least one second backup node according to the distribution proportion.

The data size can be used for representing the data size and the required occupied space of the data to be backed up; the load amount may be used to represent the available operational storage space of each backup node; the weight of each backup node may be determined based on the load amount of each backup node, and thus the allocation proportion of each backup node, which represents the proportion of the data to be processed allocated to each backup node to the total data to be processed, may be determined.

In the method of the embodiment, the data to be backed up can be distributed to at least one backup node for backup, so that the speed of data backup can be improved, the efficiency of data backup can be improved, and the appropriate backup node can be selected based on different distribution modes, so that the efficiency of data backup is further improved.

In an exemplary embodiment, the application further includes a step for mounting the data to be backed up to each backup node, where the step specifically includes:

acquiring distributed backup data corresponding to each backup node according to a data backup strategy; creating temporary pod corresponding to each backup node; and cloning the distributed backup data corresponding to each backup node, and mounting the cloned distributed backup data to the temporary pod corresponding to each backup node.

In an exemplary embodiment, as shown in fig. 5, the data processing request in step 202 may include a data recovery request, in which case the data processing request may include the following steps 502 to 508. Wherein:

step S502, responding to a data recovery request triggered by a target user, and acquiring data to be recovered, which is contained in the data recovery request.

Wherein the data processing request comprises a data recovery request; the data to be processed comprises data to be recovered; the data processing policy includes a data recovery policy; the processing node comprises a recovery node; the first processing node is a first recovery node. The target user may be a data processing system, such as a system operator of a Kubernetes cluster. The target user can interact with the backup in the Kubernetes cluster to trigger a data recovery request, and the data recovery request can contain a data identifier to be recovered, a data type to be recovered, a recovery type and the like and is used for requesting recovery of the data to be recovered. The data types of the data to be restored may include, but are not limited to, metadata types, persistent data types, and the like, which may include local storage volume types and network storage volume types. The backup can respond to a data recovery request triggered by a target user, a recovery task creation request aiming at the data recovery request is sent to the first recovery node, and the first recovery node can create a data recovery task aiming at the data recovery request after receiving the recovery task recovery request; the data recovery task may then be performed by the first recovery node.

In some possible implementations, as shown in fig. 6, the first recovery node performing a data recovery task for the data to be recovered may include the following steps:

1. receiving a task creating and recovering request sent by a backup; wherein the create resume task request is determined by the backup from the target user's data resume request.

2. And reading corresponding metadata backup information from the external memory according to the recovery task.

3. And carrying out resource sequencing on the metadata backup information, sending a resource creating request to the Kubernetes API according to the resource sequencing, and copying second resource information corresponding to the resource creating request from an external memory to the Kubernetes API.

Specifically, kubernetes-based data recovery includes three phases: and creating a recovery task, and executing recovery and environment recovery by the first recovery node. Creating a recovery task stage: the target user creates a recovery task through the data processing client, the data processing client sends the creation recovery task to the backup, the backup creates a recovery request according to the creation recovery task and sends the recovery request to the first recovery node, and the first recovery node feeds back a response request to the backup and feeds back the response request to the data processing client. The first recovery node performs a recovery phase: the first recovery node sends a request to the storage to acquire backup information of metadata read according to the creation recovery task, and after the metadata is acquired, resource ordering is carried out on the backup information of the metadata; the first recovery node sends a request to Kubernetes to create resources in a certain order; when the data to be recovered is of the PVC type, the first recovery node sends a request to the Kubernetes to start a temporary pod of the PVC on which the clone is mounted, and sends a request to the storage to copy the persistent volume of the data to be recovered from the storage to the newly created PVC, and finally the temporary pod is deleted. It should be noted that, the restoration is basically the reverse process of backup, and the restoration is performed according to the sequence of the resources, when the restored resources include PVC, the backup volume data at the storage end is restored to PVC after the restoration of the resources (temporary pod needs to be started).

Step S504, in the case that the data to be recovered is PVC type, obtaining the data recovery strategy for the data to be recovered.

The PVC type may be a persistent data type, and in the case that the data to be recovered is a persistent data type, a data recovery policy for the data to be recovered may be obtained, where the data recovery policy may include a data identifier to be recovered, a data type to be recovered, and an allocation policy for the data to be recovered, where the data identifier to be recovered is used to mark and distinguish the data to be recovered; the data type to be recovered is used for representing the data type of the data to be recovered; the allocation policy may be used to direct the allocation of the data to be restored to at least one restoration node, which may be used to perform restoration of the data to be restored.

And step S506, distributing the data to be recovered to at least one recovery node according to the data recovery strategy.

Wherein the at least one recovery node may be configured to perform recovery for the data to be recovered. The at least one recovery node may include the first recovery node, which may be an execution body in the Kubernetes cluster that performs data recovery, and at least one second recovery node, which may interact with the backup, and may be other recovery nodes in the Kubernetes cluster than the first recovery node, and may be configured to recover data to be recovered allocated to each of the second recovery nodes based on a recovery instruction of the first recovery node. When the data to be restored is the local storage volume data in the persistent data, the data to be restored can be distributed to the first restoring node according to a data restoring strategy; in the case where the data to be restored is network storage volume data in persistent data, the data to be restored may be distributed to at least one restoration node according to a data restoration policy.

And step S508, copying the data to be restored on the external memory corresponding to the data processing system to the newly created PVC resource by utilizing the at least one restoring node.

In the case where the data to be restored is of a non-PVC type, the data to be restored is copied from the external memory into the Kubernetes API.

In the method of the embodiment, the method may be applied to a first recovery node in a data processing system, a target user may trigger a data recovery request at a data processing client, and the first recovery node may respond to the data recovery request to obtain data to be recovered included in the data recovery request; then, the first recovery node may determine the data to be recovered, and obtain a data recovery policy for the data to be recovered under the condition that the data to be recovered is determined to be of a PVC type, where the data recovery policy may include a data identifier to be recovered, an allocation policy of the data to be recovered, and the like; further, the first recovery node may distribute the data to be recovered to at least one recovery node according to a data recovery policy; thus, the first recovery node can utilize at least one recovery node to complete recovery of the data to be recovered on each recovery node. According to the method, the data to be recovered can be distributed to at least one recovery node for recovery, the speed of data recovery can be improved, and further, the efficiency of data recovery can be improved.

In an exemplary embodiment, the application further includes a step for mounting the data to be restored to each restoring node, where the step specifically includes:

acquiring allocation recovery data corresponding to each recovery node according to a data recovery strategy; creating temporary pod corresponding to each recovery node; and mounting the allocation recovery data corresponding to each recovery node to the temporary pod corresponding to each recovery node.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a data processing device for realizing the above related data processing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the data processing device provided below may refer to the limitation of the data processing method hereinabove, and will not be repeated herein.

In one embodiment, as shown in fig. 7, there is provided a data processing apparatus including: a first acquisition module 701, a second acquisition module 702, an allocation module 703 and a processing module 704, wherein:

a first obtaining module 701, configured to obtain data to be processed included in a data processing request triggered by a target user, in response to the data processing request;

a second obtaining module 702, configured to obtain a data processing policy for the data to be processed if the data to be processed is of a PVC type;

a distribution module 703, configured to distribute the data to be processed to at least one processing node according to the data processing policy;

and a processing module 704, configured to complete processing of data to be processed on each processing node by using the at least one processing node.

In addition, in one possible implementation, the data processing request includes a data backup request; the data to be processed comprises data to be backed up; the data processing strategy comprises a data backup strategy; the processing node comprises a backup node; the first processing node is a first backup node, and the first obtaining module 701 is further configured to obtain the data to be backed up included in the data backup request in response to the data backup request triggered by the target user; the second obtaining module 702 is further configured to obtain the data backup policy for the data to be backed up if the data to be backed up is of a PVC type; the allocation module 703 is further configured to allocate the data to be backed up to at least one backup node according to the data backup policy; and the processing module 704 is further configured to backup the data to be backed up on each backup node to an external memory corresponding to the data processing system by using the at least one backup node.

In one possible implementation, the data to be backed up is a local type of the PVC types; the at least one backup node comprises the first backup node; the allocation module 703 is further configured to: and distributing the data to be backed up to the first backup node.

In another possible implementation manner, the data to be backed up is a network volume type in the PVC type; the at least one backup node comprises the first backup node and at least one second backup node; the data backup strategy comprises a first data backup strategy; the first data backup strategy is distributed evenly; the allocation module 703 is further configured to: and evenly distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node.

In yet another possible implementation, the data backup policy includes a second data backup policy; the second data backup strategy is distributed proportionally based on the load capacity of each backup node; the allocation module 703 is further configured to: acquiring the data volume of the data to be backed up of the network volume type and the load capacity of each backup node; acquiring the distribution proportion of each backup node according to the data volume and the load capacity of each backup node; and distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node according to the distribution proportion.

An allocation module 703, further configured to: acquiring distributed backup data corresponding to each backup node according to the data backup strategy; creating temporary pod corresponding to each backup node; and Long Gesuo, distributing backup data corresponding to the backup nodes, and mounting cloned distributed backup data to temporary pod corresponding to each backup node.

In another possible implementation, the data processing request includes a data recovery request; the data to be processed comprises data to be recovered; the data processing policy includes a data recovery policy; the processing node comprises a recovery node; the first processing node is a first recovery node, and the first obtaining module 701 is further configured to obtain the data to be recovered included in the data recovery request in response to the data recovery request triggered by the target user; the second obtaining module 702 is further configured to obtain the data recovery policy for the data to be recovered, where the data to be recovered includes the PVC type; the allocation module 703 is further configured to allocate the data to be restored to at least one restoration node according to the data restoration policy; and the processing module 704 is further configured to copy, by using the at least one recovery node, data to be recovered on an external memory corresponding to the data processing system to a newly created PVC resource.

An allocation module 703, further configured to: acquiring allocation recovery data corresponding to each recovery node according to the data recovery strategy; creating temporary pod corresponding to each recovery node; and mounting the allocation recovery data corresponding to each recovery node to the temporary pod corresponding to each recovery node.

Each of the modules in the above-described data processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 8. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data processing related data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data processing method.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A data processing method for a first processing node in a data processing system, the method comprising:

2. The method of claim 1, wherein the data processing request comprises a data backup request; the data to be processed comprises data to be backed up; the data processing strategy comprises a data backup strategy; the processing node comprises a backup node; the first processing node is a first backup node, and the method includes:

responding to the data backup request triggered by the target user, and acquiring the data to be backed up contained in the data backup request;

acquiring the data backup strategy aiming at the data to be backed up under the condition that the data to be backed up is of a PVC type;

distributing the data to be backed up to at least one backup node according to the data backup strategy;

and backing up the data to be backed up on each backup node to an external memory corresponding to the data processing system by utilizing the at least one backup node.

3. The method of claim 2, wherein the data to be backed up is a local one of the PVC types; the at least one backup node comprises the first backup node;

The distributing the data to be backed up to at least one backup node according to the data backup policy includes:

and distributing the data to be backed up to the first backup node.

4. The method of claim 2, wherein the data to be backed up is a network volume type in the PVC type; the at least one backup node comprises the first backup node and at least one second backup node; the data backup strategy comprises a first data backup strategy; the first data backup strategy is distributed evenly;

and evenly distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node.

5. The method of claim 4, wherein the data backup policy comprises a second data backup policy; the second data backup strategy is distributed proportionally based on the load capacity of each backup node;

the distributing the data to be backed up to at least one backup node according to the data backup policy further includes:

Acquiring the data volume of the data to be backed up of the network volume type and the load capacity of each backup node;

acquiring the distribution proportion of each backup node according to the data volume and the load capacity of each backup node;

and distributing the data to be backed up of the network volume type to the first backup node and the at least one second backup node according to the distribution proportion.

6. The method of claim 2, wherein after said distributing said data to be backed up to at least one backup node in accordance with said data backup policy, further comprising:

acquiring distributed backup data corresponding to each backup node according to the data backup strategy;

creating temporary pod corresponding to each backup node;

and Long Gesuo, distributing backup data corresponding to the backup nodes, and mounting cloned distributed backup data to temporary pod corresponding to each backup node.

7. The method of claim 1, wherein the data processing request comprises a data recovery request; the data to be processed comprises data to be recovered; the data processing policy includes a data recovery policy; the processing node comprises a recovery node; the first processing node is a first recovery node, and the method includes:

Responding to the data recovery request triggered by the target user, and acquiring the data to be recovered contained in the data recovery request;

acquiring the data recovery strategy aiming at the data to be recovered under the condition that the data to be recovered is of the PVC type;

distributing the data to be recovered to at least one recovery node according to the data recovery strategy;

and copying the data to be restored on the external memory corresponding to the data processing system to the newly created PVC resource by utilizing the at least one restoring node.

8. The method of claim 7, wherein after distributing the data to be restored to at least one restoration node according to the data restoration policy, further comprising:

acquiring allocation recovery data corresponding to each recovery node according to the data recovery strategy;

creating temporary pod corresponding to each recovery node;

and mounting the allocation recovery data corresponding to each recovery node to the temporary pod corresponding to each recovery node.

9. A data processing apparatus for use with a first processing node in a data processing system, the apparatus comprising:

10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1-8 when the computer program is executed.