WO2023207280A1 - Data recovery method and related apparatus - Google Patents

Data recovery method and related apparatus Download PDF

Info

Publication number
WO2023207280A1
WO2023207280A1 PCT/CN2023/077176 CN2023077176W WO2023207280A1 WO 2023207280 A1 WO2023207280 A1 WO 2023207280A1 CN 2023077176 W CN2023077176 W CN 2023077176W WO 2023207280 A1 WO2023207280 A1 WO 2023207280A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
file
index information
snapshot
information
Prior art date
Application number
PCT/CN2023/077176
Other languages
French (fr)
Chinese (zh)
Inventor
黄恒
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023207280A1 publication Critical patent/WO2023207280A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery

Definitions

  • the present application relates to the field of virtual machine storage technology, and in particular, to a data recovery method and related devices.
  • IT information technology
  • more and more enterprises are transforming the IT infrastructure of their data centers into virtualization and cloud environments, thereby using virtualization technology to improve the use efficiency of computing resources and achieve Elastic computing system architecture.
  • a backup system is generally introduced to back up the data of the virtual machine, and the virtual machine data is backed up to a third-party backup storage device. .
  • the previously backed up data needs to be restored from the backup storage to the production environment.
  • Continuous data protection (CDP) technology is a common virtual machine backup technology.
  • CDP technology can record and save every input/output (IO) operation of a virtual machine. When the virtual machine system fails, It can restore to any moment in the recent period to achieve the IO level recovery point objective (recovery point object, RPO).
  • RPO recovery point object
  • PITR point-in-time recovery
  • Embodiments of the present application provide a data recovery method that enables real-time browsing of target data at a target time without the need for constant attempts to recover data. Users can instantly view the target data content to ensure the accuracy of data recovery.
  • embodiments of the present application provide a data recovery method, which is applied to the first device, including:
  • file index information includes index information of global files in the first device
  • the first device sends the file index information to a second device, and the second device is used to perform data backup;
  • the file index information is updated, and the updated file index information includes updated index information of at least one file in the first device.
  • the file index information includes the file index information of each file in the global file.
  • the file index information of one of the files is for example: "D: ⁇ aaa ⁇ bbb ⁇ cc.txt".
  • the first device when the first device adds, deletes, or modifies at least one file, for example, when an IO write occurs on the first device, the first device intercepts the IO write and sends the IO write data to the second device. .
  • the second device performs CDP processing, writes IO written data to the CDP data volume and log volume, and performs backup storage of the IO written data.
  • the first device updates the file index information and sends the updated file index information to the second device, and the second device performs backup and storage.
  • the first device after the first device generates the index information of the global file (full file index), it monitors whether a file IO event occurs on the first device in real time, and generates the index information (incremental file index) of the files where additions, deletions, and modifications occur in real time. ). Based on the full file index and the incremental file index, the second device (backup device) can display the global file index information to the user so that the user can select the target file that needs to be restored. Users can instantly view the target data content to ensure the accuracy of data recovery.
  • the first device updates the file index information, including;
  • first sub-index information When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device.
  • Index information the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
  • the file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
  • first sub-index information when at least one file is added, deleted, or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the newly added, deleted, or modified file in the first device.
  • the first sub-index information also includes an operation type of the at least one file.
  • the operation type includes one or more of the following: adding, deleting, or modifying.
  • the first sub-index information is, for example, the file index information corresponding to "15:05:22". Specifically, at "15:05:22" the first device detects a new (or written) "cc.txt" file.
  • the file index (or address information) of the "cc.txt" file is "D: ⁇ aaa ⁇ bbb ⁇ cc.txt", and the operation type is "new" (or write).
  • the first device de-duplicates and merges the first sub-index information and the file index information (index information of the global file) to obtain updated file index information.
  • the first device may also obtain the index information of the global file according to instructions or periodically.
  • the method further includes: sending the updated file index information to the second device.
  • embodiments of the present application propose a data recovery method, applied to the second device, including:
  • target snapshot information is determined from a snapshot information set, the target snapshot information includes index information of the target file, the snapshot time of the target snapshot information is earlier than or equal to the target time, and , the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
  • the target proxy volume which provides file input and output IO services, and the target proxy volume is used to obtain the target file;
  • the target file is restored based on the target proxy volume.
  • the third device sends a first restore request to the second device, and the first restore request is used to request to restore the target file at the target time.
  • the snapshot time of the target snapshot information is determined to be T2.
  • the second device determines the target snapshot information (T2).
  • the second device creates the target proxy volume based on the target snapshot information. Specifically: perform link clone (link clone) based on the target snapshot information to obtain the target proxy volume (that is, the copy information of the target snapshot information).
  • the target proxy volume serves as a proxy volume for data access (or file IO).
  • the second device mounts the target proxy volume through a mount server.
  • the second device After the second device completes the target file recovery at the target time, the second device sends the target file to the user.
  • the target file at the target time is sent to the user (third device) in the form of a file stream through Hyper Text Transfer Protocol (HTTP). So that users can browse the target files instantly.
  • HTTP Hyper Text Transfer Protocol
  • the target data at the target time can be browsed in real time without constant trial PITR (data recovery). Users can instantly view the target data content to ensure the accuracy of data recovery.
  • the method before receiving the first recovery request, the method further includes: receiving a second recovery request, where the second recovery request is used to request recovery of the target time. data;
  • the target file index information is determined, the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information satisfies: in the file index information set The recording time with the smallest difference from the target time, the file index information set includes at least one file index information, and the file index information includes index information of global files in the first device;
  • the target file index information is restored, and the target file index information is used to determine the target file.
  • the user selects a target time through the third device, and the target time is used as the time when the user expects to restore the data.
  • the user sends a second recovery request to the second device through the third device, and the second recovery request is used to recover data at the target time.
  • the third device displays the data recovery menu on the browser page.
  • the user selects the target moment at which the data is to be recovered in the data recovery menu.
  • the second device determines the target file index information, which includes the index information of the global file recorded at 15:00:00 (also known as the full file index) and the index information from 15:00:00 to 15:47:28 ( Excluding the first sub-index information (also called incremental file index) recorded at 15:47:28).
  • the target file index information is obtained by merging the above-mentioned full file index and incremental file index.
  • the target file index information may indicate the file structure information of the global file. Therefore, the target file index information may be called the directory structure information of the entire file system.
  • restoring the target file according to the target proxy volume includes:
  • the data of the target proxy volume is rolled back to the target time to obtain the updated target proxy volume.
  • the index information indicated by the updated target proxy volume includes the index information of the target time. Index information of the target file;
  • the target file is restored based on the updated target proxy volume.
  • the second device redoes the log file according to the target time, and then restores the target file at the target time from the memory of the second device. For example: extract the target file of the target time "15:47:25" from the memory of the second device.
  • the second device rolls back the data of the target proxy volume to the target time based on the target time. After rolling back to the target time, the updated target proxy volume is obtained.
  • the updated index information indicated by the target proxy volume includes index information of the target file at the target time.
  • first snapshot information is obtained, where the first snapshot information is the snapshot information of the global file in the first device at the first moment.
  • the method further includes: obtaining second snapshot information, where the second snapshot information is the first snapshot information at a second time. Snapshot information of global files in the device; update the snapshot information set, and the updated snapshot information set includes: the first snapshot information and the second snapshot information.
  • the second device obtains the snapshot information according to the user policy, and the snapshot information is the snapshot information of the global file in the first device at a certain moment.
  • the user policy includes but is not limited to: at a predefined time, the first device obtains file index information, and at the predefined time, the second device obtains snapshot information of the global file.
  • the first device periodically acquires To obtain the file index information, the second device periodically obtains the snapshot information of the global file, and the two acquisition periods are the same. For example, after obtaining the snapshot information, the second device writes the snapshot information (or snapshot data) to the CDP data volume.
  • the second device can mount the snapshot in the mount server. Then, scan the mounted snapshot (that is, scan the file system in the snapshot) to obtain the index information of the global file. The second device backs up and stores the index information of the global file obtained by scanning as file index information.
  • a possible implementation manner of the second aspect also includes:
  • the file index information is associated and saved with the snapshot information set.
  • the embodiment of the present application proposes a data recovery device, which is applied to the first device and includes:
  • a transceiver module configured to obtain file index information, where the file index information includes index information of global files in the first device;
  • a transceiver module also configured for the first device to send the file index information to a second device, and the second device is used to perform data backup;
  • a processing module configured to update the file index information, where the updated file index information includes updated index information of at least one file in the first device.
  • the processing module is also used to monitor whether at least one file is added, deleted or modified in the first device;
  • the processing module is also configured to generate first sub-index information when at least one file is added, deleted or modified in the first device, and the first sub-index information indicates that the first device is added, deleted or modified.
  • the index information of the modified at least one file, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
  • the processing module is further configured to update the file index information according to the first sub-index information, and the updated file index information includes the first sub-index information.
  • the transceiver module is also configured to send the updated file index information to the second device.
  • the embodiment of the present application provides a data recovery device for the second device, including:
  • a transceiver module configured to receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
  • a processing module configured to determine target snapshot information from a snapshot information set according to the first recovery request, where the target snapshot information includes index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to the target snapshot information.
  • the target time is specified, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
  • the processing module is also used to create a target proxy volume according to the target snapshot information
  • the processing module is also used to mount the target proxy volume, the target proxy volume provides file input and output IO services, and the target proxy volume is used to obtain the target file;
  • the processing module is also configured to restore the target file according to the target proxy volume.
  • the transceiver module is also configured to receive a second recovery request, where the second recovery request is used to request recovery of the data at the target time;
  • the processing module is also configured to determine the target file index information according to the second recovery request.
  • the target file index The recording time of the information is earlier than or equal to the target time, and the recording time of the target file index information satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set including at least one file index information, where the file index information includes index information of global files in the first device;
  • the processing module is also used to restore the target file index information, and the target file index information is used to determine the target file;
  • the processing module is also configured to redo the log file according to the target time and restore the target file at the target time;
  • the processing module is also configured to roll back the data of the target proxy volume to the target time according to the target time, and obtain the updated target proxy volume, and the updated index information indicated by the target proxy volume.
  • Index information of the target file including the target time;
  • the processing module is also configured to restore the target file according to the updated target proxy volume.
  • the transceiver module is also configured to obtain first snapshot information, where the first snapshot information is the snapshot information of the global file in the first device at the first moment.
  • the transceiver module is also configured to obtain second snapshot information, where the second snapshot information is the snapshot information of the global file in the first device at the second moment;
  • the processing module is also configured to update the snapshot information set.
  • the updated snapshot information set includes: the first snapshot information and the second snapshot information.
  • a transceiver module further configured to receive file index information from the first device, where the file index information includes index information of global files in the first device;
  • the processing module is also configured to associate and save the file index information with the snapshot information set.
  • embodiments of the present application provide a data recovery device, which has the function of implementing the method of the above-mentioned first aspect or any possible implementation of the first aspect.
  • This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions, such as: transceiver module, processing module and storage module.
  • inventions of the present application provide a data recovery device.
  • the data recovery device includes at least one processor and a memory.
  • the memory stores computer instructions that can be run on the processor. When the computer instructions are described When the processor executes, the processor executes the method described in the above first aspect or any possible implementation manner of the first aspect.
  • embodiments of the present application provide a data recovery device, which has the function of implementing the method of the above second aspect or any of the possible implementations of the second aspect.
  • This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions, such as: transceiver module, processing module and storage module.
  • inventions of the present application provide a data recovery device.
  • the data recovery device includes at least one processor and a memory.
  • the memory stores computer instructions that can be run on the processor. When the computer instructions are described When the processor executes, the processor executes the method described in the above second aspect or any possible implementation manner of the second aspect.
  • embodiments of the present application provide a computer device, which includes at least one processor, a memory, a communication port, a display, and computer execution instructions stored in the memory and executable on the processor.
  • the processor executes the method described in any possible implementation manner of the first aspect or the second aspect.
  • embodiments of the present application provide a computer-readable storage medium that stores one or more computer-executable instructions.
  • the processor executes the above-mentioned first aspect or The method described in any possible implementation manner of the second aspect.
  • embodiments of the present application provide a computer program product (or computer program) that stores one or more computer-executable instructions.
  • the processor executes Any possible implementation method of the first aspect or the second aspect above.
  • the present application provides a chip system, which includes a processor and is used to support a computer device to implement the functions involved in the above aspect.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data for the computer device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • Figure 1a is a schematic diagram of a computer device 100 provided by an embodiment of the present application.
  • Figure 1b is a schematic diagram of an application scenario in the embodiment of the present application.
  • Figure 2 is a schematic diagram of a data recovery method in an embodiment of the present application.
  • Figure 3 is a schematic diagram of block-level CDP
  • Figure 4 is a schematic diagram of the collection scene of file index information involved in the embodiment of the present application.
  • Figure 5 is a schematic diagram of a data recovery scenario in an embodiment of the present application.
  • Figure 6 is a schematic diagram of an embodiment of the data recovery device in the embodiment of the present application.
  • cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software and network within a wide area network or local area network to realize data calculation, storage, processing and sharing.
  • Cloud technology is a general term for network technology, information technology, integration technology, management platform technology and application technology based on the cloud computing business model. It can form a resource pool and use it on demand, which is flexible and convenient. Cloud computing technology will become an important support.
  • the background services of technical network systems require a large amount of computing and storage resources, such as video websites, picture websites and more portal websites. With the rapid development of the Internet industry and applications. In the future, each item may have its own identification mark, which needs to be transmitted to the backend system for logical processing. Data at different levels will be processed separately. All types of industry data require strong system backing support. Only Achieved through cloud computing.
  • Cloud computing refers to the delivery and usage model of IT infrastructure, which refers to obtaining the required resources through the network in an on-demand and easily scalable manner;
  • cloud computing in a broad sense refers to the delivery and usage model of services, which refers to the on-demand and easily scalable method through the network. Get the services you need in an easily scalable way.
  • This kind of service can be Internet technology (IT), related to software and the Internet, or other services.
  • Cloud computing is grid computing, distributed computing, parallel computing, utility computing, network storage technologies, virtualization, load balancing ( load balance) and other traditional computer and network technology development and integration products. Cloud computing has developed rapidly with the development of the Internet, real-time data streams, diversification of connected devices, and the demand for search services, social networks, mobile commerce, and open collaboration. Different from the previous parallel distributed computing, the emergence of cloud computing will conceptually promote revolutionary changes in the entire Internet model and enterprise management model.
  • FIG. 1a is a schematic diagram of a computer device 100 provided by an embodiment of the present application.
  • the computer device 100 can be used in the first device and/or the second device proposed in the embodiments of this application.
  • a computer device 100 includes a processor 102 and a memory 104.
  • the processor 102 is connected to the memory 104 through a double data rate (DDR) bus 103.
  • DDR double data rate
  • different memories 104 may use different data buses to communicate with the processor 102, so the DDR bus 103 may also be replaced with other types of data buses.
  • the embodiment of the present application does not limit the bus type.
  • the computer device 100 also includes various I/O devices, and the processor 102 can access these I/O devices 107 through the PCIe bus 105.
  • the processor (Processor) 102 is the computing core and control core of the computer device 100 .
  • One or more processor cores 204 may be included in the processor 102 .
  • the processor 102 may be a very large scale integrated circuit.
  • An operating system and other software programs are installed in the processor 102, so that the processor 102 can access the memory 104 and various PCIe devices.
  • the Core 204 in the processor 102 may be, for example, a central processing unit (Central Processing unit, CPU), or other specific integrated circuit (Application Specific Integrated Circuit, ASIC).
  • CPU Central Processing unit
  • ASIC Application Specific Integrated Circuit
  • the processor 102 can also be other general-purpose processors, digital signal processing (DSP), application specific integrated circuit (ASIC), field programmable gate array (field programmable gate array, FPGA) or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • Other programmable logic devices discrete gate or transistor logic devices, discrete hardware components, etc.
  • the computer device 100 may also include multiple processors.
  • the memory controller is a bus circuit controller that controls the memory 104 within the computer device 100 and is used to manage and plan data transmission from the memory 104 to the Core 204. Through the memory controller, data can be exchanged between memory 104 and Core204.
  • the memory controller can be a separate chip and connected to the Core204 through the system bus. Those skilled in the art will know that the memory controller can also be integrated into the processor 102, can also be built into the north bridge, or can be an independent memory controller chip. The embodiment of the present invention does not specify the specific location of the memory controller. and limit the form of existence. In actual applications, the memory controller may control necessary logic to write data to or read data from the memory 104 .
  • the memory controller 104 may be a memory controller in a processor system such as a general-purpose processor, a dedicated accelerator, a GPU, an FPGA, or an embedded processor.
  • Memory 104 is the main memory of computer device 100 .
  • the memory 104 is usually used to store various running software in the operating system, input and output data, and information exchanged with external memory.
  • the memory 108 needs to have the advantage of fast access speed.
  • dynamic random access memory DRAM
  • the processor 102 is able to pass the memory controller Access the memory 104 at high speed and perform read and write operations on any storage unit in the memory 104.
  • the memory 104 can also be other random access memories, such as static random access memory (Static Random Access Memory, SRAM), etc.
  • the memory 104 may also be a read-only memory (Read Only Memory, ROM).
  • Read Only Memory ROM
  • the read-only memory for example, it can be a programmable read-only memory (Programmable Read Only Memory, PROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), etc.
  • PROM Programmable Read Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • This embodiment does not limit the number and type of memory 104.
  • the memory 104 can be configured to have a power-saving function.
  • the power-saving function means that the data stored in the memory will not be lost when the system is powered off and then on again.
  • the memory 104 with a power-saving function is called a non-volatile memory.
  • I/O device 107 refers to hardware that can transmit data, and can also be understood as a device connected to an I/O interface. Common I/O devices include network cards, printers, keyboards, mice, etc. All external storage can also be used as I/O devices, such as hard disks, floppy disks, optical disks, etc.
  • the processor 102 can access various IO devices 107 through the PCIe bus 105 . It should be noted that the PCIe bus 105 is just one example and can be replaced by other buses, such as the Unified Bus (UB) bus, etc.
  • UUB Unified Bus
  • Baseboard Management Controller (BMC) 106 can upgrade the firmware of the device, manage the operating status of the device, and troubleshoot faults.
  • the processor 102 can access the baseboard management controller 108 through a PCIe bus or a bus such as USB or I2C.
  • Basic management controller 106 may also be connected to at least one sensor. Obtain status data of computer equipment through sensors, where status data includes: temperature data, current data, voltage data, etc. There is no specific limitation on the type of status data in this application.
  • the baseboard management controller 106 communicates with the processor 102 through the PCIe bus or other types of buses, for example, transmits the acquired status data to the processor 102 for processing.
  • the baseboard management controller 106 can also perform maintenance on the program code in the memory 102, including upgrading or restoring, and so on.
  • the baseboard management controller 106 may also control the power supply circuit or the clock circuit within the computer device 100, etc.
  • the baseboard management controller 106 can manage the computer device 100 in the above manner.
  • the baseboard management controller 106 is only an optional device.
  • the processor 102 can communicate directly with the sensor to directly manage and maintain the computer device.
  • Kernel state and user state are Kernel state and user state.
  • the processor is divided into two permission levels: user mode and kernel mode.
  • the processor When a task or a process executes a system call and is executed in kernel code, we say that the process is in kernel mode. At this time, the processor is executing in the kernel code with the highest privilege level. In the kernel state, the processor can access all data in the memory, including peripheral devices such as hard disks, network cards, etc. The processor can also switch itself from one program to another. When a process is executing the user's own code, it is said to be in user mode. At this time, the processor is running in the user code with the lowest privilege level and can only use the regular processor instruction set and cannot use the processor instruction set that operates hardware resources. In user mode, the processor can only have limited access to memory and is not allowed to access peripheral devices, such as IO reading and writing, network card access, memory application, etc. And processor resources can be obtained by other programs.
  • System call System call is a way for a user-mode process to actively request to switch to the kernel mode.
  • the user-mode process applies to use the service program of the operating system to complete the work through the system call.
  • B. Abnormal When the CPU is executing a program running in user mode, some unknown exception occurs, which will trigger a switch from the current running process to the kernel-related program that handles the exception.
  • Virtualization is the logical representation of resources so that they are not bound by physical limitations. Technology that maps any form of interfaces and resources to another form of interfaces and resources can become virtualization technology. Its implementation is generally to add a virtualization software layer to the system to abstract the lower layer resources into another form of resources and provide them to the upper layer for use.
  • VM Virtual Machine
  • the underlying machine that runs the virtual machine is called the host.
  • the software running on the virtual environment is called the guest. If there is an operating system, correspondingly, the operating system running on the underlying machine is called the host operating system (Host Operating System, HostOS).
  • the operating system running on the virtual machine is called the guest operating system (Guest Operating System, GuestOS).
  • a clone is a copy of an existing virtual machine (VM).
  • the existing virtual machine is called the clone's parent virtual machine.
  • the clone is an individual virtual machine, and the snapshot is a copy of the virtual machine's disk files at a given point in time.
  • Snapshots provide a log of changes to a virtual disk that can be used to restore a virtual machine to a specific point in time in the event of a failure or system error. Simply put, a clone is about making a complete copy of something, while a snapshot is about making an initial copy and then just making simple subsequent changes. Additionally, storage snapshots can be used to create clones.
  • SAN Storage Area Network
  • LUN Logical Unit Number
  • NAS Network Attached Storage
  • NAS Network AttachedStorage
  • clone means making a complete, writable copy of an object.
  • the object can be a file, directory, or volume.
  • a “snapshot” refers to taking an original image at a specific point in time, and then each subsequent image differs only from the previous image.
  • Figure 1b is a schematic diagram of an application scenario in the embodiments of the present application.
  • the data recovery method proposed in the embodiment of the present application can be used in a distributed storage system or a centralized storage system, and the embodiment of the present application does not limit this.
  • the application scenario includes: a first device, a second device and a third device.
  • the first device is also called the production host.
  • the first device is used to generate data that needs to be backed up.
  • the second device is also called a backup host.
  • the second device is used to back up the data generated by the first device.
  • the third device is also called the user host. The user accesses the second device through the third device, and then issues instructions to the second device through the third device for restoring the backup data.
  • the first device, the second device and the third device are deployed on the same physical computer device (the same physical computer device may include one or more physical computer devices, such as a computer device cluster).
  • the first device, the second device and the third device belong to different virtual machines.
  • the first device and the second device are deployed in different physical computer devices.
  • the third device and the first device are the same physical computer device, and the third device provides a user-oriented function window, such as a function window provided by a browser (Web Browser). By accessing this function window, the user issues instructions to the second device for restoring backup data.
  • a user-oriented function window such as a function window provided by a browser (Web Browser).
  • the first device, the second device and the third device are respectively deployed in different physical computer devices.
  • Figure 2 is a schematic diagram of a data recovery method in an embodiment of the present application.
  • a data recovery method proposed in the embodiment of this application includes:
  • the first device obtains file index information.
  • the first device first obtains the file index information of the global file in the first device.
  • the file index information includes the file index information of each file in the global file.
  • the file index information of one of the files is for example: "D: ⁇ aaa ⁇ bbb ⁇ cc.txt".
  • the first device sends file index information to the second device.
  • the first device After the first device obtains the file index information, it sends the file index information to the second device. Backup and save by the second device.
  • the first device updates the file index information.
  • step 201 it is monitored whether at least one file is added, deleted or modified in the first device.
  • first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device.
  • Index information the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying.
  • the file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
  • the first device when the first device adds, deletes, or modifies at least one file, for example, when the first device sends an IO write, the first device intercepts the IO write and sends the IO write data to the second device. .
  • the second device performs CDP processing, writes IO written data to the CDP data volume and log volume, and performs backup storage of the IO written data.
  • the first device updates the file index information and sends the updated file index information to the second device, and the second device performs backup and storage.
  • the updated file index information is shown in Table 1:
  • the first device obtains the file index information of the global file of the first device for the first time at "14:00:00". Then, the first device monitors whether at least one file is added, deleted, or modified in the first device.
  • first sub-index information When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device. Index information, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying.
  • the first sub-index information is, for example, the file index information corresponding to "15:05:22" in Table 1.
  • the first device detects a new (or written) "cc.txt" file at "15:05:22” , the file index (or address information) of the "cc.txt” file is "D: ⁇ aaa ⁇ bbb ⁇ cc.txt", and the operation type is "new" (or write).
  • the file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
  • the first device de-duplicates and merges the first sub-index information and the file index information (index information of the global file) to obtain updated file index information.
  • the first device may also obtain the index information of the global file according to instructions or periodically.
  • the first device sends updated file index information to the second device.
  • the first device sends updated file index information to the second device.
  • the first device sends the file data to the second device, and the second device backs up the file data.
  • the second device obtains snapshot information.
  • the second device obtains the snapshot information according to the user policy, and the snapshot information is the snapshot information of the global file in the first device at a certain moment.
  • the user policy includes but is not limited to: at a predefined time, the first device obtains file index information, and at the predefined time, the second device obtains snapshot information of the global file.
  • the first device periodically obtains file index information
  • the second device periodically obtains snapshot information of global files, and the two acquisition periods are the same.
  • the second device writes the snapshot information (or snapshot data) to the CDP data volume.
  • the second device After the second device obtains the snapshot information, the second device associates and saves the file index information and the snapshot information.
  • the first device obtains the file index information (global file) of the first device for the first time at "14:00:00”, and the second device also obtains a snapshot of the first device's global file at "14:00:00” information. Then, when the second device receives the file index information from the first device, the second device associates and stores the file index information of "14:00:00" with the snapshot information (T1). After the first device updates the file index information, the first device sends the updated file index information to the second device, and the second device performs backup and storage.
  • the first device obtains the snapshot information (T2) and the file index information of "15:00:00", and the second device obtains the file index information of "15:00:00" Stored in association with snapshot information (T2). And so on.
  • the second device can mount the snapshot in the mount server. Then, scan the mounted snapshot (that is, scan the file system in the snapshot) to obtain the index information of the global file. The second device backs up and stores the index information of the global file obtained by scanning as file index information.
  • the second device receives a second recovery request from the third device, and the second recovery request is used to recover data at the target time.
  • the user selects a target time through the third device, and the target time is used as the time when the user expects to restore the data.
  • the user sends a second recovery request to the second device through the third device, and the second recovery request is used to recover data at the target time.
  • the third device displays the data recovery menu on the browser page.
  • the user selects the target moment at which the data is to be recovered in the data recovery menu.
  • the second device determines the target file index information according to the second recovery request.
  • the second device determines the target file index information according to the second recovery request.
  • the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information is The time satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set includes at least one file index information, and the file index information includes the index information of the global file in the first device.
  • the second device determines the target file index information, which includes the index information of the global file recorded at 15:00:00 (also known as the full file index) and the index information from 15:00:00 to 15:47:28 ( Excluding the first sub-index information (also called incremental file index) recorded at 15:47:28).
  • the target file index information is obtained by merging the above-mentioned full file index and incremental file index.
  • the target file index information may indicate the file structure information of the global file. Therefore, the target file index information may be called the directory structure information of the entire file system.
  • the second device sends the target file index information to the third device.
  • the second device sends target file index information to the third device.
  • the third device displays the directory structure of the global file at the target time to the user based on the target file index information.
  • the directory structure of the global files at the target moment is displayed through the browser page, so that the user can select the target file to be restored through the visual interface.
  • the target file can include one or more files.
  • the second device receives the first recovery request from the third device.
  • the first recovery request is used to request recovery of the target time. target file.
  • the third device sends a first restore request to the second device, and the first restore request is used to request to restore the target file at the target time.
  • the second device determines the target snapshot information from the snapshot information set according to the first recovery request.
  • the second device determines the target snapshot information from the snapshot information set according to the first recovery request.
  • the target snapshot information includes the index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to The target time, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information.
  • the snapshot time of the target snapshot information is determined to be T2.
  • the second device determines the target snapshot information (T2).
  • the second device creates the target proxy volume based on the target snapshot information.
  • the second device creates the target proxy volume based on the target snapshot information. Specifically: perform link clone (link clone) based on the target snapshot information to obtain the target proxy volume (that is, the copy information of the target snapshot information).
  • the target proxy volume serves as a proxy volume for data access (or file IO).
  • the second device mounts the target proxy volume.
  • the second device mounts the target proxy volume.
  • the second device mounts the target proxy volume through a mount server.
  • the second device redoes the log file according to the target time and restores the target file at the target time.
  • the second device redoes the log file according to the target time, and then restores the target file at the target time from the memory of the second device. For example: extract the target file of the target time "15:47:25" from the memory of the second device.
  • step 214 is entered (the target proxy volume is rolled back to the target time).
  • step 214 is entered.
  • Step 213 and step 214 are asynchronous processing.
  • the second device rolls back the data of the target proxy volume to the target time according to the target time, and obtains the updated target proxy volume.
  • the second device rolls back the data of the target proxy volume to the target time according to the target time. After rolling back to the target time, the updated target proxy volume is obtained.
  • the updated index information indicated by the target proxy volume includes index information of the target file at the target time.
  • the second device sends the target file to the third device.
  • the second device sends the target file to the third device.
  • the target file at the target time is sent to the third device (user) in the form of a file stream through Hyper Text Transfer Protocol (HTTP). So that users can browse the target files instantly.
  • HTTP Hyper Text Transfer Protocol
  • the target data at the target time can be browsed in real time without constant trial PITR (data recovery). Users can instantly view the target data content to ensure the accuracy of data recovery.
  • the first device production host
  • the user mode includes: application program (APP) and agent (agent), where the agent (agent) includes the application control unit (app-control) ;
  • the kernel state includes: file system, container (volume), driver (driver) and disk (disk), where the driver (driver) includes: checkpoints (checkpoints), container (volume) and a changed block tracking bitmap (CBT bitmap).
  • the first device also includes a memory (storage).
  • the second device can be divided into multiple units according to functions, including: continuous data protection device (CDP appliance), snapshot data container (data vol), data (data) and log (journal), among which continuous data protection
  • CDP appliance includes: manager unit (manager), snapshot unit (snapshot) and data receiving unit (data receiver).
  • the block-level CDP process is as follows: After a file IO event occurs in the first device detection application (APP), the agent issues a command (Control CMD) to the checkpoints in the driver (driver). Then, the driver sends the file block data corresponding to the file IO event to the data receiving unit (data receiver) of the second device. Stored in the form of IO logs and data on the second device side for PITR.
  • APP first device detection application
  • Control CMD the driver
  • the driver sends the file block data corresponding to the file IO event to the data receiving unit (data receiver) of the second device.
  • data receiver data receiving unit
  • Figure 4 is a schematic diagram of the collection scenario of file index information involved in the embodiment of the present application.
  • Figure 5 is a schematic diagram of a data recovery scenario in an embodiment of the present application.
  • the first device adds a new collection unit (fs-catalog) to the user-mode agent.
  • the collection unit (fs-catalog) can obtain file indexing information (indexing data) from the file system (file system).
  • the collection unit (fs-catalog) has the ability to obtain file index information in real time (for example, when a file IO event occurs, obtain the file index information after the file IO event).
  • the second device adds a synchronization unit (catalog engine) to the CDP appliance.
  • the specific collection process is as follows: After collecting file index information (including full file index and incremental file index), the collection unit (fs-catalog) sends the file index information to the synchronization unit (catalog engine) of the second device.
  • the synchronization unit (catalog engine) of the second device stores the file index information into the indexing store.
  • the second device adds a mount server (mount server) and a delegate volume (delegate volume).
  • mount server is used to automatically and quickly view the content of a file when the user instantly Provide the file content (that is, mount the target proxy volume)
  • proxy volume is used to automatically create a linked clone volume (that is, create the target proxy volume).
  • the specific data recovery process is as follows: when the third device (user host) needs to recover the target data at the target time.
  • the third device instructs the second device to find the target file index information (target time) through the browser (Web Browser).
  • the synchronization unit (catalog engine) of the second device instructs the indexing store (indexing store) to redo the log files on-demand (on-demand log redo).
  • the second device determines corresponding snapshot information (snapshot 2 (T2)) based on the file index information (target time).
  • the proxy volume delegate volume
  • the delegate volume restores the target data at the target time from the journal.
  • the second device sends the target data to the third device.
  • the above-mentioned memory isolation device includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.
  • Embodiments of the present application can divide the memory isolation device into functional modules according to the above method examples.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or software function modules. need to say It should be noted that the division of modules in the embodiments of the present application is schematic and is only a logical function division. In actual implementation, there may be other division methods.
  • FIG. 6 is a schematic diagram of an embodiment of the data recovery device in the embodiment of the present application.
  • Data recovery equipment includes:
  • Transceiver module 601 configured to obtain file index information, where the file index information includes index information of global files in the first device;
  • the transceiver module 601 is also used for the first device to send the file index information to a second device, and the second device is used to perform data backup;
  • the processing module 602 is configured to update the file index information.
  • the updated file index information includes updated index information of at least one file in the first device.
  • the processing module 602 is also used to monitor whether at least one file is added, deleted or modified in the first device;
  • the processing module 602 is also configured to generate first sub-index information when at least one file is added, deleted, or modified in the first device, and the first sub-index information indicates that the file is added, deleted, or modified in the first device. Or modify the index information of at least one file, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
  • the processing module 602 is further configured to update the file index information according to the first sub-index information, and the updated file index information includes the first sub-index information.
  • the sending and receiving module 601 is also configured to send the updated file index information to the second device.
  • the transceiver module 601 is configured to receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
  • Processing module 602 configured to determine target snapshot information from a snapshot information set according to the first recovery request, where the target snapshot information includes index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to The target time, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
  • the processing module 602 is also configured to create a target proxy volume according to the target snapshot information
  • the processing module 602 is also used to mount the target proxy volume, which provides file input and output IO services, and the target proxy volume is used to obtain the target file;
  • the processing module 602 is also used to restore the target file according to the target proxy volume.
  • the transceiver module 601 is also configured to receive a second recovery request, where the second recovery request is used to request recovery of the data at the target time;
  • the processing module 602 is also configured to determine the target file index information according to the second recovery request, the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information Satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set includes at least one file index information, the file index information includes the index information of the global file in the first device;
  • the processing module 602 is also used to restore the target file index information, and the target file index information is used to determine the target file;
  • the processing module 602 is also configured to redo the log file according to the target time and restore the target file at the target time;
  • the processing module 602 is also configured to roll back the data of the target proxy volume to the target time according to the target time, and obtain the updated target proxy volume, and the index indicated by the updated target proxy volume.
  • the information includes index information of the target file at the target time;
  • the processing module 602 is also configured to restore the target file according to the updated target proxy volume.
  • the transceiver module 601 is also configured to obtain the first snapshot information, which is the snapshot information of the global file in the first device at the first moment.
  • the transceiver module 601 is also configured to obtain second snapshot information, where the second snapshot information is the snapshot information of the global file in the first device at the second moment;
  • the processing module 602 is also configured to update the snapshot information set.
  • the updated snapshot information set includes: the first snapshot information and the second snapshot information.
  • the transceiver module 601 is also configured to receive file index information from the first device, where the file index information includes index information of global files in the first device;
  • the processing module 602 is also configured to associate and save the file index information with the snapshot information set.
  • This application also provides a chip system, which includes a processor and is used to support the above-mentioned terminal device to implement its related functions, for example, for example, receiving or processing the data involved in the above-mentioned method embodiments.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data for the terminal device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • a computer-readable storage medium is also provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the device executes the above figure. Methods described in some embodiments from 2 to 5.
  • a computer program product in another embodiment, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can obtain data from a computer-readable storage medium.
  • the storage medium is read to read the computer execution instructions, and at least one processor executes the computer execution instructions to cause the device to perform the methods described in some embodiments of FIGS. 2 to 5 above.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physically separate.
  • the physical unit can be located in one place, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • the connection relationship between modules indicates that there are communication connections between them, which can be specifically implemented as one or more communication buses or signal lines.
  • the present application can be implemented by software plus necessary general hardware. Of course, it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions performed by computer programs can be easily implemented with corresponding hardware. Moreover, the specific hardware structures used to implement the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. circuit etc. However, for this application, software program implementation is a better implementation in most cases. Based on this understanding, the technical solution of the present application is essentially or that part that contributes to the existing technology. It can be embodied in the form of software products.
  • the computer software products are stored in readable storage media, such as computer floppy disks, USB flash drives, mobile hard disks, ROM, RAM, magnetic disks or optical disks, etc., and include a number of instructions to enable A computer device performs the methods described in various embodiments of the application.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, a computer, a memory-isolated device , computing equipment or data center to another website, computer, memory isolation device, computing equipment through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) or data center for transmission.
  • the computer-readable storage medium may be any available medium that a computer can store, or a data storage device such as a training device or a data center integrated with one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.
  • system and “network” are often used interchangeably herein.
  • the term “and/or” in this article is just an association relationship that describes related objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. B these three situations.
  • the character "/" in this article generally indicates that the related objects are an "or" relationship.
  • B corresponding to A means that B is associated with A, and B can be determined based on A.
  • determining B based on A does not mean determining B only based on A.
  • B can also be determined based on A and/or other information.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • Another point, the phase shown or discussed The mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separate.
  • a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or it may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • Integrated units may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present application.
  • a computer device which may be a personal computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Disclosed in the embodiments of the present application are a data recovery method and a related apparatus. The method comprises: receiving a first recovery request, which is used for requesting the recovery of a target file at a target moment; according to the first recovery request, determining target snapshot information from a snapshot information set, wherein the target snapshot information comprises index information of the target file; according to the target snapshot information, creating a target proxy volume; mounting the target proxy volume, wherein the target proxy volume provides a file input/output (IO) service, and the target proxy volume is used for acquiring the target file; and recovering the target file according to the target proxy volume. By means of the method, target data at a target moment can be viewed in real time without the need to continuously try to recover data. A user can immediately view target data content, so as to ensure the accuracy of data recovery.

Description

一种数据恢复方法以及相关装置A data recovery method and related devices 技术领域Technical field
本申请涉及虚拟机存储技术领域,尤其涉及一种数据恢复方法以及相关装置。The present application relates to the field of virtual machine storage technology, and in particular, to a data recovery method and related devices.
背景技术Background technique
随着信息技术(information technology,IT)的不断发展,越来越多的企业将其数据中心的IT基础设施改造为虚拟化和云化环境,从而利用虚拟化技术提供计算资源的使用效率,实现弹性计算系统架构。在云计算/虚拟化环境中,为避免由于人为误删除或系统故障导致的虚拟机数据丢失,一般会引入备份系统对虚拟机进行数据备份,将虚拟机数据备份到第三方的备份存储设备上。虚拟机数据恢复的时候,需要将之前备份的数据从备份存储上恢复到生产环境中。With the continuous development of information technology (IT), more and more enterprises are transforming the IT infrastructure of their data centers into virtualization and cloud environments, thereby using virtualization technology to improve the use efficiency of computing resources and achieve Elastic computing system architecture. In a cloud computing/virtualization environment, in order to avoid virtual machine data loss due to manual deletion or system failure, a backup system is generally introduced to back up the data of the virtual machine, and the virtual machine data is backed up to a third-party backup storage device. . When restoring virtual machine data, the previously backed up data needs to be restored from the backup storage to the production environment.
持续数据保护(continuous data protection,CDP)技术是一种常见的虚拟机备份技术,CDP技术能够记录并保存虚拟机的每一个输入/输出(Input/Output,IO)操作,当虚拟机系统出现故障时能够恢复到最近一段时间的任意时刻,实现IO级别的恢复点目标(recovery point object,RPO)。通过CDP技术,可以实现任意时间点恢复(point-in-time recovery,PITR),极大的降低了RPO。Continuous data protection (CDP) technology is a common virtual machine backup technology. CDP technology can record and save every input/output (IO) operation of a virtual machine. When the virtual machine system fails, It can restore to any moment in the recent period to achieve the IO level recovery point objective (recovery point object, RPO). Through CDP technology, point-in-time recovery (PITR) can be achieved, greatly reducing RPO.
然而,当前的虚拟机备份技术,由于用户无法准确的获知目标数据是在什么时间点损坏的,因此用户往往需要多次PITR才能恢复目标数据。在恢复目标数据的过程中,也需要多次PITR才能尽可能减少数据丢失量。However, with the current virtual machine backup technology, users often need multiple PITRs to restore the target data because users cannot accurately know at what point in time the target data was damaged. During the process of recovering target data, multiple PITRs are also required to reduce the amount of data loss as much as possible.
发明内容Contents of the invention
本申请实施例提供了一种数据恢复方法,无需不断的尝试性恢复数据,即可实时浏览目标时刻的目标数据。用户可以即时查看目标数据内容,以确保数据恢复的准确性。Embodiments of the present application provide a data recovery method that enables real-time browsing of target data at a target time without the need for constant attempts to recover data. Users can instantly view the target data content to ensure the accuracy of data recovery.
第一方面,本申请实施例提出一种数据恢复方法,应用于第一设备,包括:In the first aspect, embodiments of the present application provide a data recovery method, which is applied to the first device, including:
获取文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;Obtain file index information, where the file index information includes index information of global files in the first device;
所述第一设备向第二设备发送所述文件索引信息,所述第二设备用于执行数据备份;The first device sends the file index information to a second device, and the second device is used to perform data backup;
更新所述文件索引信息,更新后的所述文件索引信息包括更新后的所述第一设备中至少一个文件的索引信息。The file index information is updated, and the updated file index information includes updated index information of at least one file in the first device.
示例性的,文件索引信息,包括全局文件中的每个文件的文件索引信息。其中一个文件的文件索引信息例如:“D:\aaa\bbb\cc.txt”。For example, the file index information includes the file index information of each file in the global file. The file index information of one of the files is for example: "D:\aaa\bbb\cc.txt".
示例性的,当第一设备新增、删除或者修改至少一个文件时,例如第一设备发生IO写入时,第一设备截获该IO写入并将该IO写入的数据发送至第二设备。由第二设备执行CDP处理,将IO写入的数据写入CDP数据卷和日志卷,执行该IO写入的数据的备份存储。在此过程中,第一设备更新文件索引信息,并将更新后的文件索引信息发送至第二设备,由第二设备进行备份存储。For example, when the first device adds, deletes, or modifies at least one file, for example, when an IO write occurs on the first device, the first device intercepts the IO write and sends the IO write data to the second device. . The second device performs CDP processing, writes IO written data to the CDP data volume and log volume, and performs backup storage of the IO written data. During this process, the first device updates the file index information and sends the updated file index information to the second device, and the second device performs backup and storage.
本申请实施例中,第一设备生成全局文件的索引信息(全量文件索引)后,通过实时监控第一设备是否发生文件IO事件,实时生成发生增删查改的文件的索引信息(增量文件索引)。基于全量文件索引和增量文件索引,使得第二设备(备份装置)可以向用户展示全局文件的索引信息,以便用户选取需要恢复的目标文件。用户可以即时查看目标数据内容,以确保数据恢复的准确性。 In the embodiment of the present application, after the first device generates the index information of the global file (full file index), it monitors whether a file IO event occurs on the first device in real time, and generates the index information (incremental file index) of the files where additions, deletions, and modifications occur in real time. ). Based on the full file index and the incremental file index, the second device (backup device) can display the global file index information to the user so that the user can select the target file that needs to be restored. Users can instantly view the target data content to ensure the accuracy of data recovery.
结合第一方面,在第一方面的一种可能实现方式中,所述第一设备更新所述文件索引信息,包括;In conjunction with the first aspect, in a possible implementation of the first aspect, the first device updates the file index information, including;
监测所述第一设备中是否发生新增、删除或者修改至少一个文件;Monitor whether at least one file is added, deleted or modified in the first device;
当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改;When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device. Index information, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。The file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
示例性的,当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改。第一子索引信息例如“15:05:22”对应的文件索引信息,具体的,在“15:05:22”第一设备检测新增(或者写入)“cc.txt”文件,该“cc.txt”文件的文件索引(或者地址信息)为“D:\aaa\bbb\cc.txt”,操作类型为新增(或者写入)“new”。Exemplarily, when at least one file is added, deleted, or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the newly added, deleted, or modified file in the first device. Index information of at least one file. The first sub-index information also includes an operation type of the at least one file. The operation type includes one or more of the following: adding, deleting, or modifying. The first sub-index information is, for example, the file index information corresponding to "15:05:22". Specifically, at "15:05:22" the first device detects a new (or written) "cc.txt" file. The file index (or address information) of the "cc.txt" file is "D:\aaa\bbb\cc.txt", and the operation type is "new" (or write).
可选的,第一设备获取第一子索引信息后,将第一子索引信息和文件索引信息(全局文件的索引信息)进行去重合并处理,得到更新后的文件索引信息。Optionally, after acquiring the first sub-index information, the first device de-duplicates and merges the first sub-index information and the file index information (index information of the global file) to obtain updated file index information.
可选的,第一设备也可以根据指令或者周期性的获取全局文件的索引信息。Optionally, the first device may also obtain the index information of the global file according to instructions or periodically.
结合第一方面,在第一方面的一种可能实现方式中,向所述第二设备发送所述文件索引信息之后,还包括:向所述第二设备发送更新后的所述文件索引信息。With reference to the first aspect, in a possible implementation manner of the first aspect, after sending the file index information to the second device, the method further includes: sending the updated file index information to the second device.
第二方面,本申请实施例提出一种数据恢复方法,应用于第二设备,包括:In the second aspect, embodiments of the present application propose a data recovery method, applied to the second device, including:
接收第一恢复请求,所述第一恢复请求用于请求恢复目标时刻的目标文件;Receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
根据所述第一恢复请求,从快照信息集合中确定目标快照信息,所述目标快照信息包括所述目标文件的索引信息,所述目标快照信息的快照时刻早于或等于所述目标时刻,且,所述目标快照信息的快照时刻满足:所述快照信息集合中与所述目标时刻的差值最小的快照时刻,所述快照信息集合包括至少一个快照信息;According to the first recovery request, target snapshot information is determined from a snapshot information set, the target snapshot information includes index information of the target file, the snapshot time of the target snapshot information is earlier than or equal to the target time, and , the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
根据所述目标快照信息,创建目标代理卷;Create a target proxy volume according to the target snapshot information;
挂载所述目标代理卷,所述目标代理卷提供文件输入输出IO服务,所述目标代理卷用于获取所述目标文件;Mount the target proxy volume, which provides file input and output IO services, and the target proxy volume is used to obtain the target file;
根据所述目标代理卷,恢复所述目标文件。The target file is restored based on the target proxy volume.
具体的,当用户通过可视化界面确定需要恢复的目标文件后,第三设备向第二设备发送第一恢复请求,第一恢复请求用于请求恢复目标时刻的目标文件。Specifically, after the user determines the target file that needs to be restored through the visual interface, the third device sends a first restore request to the second device, and the first restore request is used to request to restore the target file at the target time.
示例性的,当目标时刻为“15:47:25”时,确定目标快照信息的快照时刻为T2。然后第二设备确定目标快照信息(T2)。For example, when the target time is "15:47:25", the snapshot time of the target snapshot information is determined to be T2. The second device then determines the target snapshot information (T2).
第二设备根据目标快照信息,创建目标代理卷。具体的:基于目标快照信息进行链接克隆(link clone),得到目标代理卷(即目标快照信息的复制信息)。该目标代理卷作为数据访问(或者文件IO)的代理卷。示例性的,第二设备通过装载服务器(mount server)挂载该目标代理卷。The second device creates the target proxy volume based on the target snapshot information. Specifically: perform link clone (link clone) based on the target snapshot information to obtain the target proxy volume (that is, the copy information of the target snapshot information). The target proxy volume serves as a proxy volume for data access (or file IO). For example, the second device mounts the target proxy volume through a mount server.
当第二设备完成目标时刻的目标文件恢复后,第二设备向用户发送目标文件。示例性的,通过超文本传输协议(Hyper Text Transfer Protocol,HTTP)方式,以文件流的形式向用户(第三设备)发送目标时刻的目标文件。以便用户即时浏览目标文件。 After the second device completes the target file recovery at the target time, the second device sends the target file to the user. For example, the target file at the target time is sent to the user (third device) in the form of a file stream through Hyper Text Transfer Protocol (HTTP). So that users can browse the target files instantly.
本申请实施例中,无需不断的尝试性PITR(恢复数据),即可实时浏览目标时刻的目标数据。用户可以即时查看目标数据内容,以确保数据恢复的准确性。In the embodiment of the present application, the target data at the target time can be browsed in real time without constant trial PITR (data recovery). Users can instantly view the target data content to ensure the accuracy of data recovery.
结合第二方面,在第二方面的一种可能实现方式中,接收所述第一恢复请求之前,还包括:接收第二恢复请求,所述第二恢复请求用于请求恢复所述目标时刻的数据;In conjunction with the second aspect, in a possible implementation of the second aspect, before receiving the first recovery request, the method further includes: receiving a second recovery request, where the second recovery request is used to request recovery of the target time. data;
根据所述第二恢复请求,确定目标文件索引信息,所述目标文件索引信息的记录时刻早于或等于所述目标时刻,且,所述目标文件索引信息的记录时刻满足:文件索引信息集合中与所述目标时刻的差值最小的记录时刻,所述文件索引信息集合包括至少一个文件索引信息,所述文件索引信息包括第一设备中全局文件的索引信息;According to the second recovery request, the target file index information is determined, the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information satisfies: in the file index information set The recording time with the smallest difference from the target time, the file index information set includes at least one file index information, and the file index information includes index information of global files in the first device;
恢复所述目标文件索引信息,所述目标文件索引信息用于确定所述目标文件。The target file index information is restored, and the target file index information is used to determine the target file.
具体的,用户通过第三设备选择目标时刻,该目标时刻作为用户期望恢复数据的时刻。用户通过第三设备向第二设备发送第二恢复请求,该第二恢复请求用于恢复目标时刻的数据。Specifically, the user selects a target time through the third device, and the target time is used as the time when the user expects to restore the data. The user sends a second recovery request to the second device through the third device, and the second recovery request is used to recover data at the target time.
示例性的,第三设备在浏览器页面展示数据恢复菜单。用户在该数据恢复菜单中选择想要恢复数据的目标时刻。For example, the third device displays the data recovery menu on the browser page. The user selects the target moment at which the data is to be recovered in the data recovery menu.
示例性的,当目标时刻为“15:47:25”时,确定最接近的文件索引信息的记录时刻在T2~T3(15:00:00~15:47:28)之间,且不早于15:47:28。然后,第二设备确定目标文件索引信息,该目标文件索引信息包括15:00:00记录的全局文件的索引信息(又称为全量文件索引)和15:00:00~15:47:28(不含15:47:28)记录的第一子索引信息(又称为增量文件索引)。可选的,该目标文件索引信息由上述的全量文件索引和增量文件索引合并(merge)得到。For example, when the target time is "15:47:25", it is determined that the recording time of the closest file index information is between T2 ~ T3 (15:00:00 ~ 15:47:28), and not earlier At 15:47:28. Then, the second device determines the target file index information, which includes the index information of the global file recorded at 15:00:00 (also known as the full file index) and the index information from 15:00:00 to 15:47:28 ( Excluding the first sub-index information (also called incremental file index) recorded at 15:47:28). Optionally, the target file index information is obtained by merging the above-mentioned full file index and incremental file index.
可选的,该目标文件索引信息可以指示全局文件的文件结构信息,因此,该目标文件索引信息可以称为全量文件系统的目录结构信息。Optionally, the target file index information may indicate the file structure information of the global file. Therefore, the target file index information may be called the directory structure information of the entire file system.
结合第二方面,在第二方面的一种可能实现方式中,根据所述目标代理卷,恢复所述目标文件,包括:Combined with the second aspect, in a possible implementation of the second aspect, restoring the target file according to the target proxy volume includes:
根据所述目标时刻重做日志文件,恢复所述目标时刻的所述目标文件;Redo the log file at the target time and restore the target file at the target time;
根据所述目标时刻,将所述目标代理卷的数据回滚至所述目标时刻,得到更新后的所述目标代理卷,更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息;According to the target time, the data of the target proxy volume is rolled back to the target time to obtain the updated target proxy volume. The index information indicated by the updated target proxy volume includes the index information of the target time. Index information of the target file;
根据更新后的所述目标代理卷,恢复所述目标文件。The target file is restored based on the updated target proxy volume.
具体的,第二设备根据目标时刻,重做(redo)日志文件,然后,从第二设备的存储器中恢复目标时刻的目标文件。例如:从第二设备的存储器中提取目标时刻“15:47:25”的目标文件。第二设备根据目标时刻,将目标代理卷的数据回滚至目标时刻。回滚到目标时刻后,得到更新后的目标代理卷。更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息。Specifically, the second device redoes the log file according to the target time, and then restores the target file at the target time from the memory of the second device. For example: extract the target file of the target time "15:47:25" from the memory of the second device. The second device rolls back the data of the target proxy volume to the target time based on the target time. After rolling back to the target time, the updated target proxy volume is obtained. The updated index information indicated by the target proxy volume includes index information of the target file at the target time.
结合第二方面,在第二方面的一种可能实现方式中,获取第一快照信息,所述第一快照信息为第一时刻的所述第一设备中全局文件的快照信息。In conjunction with the second aspect, in a possible implementation manner of the second aspect, first snapshot information is obtained, where the first snapshot information is the snapshot information of the global file in the first device at the first moment.
结合第二方面,在第二方面的一种可能实现方式中,获取所述第一快照信息之后,还包括:获取第二快照信息,所述第二快照信息为第二时刻的所述第一设备中全局文件的快照信息;更新所述快照信息集合,更新后的所述快照信息集合包括:所述第一快照信息和所述第二快照信息。In conjunction with the second aspect, in a possible implementation of the second aspect, after obtaining the first snapshot information, the method further includes: obtaining second snapshot information, where the second snapshot information is the first snapshot information at a second time. Snapshot information of global files in the device; update the snapshot information set, and the updated snapshot information set includes: the first snapshot information and the second snapshot information.
具体的,第二设备根据用户策略获取快照信息,该快照信息为某一时刻的第一设备中全局文件的快照信息。该用户策略包括但不限于:在预定义的时刻,第一设备获取文件索引信息,在该预定义的时刻,第二设备获取全局文件的快照信息。又例如:第一设备周期性的获 取文件索引信息,第二设备周期性的获取全局文件的快照信息,两个获取的周期相同。示例性的,第二设备获取快照信息后,将快照信息(或称为快照数据)写入到CDP数据卷。Specifically, the second device obtains the snapshot information according to the user policy, and the snapshot information is the snapshot information of the global file in the first device at a certain moment. The user policy includes but is not limited to: at a predefined time, the first device obtains file index information, and at the predefined time, the second device obtains snapshot information of the global file. Another example: the first device periodically acquires To obtain the file index information, the second device periodically obtains the snapshot information of the global file, and the two acquisition periods are the same. For example, after obtaining the snapshot information, the second device writes the snapshot information (or snapshot data) to the CDP data volume.
可选的,当第二设备获取快照信息后,第二设备可以在装载服务器(mount server)中挂载该快照。然后,扫描挂载后的该快照(即扫描快照中的文件系统),获取全局文件的索引信息。第二设备将该扫描得到的全局文件的索引信息,作为文件索引信息进行备份存储。Optionally, after the second device obtains the snapshot information, the second device can mount the snapshot in the mount server. Then, scan the mounted snapshot (that is, scan the file system in the snapshot) to obtain the index information of the global file. The second device backs up and stores the index information of the global file obtained by scanning as file index information.
结合第二方面,在第二方面的一种可能实现方式中,还包括:Combined with the second aspect, a possible implementation manner of the second aspect also includes:
接收来自所述第一设备的文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;Receive file index information from the first device, where the file index information includes index information of global files in the first device;
将所述文件索引信息与所述快照信息集合关联保存。The file index information is associated and saved with the snapshot information set.
第三方面,本申请实施例提出一种数据恢复装置,应用于第一设备,包括:In the third aspect, the embodiment of the present application proposes a data recovery device, which is applied to the first device and includes:
收发模块,用于获取文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;A transceiver module, configured to obtain file index information, where the file index information includes index information of global files in the first device;
收发模块,还用于所述第一设备向第二设备发送所述文件索引信息,所述第二设备用于执行数据备份;A transceiver module, also configured for the first device to send the file index information to a second device, and the second device is used to perform data backup;
处理模块,用于更新所述文件索引信息,更新后的所述文件索引信息包括更新后的所述第一设备中至少一个文件的索引信息。A processing module configured to update the file index information, where the updated file index information includes updated index information of at least one file in the first device.
在一种可能的实现方式中,In one possible implementation,
处理模块,还用于监测所述第一设备中是否发生新增、删除或者修改至少一个文件;The processing module is also used to monitor whether at least one file is added, deleted or modified in the first device;
处理模块,还用于当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改;The processing module is also configured to generate first sub-index information when at least one file is added, deleted or modified in the first device, and the first sub-index information indicates that the first device is added, deleted or modified. The index information of the modified at least one file, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
处理模块,还用于根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。The processing module is further configured to update the file index information according to the first sub-index information, and the updated file index information includes the first sub-index information.
在一种可能的实现方式中,In one possible implementation,
收发模块,还用于向所述第二设备发送更新后的所述文件索引信息。The transceiver module is also configured to send the updated file index information to the second device.
第四方面,本申请实施例提出一种数据恢复装置,用于第二设备,包括:In the fourth aspect, the embodiment of the present application provides a data recovery device for the second device, including:
收发模块,用于接收第一恢复请求,所述第一恢复请求用于请求恢复目标时刻的目标文件;A transceiver module, configured to receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
处理模块,用于根据所述第一恢复请求,从快照信息集合中确定目标快照信息,所述目标快照信息包括所述目标文件的索引信息,所述目标快照信息的快照时刻早于或等于所述目标时刻,且,所述目标快照信息的快照时刻满足:所述快照信息集合中与所述目标时刻的差值最小的快照时刻,所述快照信息集合包括至少一个快照信息;A processing module configured to determine target snapshot information from a snapshot information set according to the first recovery request, where the target snapshot information includes index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to the target snapshot information. The target time is specified, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
处理模块,还用于根据所述目标快照信息,创建目标代理卷;The processing module is also used to create a target proxy volume according to the target snapshot information;
处理模块,还用于挂载所述目标代理卷,所述目标代理卷提供文件输入输出IO服务,所述目标代理卷用于获取所述目标文件;The processing module is also used to mount the target proxy volume, the target proxy volume provides file input and output IO services, and the target proxy volume is used to obtain the target file;
处理模块,还用于根据所述目标代理卷,恢复所述目标文件。The processing module is also configured to restore the target file according to the target proxy volume.
在一种可能的实现方式中,In one possible implementation,
收发模块,还用于接收第二恢复请求,所述第二恢复请求用于请求恢复所述目标时刻的数据;The transceiver module is also configured to receive a second recovery request, where the second recovery request is used to request recovery of the data at the target time;
处理模块,还用于根据所述第二恢复请求,确定目标文件索引信息,所述目标文件索引 信息的记录时刻早于或等于所述目标时刻,且,所述目标文件索引信息的记录时刻满足:文件索引信息集合中与所述目标时刻的差值最小的记录时刻,所述文件索引信息集合包括至少一个文件索引信息,所述文件索引信息包括第一设备中全局文件的索引信息;The processing module is also configured to determine the target file index information according to the second recovery request. The target file index The recording time of the information is earlier than or equal to the target time, and the recording time of the target file index information satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set including at least one file index information, where the file index information includes index information of global files in the first device;
处理模块,还用于恢复所述目标文件索引信息,所述目标文件索引信息用于确定所述目标文件;The processing module is also used to restore the target file index information, and the target file index information is used to determine the target file;
在一种可能的实现方式中,In one possible implementation,
处理模块,还用于根据所述目标时刻重做日志文件,恢复所述目标时刻的所述目标文件;The processing module is also configured to redo the log file according to the target time and restore the target file at the target time;
处理模块,还用于根据所述目标时刻,将所述目标代理卷的数据回滚至所述目标时刻,得到更新后的所述目标代理卷,更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息;The processing module is also configured to roll back the data of the target proxy volume to the target time according to the target time, and obtain the updated target proxy volume, and the updated index information indicated by the target proxy volume. Index information of the target file including the target time;
处理模块,还用于根据更新后的所述目标代理卷,恢复所述目标文件。The processing module is also configured to restore the target file according to the updated target proxy volume.
在一种可能的实现方式中,In one possible implementation,
收发模块,还用于获取第一快照信息,所述第一快照信息为第一时刻的所述第一设备中全局文件的快照信息。The transceiver module is also configured to obtain first snapshot information, where the first snapshot information is the snapshot information of the global file in the first device at the first moment.
在一种可能的实现方式中,In one possible implementation,
收发模块,还用于获取第二快照信息,所述第二快照信息为第二时刻的所述第一设备中全局文件的快照信息;The transceiver module is also configured to obtain second snapshot information, where the second snapshot information is the snapshot information of the global file in the first device at the second moment;
处理模块,还用于更新所述快照信息集合,更新后的所述快照信息集合包括:所述第一快照信息和所述第二快照信息。The processing module is also configured to update the snapshot information set. The updated snapshot information set includes: the first snapshot information and the second snapshot information.
在一种可能的实现方式中,In one possible implementation,
收发模块,还用于接收来自所述第一设备的文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;A transceiver module, further configured to receive file index information from the first device, where the file index information includes index information of global files in the first device;
处理模块,还用于将所述文件索引信息与所述快照信息集合关联保存。The processing module is also configured to associate and save the file index information with the snapshot information set.
第五方面,本申请实施例提供了一种数据恢复装置,该数据恢复装置具有实现上述第一方面或第一方面任意一种可能实现方式的方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,例如:收发模块、处理模块和存储模块。In a fifth aspect, embodiments of the present application provide a data recovery device, which has the function of implementing the method of the above-mentioned first aspect or any possible implementation of the first aspect. This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, such as: transceiver module, processing module and storage module.
第六方面,本申请实施例提供了一种数据恢复装置,该数据恢复装置包括至少一个处理器和存储器,该存储器中存储有可在处理器上运行的计算机指令,当该计算机指令被所述处理器执行时,所述处理器执行如上述第一方面或第一方面任意一种可能的实现方式所述的方法。In a sixth aspect, embodiments of the present application provide a data recovery device. The data recovery device includes at least one processor and a memory. The memory stores computer instructions that can be run on the processor. When the computer instructions are described When the processor executes, the processor executes the method described in the above first aspect or any possible implementation manner of the first aspect.
第七方面,本申请实施例提供了一种数据恢复装置,该数据恢复装置具有实现上述第二方面或第二方面任意一种可能实现方式的方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,例如:收发模块、处理模块和存储模块。In a seventh aspect, embodiments of the present application provide a data recovery device, which has the function of implementing the method of the above second aspect or any of the possible implementations of the second aspect. This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, such as: transceiver module, processing module and storage module.
第八方面,本申请实施例提供了一种数据恢复装置,该数据恢复装置包括至少一个处理器和存储器,该存储器中存储有可在处理器上运行的计算机指令,当该计算机指令被所述处理器执行时,所述处理器执行如上述第二方面或第二方面任意一种可能的实现方式所述的方法。In an eighth aspect, embodiments of the present application provide a data recovery device. The data recovery device includes at least one processor and a memory. The memory stores computer instructions that can be run on the processor. When the computer instructions are described When the processor executes, the processor executes the method described in the above second aspect or any possible implementation manner of the second aspect.
第九方面,本申请实施例提供了一种计算机设备,该计算机设备包括至少一个处理器、存储器、通信端口、显示器以及存储在存储器中并可在处理器上运行的计算机执行指令,当 所述计算机执行指令被所述处理器执行时,所述处理器执行如上述第一方面或者第二方面任意一种可能的实现方式所述的方法。In a ninth aspect, embodiments of the present application provide a computer device, which includes at least one processor, a memory, a communication port, a display, and computer execution instructions stored in the memory and executable on the processor. When the computer execution instructions are executed by the processor, the processor executes the method described in any possible implementation manner of the first aspect or the second aspect.
第十方面,本申请实施例提供了一种存储一个或多个计算机执行指令的计算机可读存储介质,当所述计算机执行指令被处理器执行时,所述处理器执行如上述第一方面或者第二方面任意一种可能的实现方式所述的方法。In a tenth aspect, embodiments of the present application provide a computer-readable storage medium that stores one or more computer-executable instructions. When the computer-executable instructions are executed by a processor, the processor executes the above-mentioned first aspect or The method described in any possible implementation manner of the second aspect.
第十一方面,本申请实施例提供一种存储一个或多个计算机执行指令的计算机程序产品(或称计算机程序),当所述计算机执行指令被所述处理器执行时,所述处理器执行上述第一方面或者第二方面任意一种可能实现方式的方法。In an eleventh aspect, embodiments of the present application provide a computer program product (or computer program) that stores one or more computer-executable instructions. When the computer-executable instructions are executed by the processor, the processor executes Any possible implementation method of the first aspect or the second aspect above.
第十二方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持计算机设备实现上述方面中所涉及的功能。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存计算机设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。In a twelfth aspect, the present application provides a chip system, which includes a processor and is used to support a computer device to implement the functions involved in the above aspect. In a possible design, the chip system further includes a memory, and the memory is used to store necessary program instructions and data for the computer device. The chip system may be composed of chips, or may include chips and other discrete devices.
其中,第三至第十二方面或者其中任一种可能实现方式所带来的技术效果可参见第一方面或第二方面不同可能实现方式所带来的技术效果,此处不再赘述。Among them, the technical effects brought by the third to twelfth aspects or any one of the possible implementation methods can be found in the technical effects brought by the different possible implementation methods of the first aspect or the second aspect, and will not be described again here.
附图说明Description of the drawings
图1a为本申请实施例提供的一种计算机设备100的示意图;Figure 1a is a schematic diagram of a computer device 100 provided by an embodiment of the present application;
图1b为本申请实施例中一种应用场景示意图;Figure 1b is a schematic diagram of an application scenario in the embodiment of the present application;
图2为本申请实施例中一种数据恢复方法的实施例示意图;Figure 2 is a schematic diagram of a data recovery method in an embodiment of the present application;
图3为块级CDP示意图;Figure 3 is a schematic diagram of block-level CDP;
图4为本申请实施例涉及的文件索引信息的采集场景示意图;Figure 4 is a schematic diagram of the collection scene of file index information involved in the embodiment of the present application;
图5为本申请实施例中数据恢复的场景示意图;Figure 5 is a schematic diagram of a data recovery scenario in an embodiment of the present application;
图6为本申请实施例中数据恢复装置的一种实施例示意图。Figure 6 is a schematic diagram of an embodiment of the data recovery device in the embodiment of the present application.
具体实施方式Detailed ways
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“对应于”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if present) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects without necessarily using Used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the application described herein can, for example, be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "include" and "corresponding to" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, system, product or apparatus that includes a series of steps or units and need not be limited to those explicitly listed may include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.
相比传统的硬件资源成本较高,更新换代较快的劣势,许多企业都会选择租用云资源来替代传统的硬件资源,云技术(cloud technology)应运而生。由于租用云资源只需要支付一定的租金就可以完全替代原有的硬件资源,且更新换代以及扩容也不需要另外购买新的硬件资源,在一定程度上可以减少企业的成本开支。Compared with traditional hardware resources, which have the disadvantages of high cost and rapid replacement, many companies will choose to rent cloud resources to replace traditional hardware resources, and cloud technology emerged as the times require. Since renting cloud resources only requires paying a certain amount of rent, it can completely replace the original hardware resources, and there is no need to purchase new hardware resources for upgrading and expansion, which can reduce the cost of enterprises to a certain extent.
可以理解的是,云技术是指在广域网或局域网内将硬件、软件和网络等系列资源统一起来,实现数据的计算、储存、处理和共享的一种托管技术。云技术基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术和应用技术等的总称,可以组成资源池,按需所用,灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源,如视频网站、图片类网站和更多的门户网站。伴随着互联网行业的高度发展 和应用,将来每个物品都有可能存在自己的识别标志,都需要传输到后台系统进行逻辑处理,不同程度级别的数据将会分开处理,各类行业数据皆需要强大的系统后盾支撑,只能通过云计算来实现。It can be understood that cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software and network within a wide area network or local area network to realize data calculation, storage, processing and sharing. Cloud technology is a general term for network technology, information technology, integration technology, management platform technology and application technology based on the cloud computing business model. It can form a resource pool and use it on demand, which is flexible and convenient. Cloud computing technology will become an important support. The background services of technical network systems require a large amount of computing and storage resources, such as video websites, picture websites and more portal websites. With the rapid development of the Internet industry and applications. In the future, each item may have its own identification mark, which needs to be transmitted to the backend system for logical processing. Data at different levels will be processed separately. All types of industry data require strong system backing support. Only Achieved through cloud computing.
云计算(cloud computing)指IT基础设施的交付和使用模式,指通过网络以按需、易扩展的方式获得所需资源;广义云计算指服务的交付和使用模式,指通过网络以按需、易扩展的方式获得所需服务。这种服务可以是互联网技术(internet technology,IT),和软件、互联网相关,也可是其他服务。云计算是网格计算(grid computing)、分布式计算(distributed computing)、并行计算(parallel computing)、效用计算(utility computing)、网络存储(network storage technologies)、虚拟化(virtualization)、负载均衡(load balance)等传统计算机和网络技术发展融合的产物。随着互联网、实时数据流、连接设备多样化的发展,以及搜索服务、社会网络、移动商务和开放协作等需求的推动,云计算迅速发展起来。不同于以往的并行分布式计算,云计算的产生从理念上将推动整个互联网模式、企业管理模式发生革命性的变革。Cloud computing refers to the delivery and usage model of IT infrastructure, which refers to obtaining the required resources through the network in an on-demand and easily scalable manner; cloud computing in a broad sense refers to the delivery and usage model of services, which refers to the on-demand and easily scalable method through the network. Get the services you need in an easily scalable way. This kind of service can be Internet technology (IT), related to software and the Internet, or other services. Cloud computing is grid computing, distributed computing, parallel computing, utility computing, network storage technologies, virtualization, load balancing ( load balance) and other traditional computer and network technology development and integration products. Cloud computing has developed rapidly with the development of the Internet, real-time data streams, diversification of connected devices, and the demand for search services, social networks, mobile commerce, and open collaboration. Different from the previous parallel distributed computing, the emergence of cloud computing will conceptually promote revolutionary changes in the entire Internet model and enterprise management model.
图1a为本申请实施例提供的一种计算机设备100的示意图。该计算机设备100可用于本申请实施例提出的第一设备和/或第二设备。如图1a所示,计算机设备100包括处理器102、内存104。处理器102通过双倍速率(double data rate,DDR)总线103和内存104相连。这里,不同的内存104可能采用不同的数据总线与处理器102通信,因此DDR总线103也可以替换为其他类型的数据总线,本申请实施例不对总线类型进行限定。另外,计算机设备100还包括各种I/O设备,处理器102可以通过PCIe总线105访问这些I/O设备107。FIG. 1a is a schematic diagram of a computer device 100 provided by an embodiment of the present application. The computer device 100 can be used in the first device and/or the second device proposed in the embodiments of this application. As shown in Figure 1a, a computer device 100 includes a processor 102 and a memory 104. The processor 102 is connected to the memory 104 through a double data rate (DDR) bus 103. Here, different memories 104 may use different data buses to communicate with the processor 102, so the DDR bus 103 may also be replaced with other types of data buses. The embodiment of the present application does not limit the bus type. In addition, the computer device 100 also includes various I/O devices, and the processor 102 can access these I/O devices 107 through the PCIe bus 105.
处理器(Processor)102是计算机设备100的运算核心和控制核心。处理器102中可以包括一个或多个处理器核(core)204。处理器102可以是一块超大规模的集成电路。在处理器102中安装有操作系统和其他软件程序,从而处理器102能够实现对内存104及各种PCIe设备的访问。可以理解的是,在本发明实施例中,处理器102中的Core204例如可以是中央处理器(Central Processing unit,CPU),可以是其他特定集成电路(Application Specific Integrated Circuit,ASIC)。处理器102还可以是是其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。实际应用中,计算机设备100也可以包括多个处理器。The processor (Processor) 102 is the computing core and control core of the computer device 100 . One or more processor cores 204 may be included in the processor 102 . The processor 102 may be a very large scale integrated circuit. An operating system and other software programs are installed in the processor 102, so that the processor 102 can access the memory 104 and various PCIe devices. It can be understood that in the embodiment of the present invention, the Core 204 in the processor 102 may be, for example, a central processing unit (Central Processing unit, CPU), or other specific integrated circuit (Application Specific Integrated Circuit, ASIC). The processor 102 can also be other general-purpose processors, digital signal processing (DSP), application specific integrated circuit (ASIC), field programmable gate array (field programmable gate array, FPGA) or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. In practical applications, the computer device 100 may also include multiple processors.
内存控制器(Memory Controller)是计算机设备100内部控制内存104并用于管理与规划从内存104到Core204间的数据传输的总线电路控制器。通过内存控制器,内存104与Core204之间可以交换数据。内存控制器可以是一个单独的芯片,并通过系统总线与Core204连接。本领域技术人员可以知道,内存控制器也可以被集成到处理器102中,也可以被内置于北桥中,还可以是一块独立的内存控制器芯片,本发明实施例不对内存控制器的具体位置和存在形式进行限定。实际应用中,内存控制器可以控制必要的逻辑以将数据写入内存104或从内存104中读取数据。内存控制器104可以是通用处理器、专用加速器、GPU、FPGA、嵌入式处理器等处理器系统中的内存控制器。The memory controller (Memory Controller) is a bus circuit controller that controls the memory 104 within the computer device 100 and is used to manage and plan data transmission from the memory 104 to the Core 204. Through the memory controller, data can be exchanged between memory 104 and Core204. The memory controller can be a separate chip and connected to the Core204 through the system bus. Those skilled in the art will know that the memory controller can also be integrated into the processor 102, can also be built into the north bridge, or can be an independent memory controller chip. The embodiment of the present invention does not specify the specific location of the memory controller. and limit the form of existence. In actual applications, the memory controller may control necessary logic to write data to or read data from the memory 104 . The memory controller 104 may be a memory controller in a processor system such as a general-purpose processor, a dedicated accelerator, a GPU, an FPGA, or an embedded processor.
内存104是计算机设备100的主存。内存104通常用来存放操作系统中各种正在运行的软件、输入和输出数据以及与外存交换的信息等。为了提高处理器102的访问速度,内存108需要具备访问速度快的优点。在传统的计算机系统架构中,通常采用动态随机存取存储器(Dynamic Random Access Memory,DRAM)作为内存104。处理器102能够通过内存控制器 高速访问内存104,对内存104中的任意一个存储单元进行读操作和写操作。除了DRAM之外,内存104还可以是其他随机存取存储器,例如静态随机存取存储器(Static Random Access Memory,SRAM)等。另外,内存104也可以是只读存储器(Read Only Memory,ROM)。而对于只读存储器,举例来说,可以是可编程只读存储器(Programmable Read Only Memory,PROM)、可抹除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)等。本实施例不对内存104的数量和类型进行限定。此外,可对内存104进行配置使其具有保电功能。保电功能是指系统发生掉电又重新上电时,存储器中存储的数据也不会丢失。具有保电功能的内存104被称为非易失性存储器。Memory 104 is the main memory of computer device 100 . The memory 104 is usually used to store various running software in the operating system, input and output data, and information exchanged with external memory. In order to improve the access speed of the processor 102, the memory 108 needs to have the advantage of fast access speed. In traditional computer system architecture, dynamic random access memory (Dynamic Random Access Memory, DRAM) is usually used as the memory 104. The processor 102 is able to pass the memory controller Access the memory 104 at high speed and perform read and write operations on any storage unit in the memory 104. In addition to DRAM, the memory 104 can also be other random access memories, such as static random access memory (Static Random Access Memory, SRAM), etc. In addition, the memory 104 may also be a read-only memory (Read Only Memory, ROM). As for the read-only memory, for example, it can be a programmable read-only memory (Programmable Read Only Memory, PROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), etc. This embodiment does not limit the number and type of memory 104. In addition, the memory 104 can be configured to have a power-saving function. The power-saving function means that the data stored in the memory will not be lost when the system is powered off and then on again. The memory 104 with a power-saving function is called a non-volatile memory.
输入输出(input/ourput,I/O)设备107,是指可以进行数据传输的硬件,也可以理解为与I/O接口对接的设备。常见的I/O设备有网卡、打印机、键盘、鼠标等。所有的外存也可以作为I/O设备,如硬盘、软盘、光盘等。处理器102可通过PCIe总线105访问各个IO设备107。需要说明的是,PCIe总线105只是其中的一个示例,可以被替换为其他总线,例如统一(Unified Bus,UB)总线等。Input/output (I/O) device 107 refers to hardware that can transmit data, and can also be understood as a device connected to an I/O interface. Common I/O devices include network cards, printers, keyboards, mice, etc. All external storage can also be used as I/O devices, such as hard disks, floppy disks, optical disks, etc. The processor 102 can access various IO devices 107 through the PCIe bus 105 . It should be noted that the PCIe bus 105 is just one example and can be replaced by other buses, such as the Unified Bus (UB) bus, etc.
基板管理控制器(Baseboard Management Controller,BMC)106,可以对设备进行固件升级,对设备的运行状态进行管理以及排除故障等。处理器102可通过PCIe总线或者USB、I2C等总线访问基板管理控制器108。基本管理控制器106还可以和至少一个传感器相连。通过传感器获取计算机设备的状态数据,其中状态数据包括:温度数据,电流数据、电压数据等等。在本申请中不对状态数据的类型做具体限制。基板管理控制器106通过PCIe总线或者其他类型的总线和处理器102通信,例如,将获取到的状态数据,传递给处理器102进行处理。基板管理控制器106也可以对存储器102中的程序代码进行维护,包括升级或恢复等等。基板管理控制器106还可以对计算机设备100内的电源电路或时钟电路进行控制等。总之,基板管理控制器106可以通过以上方式实现对计算机设备100的管理。然而,基板管理控制器106只是一个可选设备。在一些实施方式中,处理器102可以直接和传感器通信,从而对计算机设备直接进行管理和维护。Baseboard Management Controller (BMC) 106 can upgrade the firmware of the device, manage the operating status of the device, and troubleshoot faults. The processor 102 can access the baseboard management controller 108 through a PCIe bus or a bus such as USB or I2C. Basic management controller 106 may also be connected to at least one sensor. Obtain status data of computer equipment through sensors, where status data includes: temperature data, current data, voltage data, etc. There is no specific limitation on the type of status data in this application. The baseboard management controller 106 communicates with the processor 102 through the PCIe bus or other types of buses, for example, transmits the acquired status data to the processor 102 for processing. The baseboard management controller 106 can also perform maintenance on the program code in the memory 102, including upgrading or restoring, and so on. The baseboard management controller 106 may also control the power supply circuit or the clock circuit within the computer device 100, etc. In summary, the baseboard management controller 106 can manage the computer device 100 in the above manner. However, the baseboard management controller 106 is only an optional device. In some embodiments, the processor 102 can communicate directly with the sensor to directly manage and maintain the computer device.
首先介绍本申请实施例涉及的一些概念:First, some concepts involved in the embodiments of this application are introduced:
一、内核态和用户态。1. Kernel state and user state.
由于需要限制不同的程序之间的访问能力,防止他们获取其他程序的内存数据,或者获取外围设备的数据,处理器划分出两个权限等级:用户态和内核态。Due to the need to restrict access capabilities between different programs and prevent them from obtaining memory data of other programs or obtaining data of peripheral devices, the processor is divided into two permission levels: user mode and kernel mode.
当一个任务或者一个进程执行系统调用而在内核代码中执行时,我们就称进程处于内核态。此时处理器处于特权级最高的内核代码中执行。在内核态,处理器可以访问内存的所有数据,包括外围设备,例如硬盘,网卡等,处理器也可以将自己从一个程序切换到另一个程序。当进程在执行用户自己的代码时,则称其处于用户态。此时处理器在特权级最低的用户代码中运行,仅能使用常规处理器指令集,不能使用操作硬件资源的处理器指令集。在用户态,处理器只能受限地访问内存,且不允许访问外围设备,如IO读写、网卡访问、申请内存等。并且处理器资源可以被其他程序获取。When a task or a process executes a system call and is executed in kernel code, we say that the process is in kernel mode. At this time, the processor is executing in the kernel code with the highest privilege level. In the kernel state, the processor can access all data in the memory, including peripheral devices such as hard disks, network cards, etc. The processor can also switch itself from one program to another. When a process is executing the user's own code, it is said to be in user mode. At this time, the processor is running in the user code with the lowest privilege level and can only use the regular processor instruction set and cannot use the processor instruction set that operates hardware resources. In user mode, the processor can only have limited access to memory and is not allowed to access peripheral devices, such as IO reading and writing, network card access, memory application, etc. And processor resources can be obtained by other programs.
一般情况下,应用程序是运行在用户态的,但有时候应用程序也需要做一些内核态的事情,例如从硬盘读取数据,或者通过键盘获取输入等,而唯一可以做这些事情的就是操作系统,所以此时程序就需要从用户态切换到内核态。用户态切换到内核态有3种方式。A.系统调用。系统调用是用户态进程主动要求切换到内核态的一种方式,用户态进程通过系统调用申请使用操作系统的服务程序完成工作。B.异常。当CPU在执行运行在用户态的程序时,发生了某些事先不可知的异常,这时会触发由当前运行进程切换到处理此异常的内核相关程序 中,也就转到了内核态,比如缺页异常。C.外围设备的中断。当外围设备完成用户请求的操作后,会向CPU发出相应的中断信号,这时CPU会暂停执行下一条即将要执行的指令转而去执行与中断信号对应的处理程序,如果先前执行的指令是用户态下的程序,那么这个转换的过程自然也就发生了用户态到内核态的切换。比如硬盘读写操作完成,系统会切换到硬盘读写的中断处理程序中执行后续操作等。这三种方式是系统在运行时由用户态转到内核态的最主要方式,其中系统调用可以认为是用户进程主动发起的,异常和外围设备中断则是被动的。Under normal circumstances, applications run in user mode, but sometimes applications also need to do some things in kernel mode, such as reading data from the hard disk, or getting input through the keyboard, etc., and the only thing that can do these things is the operation system, so at this time the program needs to switch from user mode to kernel mode. There are three ways to switch from user mode to kernel mode. A. System call. System call is a way for a user-mode process to actively request to switch to the kernel mode. The user-mode process applies to use the service program of the operating system to complete the work through the system call. B. Abnormal. When the CPU is executing a program running in user mode, some unknown exception occurs, which will trigger a switch from the current running process to the kernel-related program that handles the exception. , it goes to the kernel state, such as page fault exception. C. Interruption of peripheral devices. When the peripheral device completes the operation requested by the user, it will send a corresponding interrupt signal to the CPU. At this time, the CPU will suspend execution of the next instruction to be executed and instead execute the processing program corresponding to the interrupt signal. If the previously executed instruction is Programs in user mode, then this conversion process naturally involves switching from user mode to kernel mode. For example, after the hard disk read and write operations are completed, the system will switch to the hard disk read and write interrupt handler to perform subsequent operations. These three methods are the most important ways for the system to transfer from user mode to kernel mode during runtime. System calls can be considered to be actively initiated by the user process, while exceptions and peripheral device interrupts are passive.
二、虚拟化技术。2. Virtualization technology.
虚拟化是资源的逻辑表示,使其不受物理限制的约束。将任何一种形式的接口和资源映射成另一种形式的接口和资源的技术,都可以成为虚拟化技术。其实现形式一般是在系统中加入一个虚拟化软件层,将下层的资源抽象成另一形式的资源,提供给上层使用。Virtualization is the logical representation of resources so that they are not bound by physical limitations. Technology that maps any form of interfaces and resources to another form of interfaces and resources can become virtualization technology. Its implementation is generally to add a virtualization software layer to the system to abstract the lower layer resources into another form of resources and provide them to the upper layer for use.
其中,通过虚拟化仿真出来的计算机系统叫做虚拟机(Virtual Machine,VM)。运行虚拟机的底层机器称为主机(Host)。运行在虚拟环境上的软件称为客户机(Guest)。如果有操作系统,相应地,底层机器上运行的操作系统称为主机操作系统(Host Operating System,HostOS)。运行在虚拟机之上的操作系统称为客户机操作系统(Guest Operating System,GuestOS)。Among them, the computer system simulated through virtualization is called a virtual machine (Virtual Machine, VM). The underlying machine that runs the virtual machine is called the host. The software running on the virtual environment is called the guest. If there is an operating system, correspondingly, the operating system running on the underlying machine is called the host operating system (Host Operating System, HostOS). The operating system running on the virtual machine is called the guest operating system (Guest Operating System, GuestOS).
三、快照(snapshot)技术。3. Snapshot technology.
世界各地的企业都认识到其数据的商业价值,因此,寻求可靠和经济有效的方法来保护存储在其计算机网络上的信息,同时将对生产力的影响降至最低。公司将备份其关键计算系统,如数据库、文件服务器、网页服务器等,作为每日、每周或每月维护计划的一部分。公司还将寻求保护其每位员工使用的计算机系统,如会计部门、营销部门、工程部门等使用的计算机系统。考虑到管理的数据量迅速扩大,除了保护数据之外,公司还继续寻求管理数据增长的创新技术。Businesses around the world recognize the business value of their data and, therefore, seek reliable and cost-effective ways to protect the information stored on their computer networks while minimizing the impact on productivity. Companies will back up their critical computing systems such as databases, file servers, web servers, etc. as part of a daily, weekly or monthly maintenance schedule. The company will also seek to protect the computer systems used by each of its employees, such as those used by the accounting department, marketing department, engineering department, etc. Given the rapidly expanding volumes of data being managed, companies continue to seek innovative technologies to manage data growth in addition to protecting it.
从历史上看,快照和克隆操作一直在作为虚拟对象的存储容器上进行。克隆是现有虚拟机(virtual machine,VM)的副本。现有虚拟机称为克隆的父虚拟机。克隆操作完成后,克隆是一个单独的虚拟机,而快照是虚拟机磁盘文件在给定时间点的副本。快照为虚拟磁盘提供更改日志,用于在发生故障或系统错误时将虚拟机恢复到特定时间点。简单地说,克隆就是制作某物的完整副本,而快照则制作一个初始副本,然后只是进行简单的后续更改。此外,存储快照可用于创建克隆。Historically, snapshot and clone operations have been performed on storage containers as virtual objects. A clone is a copy of an existing virtual machine (VM). The existing virtual machine is called the clone's parent virtual machine. After the cloning operation is complete, the clone is an individual virtual machine, and the snapshot is a copy of the virtual machine's disk files at a given point in time. Snapshots provide a log of changes to a virtual disk that can be used to restore a virtual machine to a specific point in time in the event of a failure or system error. Simply put, a clone is about making a complete copy of something, while a snapshot is about making an initial copy and then just making simple subsequent changes. Additionally, storage snapshots can be used to create clones.
对于存储区域网络(Storage Area Network,SAN)阵列,这些操作是在逻辑单元号(Logical Unit Number,LUN)上进行的,而对于网络附属存储(Network AttachedStorage,NAS)阵列,这些操作是在卷或聚合上进行的,卷或聚合都是逻辑对象。这两个操作有效地使用户能够捕获数据集的一个或多个即时映像(称为快照),并以非重叠的方式(称为克隆)在多个用户之间共享映像。For Storage Area Network (SAN) arrays, these operations are performed on Logical Unit Number (LUN), while for Network Attached Storage (Network AttachedStorage, NAS) arrays, these operations are performed on volumes or Performed on aggregates, volumes or aggregates are logical objects. These two operations effectively enable users to capture one or more instant images of a data set (called snapshots) and share the images among multiple users in a non-overlapping manner (called clones).
在本申请中,“克隆”是指制作对象的完整可写副本。对象可以是文件、目录或卷。In this application, "clone" means making a complete, writable copy of an object. The object can be a file, directory, or volume.
在本申请中,“快照”是指在特定时间点拍摄原始图像,然后每个后续图像仅与前一张图像存在差异。In this application, a "snapshot" refers to taking an original image at a specific point in time, and then each subsequent image differs only from the previous image.
下面,结合附图介绍本申请实施例提出的数据恢复方法。首先,本申请实施例适用的应用场景如图1b所示,图1b为本申请实施例中一种应用场景示意图。本申请实施例提出的数据恢复方法可以用于分布式存储系统中,也可以应用于集中式存储系统,本申请实施例对此不作限制。 Next, the data recovery method proposed in the embodiment of the present application will be introduced with reference to the accompanying drawings. First, the application scenarios applicable to the embodiments of the present application are shown in Figure 1b. Figure 1b is a schematic diagram of an application scenario in the embodiments of the present application. The data recovery method proposed in the embodiment of the present application can be used in a distributed storage system or a centralized storage system, and the embodiment of the present application does not limit this.
该应用场景包括:第一设备、第二设备和第三设备。第一设备又称为生产主机,第一设备用于产生需要备份数据。第二设备又称为备份主机,第二设备用于备份第一设备所产生的数据。第三设备又称为用户主机,用户(user)通过第三设备访问第二设备,然后通过第三设备向第二设备下发指令用于恢复备份数据。The application scenario includes: a first device, a second device and a third device. The first device is also called the production host. The first device is used to generate data that needs to be backed up. The second device is also called a backup host. The second device is used to back up the data generated by the first device. The third device is also called the user host. The user accesses the second device through the third device, and then issues instructions to the second device through the third device for restoring the backup data.
一种可能的实现方式中,第一设备、第二设备和第三设备部署于同一实体计算机设备(该同一实体计算机设备可以包括一个或多个实体的计算机设备,例如是一个计算机设备集群)。第一设备、第二设备和第三设备属于不同的虚拟机。In a possible implementation, the first device, the second device and the third device are deployed on the same physical computer device (the same physical computer device may include one or more physical computer devices, such as a computer device cluster). The first device, the second device and the third device belong to different virtual machines.
在另一种可能的实现方式中,第一设备、第二设备部署于不同的实体计算机设备中。第三设备与第一设备是同一实体计算机设备,第三设备提供面向用户的功能窗口,例如是浏览器(Web Browser)提供的功能窗口。用户通过访问该功能窗口,向第二设备下发指令用于恢复备份数据。In another possible implementation, the first device and the second device are deployed in different physical computer devices. The third device and the first device are the same physical computer device, and the third device provides a user-oriented function window, such as a function window provided by a browser (Web Browser). By accessing this function window, the user issues instructions to the second device for restoring backup data.
在另一种可能的实现方式中,第一设备、第二设备和第三设备分别部署于不同实体计算机设备中。In another possible implementation, the first device, the second device and the third device are respectively deployed in different physical computer devices.
请参阅图2,图2为本申请实施例中一种数据恢复方法的实施例示意图。本申请实施例提出的一种数据恢复方法,包括:Please refer to Figure 2. Figure 2 is a schematic diagram of a data recovery method in an embodiment of the present application. A data recovery method proposed in the embodiment of this application includes:
201、第一设备获取文件索引信息。201. The first device obtains file index information.
本实施例中,第一设备首先获取第一设备中全局文件的文件索引信息。示例性的,文件索引信息,包括全局文件中的每个文件的文件索引信息。其中一个文件的文件索引信息例如:“D:\aaa\bbb\cc.txt”。In this embodiment, the first device first obtains the file index information of the global file in the first device. For example, the file index information includes the file index information of each file in the global file. The file index information of one of the files is for example: "D:\aaa\bbb\cc.txt".
202、第一设备向第二设备发送文件索引信息。202. The first device sends file index information to the second device.
本实施例中,当第一设备获取文件索引信息后,将该文件索引信息发送至第二设备。由第二设备进行备份保存。In this embodiment, after the first device obtains the file index information, it sends the file index information to the second device. Backup and save by the second device.
203、第一设备更新文件索引信息。203. The first device updates the file index information.
本实施例中,在步骤201之后,监测所述第一设备中是否发生新增、删除或者修改至少一个文件。当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改。根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。In this embodiment, after step 201, it is monitored whether at least one file is added, deleted or modified in the first device. When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device. Index information, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying. The file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
示例性的,当第一设备新增、删除或者修改至少一个文件时,例如第一设备发送IO写入时,第一设备截获该IO写入并将该IO写入的数据发送至第二设备。由第二设备执行CDP处理,将IO写入的数据写入CDP数据卷和日志卷,执行该IO写入的数据的备份存储。在此过程中,第一设备更新文件索引信息,并将更新后的文件索引信息发送至第二设备,由第二设备进行备份存储。For example, when the first device adds, deletes, or modifies at least one file, for example, when the first device sends an IO write, the first device intercepts the IO write and sends the IO write data to the second device. . The second device performs CDP processing, writes IO written data to the CDP data volume and log volume, and performs backup storage of the IO written data. During this process, the first device updates the file index information and sends the updated file index information to the second device, and the second device performs backup and storage.
示例性的,更新后的文件索引信息如表1所示:For example, the updated file index information is shown in Table 1:
表1

Table 1

表1中,第一设备在“14:00:00”初次获取第一设备的全局文件的文件索引信息。然后,第一设备监视所述第一设备中是否发生新增、删除或者修改至少一个文件。In Table 1, the first device obtains the file index information of the global file of the first device for the first time at "14:00:00". Then, the first device monitors whether at least one file is added, deleted, or modified in the first device.
当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改。第一子索引信息例如表1中“15:05:22”对应的文件索引信息,具体的,在“15:05:22”第一设备检测新增(或者写入)“cc.txt”文件,该“cc.txt”文件的文件索引(或者地址信息)为“D:\aaa\bbb\cc.txt”,操作类型为新增(或者写入)“new”。When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device. Index information, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying. The first sub-index information is, for example, the file index information corresponding to "15:05:22" in Table 1. Specifically, the first device detects a new (or written) "cc.txt" file at "15:05:22" , the file index (or address information) of the "cc.txt" file is "D:\aaa\bbb\cc.txt", and the operation type is "new" (or write).
根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。The file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
可选的,第一设备获取第一子索引信息后,将第一子索引信息和文件索引信息(全局文件的索引信息)进行去重合并处理,得到更新后的文件索引信息。Optionally, after acquiring the first sub-index information, the first device de-duplicates and merges the first sub-index information and the file index information (index information of the global file) to obtain updated file index information.
可选的,第一设备也可以根据指令或者周期性的获取全局文件的索引信息。Optionally, the first device may also obtain the index information of the global file according to instructions or periodically.
204、第一设备向第二设备发送更新后的文件索引信息。204. The first device sends updated file index information to the second device.
本实施例中,第一设备向第二设备发送更新后的文件索引信息。In this embodiment, the first device sends updated file index information to the second device.
进一步的,第一设备将文件数据发送至第二设备,由第二设备备份该文件数据。Further, the first device sends the file data to the second device, and the second device backs up the file data.
205、第二设备获取快照信息。205. The second device obtains snapshot information.
本实施例中,第二设备根据用户策略获取快照信息,该快照信息为某一时刻的第一设备中全局文件的快照信息。该用户策略包括但不限于:在预定义的时刻,第一设备获取文件索引信息,在该预定义的时刻,第二设备获取全局文件的快照信息。又例如:第一设备周期性的获取文件索引信息,第二设备周期性的获取全局文件的快照信息,两个获取的周期相同。In this embodiment, the second device obtains the snapshot information according to the user policy, and the snapshot information is the snapshot information of the global file in the first device at a certain moment. The user policy includes but is not limited to: at a predefined time, the first device obtains file index information, and at the predefined time, the second device obtains snapshot information of the global file. Another example: the first device periodically obtains file index information, and the second device periodically obtains snapshot information of global files, and the two acquisition periods are the same.
示例性的,第二设备获取快照信息后,将快照信息(或称为快照数据)写入到CDP数据卷。For example, after obtaining the snapshot information, the second device writes the snapshot information (or snapshot data) to the CDP data volume.
当第二设备获取快照信息后,第二设备将文件索引信息和快照信息关联保存。After the second device obtains the snapshot information, the second device associates and saves the file index information and the snapshot information.
表2

Table 2

表2中,第一设备在“14:00:00”初次获取第一设备的文件索引信息(全局文件),在“14:00:00”第二设备还获取第一设备的全局文件的快照信息。然后,当第二设备收到来自第一设备的文件索引信息后,第二设备将“14:00:00”的文件索引信息和快照信息(T1)关联存储。第一设备在更新文件索引信息后,第一设备将更新后的文件索引信息发送至第二设备,由第二设备进行备份存储。然后,到T2时刻(15:00:00),第一设备获取快照信息(T2)和“15:00:00”的文件索引信息,第二设备将“15:00:00”的文件索引信息和快照信息(T2)关联存储。以此类推。In Table 2, the first device obtains the file index information (global file) of the first device for the first time at "14:00:00", and the second device also obtains a snapshot of the first device's global file at "14:00:00" information. Then, when the second device receives the file index information from the first device, the second device associates and stores the file index information of "14:00:00" with the snapshot information (T1). After the first device updates the file index information, the first device sends the updated file index information to the second device, and the second device performs backup and storage. Then, at time T2 (15:00:00), the first device obtains the snapshot information (T2) and the file index information of "15:00:00", and the second device obtains the file index information of "15:00:00" Stored in association with snapshot information (T2). And so on.
可选的,当第二设备获取快照信息后,第二设备可以在装载服务器(mount server)中挂载该快照。然后,扫描挂载后的该快照(即扫描快照中的文件系统),获取全局文件的索引信息。第二设备将该扫描得到的全局文件的索引信息,作为文件索引信息进行备份存储。Optionally, after the second device obtains the snapshot information, the second device can mount the snapshot in the mount server. Then, scan the mounted snapshot (that is, scan the file system in the snapshot) to obtain the index information of the global file. The second device backs up and stores the index information of the global file obtained by scanning as file index information.
206、第二设备接收来自第三设备的第二恢复请求,第二恢复请求用于恢复目标时刻的数据。206. The second device receives a second recovery request from the third device, and the second recovery request is used to recover data at the target time.
本实施例中,用户通过第三设备选择目标时刻,该目标时刻作为用户期望恢复数据的时刻。用户通过第三设备向第二设备发送第二恢复请求,该第二恢复请求用于恢复目标时刻的数据。In this embodiment, the user selects a target time through the third device, and the target time is used as the time when the user expects to restore the data. The user sends a second recovery request to the second device through the third device, and the second recovery request is used to recover data at the target time.
示例性的,第三设备在浏览器页面展示数据恢复菜单。用户在该数据恢复菜单中选择想要恢复数据的目标时刻。For example, the third device displays the data recovery menu on the browser page. The user selects the target moment at which the data is to be recovered in the data recovery menu.
207、第二设备根据第二恢复请求,确定目标文件索引信息。207. The second device determines the target file index information according to the second recovery request.
本实施例中,第二设备根据所述第二恢复请求,确定目标文件索引信息,所述目标文件索引信息的记录时刻早于或等于所述目标时刻,且,所述目标文件索引信息的记录时刻满足:文件索引信息集合中与所述目标时刻的差值最小的记录时刻,所述文件索引信息集合包括至少一个文件索引信息,所述文件索引信息包括第一设备中全局文件的索引信息。In this embodiment, the second device determines the target file index information according to the second recovery request. The recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information is The time satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set includes at least one file index information, and the file index information includes the index information of the global file in the first device.
示例性的,以表2为例,当目标时刻为“15:47:25”时,确定最接近的文件索引信息的记录时刻在T2~T3(15:00:00~15:47:28)之间,且不早于15:47:28。然后,第二设备确定目标文件索引信息,该目标文件索引信息包括15:00:00记录的全局文件的索引信息(又称为全量文件索引)和15:00:00~15:47:28(不含15:47:28)记录的第一子索引信息(又称为增量文件索引)。可选的,该目标文件索引信息由上述的全量文件索引和增量文件索引合并(merge)得到。For example, taking Table 2 as an example, when the target time is "15:47:25", it is determined that the recording time of the closest file index information is between T2 ~ T3 (15:00:00 ~ 15:47:28) between, and no earlier than 15:47:28. Then, the second device determines the target file index information, which includes the index information of the global file recorded at 15:00:00 (also known as the full file index) and the index information from 15:00:00 to 15:47:28 ( Excluding the first sub-index information (also called incremental file index) recorded at 15:47:28). Optionally, the target file index information is obtained by merging the above-mentioned full file index and incremental file index.
可选的,该目标文件索引信息可以指示全局文件的文件结构信息,因此,该目标文件索引信息可以称为全量文件系统的目录结构信息。Optionally, the target file index information may indicate the file structure information of the global file. Therefore, the target file index information may be called the directory structure information of the entire file system.
208、第二设备向第三设备发送目标文件索引信息。208. The second device sends the target file index information to the third device.
本实施例中,第二设备向第三设备发送目标文件索引信息。可选的,第三设备根据目标文件索引信息,向用户展示目标时刻的全局文件的目录结构。例如:通过浏览器页面展示该目标时刻的全局文件的目录结构,以供用户通过可视化界面选取需要恢复的目标文件。该目标文件可以包括一个或多个文件。In this embodiment, the second device sends target file index information to the third device. Optionally, the third device displays the directory structure of the global file at the target time to the user based on the target file index information. For example, the directory structure of the global files at the target moment is displayed through the browser page, so that the user can select the target file to be restored through the visual interface. The target file can include one or more files.
209、第二设备接收来自第三设备的第一恢复请求,第一恢复请求用于请求恢复目标时刻 的目标文件。209. The second device receives the first recovery request from the third device. The first recovery request is used to request recovery of the target time. target file.
本实施例中,当用户通过可视化界面确定需要恢复的目标文件后,第三设备向第二设备发送第一恢复请求,第一恢复请求用于请求恢复目标时刻的目标文件。In this embodiment, after the user determines the target file that needs to be restored through the visual interface, the third device sends a first restore request to the second device, and the first restore request is used to request to restore the target file at the target time.
210、第二设备根据第一恢复请求,从快照信息集合中确定目标快照信息。210. The second device determines the target snapshot information from the snapshot information set according to the first recovery request.
本实施例中,第二设备根据第一恢复请求,从快照信息集合中确定目标快照信息,所述目标快照信息包括所述目标文件的索引信息,所述目标快照信息的快照时刻早于或等于所述目标时刻,且,所述目标快照信息的快照时刻满足:所述快照信息集合中与所述目标时刻的差值最小的快照时刻,所述快照信息集合包括至少一个快照信息。In this embodiment, the second device determines the target snapshot information from the snapshot information set according to the first recovery request. The target snapshot information includes the index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to The target time, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information.
示例性的,以表2为例,当目标时刻为“15:47:25”时,确定目标快照信息的快照时刻为T2。然后第二设备确定目标快照信息(T2)。For example, taking Table 2 as an example, when the target time is "15:47:25", the snapshot time of the target snapshot information is determined to be T2. The second device then determines the target snapshot information (T2).
211、第二设备根据目标快照信息,创建目标代理卷。211. The second device creates the target proxy volume based on the target snapshot information.
本实施例中,第二设备根据目标快照信息,创建目标代理卷。具体的:基于目标快照信息进行链接克隆(link clone),得到目标代理卷(即目标快照信息的复制信息)。该目标代理卷作为数据访问(或者文件IO)的代理卷。In this embodiment, the second device creates the target proxy volume based on the target snapshot information. Specifically: perform link clone (link clone) based on the target snapshot information to obtain the target proxy volume (that is, the copy information of the target snapshot information). The target proxy volume serves as a proxy volume for data access (or file IO).
212、第二设备挂载目标代理卷。212. The second device mounts the target proxy volume.
本实施例中,第二设备创建目标代理卷后,挂载该目标代理卷。示例性的,第二设备通过装载服务器(mount server)挂载该目标代理卷。In this embodiment, after creating the target proxy volume, the second device mounts the target proxy volume. For example, the second device mounts the target proxy volume through a mount server.
213、第二设备根据目标时刻重做日志文件,恢复目标时刻的目标文件。213. The second device redoes the log file according to the target time and restores the target file at the target time.
本实施例中,第二设备根据目标时刻,重做(redo)日志文件,然后,从第二设备的存储器中恢复目标时刻的目标文件。例如:从第二设备的存储器中提取目标时刻“15:47:25”的目标文件。In this embodiment, the second device redoes the log file according to the target time, and then restores the target file at the target time from the memory of the second device. For example: extract the target file of the target time "15:47:25" from the memory of the second device.
可选的,当第二设备完成重做日志文件后,进入步骤214(目标代理卷回滚至目标时刻)。Optionally, after the second device completes the redo log file, step 214 is entered (the target proxy volume is rolled back to the target time).
可选的,当第二设备开始执行重做日志文件后,进入步骤214,此处无需等待第二设备完成重做日志文件后,再执行将目标代理卷的数据回滚至目标时刻。步骤213与步骤214为异步处理。Optionally, after the second device starts executing the redo log file, step 214 is entered. Here, there is no need to wait for the second device to complete the redo log file before rolling back the data of the target proxy volume to the target time. Step 213 and step 214 are asynchronous processing.
214、第二设备根据目标时刻,将目标代理卷的数据回滚至目标时刻,得到更新后的目标代理卷。214. The second device rolls back the data of the target proxy volume to the target time according to the target time, and obtains the updated target proxy volume.
本实施例中,第二设备根据目标时刻,将目标代理卷的数据回滚至目标时刻。回滚到目标时刻后,得到更新后的目标代理卷。更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息。In this embodiment, the second device rolls back the data of the target proxy volume to the target time according to the target time. After rolling back to the target time, the updated target proxy volume is obtained. The updated index information indicated by the target proxy volume includes index information of the target file at the target time.
215、第二设备向第三设备发送目标文件。215. The second device sends the target file to the third device.
本实施例中,当第二设备完成目标时刻的目标文件恢复后,第二设备向第三设备发送目标文件。示例性的,通过超文本传输协议(Hyper Text Transfer Protocol,HTTP)方式,以文件流的形式向第三设备(用户)发送目标时刻的目标文件。以便用户即时浏览目标文件。In this embodiment, after the second device completes the recovery of the target file at the target time, the second device sends the target file to the third device. For example, the target file at the target time is sent to the third device (user) in the form of a file stream through Hyper Text Transfer Protocol (HTTP). So that users can browse the target files instantly.
本申请实施例中,无需不断的尝试性PITR(恢复数据),即可实时浏览目标时刻的目标数据。用户可以即时查看目标数据内容,以确保数据恢复的准确性。In the embodiment of the present application, the target data at the target time can be browsed in real time without constant trial PITR (data recovery). Users can instantly view the target data content to ensure the accuracy of data recovery.
结合前述实施例,下面结合附图介绍本申请实施例涉及的应用场景。首先,请参阅图3,图3为块级CDP示意图。第一设备(生产主机)按照功能可以划分为多个单元,具体包括:用户态中包括:应用程序(APP)、代理(agent),其中代理(agent)包括应用程序控制单元(app-control);内核态包括:文件系统(file system)、容器(volume)、驱动程序(driver)和磁盘(disk),其中,驱动程序(driver)包括:检查点(checkpoints)、容器(volume) 和变更块跟踪位图(CBT bitmap),第一设备还包括存储器(storage)。With reference to the foregoing embodiments, application scenarios involved in the embodiments of the present application will be introduced below with reference to the accompanying drawings. First, please refer to Figure 3, which is a schematic diagram of block-level CDP. The first device (production host) can be divided into multiple units according to functions, including: The user mode includes: application program (APP) and agent (agent), where the agent (agent) includes the application control unit (app-control) ;The kernel state includes: file system, container (volume), driver (driver) and disk (disk), where the driver (driver) includes: checkpoints (checkpoints), container (volume) and a changed block tracking bitmap (CBT bitmap). The first device also includes a memory (storage).
第二设备(备份装置)按照功能可以划分为多个单元,具体包括:持续数据保护装置(CDP appliance)、快照数据容器(data vol)、数据(data)和日志(journal),其中持续数据保护装置(CDP appliance)包括:管理器单元(manager)、快照单元(snapshot)和数据接收单元(data receiver)。The second device (backup device) can be divided into multiple units according to functions, including: continuous data protection device (CDP appliance), snapshot data container (data vol), data (data) and log (journal), among which continuous data protection The device (CDP appliance) includes: manager unit (manager), snapshot unit (snapshot) and data receiving unit (data receiver).
具体的,块级CDP流程如下:第一设备检测应用程序(APP)发生文件IO事件后,代理(agent)向驱动程序(driver)中的检查点(checkpoints)下发指令(Control CMD)。然后,驱动程序(driver)将该文件IO事件对应的文件块(block)数据,发送至第二设备的数据接收单元(data receiver)。在第二设备侧以IO日志和数据的方式存储,以便进行PITR。Specifically, the block-level CDP process is as follows: After a file IO event occurs in the first device detection application (APP), the agent issues a command (Control CMD) to the checkpoints in the driver (driver). Then, the driver sends the file block data corresponding to the file IO event to the data receiving unit (data receiver) of the second device. Stored in the form of IO logs and data on the second device side for PITR.
在此基础上,本申请实施例提出的应用场景如图4和图5所示,图4为本申请实施例涉及的文件索引信息的采集场景示意图。图5为本申请实施例中数据恢复的场景示意图。On this basis, the application scenarios proposed by the embodiment of the present application are shown in Figures 4 and 5. Figure 4 is a schematic diagram of the collection scenario of file index information involved in the embodiment of the present application. Figure 5 is a schematic diagram of a data recovery scenario in an embodiment of the present application.
首先介绍图4,在文件索引信息的采集阶段:第一设备在用户态的代理(agent)中,新增了采集单元(fs-catalog)。该采集单元(fs-catalog)可以从文件系统(file system)获取文件索引信息(indexing data)。该采集单元(fs-catalog)具有实时获取文件索引信息(例如当发生文件IO事件时,获取文件IO事件后的文件索引信息)。第二设备在持续数据保护装置(CDP appliance)中新增了同步单元(catalog engine)。First, Figure 4 is introduced. In the collection stage of file index information: the first device adds a new collection unit (fs-catalog) to the user-mode agent. The collection unit (fs-catalog) can obtain file indexing information (indexing data) from the file system (file system). The collection unit (fs-catalog) has the ability to obtain file index information in real time (for example, when a file IO event occurs, obtain the file index information after the file IO event). The second device adds a synchronization unit (catalog engine) to the CDP appliance.
具体的采集流程如下:采集单元(fs-catalog)采集文件索引信息后(包括全量文件索引和增量文件索引),将该文件索引信息发送至第二设备的同步单元(catalog engine)。第二设备的同步单元(catalog engine)将文件索引信息存储至索引存储器(indexing store)。The specific collection process is as follows: After collecting file index information (including full file index and incremental file index), the collection unit (fs-catalog) sends the file index information to the synchronization unit (catalog engine) of the second device. The synchronization unit (catalog engine) of the second device stores the file index information into the indexing store.
其次介绍图5,在数据恢复阶段,第二设备新增装载服务器(mount server)和代理卷(delegate volume),其中,装载服务器(mount server)用于当用户即时查看某个文件内容时自动快速提供文件内容(即挂载目标代理卷),代理卷(delegate volume)用于自动创建的一个链接克隆卷(即创建目标代理卷)。Next, Figure 5 is introduced. During the data recovery stage, the second device adds a mount server (mount server) and a delegate volume (delegate volume). Among them, the mount server (mount server) is used to automatically and quickly view the content of a file when the user instantly Provide the file content (that is, mount the target proxy volume), and the proxy volume (delegate volume) is used to automatically create a linked clone volume (that is, create the target proxy volume).
具体的数据恢复流程如下:当第三设备(用户主机)需要恢复目标时刻的目标数据时。第三设备通过浏览器(Web Browser)指示第二设备找到目标文件索引信息(目标时刻)。第二设备的同步单元(catalog engine)指示索引存储器(indexing store)按需重做日志文件(on-demand log redo)。第二设备基于文件索引信息(目标时刻)确定对应的快照信息(快照2(T2))。代理卷(delegate volume)从快照数据容器(data vol)中提取目标快照信息(快照2(T2)),然后得到链接克隆卷(link clone),该链接克隆卷又称为目标代理卷。代理卷(delegate volume)从日志(journal)中恢复目标时刻的目标数据。最后,第二设备向第三设备发送该目标数据。The specific data recovery process is as follows: when the third device (user host) needs to recover the target data at the target time. The third device instructs the second device to find the target file index information (target time) through the browser (Web Browser). The synchronization unit (catalog engine) of the second device instructs the indexing store (indexing store) to redo the log files on-demand (on-demand log redo). The second device determines corresponding snapshot information (snapshot 2 (T2)) based on the file index information (target time). The proxy volume (delegate volume) extracts the target snapshot information (snapshot 2 (T2)) from the snapshot data container (data vol), and then obtains the linked clone volume (link clone), which is also called the target proxy volume. The delegate volume restores the target data at the target time from the journal. Finally, the second device sends the target data to the third device.
上述主要以方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,上述内存隔离的装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The above mainly introduces the solutions provided by the embodiments of the present application from the perspective of methods. It can be understood that, in order to implement the above functions, the above-mentioned memory isolation device includes hardware structures and/or software modules corresponding to each function. Persons skilled in the art should easily realize that, with the modules and algorithm steps of each example described in conjunction with the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对内存隔离的装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说 明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。Embodiments of the present application can divide the memory isolation device into functional modules according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module. The above integrated modules can be implemented in the form of hardware or software function modules. need to say It should be noted that the division of modules in the embodiments of the present application is schematic and is only a logical function division. In actual implementation, there may be other division methods.
下面对本申请中的数据恢复装置进行描述,请参阅图6,图6为本申请实施例中数据恢复装置的一种实施例示意图。数据恢复装置包括:The data recovery device in the present application will be described below. Please refer to FIG. 6 . FIG. 6 is a schematic diagram of an embodiment of the data recovery device in the embodiment of the present application. Data recovery equipment includes:
收发模块601,用于获取文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;Transceiver module 601, configured to obtain file index information, where the file index information includes index information of global files in the first device;
收发模块601,还用于所述第一设备向第二设备发送所述文件索引信息,所述第二设备用于执行数据备份;The transceiver module 601 is also used for the first device to send the file index information to a second device, and the second device is used to perform data backup;
处理模块602,用于更新所述文件索引信息,更新后的所述文件索引信息包括更新后的所述第一设备中至少一个文件的索引信息。The processing module 602 is configured to update the file index information. The updated file index information includes updated index information of at least one file in the first device.
在一种可能的实现方式中,In one possible implementation,
处理模块602,还用于监测所述第一设备中是否发生新增、删除或者修改至少一个文件;The processing module 602 is also used to monitor whether at least one file is added, deleted or modified in the first device;
处理模块602,还用于当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改;The processing module 602 is also configured to generate first sub-index information when at least one file is added, deleted, or modified in the first device, and the first sub-index information indicates that the file is added, deleted, or modified in the first device. Or modify the index information of at least one file, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
处理模块602,还用于根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。The processing module 602 is further configured to update the file index information according to the first sub-index information, and the updated file index information includes the first sub-index information.
在一种可能的实现方式中,In one possible implementation,
收发模块601,还用于向所述第二设备发送更新后的所述文件索引信息。The sending and receiving module 601 is also configured to send the updated file index information to the second device.
又一种可能的示例中:Another possible example:
收发模块601,用于接收第一恢复请求,所述第一恢复请求用于请求恢复目标时刻的目标文件;The transceiver module 601 is configured to receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
处理模块602,用于根据所述第一恢复请求,从快照信息集合中确定目标快照信息,所述目标快照信息包括所述目标文件的索引信息,所述目标快照信息的快照时刻早于或等于所述目标时刻,且,所述目标快照信息的快照时刻满足:所述快照信息集合中与所述目标时刻的差值最小的快照时刻,所述快照信息集合包括至少一个快照信息;Processing module 602, configured to determine target snapshot information from a snapshot information set according to the first recovery request, where the target snapshot information includes index information of the target file, and the snapshot time of the target snapshot information is earlier than or equal to The target time, and the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
处理模块602,还用于根据所述目标快照信息,创建目标代理卷;The processing module 602 is also configured to create a target proxy volume according to the target snapshot information;
处理模块602,还用于挂载所述目标代理卷,所述目标代理卷提供文件输入输出IO服务,所述目标代理卷用于获取所述目标文件;The processing module 602 is also used to mount the target proxy volume, which provides file input and output IO services, and the target proxy volume is used to obtain the target file;
处理模块602,还用于根据所述目标代理卷,恢复所述目标文件。The processing module 602 is also used to restore the target file according to the target proxy volume.
在一种可能的实现方式中,In one possible implementation,
收发模块601,还用于接收第二恢复请求,所述第二恢复请求用于请求恢复所述目标时刻的数据;The transceiver module 601 is also configured to receive a second recovery request, where the second recovery request is used to request recovery of the data at the target time;
处理模块602,还用于根据所述第二恢复请求,确定目标文件索引信息,所述目标文件索引信息的记录时刻早于或等于所述目标时刻,且,所述目标文件索引信息的记录时刻满足:文件索引信息集合中与所述目标时刻的差值最小的记录时刻,所述文件索引信息集合包括至少一个文件索引信息,所述文件索引信息包括第一设备中全局文件的索引信息;The processing module 602 is also configured to determine the target file index information according to the second recovery request, the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information Satisfies: the recording time with the smallest difference from the target time in the file index information set, the file index information set includes at least one file index information, the file index information includes the index information of the global file in the first device;
处理模块602,还用于恢复所述目标文件索引信息,所述目标文件索引信息用于确定所述目标文件;The processing module 602 is also used to restore the target file index information, and the target file index information is used to determine the target file;
在一种可能的实现方式中, In one possible implementation,
处理模块602,还用于根据所述目标时刻重做日志文件,恢复所述目标时刻的所述目标文件;The processing module 602 is also configured to redo the log file according to the target time and restore the target file at the target time;
处理模块602,还用于根据所述目标时刻,将所述目标代理卷的数据回滚至所述目标时刻,得到更新后的所述目标代理卷,更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息;The processing module 602 is also configured to roll back the data of the target proxy volume to the target time according to the target time, and obtain the updated target proxy volume, and the index indicated by the updated target proxy volume. The information includes index information of the target file at the target time;
处理模块602,还用于根据更新后的所述目标代理卷,恢复所述目标文件。The processing module 602 is also configured to restore the target file according to the updated target proxy volume.
在一种可能的实现方式中,In one possible implementation,
收发模块601,还用于获取第一快照信息,所述第一快照信息为第一时刻的所述第一设备中全局文件的快照信息。The transceiver module 601 is also configured to obtain the first snapshot information, which is the snapshot information of the global file in the first device at the first moment.
在一种可能的实现方式中,In one possible implementation,
收发模块601,还用于获取第二快照信息,所述第二快照信息为第二时刻的所述第一设备中全局文件的快照信息;The transceiver module 601 is also configured to obtain second snapshot information, where the second snapshot information is the snapshot information of the global file in the first device at the second moment;
处理模块602,还用于更新所述快照信息集合,更新后的所述快照信息集合包括:所述第一快照信息和所述第二快照信息。The processing module 602 is also configured to update the snapshot information set. The updated snapshot information set includes: the first snapshot information and the second snapshot information.
在一种可能的实现方式中,In one possible implementation,
收发模块601,还用于接收来自所述第一设备的文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;The transceiver module 601 is also configured to receive file index information from the first device, where the file index information includes index information of global files in the first device;
处理模块602,还用于将所述文件索引信息与所述快照信息集合关联保存。The processing module 602 is also configured to associate and save the file index information with the snapshot information set.
本申请还提供了一种芯片系统,该芯片系统包括处理器,用于支持上述终端设备实现其所涉及的功能,例如,例如接收或处理上述方法实施例中所涉及的数据。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存终端设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。This application also provides a chip system, which includes a processor and is used to support the above-mentioned terminal device to implement its related functions, for example, for example, receiving or processing the data involved in the above-mentioned method embodiments. In a possible design, the chip system further includes a memory, and the memory is used to store necessary program instructions and data for the terminal device. The chip system may be composed of chips, or may include chips and other discrete devices.
在本申请的另一实施例中,还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当设备的至少一个处理器执行该计算机执行指令时,设备执行上述图2至图5部分实施例所描述的方法。In another embodiment of the present application, a computer-readable storage medium is also provided. Computer-executable instructions are stored in the computer-readable storage medium. When at least one processor of the device executes the computer-executed instructions, the device executes the above figure. Methods described in some embodiments from 2 to 5.
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备执行上述图2至图5部分实施例所描述的方法。In another embodiment of the present application, a computer program product is also provided. The computer program product includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can obtain data from a computer-readable storage medium. The storage medium is read to read the computer execution instructions, and at least one processor executes the computer execution instructions to cause the device to perform the methods described in some embodiments of FIGS. 2 to 5 above.
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。In addition, it should be noted that the device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physically separate. The physical unit can be located in one place, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the device embodiments provided in this application, the connection relationship between modules indicates that there are communication connections between them, which can be specifically implemented as one or more communication buses or signal lines.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分 可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备执行本申请各个实施例所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that the present application can be implemented by software plus necessary general hardware. Of course, it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions performed by computer programs can be easily implemented with corresponding hardware. Moreover, the specific hardware structures used to implement the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. circuit etc. However, for this application, software program implementation is a better implementation in most cases. Based on this understanding, the technical solution of the present application is essentially or that part that contributes to the existing technology. It can be embodied in the form of software products. The computer software products are stored in readable storage media, such as computer floppy disks, USB flash drives, mobile hard disks, ROM, RAM, magnetic disks or optical disks, etc., and include a number of instructions to enable A computer device performs the methods described in various embodiments of the application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、内存隔离的装置、计算设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、内存隔离的装置、计算设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, a computer, a memory-isolated device , computing equipment or data center to another website, computer, memory isolation device, computing equipment through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) or data center for transmission. The computer-readable storage medium may be any available medium that a computer can store, or a data storage device such as a training device or a data center integrated with one or more available media. The available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It will be understood that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic associated with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that in the various embodiments of the present application, the size of the sequence numbers of the above-mentioned processes does not mean the order of execution. The execution order of each process should be determined by its functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.
另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。Additionally, the terms "system" and "network" are often used interchangeably herein. The term "and/or" in this article is just an association relationship that describes related objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. B these three situations. In addition, the character "/" in this article generally indicates that the related objects are an "or" relationship.
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。It should be understood that in the embodiment of the present application, "B corresponding to A" means that B is associated with A, and B can be determined based on A. However, it should also be understood that determining B based on A does not mean determining B only based on A. B can also be determined based on A and/or other information.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of both. In order to clearly illustrate the relationship between hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相 互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. Another point, the phase shown or discussed The mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。A unit described as a separate component may or may not be physically separate. A component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or it may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。Integrated units may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products. Based on this understanding, the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present application.
总之,以上所述仅为本申请技术方案的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。 In short, the above descriptions are only preferred embodiments of the technical solution of the present application and are not intended to limit the protection scope of the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this application shall be included in the protection scope of this application.

Claims (17)

  1. 一种数据恢复方法,其特征在于,所述方法应用于第一设备,所述方法包括:A data recovery method, characterized in that the method is applied to a first device, and the method includes:
    获取文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;Obtain file index information, where the file index information includes index information of global files in the first device;
    所述第一设备向第二设备发送所述文件索引信息,所述第二设备用于执行数据备份;The first device sends the file index information to a second device, and the second device is used to perform data backup;
    更新所述文件索引信息,更新后的所述文件索引信息包括更新后的所述第一设备中至少一个文件的索引信息。The file index information is updated, and the updated file index information includes updated index information of at least one file in the first device.
  2. 根据权利要求1所述的方法,其特征在于,所述第一设备更新所述文件索引信息,包括;The method according to claim 1, characterized in that the first device updates the file index information, including;
    监测所述第一设备中是否发生新增、删除或者修改至少一个文件;Monitor whether at least one file is added, deleted or modified in the first device;
    当所述第一设备中新增、删除或者修改至少一个文件时,生成第一子索引信息,所述第一子索引信息指示所述第一设备中新增、删除或者修改的至少一个文件的索引信息,所述第一子索引信息还包括所述至少一个文件的操作类型,所述操作类型包括以下一项或多项:新增、删除或者修改;When at least one file is added, deleted or modified in the first device, first sub-index information is generated, and the first sub-index information indicates the number of the at least one file added, deleted or modified in the first device. Index information, the first sub-index information also includes the operation type of the at least one file, the operation type includes one or more of the following: adding, deleting or modifying;
    根据所述第一子索引信息,更新所述文件索引信息,更新后的所述文件索引信息包括所述第一子索引信息。The file index information is updated according to the first sub-index information, and the updated file index information includes the first sub-index information.
  3. 根据权利要求1-2中任一项所述的方法,其特征在于,向所述第二设备发送所述文件索引信息之后,所述方法还包括:The method according to any one of claims 1-2, characterized in that after sending the file index information to the second device, the method further includes:
    向所述第二设备发送更新后的所述文件索引信息。Send the updated file index information to the second device.
  4. 一种数据恢复方法,其特征在于,所述方法应用于第二设备,所述方法包括:A data recovery method, characterized in that the method is applied to a second device, and the method includes:
    接收第一恢复请求,所述第一恢复请求用于请求恢复目标时刻的目标文件;Receive a first recovery request, where the first recovery request is used to request recovery of the target file at the target time;
    根据所述第一恢复请求,从快照信息集合中确定目标快照信息,所述目标快照信息包括所述目标文件的索引信息,所述目标快照信息的快照时刻早于或等于所述目标时刻,且,所述目标快照信息的快照时刻满足:所述快照信息集合中与所述目标时刻的差值最小的快照时刻,所述快照信息集合包括至少一个快照信息;According to the first recovery request, target snapshot information is determined from a snapshot information set, the target snapshot information includes index information of the target file, the snapshot time of the target snapshot information is earlier than or equal to the target time, and , the snapshot time of the target snapshot information satisfies: the snapshot time with the smallest difference from the target time in the snapshot information set, and the snapshot information set includes at least one snapshot information;
    根据所述目标快照信息,创建目标代理卷;Create a target proxy volume according to the target snapshot information;
    挂载所述目标代理卷,所述目标代理卷提供文件输入输出IO服务,所述目标代理卷用于获取所述目标文件;Mount the target proxy volume, which provides file input and output IO services, and the target proxy volume is used to obtain the target file;
    根据所述目标代理卷,恢复所述目标文件。The target file is restored based on the target proxy volume.
  5. 根据权利要求4所述的方法,其特征在于,接收所述第一恢复请求之前,所述方法还包括:The method according to claim 4, characterized in that before receiving the first recovery request, the method further includes:
    接收第二恢复请求,所述第二恢复请求用于请求恢复所述目标时刻的数据;Receive a second recovery request, where the second recovery request is used to request recovery of data at the target time;
    根据所述第二恢复请求,确定目标文件索引信息,所述目标文件索引信息的记录时刻早于或等于所述目标时刻,且,所述目标文件索引信息的记录时刻满足:文件索引信息集合中与所述目标时刻的差值最小的记录时刻,所述文件索引信息集合包括至少一个文件索引信息,所述文件索引信息包括第一设备中全局文件的索引信息;According to the second recovery request, the target file index information is determined, the recording time of the target file index information is earlier than or equal to the target time, and the recording time of the target file index information satisfies: in the file index information set The recording time with the smallest difference from the target time, the file index information set includes at least one file index information, and the file index information includes index information of global files in the first device;
    恢复所述目标文件索引信息,所述目标文件索引信息用于确定所述目标文件。 The target file index information is restored, and the target file index information is used to determine the target file.
  6. 根据权利要求4或5中任一项所述的方法,其特征在于,根据所述目标代理卷,恢复所述目标文件,包括:The method according to any one of claims 4 or 5, characterized in that restoring the target file according to the target proxy volume includes:
    根据所述目标时刻重做日志文件,恢复所述目标时刻的所述目标文件;Redo the log file at the target time and restore the target file at the target time;
    根据所述目标时刻,将所述目标代理卷的数据回滚至所述目标时刻,得到更新后的所述目标代理卷,更新后的所述目标代理卷指示的索引信息包括所述目标时刻的所述目标文件的索引信息;According to the target time, the data of the target proxy volume is rolled back to the target time to obtain the updated target proxy volume. The index information indicated by the updated target proxy volume includes the index information of the target time. Index information of the target file;
    根据更新后的所述目标代理卷,恢复所述目标文件。The target file is restored based on the updated target proxy volume.
  7. 根据权利要求4-6中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 4-6, characterized in that the method further includes:
    获取第一快照信息,所述第一快照信息为第一时刻的所述第一设备中全局文件的快照信息。Obtain first snapshot information, where the first snapshot information is the snapshot information of the global file in the first device at the first moment.
  8. 根据权利要求7所述的方法,其特征在于,获取所述第一快照信息之后,所述方法还包括:The method according to claim 7, characterized in that after obtaining the first snapshot information, the method further includes:
    获取第二快照信息,所述第二快照信息为第二时刻的所述第一设备中全局文件的快照信息;Obtain second snapshot information, where the second snapshot information is the snapshot information of the global file in the first device at the second moment;
    更新所述快照信息集合,更新后的所述快照信息集合包括:所述第一快照信息和所述第二快照信息。The snapshot information set is updated, and the updated snapshot information set includes: the first snapshot information and the second snapshot information.
  9. 根据权利要求7-8中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 7-8, characterized in that the method further includes:
    接收来自所述第一设备的文件索引信息,所述文件索引信息包括所述第一设备中全局文件的索引信息;Receive file index information from the first device, where the file index information includes index information of global files in the first device;
    将所述文件索引信息与所述快照信息集合关联保存。The file index information is associated and saved with the snapshot information set.
  10. 一种数据恢复系统,其特征在于,所述系统包括:A data recovery system, characterized in that the system includes:
    第一设备,所述第一设备用于执行前述权利要求1-3中任一项所述的方法;A first device, the first device is used to perform the method according to any one of the preceding claims 1-3;
    第二设备,所述第二设备用于执行前述权利要求4-9中任一项所述的方法。Second device, the second device is used to perform the method according to any one of the preceding claims 4-9.
  11. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;A computing device cluster, characterized by including at least one computing device, each computing device including a processor and a memory;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1-3,或者4-9中任一项所述的方法。The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device, so that the cluster of computing devices executes the method described in any one of claims 1-3, or 4-9. method.
  12. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算机设备集群运行时,使得所述计算机设备集群执行如权利要求的1-9中任一项所述的方法。A computer program product containing instructions, characterized in that, when the instructions are run by a cluster of computer equipment, the cluster of computer equipment causes the cluster of computer equipment to perform the method according to any one of claims 1-9.
  13. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1-9中任一项所述的方法。A computer-readable storage medium, characterized in that it includes computer program instructions. When the computer program instructions are executed by a computing device cluster, the computing device cluster performs the method according to any one of claims 1-9. .
  14. 一种计算机设备,其特征在于,用作第一设备,包括:A computer device, characterized by being used as a first device, including:
    收发模块,用于执行权利要求1-3任一项所述的方法中由所述第一设备所执行的接收和/或发送相关的操作; A transceiver module, configured to perform reception and/or transmission-related operations performed by the first device in the method described in any one of claims 1-3;
    处理模块,用于执行权利要求1-3任一项所述的方法中由所述第一设备所执行的接收和/或发送相关的操作之外的其它操作。A processing module, configured to perform other operations other than the receiving and/or sending related operations performed by the first device in the method according to any one of claims 1-3.
  15. 一种计算机设备,其特征在于,用作第二设备,包括:A computer device, characterized by being used as a second device, including:
    收发模块,用于执行权利要求4-9任一项所述的方法中由所述第二设备所执行的接收和/或发送相关的操作;A transceiver module, configured to perform reception and/or transmission-related operations performed by the second device in the method described in any one of claims 4-9;
    处理模块,用于执行权利要求4-9任一项所述的方法中由所述第二设备所执行的接收和/或发送相关的操作之外的其它操作。A processing module configured to perform other operations other than the receiving and/or sending related operations performed by the second device in the method according to any one of claims 4 to 9.
  16. 一种计算机设备,用作第一设备,其特征在于,包括;A computer device used as a first device, characterized by including;
    通信接口;Communication Interface;
    与所述通信接口连接的处理器,基于所述通信接口和所述处理器,使得所述第一设备执行如权利要求1至3任一项所述的方法。A processor connected to the communication interface causes the first device to execute the method according to any one of claims 1 to 3 based on the communication interface and the processor.
  17. 一种计算机设备,用作第二设备,其特征在于,包括:A computer device used as a second device, characterized in that it includes:
    通信接口;Communication Interface;
    与所述通信接口连接的处理器,基于所述通信接口和所述处理器,使得所述第二设备执行如权利要求4至9任一项所述的方法。 A processor connected to the communication interface causes the second device to perform the method according to any one of claims 4 to 9 based on the communication interface and the processor.
PCT/CN2023/077176 2022-04-28 2023-02-20 Data recovery method and related apparatus WO2023207280A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210461500.X 2022-04-28
CN202210461500.XA CN117009133A (en) 2022-04-28 2022-04-28 Data recovery method and related device

Publications (1)

Publication Number Publication Date
WO2023207280A1 true WO2023207280A1 (en) 2023-11-02

Family

ID=88517224

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/077176 WO2023207280A1 (en) 2022-04-28 2023-02-20 Data recovery method and related apparatus

Country Status (2)

Country Link
CN (1) CN117009133A (en)
WO (1) WO2023207280A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702158A (en) * 2009-10-28 2010-05-05 卓望数码技术(深圳)有限公司 Index file creation synchronized method and search system
US10042719B1 (en) * 2015-09-22 2018-08-07 EMC IP Holding Company LLC Optimizing application data backup in SMB
CN112328435A (en) * 2020-12-07 2021-02-05 武汉绿色网络信息服务有限责任公司 Method, device, equipment and storage medium for backing up and recovering target data
CN112380057A (en) * 2020-11-12 2021-02-19 平安科技(深圳)有限公司 Data recovery method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702158A (en) * 2009-10-28 2010-05-05 卓望数码技术(深圳)有限公司 Index file creation synchronized method and search system
US10042719B1 (en) * 2015-09-22 2018-08-07 EMC IP Holding Company LLC Optimizing application data backup in SMB
CN112380057A (en) * 2020-11-12 2021-02-19 平安科技(深圳)有限公司 Data recovery method, device, equipment and storage medium
CN112328435A (en) * 2020-12-07 2021-02-05 武汉绿色网络信息服务有限责任公司 Method, device, equipment and storage medium for backing up and recovering target data

Also Published As

Publication number Publication date
CN117009133A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
US11321195B2 (en) Hypervisor-independent reference copies of virtual machine payload data based on block-level pseudo-mount
US11748143B2 (en) Live mount of virtual machines in a public cloud computing environment
US11099956B1 (en) Snapshot-based disaster recovery orchestration of virtual machine failover and failback operations
US11513922B2 (en) Systems and methods for change block tracking for backing up changed data
US10817326B2 (en) Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount)
US20220147422A1 (en) Data storage management system for multi-cloud protection, recovery, and migration of databases-as-a-service (dbaas) and/or serverless database management systems (dbms)
US20200341851A1 (en) Live browsing of backed up data residing on cloned disks
US20200174894A1 (en) Virtual server cloud file system for virtual machine restore to cloud operations
US9015164B2 (en) High availability for cloud servers
US9495189B2 (en) Live replication of a virtual machine exported and imported via a portable storage device
US11550680B2 (en) Assigning backup resources in a data storage management system based on failover of partnered data storage resources
US9817733B2 (en) Resource recovery for checkpoint-based high-availability in a virtualized environment
US11188271B2 (en) Using storage managers in data storage management systems for license distribution, compliance, and updates
US20240118980A1 (en) Cloud-based recovery of backed up data using auxiliary copy replication and on-demand failover resources
US20230032790A1 (en) Scalable recovery and/or migration to cloud-based custom-made virtual machines without using failed machines' credentials
US11604705B2 (en) System and method for cloning as SQL server AG databases in a hyperconverged system
Yang et al. SnapMig: Accelerating VM live storage migration by leveraging the existing VM snapshots in the cloud
US20230043336A1 (en) Using an application orchestrator computing environment for automatically scaled deployment of data protection resources needed for data in a production cluster distinct from the application orchestrator or in another application orchestrator computing environment
US11467924B2 (en) Instant recovery of databases
WO2023207280A1 (en) Data recovery method and related apparatus
US12001301B2 (en) Hypervisor-independent reference copies of virtual machine payload data based on block-level pseudo-mount
US12003581B2 (en) Enhanced network attached storage (NAS) interoperating with and overflowing to cloud storage resources
US20240007505A1 (en) Secure data replication to, and recovery of data from, air-gapped data storage pools
US20230297403A1 (en) Live mounting a virtual machine in a public cloud based on accessing a simulated virtual disk as an internet small computer systems interface (iscsi) target
CN110688195B (en) Instant restore and instant access of a HYPER-V VM and applications running inside the VM using the data domain boost fs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23794716

Country of ref document: EP

Kind code of ref document: A1