CN116302683A - Data recovery method and computing device - Google Patents

Data recovery method and computing device Download PDF

Info

Publication number
CN116302683A
CN116302683A CN202310085772.9A CN202310085772A CN116302683A CN 116302683 A CN116302683 A CN 116302683A CN 202310085772 A CN202310085772 A CN 202310085772A CN 116302683 A CN116302683 A CN 116302683A
Authority
CN
China
Prior art keywords
data
pool
target data
storage
storage address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310085772.9A
Other languages
Chinese (zh)
Inventor
李舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202310085772.9A priority Critical patent/CN116302683A/en
Publication of CN116302683A publication Critical patent/CN116302683A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a data recovery method and computing equipment. Determining that the target data fails to be written in the first data pool; acquiring the target data from a second data pool; inserting the target data into a data stream to be written in the first data pool; reassigning a first storage address in the first data pool for the target data; and writing the target data into the first data pool according to the reassigned first storage address. The technical scheme provided by the embodiment of the application simplifies the data recovery flow, reduces the system resource overhead and improves the storage performance.

Description

Data recovery method and computing device
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data recovery method and computing equipment.
Background
In order to exert the storage performance more, a storage system for providing distributed storage generally adopts a layered and pool-divided storage mode, two data pools are arranged, data are written into the two data pools in parallel and synchronously, one data pool can be realized by adopting a low-delay storage medium, so that the writing delay is reduced, the writing performance is improved, and the data can be finally and actually stored in the other data pool.
By adopting a parallel synchronous writing mechanism, when one data pool is written into successfully, for example, when the actually stored data pool is written into successfully, the data recovery can be performed based on the data written into by the other data pool.
In the existing data recovery, the data read from another data pool is usually overwritten to the fault location by adopting a retry-in-place writing mode, however, the storage medium used by the data pool may not support overwriting, thus causing data recovery failure, so that a mode of reading a plurality of data in the whole storage area can be adopted, and after the plurality of data are modified based on the data of the other data pool, the data are integrally written into the storage area, however, the process is very complex, the expenditure is huge, and the storage performance is affected.
Disclosure of Invention
The embodiment of the application provides a data recovery method and computing equipment, which are used for solving the technical problems of high system resource overhead and influence on storage performance in the prior art.
In a first aspect, an embodiment of the present application provides a data recovery method, including:
determining that the writing of the target data in the first data pool fails;
acquiring the target data from a second data pool;
Reassigning a first storage address in the first data pool for the target data;
and writing the target data into the first data pool according to the reassigned first storage address.
In a second aspect, embodiments of the present application provide a computing device comprising a processing component and a storage component;
the storage component stores one or more computer instructions; the one or more computer instructions are operable to be invoked by the processing component to perform the data recovery method as described in the first aspect above.
In a third aspect, embodiments of the present application provide a computer storage medium storing a computer program, which when executed by a computer implements the data recovery method according to the second aspect described above.
In this embodiment of the present application, under the condition that writing in the first data pool fails, the target data may be read from the second data pool, and then, a first storage address is reassigned to the target data in the first data pool, and the target data is written into the first data pool according to the reassigned first storage address. And the target data is rewritten into the reassigned first storage address in the first data pool, so that the data recovery flow is simplified, the system resource overhead is reduced, and the storage performance can be improved.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 shows a schematic system structure to which the technical solution of the embodiment of the present application is applicable;
FIG. 2 illustrates a flow chart of one embodiment of a data recovery method provided herein;
FIG. 3 is a schematic diagram of a data recovery operation in a practical application according to an embodiment of the present application;
FIG. 4 is a schematic diagram showing address setting in a practical application according to an embodiment of the present application;
FIG. 5 is a flow chart illustrating yet another embodiment of a data recovery method provided herein;
FIG. 6 is a schematic diagram illustrating the structure of one embodiment of a data recovery device provided herein;
FIG. 7 illustrates a schematic diagram of one embodiment of a computing device provided herein.
Detailed Description
In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application.
In some of the flows described in the specification and claims of this application and in the foregoing figures, a number of operations are included that occur in a particular order, but it should be understood that the operations may be performed in other than the order in which they occur or in parallel, that the order of operations such as 101, 102, etc. is merely for distinguishing between the various operations, and that the order of execution is not by itself represented by any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
The technical scheme of the embodiment of the application can be applied to a storage system for providing distributed storage. Distributed storage is widely used due to its advantages such as its laterally scalable architecture and high availability of data, and is particularly widely used for various cloud service providers.
The IO (Input/Output) path of the traditional distributed storage is long, and the delay is obvious. With the progress of various storage media over the last several years, new storage media such as NAND flash memory (a better storage device than a hard disk drive), phase change memory, etc. have been used for storage services on a large scale. Based on the advantages of low read-write delay, high throughput and the like of the novel storage value, a write cache design can be inserted into a data storage path, the use mode of the write cache is optimized, additional writing is adopted, and the storage performance is exerted to a greater extent through a sequential writing mode, so that design architectures such as layered storage and the like are formed. Meanwhile, in order to reduce cost, improve deployment flexibility, improve performance stability of a storage system and the like, a layered and split-pool storage mode is provided, by setting two independent data pools, data are written into the two data pools in parallel and synchronously, transmission bandwidth overhead can be reduced, the two data pools are assumed to be a first data pool and a second data pool respectively, the first data pool actually stores data, and the second data pool adopts a storage medium with lower writing delay. Both may meet the distributed storage requirements. Compared with the traditional write cache design, the second data pool can have distributed global attributes, and can ensure data persistence storage to avoid single-point faults. In addition, the second data pool can decouple throughput from capacity binding, and can provide high throughput performance with high performance media.
The inventor finds that, in the process of implementing the application, as the data are synchronously written into the first data pool and the second data pool in parallel, under the condition that the first data pool is successfully written, the data in the second data pool can be deleted, so that the storage space of the second data pool is vacated for the subsequent data writing. Under the parallel synchronous write mechanism, in the case of a write failure of the first data pool, the second data pool is likely to have been written successfully due to the lower write latency of the second data pool. When the first data pool is subjected to data recovery, corresponding data can be positioned and read from the second data pool, and if an overwriting mode is adopted, a storage medium used by the data pool, such as the novel storage medium described above, may not support overwriting, so that data recovery fails. Therefore, a manner of reading a plurality of data in a storage area where the data is located is generally adopted, and after the plurality of data is modified based on the data from another data pool, the data is written into the storage area as a whole, so that the process is very complex, the cost is huge, and the storage performance is affected.
In order to solve the technical problems that the data recovery process is complex, the cost is high and the like, and therefore the storage performance is affected, the inventor provides a technical scheme of the application through a series of researches.
In order to facilitate understanding of the technical solutions of the present application, the following first explains the technical terms possibly related in the present application correspondingly:
and (3) additional writing: a data writing method of adding new writing data to the written data.
And (3) overwriting: the current write data will overwrite a corresponding data write pattern of the written data.
Sequential writing: the write operation is a data writing method in which the positions of the write operations are continuous, and the additional writing is sequential writing when the write positions are continuous.
Metadata: data describing data attributes, such as memory address, data combination, write time, modification time, etc.
A storage medium: the hardware device in the storage system for storing data, where the data is ultimately required to be written to the storage device, may refer to a storage medium such as a magnetic disk.
And (3) distributed storage: is that data is stored scattered across a plurality of independent storage nodes. The storage system for providing the distributed storage adopts an expandable system structure, and a plurality of storage nodes are utilized to share the storage load, so that the reliability, availability and access efficiency of the system are improved, and the system is easy to expand.
And (3) data pool: a data distribution storage system capable of persistent storage is composed of multiple storage media. In this embodiment, two data pools are referred to, for example, a first data pool may actually store data, where the data is ultimately stored in the first data pool, and thus may also be referred to as a flush pool. The second data pool adopts a storage medium with lower writing delay, and in order to improve the writing performance of the data, the data is continuously written into the second data pool in an additional writing mode, and the second data pool is also called a log pool in the same way as the writing mode of log data. The first data pool can have the characteristics of large data storage quantity, full life cycle storage of data, random reading of data sources and the like, and meets the requirements of distributed storage on performance, stability, reliability, availability and the like.
In this embodiment of the present application, under the condition that writing in a first data pool fails, target data may be read from a second data pool, and then the target data is inserted into a data stream to be written in the first data pool, and a first storage address is redistributed in the first data pool for the target data; and writing the target data into the first data pool according to the reassigned first storage address. The target data is rewritten into the reassigned first storage address in the first data pool as new data, so that the data recovery flow is simplified, the system resource overhead is reduced, and the storage performance can be improved.
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Fig. 1 shows a system architecture diagram to which the technical solution of the embodiment of the present application may be applied, where the system architecture may include a storage system 100, where the storage system 100 may be in a cluster form, and a first data pool 101 and a second data pool 102 are provided, and in addition, the storage system 100 may further include a storage engine 103 that is responsible for operations such as data management and control.
The system architecture may further include a data accessor 104, where the data accessor 104 may be a front-end application, and the storage system acts as a back-end storage cluster to provide data access capabilities, etc.
The first data pool 101 and the second data pool 102 may each be configured by a plurality of storage nodes using a distributed storage architecture, and each storage node may be implemented as a storage medium or the like.
The second data pool 102 may use a high-speed storage medium with a lower writing delay, and the first data pool 101 and the second data pool 102 use a parallel synchronous writing mechanism, so that the second data pool 102 may preferentially write successfully in most cases. In the case that the first data pool is also written successfully, the data in the second data pool may be deleted to make room for storage of the second data pool for subsequent data writing.
When the data access party is used as a front-end Application, the data access party may be a browser, an APP (Application program), or a web Application such as an H5 (HyperText Markup Language5, 5 th edition of hypertext markup language) Application, or a light Application (also called applet, a lightweight Application program) or a cloud Application, etc., which may be deployed in an electronic device, and may need to run depending on the device or some APPs in the device. The electronic device may for example have a display screen and support information browsing etc. as may be a personal mobile terminal such as a mobile phone, tablet, personal computer, desktop computer, smart phone, smart watch etc.
In an actual application, the technical solution of the embodiment of the present application may be applied to a cloud computing scenario, where the first data pool and the second data pool may provide cloud storage, and the storage engine may be implemented as a cloud server, for example.
It should be noted that, in the embodiments of the present application, the use of user data may be involved, and in practical applications, user specific personal data may be used in the schemes described herein within the scope allowed by applicable legal regulations in the country where the applicable legal regulations are met (for example, the user explicitly agrees to the user to actually notify the user, etc.).
Implementation details of the technical solutions of the embodiments of the present application are set forth in detail below.
Fig. 2 is a flowchart of an embodiment of a data recovery method provided in the embodiment of the present application, where the technical solution of the present embodiment may be executed by a storage engine in the system architecture, and the method may include the following steps:
201: determining that the target data fails to be written in the first data pool.
The target data is synchronously written into the first data pool and the second data pool in parallel, and the writing failure can be reported by mistake, so that whether the target data fails to be written into the first data pool or not can be determined.
The first data pool and the second data pool can be sequentially written in an additional writing mode, and the target data can be any data written by a data access request.
202: target data is read from the second data pool.
The second data pool has been written successfully with a high probability in case of a write failure of the first data pool. In this embodiment of the present application, the target data may be obtained from the second data pool under the condition that the writing of the first data pool fails.
203: reassigning a first storage address in the first data pool for the target data;
204: and writing the target data into the first data pool according to the reassigned first storage address.
In order to write the target data into the first data pool, in the embodiment of the present application, the first storage address may be reassigned in the first data pool, but the old address corresponding to the write failure is not used, so compared with the traditional manner, in the embodiment of the present application, by retransmitting the same storage location where the target data is written and rearranged, the data writing and the fault processing are decoupled, the management and control flow is simplified, the traffic allocation is normalized, the congestion is overcome by the predictable bandwidth resource allocation, and the performances such as QoS (Quality of Service ) and the like are ensured.
Since the first data pool receives other data requested to be written by the data access party, the target data can be added into the data stream to be written as a new data block, so that the target data and the other data can be sequentially written into the first data pool based on the insertion position. The data in the data stream is distributed with the storage address in the first data pool before being written into the first data pool, wherein the arrangement sequence of the target data in the data stream can be determined according to the insertion position, and can be used as the address distribution sequence or the data writing sequence, so that the first storage address can be distributed for each data in the data stream in sequence according to the arrangement sequence and written into the first data pool, and the aim of writing the target data into the first data pool is fulfilled. The first data pool can be written after the first storage address is allocated to each data in the data stream, or the target data can be written into the first data pool sequentially according to the respective corresponding arrangement sequence after the first storage addresses are allocated to a plurality of data in the data stream. The first storage addresses corresponding to different data in the data stream can be arranged according to the ordering sequence of the data stream, so that the continuous writing positions are ensured, and the sequential writing is realized. In summary, in some embodiments, the method may further include: and inserting the target data into a data stream to be written in the first data pool.
Optionally, the reassigning the first storage address to the target data in the first data pool may be: acquiring target data from the first data stream according to the insertion position; the first storage address is reassigned to the target data in the first data pool. Wherein, the target data can be obtained from the first data stream according to the arrangement sequence of the insertion positions in the data stream.
Optionally, the writing the target data into the first data pool according to the reassigned first storage address may be: and writing the target data into the first data pool based on the corresponding arrangement sequence of the insertion positions in the data stream according to the reassigned first storage address.
Since the data is written into the first data pool and the second data pool, the storage address corresponding to the first data pool and the storage address corresponding to the second data pool are allocated. The target data re-written to the first data pool will also reassign the memory address. Thus, in some embodiments, after the target data is written successfully in the first data pool, the first storage address corresponding to the target data when the target data fails to be written in the first data pool may be deleted.
In some embodiments, the insertion of the target data into the data stream to be written to the first data pool may be: and inserting the target data into the data stream to be written in the first data pool according to the receiving time sequence.
The receipt time may refer to a request write time or a time when data enters the storage engine, etc.
In order to facilitate understanding of the data recovery flow in the embodiment of the present application, in a data recovery operation schematic diagram shown in fig. 3, the storage engine 103 may receive a request to write a plurality of data, where the plurality of data may be arranged according to a receiving time sequence to form a data stream to be written corresponding to the first data pool 101. Assuming that the target data is data B in FIG. 3, for ease of distinction, the previously written data B is described as data B 1 The rewritten data B is described as data B 2 ,B 1 =B 2 =b. The format of the data after being written in sequence is AB 1 CD, data B 1 At this point, data B may be retrieved from the second data pool 102 upon a write failure of the first data pool 2 Data B 2 The incoming storage engine is considered a new data and is inserted into the data stream. Data B assuming that the insertion position is determined based on the time of reception 2 After the data E and before the data F, the data B can be rearranged according to the insertion position 2 Writing into a first data pool, wherein the corresponding written format is EB 2 FG。
In some embodiments, reading the target data from the second data pool may include:
determining a first storage address corresponding to the target data in the first data pool;
determining that the first storage address corresponds to a second storage address in a second data pool;
the target data is read from the second data pool based on the second storage address.
The first storage address and the second storage address may have an address mapping relationship, so that the second storage address corresponding to the first storage address may be determined by searching the address mapping relationship.
The first storage address corresponding to the target data in the first data pool may refer to a first storage address allocated by a last allocation operation before the reallocation operation.
In addition, since the storage address is stored in the metadata, the metadata of the first data pool and the metadata of the second data pool may have a corresponding address mapping relationship based on the data identification, and thus the second storage address corresponding to the first storage address may also be determined by searching the metadata.
In addition, the first data pool and the second data pool can also uniformly manage metadata, and the metadata comprises a first storage address in the first data pool and a second storage address in the second data pool, so that the second storage address corresponding to the first storage address can be rapidly determined by searching the metadata.
In addition, to improve address determination efficiency, an address format of a second storage address corresponding to the second data pool may include a node identification, a start address, and an address offset; thus, in some embodiments, based on the second storage address, reading the target data from the second data pool may include:
determining a node identifier corresponding to the storage node based on the second storage address;
and reading target data from the storage position corresponding to the starting address and the address offset.
The corresponding storage node can be located according to the node identification, and the corresponding storage location can be determined according to the starting address and the address offset, so that the target data can be read from the storage location.
The start address may refer to a start address of a certain data segment in the storage node to which the target data requests to be written, the offset address is an offset of the write target data relative to the start address, and the storage location of the target data may be the start address+the offset address. As shown in the address format schematic diagram of fig. 4, the second storage address may include a node identifier, a start address, and an offset address, and the first storage address and the second storage address may establish a shortcut mapping relationship based on metadata of each location.
As can be seen from the above description, the first storage address and the second storage address may correspond to an address mapping relationship, and thus, in some embodiments, the method may further include:
the address mapping relationship is updated based on the reassigned first storage address.
Furthermore, in some embodiments, the method may further comprise:
after the target data is successfully written in the first data pool, deleting a first storage address corresponding to the target data when the writing in the first data pool fails in the first data pool.
Optionally, the deleted first storage address may also be recycled, so that the deleted first storage address may be recycled, specifically after the target data is written successfully in the first data pool, to continue storing data using the first storage address, and so on.
Furthermore, to facilitate address reclamation, in some embodiments, the method may further comprise:
setting an invalid mark for a first storage address corresponding to the target data when the writing of the target data in the first data pool fails;
the first memory address where the invalid flag is set may be reclaimed after the target data is written successfully in the first data pool.
B in combination with the data recovery operation diagram shown in FIG. 3 2 After the write is successful, B can be 1 The corresponding memory address is deleted, and the deleted address can be recycled and the like.
In some embodiments, the method may further comprise, prior to determining that the target data fails to write to the first data pool:
aiming at target data to be written, a first storage address is allocated for the target data in a first data pool;
judging whether the target data is from a second data pool;
if yes, writing the target data into a first data pool according to the first storage address;
if not, a second storage address is allocated for the target data in the second data pool, and the target data is written into the first data pool and the second data pool respectively based on the first storage address and the second storage address.
That is, the target data to be written may be derived from the second data pool, the data requested to be recovered may be the data requested to be written by the user or the data recovered from other channels, etc., so before the writing operation is performed, it may be first determined whether the target data is derived from the second data pool, if the target data is derived from the second data pool, the target data does not need to be rewritten into the second data pool, and only needs to be written into the first data pool. If not, the first data pool and the second data pool need to be written.
In the above description, the target data is written into the first data pool according to the reassigned first storage address, or whether the target data is sourced from the second data pool may be first determined, if yes, the target data is written into the first data pool according to the reassigned first storage address.
Since the target data may also fail to be written to or have been deleted from the second data pool, for data recovery, the target data may be obtained from the upstream and downstream backup data for recovery, where the upstream and downstream backup data includes backup data in the data access party, and therefore, in some embodiments, the method may further include: if the acquisition of the target data from the second data pool fails, acquiring the target data from a data access party; the target data is written to the first data pool.
In addition, if the writing failure of the second data pool is determined, the target data acquired from the data access party can be synchronously written into the first data pool and the second data pool in parallel.
In some embodiments, the method may further comprise:
and if the target data fails to be acquired from the second data pool, outputting fault prompt information.
The fault notification may be used to indicate that the second data pool is faulty, etc.
The fault prompt information may be displayed or sent to a communication account, where the communication account may be a mailbox account or an instant messaging account, for example.
In addition, to facilitate data access, the first data pool may be accessed by directing a read request into the first data pool prior to secure storage of the first data pool, and thus, in some embodiments, the method may further comprise:
receiving a read request for target data; judging whether the target data is successfully written in the first data pool or not; if yes, accessing the first data pool to read target data; if not, accessing the second data pool to read the target data.
In some embodiments, to improve data security, inserting the target data into the data stream to be written in the first data pool may include: performing data verification on target data; after verification passes, the target data is inserted into the data stream to be written in the first data pool. The data verification may adopt a verification mode such as CRC verification, which is not limited in this application.
Fig. 5 is a flowchart of another embodiment of a data recovery method according to an embodiment of the present application, where the method may include the following steps:
501: and allocating a first storage address for the target data in the first data pool aiming at the target data to be written.
502: judging whether the target data is from a second data pool; if yes, go to step 503, if no, go to step 504.
If the target data is derived from the second data pool, the target data is the rewritten data, at this time, the second data pool can be written according to the first storage address, the target data is obtained from the second data pool and can be inserted into the data stream to be written, and according to the arrangement sequence of each data in the data stream, the storage address can be allocated for each data in turn, so as to ensure the sequential writing. The arrangement order may be determined according to the reception time sequence, or the like. The time of receipt may refer to the time at which the respective data entered the storage engine.
Wherein the data acquired from the second data pool may be provided with a specific flag to determine whether the target data originates from the second data pool or the like based on the specific flag.
The target data may be data requested to be written by the data access party or target data obtained from upstream and downstream backup data, if the target data does not originate from the second data pool. The upstream and downstream backup data may include, for example, backup data in the data access party, and the like.
The data from the data access party, the backup data and the second data pool can be arranged according to the sequence of receiving time to form a data stream to be written, and a storage address can be allocated to each data according to the arrangement sequence of each data in the data stream.
503: the second data pool is written according to the first memory address.
504: and allocating a second storage address for the target data in the second data pool, and writing the target data into the first data pool and the second data pool respectively based on the first storage address and the second storage address.
The target data can be synchronously written into the first data pool and the second data pool in parallel. The target data can be written into the first data pool and the second data pool sequentially by adopting an additional writing mode.
505: judging whether the target data is successfully written in the first data pool, if so, ending the flow, otherwise, executing step 506:
506: an invalid flag is set for the first memory address.
507: and determining a second storage address corresponding to the first storage address, and acquiring target data from the second data pool based on the second storage address.
The second storage address can adopt a concise address format of node identification, a starting address and an offset address so as to quickly search target data, thereby simplifying an address searching mode.
The first memory address and the second memory address have an address mapping relationship, and after the target data originates from the second data pool and the first memory address is reassigned to the target data, the address mapping relationship may be updated based on the reassigned first memory address.
508: if the second data pool is successful, step 509 is executed, and if not, step 511 is executed.
509: and inserting the target data into the data stream to be written in the first data pool.
510: according to the insertion position, the target data requested to be written is determined, and the execution is continued by returning to step 501.
The target data may be obtained from the data stream in accordance with a corresponding arrangement of insertion positions in the data stream.
511: target data is obtained from the upstream and downstream backup data, and the process returns to step 501 to continue execution.
If the second data pool fails to acquire, fault prompt information can be output.
512: and after the target data is written successfully, recovering the corresponding first storage address based on the invalid mark.
In addition, if the first data pool receives a read request for the target data before being stored safely, the first data pool is accessed to read the target data under the condition that the target data is written successfully in the first data pool, and if the target data is not written successfully, the second data pool is accessed to read the target data so as to guide the target data to the second data pool to realize data access.
In the embodiment of the application, the data stream to be written is implanted with the newly added data, so that the flow switching and the influence on the current operation are avoided, and the first threo saw tooth writing can be completed through address mapping and subsequent space recovery. The management and control scheduling of the first data pool on the writing priority is simplified, IO characteristics which are more friendly to the access of the storage medium are constructed, data are sequentially brushed down to the storage medium, and the performance advantage of the storage medium in sequential operation is exerted. The first data pool and the second data pool can be used for recording data according to the additional writing, and the second data pool is highly coupled with the first data pool, so that the data query process is simplified and the data can be rapidly checked. By the embodiment of the application, the system resource overhead can be reduced, the storage read-write performance can be improved, and the storage QoS stability can be ensured.
Fig. 6 is a schematic structural diagram of an embodiment of a data recovery device according to an embodiment of the present application, where the device may include:
a determining module 601, configured to determine that writing of the target data in the first data pool fails;
a data acquisition module 602, configured to acquire target data from the second data pool;
an address allocation module 603, configured to reallocate a first storage address in the first data pool for the target data;
The data recovery module 604 is configured to write the target data into the first data pool according to the reassigned first storage address.
In some embodiments, the data acquisition module is specifically configured to determine a first storage address corresponding to the target data in the first data pool; determining a second storage address corresponding to the first storage address; the target data is read from the second data pool based on the second storage address.
In some embodiments, the second storage address includes a node identification, a start address, and an offset address; based on the second storage address, the data acquisition module is specifically configured to determine, based on the second storage address, a node identifier corresponding to the storage node; and reading the target data from the storage position corresponding to the starting address and the offset address.
In some embodiments, the apparatus may further comprise:
the data implantation module is used for inserting the target data into the data stream to be written in the first data pool;
the data recovery module specifically writes the target data into the first data pool according to the reassigned first storage address based on the corresponding arrangement sequence of the insertion positions in the data stream.
In some embodiments, the apparatus may further comprise:
And the deleting module is used for deleting the first storage address corresponding to the target data when the writing of the target data in the first data pool fails after the target data is successfully written in the first data pool.
In some embodiments, the apparatus may further comprise:
the recovery module is used for setting an invalid mark for a first storage address corresponding to the target data when the writing of the target data in the first data pool fails; and after the target data is successfully written in the first data pool, recovering the corresponding first storage address based on the invalid mark.
In some embodiments, the determining module may specifically find an address mapping relationship, and determine a second storage address corresponding to the first storage address;
the address allocation module is further configured to update the address mapping relationship based on the reassigned first storage address after reassigning the first storage address in the first data pool for the target data.
In some embodiments, the apparatus may further comprise:
the data writing module is used for distributing a first storage address for target data in a first data pool aiming at the target data to be written; judging whether the target data is from a second data pool; if yes, writing the first data pool into the second data pool according to the first storage address; if not, a second storage address is allocated for the target data in the second data pool, and the target data is written into the first data pool and the second data pool respectively based on the first storage address and the second storage address.
In some embodiments, the data recovery module is further configured to acquire the target data from the data access party if the target data fails to be acquired from the second data; the target data is written to the first data pool.
In some embodiments, the data implantation module may specifically insert the target data into the data stream to be written in the first data pool according to the receiving time sequence.
In some embodiments, the apparatus may further comprise:
the data access module is used for receiving a read request aiming at target data; judging whether the target data is successfully written in the first data pool or not; if yes, accessing the first data pool to read target data; if not, accessing the second data pool to read the target data.
The data recovery device shown in fig. 6 may perform the data recovery method described in the embodiment shown in fig. 2, and its implementation principle and technical effects are not repeated. The specific manner in which the respective modules and units of the data recovery apparatus in the above embodiments perform operations has been described in detail in the embodiments related to the method, and will not be described in detail here.
Embodiments of the present application also provide a computing device, as shown in fig. 7, that may include a storage component 701 and a processing component 702;
The storage component 701 stores one or more computer instructions for execution by the processing component 702 to implement the data recovery method as described in any of the embodiments above.
Of course, the computing device may necessarily include other components as well, such as input/output interfaces, display components, communication components, and the like. The input/output interface provides an interface between the processing component and a peripheral interface module, which may be an output device, an input device, etc. The communication component is configured to facilitate wired or wireless communication between the computing device and other devices, and the like.
Wherein the processing component may include one or more processors to execute computer instructions to perform all or part of the steps of the methods described above. Of course, the processing component may also be implemented as one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described above.
The storage component is configured to store various types of data to support operations at the terminal. The memory component may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The display component may be an Electroluminescent (EL) element, a liquid crystal display or a micro display having a similar structure, or a retina-directly displayable or similar laser scanning type display.
It should be noted that, the above-mentioned computing device may be a physical device or an elastic computing host provided by a cloud computing platform, etc. It may be implemented as a distributed cluster of multiple servers or terminal devices, or as a single server or single terminal device.
The embodiment of the application also provides a computer readable storage medium storing a computer program, which when executed by a computer, can implement the data recovery method of any of the above embodiments. The computer-readable medium may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device.
The embodiments of the present application also provide a computer program product, which includes a computer program loaded on a computer readable storage medium, where the computer program can implement the data recovery method of any of the embodiments described above when executed by a computer. In such embodiments, the computer program may be downloaded and installed from a network, and/or installed from a removable medium. The computer program, when executed by a processor, performs the various functions defined in the system of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (12)

1. A method of data recovery, comprising:
determining that the writing of the target data in the first data pool fails;
acquiring the target data from a second data pool;
reassigning a first storage address in the first data pool for the target data;
and writing the target data into the first data pool according to the reassigned first storage address.
2. The method of claim 1, wherein the obtaining the target data from the second data pool comprises:
determining a first storage address corresponding to the target data in the first data pool;
determining that the first storage address corresponds to a second storage address in the second data pool;
And reading the target data from a second data pool based on the second storage address.
3. The method of claim 2, wherein the second storage address comprises a node identification, a start address, and an offset address; the obtaining the target data from the second data pool based on the second storage address includes:
determining a corresponding storage node of the node identification based on the second storage address;
and reading the target data from the storage positions corresponding to the starting address and the offset address.
4. The method as recited in claim 1, further comprising:
inserting the target data into a data stream to be written in the first data pool;
said writing said target data into said first data pool according to said reassigned first memory address comprising:
writing the target data into the first data pool based on the corresponding arrangement sequence of the insertion positions in the data stream according to the reassigned first storage address;
the method further comprises the steps of:
and deleting a first storage address corresponding to the target data when the target data is written into the first data pool successfully after the target data is written into the first data pool.
5. The method of claim 2, wherein the determining the second memory address to which the first memory address corresponds comprises:
searching an address mapping relation and determining a second storage address corresponding to the first storage address;
after said reassigning the first storage address in the first data pool for the target data, the method further comprises:
and updating the address mapping relation based on the reassigned first storage address.
6. The method of claim 1, wherein the determining the target data is after the first data pool write failure, the method further comprising:
setting an invalid mark for a first storage address corresponding to the target data when the writing of the target data in the first data pool fails;
and after the target data is successfully written in the first data pool, recovering the corresponding first storage address based on the invalid mark.
7. The method of claim 1, wherein the determining the target data is prior to a first data pool write failure, the method further comprising:
aiming at target data to be written, a first storage address is allocated for the target data in a first data pool;
Judging whether the target data is from a second data pool;
if yes, writing the target data into the first data pool according to the first storage address;
if not, a second storage address is allocated for the target data in a second data pool, and the target data is respectively written into the first data pool and the second data pool based on the first storage address and the second storage address.
8. The method as recited in claim 1, further comprising:
if the target data is failed to be acquired from the second data, acquiring the target data from a data access party;
and writing the target data into the first data pool.
9. The method of claim 4, wherein inserting the target data into the data stream currently written by the first data pool comprises:
and inserting the target data into the data stream to be written in the first data pool according to the receiving time sequence.
10. The method as recited in claim 1, further comprising:
receiving a read request for the target data;
judging whether the target data is successfully written in the first data pool or not;
If yes, accessing the first data pool to read the target data;
if not, accessing the second data pool to read the target data.
11. A computing device comprising a processing component and a storage component;
the storage component stores one or more computer instructions; the one or more computer instructions are operable to be invoked and executed by the processing component to implement a data recovery method as claimed in any one of claims 1 to 10.
12. A computer storage medium, characterized in that a computer program is stored, which, when executed by a computer, implements the data recovery method according to any one of claims 1 to 10.
CN202310085772.9A 2023-01-11 2023-01-11 Data recovery method and computing device Pending CN116302683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310085772.9A CN116302683A (en) 2023-01-11 2023-01-11 Data recovery method and computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310085772.9A CN116302683A (en) 2023-01-11 2023-01-11 Data recovery method and computing device

Publications (1)

Publication Number Publication Date
CN116302683A true CN116302683A (en) 2023-06-23

Family

ID=86821352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310085772.9A Pending CN116302683A (en) 2023-01-11 2023-01-11 Data recovery method and computing device

Country Status (1)

Country Link
CN (1) CN116302683A (en)

Similar Documents

Publication Publication Date Title
JP4809040B2 (en) Storage apparatus and snapshot restore method
US20190245919A1 (en) Method and apparatus for information processing, server and computer readable medium
US10127243B2 (en) Fast recovery using self-describing replica files in a distributed storage system
CN109271098B (en) Data migration method and device
US9983947B2 (en) Snapshots at real time intervals on asynchronous data replication system
CN110018897B (en) Data processing method and device and computing equipment
CN110399227B (en) Data access method, device and storage medium
CN110427258A (en) Scheduling of resource control method and device based on cloud platform
US9798638B2 (en) Systems and methods providing mount catalogs for rapid volume mount
CN114780019A (en) Electronic device management method and device, electronic device and storage medium
CN110888769B (en) Data processing method and computer equipment
CN109189480B (en) File system starting method and device
US20210326271A1 (en) Stale data recovery using virtual storage metadata
CN111124294B (en) Sector mapping information management method and device, storage medium and equipment
KR101676175B1 (en) Apparatus and method for memory storage to protect data-loss after power loss
US20210109922A1 (en) Database migration technique
CN116302683A (en) Data recovery method and computing device
JP4512201B2 (en) Data processing method and system
CN113127438B (en) Method, apparatus, server and medium for storing data
US10445338B2 (en) Method and system for replicating data in a cloud storage system
CN111930707A (en) Method and system for correcting drive letter of windows cloud migration
CN104636086A (en) HA storage device and HA state managing method
CN111435342A (en) Poster updating method, poster updating system and poster management system
CN113204520B (en) Remote sensing data rapid concurrent read-write method based on distributed file system
US11782630B1 (en) Efficient use of optional fast replicas via asymmetric replication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination