WO2020024772A1 - 查询数据的方法和装置 - Google Patents

查询数据的方法和装置 Download PDF

Info

Publication number
WO2020024772A1
WO2020024772A1 PCT/CN2019/095103 CN2019095103W WO2020024772A1 WO 2020024772 A1 WO2020024772 A1 WO 2020024772A1 CN 2019095103 W CN2019095103 W CN 2019095103W WO 2020024772 A1 WO2020024772 A1 WO 2020024772A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
snapshot
time slice
time
data
Prior art date
Application number
PCT/CN2019/095103
Other languages
English (en)
French (fr)
Inventor
李鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19845149.4A priority Critical patent/EP3822762A4/en
Publication of WO2020024772A1 publication Critical patent/WO2020024772A1/zh
Priority to US17/164,981 priority patent/US11579986B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2477Temporal data queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/835Timestamp
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • the present application relates to the field of storage, and more particularly, to a method and device for querying data.
  • a snapshot is a mirror of a data set at a specific moment, also known as an instant copy, and it is a fully available copy of this data set.
  • the storage network industry association (storage, network, industry, association, SNIA) defines a snapshot as: a fully available copy of a specified data set, the copy including an image of the corresponding data at a point in time.
  • a snapshot can be either a copy of the data it represents or a copy of the data.
  • Existing snapshot technologies include two types of full snapshots and incremental snapshots, each using different snapshot technologies.
  • the full snapshot uses the mirror split (mirror) snapshot technology to reach a preset snapshot time point, first create and maintain a complete mirror volume for the source data volume, each time data is written, the source data volume and the Mirrored volumes are written simultaneously, which takes up a lot of storage space.
  • Incremental snapshots can track changes to data volumes and snapshot volumes. When a new incremental snapshot is generated, the old snapshot data will be refreshed.
  • the present application provides a method and device for querying data, which can provide a bucket-level snapshot method to achieve access to all object contents of a snapshot without affecting business performance.
  • a method for querying data including: generating a mapping relationship, which is used to indicate a one-to-one correspondence between N operations corresponding to a first storage and N storage spaces, where the N operations occur at different times.
  • a first storage space in the N storage spaces is used to store a first storage object after processing based on the first operation.
  • the first operation corresponds to the first storage space.
  • a storage space is any storage space among the N storage spaces; an operation record is generated, and the operation record is used to record the occurrence times of the N operations; a first query request is received, and the first query request is used to request to query the The storage state of the first storage object at the first moment; according to the operation record, the first operation before the first moment is determined from the N operations; according to the mapping relationship, the storage space corresponding to the first operation is determined The stored first storage object.
  • a client accesses a storage object, it can achieve read-only access to all object contents of the snapshot by establishing a snapshot.
  • the method of establishing a time-slice-based snapshot can form a snapshot at each time point from the beginning of the first snapshot, and does not bring additional storage overhead, and the snapshot process does not affect existing services and reads at all. Write performance.
  • the operation record is divided into N time slices, and the operation record in each time slice includes time information of the time slice.
  • the generating an operation record includes: generating a new one according to time information of a time slice of the first operation and metadata corresponding to the first storage object. Metadata.
  • the first storage creates a first snapshot at a first time slice, and when the first storage receives the first snapshot sent by the client for the first snapshot
  • the method further includes: determining the first time slice according to a snapshot name of the first snapshot; and determining the metadata and the first storage object according to the first time slice.
  • the first storage creates a first snapshot at a first time slice, and when the first storage receives the first snapshot sent by the client for the first snapshot
  • the method further includes: determining a first time period according to a time slice in which the rollback request is located and the first time slice; and deleting all the operation records in the first time period.
  • an apparatus for querying data including: a generating unit for generating a mapping relationship, the mapping relationship being used to indicate a one-to-one correspondence between N operations corresponding to the first storage and N storage spaces Relationship, wherein the N operations occur at different times, and a first storage space in the N storage spaces is used to store a first storage object after processing based on the first operation, the first operation and the first storage Corresponding to space, the first storage space is any storage space among the N storage spaces; the generating unit is further configured to generate an operation record, and the operation record is used to record the occurrence times of the N operations; the receiving unit is used to Upon receiving a first query request, the first query request is used to request a query of a storage state of the first storage object at a first moment; a determining unit is configured to determine the first moment from the N operations according to the operation record The previous first operation; the determining unit is further configured to determine a first storage object stored in a storage space corresponding to the first operation according to the mapping
  • the operation record is divided into N time slices, and the operation record in each time slice includes time information of the time slice.
  • the generating unit is further configured to generate a new image according to time information of the time slice of the first operation and metadata corresponding to the first storage object. Metadata.
  • the first storage creates a first snapshot at a first time slice, and when the first storage receives the first snapshot sent by the client for the first snapshot
  • the determining unit is further configured to: determine the first time slice according to a snapshot name of the first snapshot; determine the metadata and the first storage object according to the first time slice.
  • the first storage creates a first snapshot at a first time slice, and when the first storage receives the first snapshot sent by the client for the first snapshot
  • the determining unit is further configured to: determine a first period according to a time slice where the rollback request is located and the first time slice; and delete all the operation records in the first period.
  • an apparatus for querying data includes a processor and a communication interface, and the processor is configured to execute a program.
  • the processor executes code
  • the processor and the communication interface implement the method for querying data in any one of the first aspect or the first aspect.
  • a memory may be integrated in the processor, or the processing device may include a processor.
  • the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores program code for execution by a snapshot processing apparatus.
  • the program code includes instructions for executing the snapshot processing method in the first aspect or any one of the possible implementation manners of the first aspect.
  • the present application provides a computer program product containing instructions.
  • the snapshot processing apparatus is caused to execute the method for querying data in the first aspect or any one of the possible implementation manners of the first aspect.
  • the present application provides a chip, and the chip system includes a processor and a communication interface.
  • the processor is used to execute a program.
  • the processor and the communication interface implement the method for querying data in the first aspect or any possible implementation manner of the first aspect.
  • the chip may further include a memory.
  • the memory and the processor may be integrated together.
  • FIG. 1 is a schematic diagram of a data reading and writing process between an object storage system and a client to which an embodiment of the present application is applied.
  • FIG. 2 is a flowchart of a method for querying data according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an example of a bucket provided by an embodiment of the present application.
  • FIG. 4 shows a schematic block diagram of an apparatus for querying data according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an apparatus for querying data according to another embodiment of the present application.
  • FIG. 1 is a schematic diagram of a data reading and writing process between an object storage system and a client.
  • the object storage system usually adopts a method of separating data from metadata.
  • the object storage system 100 may include an object semantic interface and a service layer, a metadata storage system, and a data storage system. Among them, the object semantic interface and the service layer are connected to the metadata storage system and the data storage system at the same time.
  • the object storage system executes the data write request process
  • its basic flow is shown in Figure 1, including: the object semantic interface and the service layer receives the client write request , The object semantic interface and service layer send write data to the data storage system, and get the data unique identifier or data organization form, as shown in the write 1 operation in Figure 1; the object semantic interface and service layer will obtain the unique identifier of the data together with other objects Attributes are combined into metadata and written into the metadata storage system, as shown in the write 2 operation in Figure 1.
  • the metadata storage system can uniquely index to the metadata of the object through the object identifier.
  • the object storage system executes the data read request process
  • the basic flow is shown in Figure 1, including: the object semantic interface and the service layer index the metadata storage system to obtain the object metadata according to the object identifier, and obtain the data unique identifier or data organization
  • the format is as shown in Figure 1. Read 1 operation; the object semantic interface and the service layer index the obtained data unique identifier to the data storage system to obtain the data, as shown in Figure 1 read 2 operation, and return to the client along with other metadata attributes .
  • Snapshot is one of the effective methods to prevent data loss from online storage devices.
  • a snapshot is a fully available copy of a specified set of data, that is, a snapshot can be either a copy of the data it represents or a copy of the data. Or it can be understood that the snapshot is a technology for protecting the data of the file system, and is used to protect the state of the file system at a certain time (for example, when the data backup is started).
  • a snapshot is a reference mark or pointer to data stored in a storage device.
  • snapshots can be used for online data recovery. In the event of an application failure or file corruption on the storage device, timely data recovery can be performed to restore the data to the state at the time when the snapshot was generated.
  • users can access the snapshot data and use the snapshot for testing and other tasks.
  • snapshots include two types of full snapshots and incremental snapshots, each using different snapshot technologies.
  • full volume snapshot is also called full copy snapshot or copy as is.
  • split mirror snapshot technology Before using the split mirror snapshot technology to reach a preset snapshot time point, first create and maintain a complete mirror volume for the source data volume. When entering data, both the source data volume and the mirrored volume are written at the same time. This ensures that two copies of the same data are stored on the source data volume and the mirrored volume, respectively, and a mirror pair consisting of the two.
  • the data writing operation of the mirrored pair is stopped, and the mirrored volume quickly leaves the mirrored pair and is converted into a snapshot volume, thus obtaining a data snapshot.
  • the snapshot volume has completed applications such as data snapshot / data backup, it will resynchronize with the source data volume to become a new box of mirrored volumes.
  • mirror-separated snapshots The benefit of mirror-separated snapshots is that the data is well isolated, making it possible to access data offline, and simplifying the process of recovering, copying, or archiving all data on a hard disk.
  • the most important thing is that the operation time is very short, just disconnect the mirror
  • the time required for a volume pair is usually only a few milliseconds. Such a small backup window will hardly affect the upper-layer applications.
  • Incremental snapshots feature tracking changes to data volumes and snapshot volumes. When a new incremental snapshot is generated, the old snapshot data will be refreshed. The first snapshot and every subsequent incremental snapshot data are timestamped. Using the timestamp, we can roll back the snapshot data to an arbitrary point in time. Incremental snapshot technology can speed up the generation of subsequent snapshots, and only consumes a little more space in name. As a result, we can increase the frequency of creating snapshots and also keep snapshots a bit longer. Incremental snapshots can include copy-on-write (COW) and redirect-on-write (ROW).
  • COW copy-on-write
  • ROW redirect-on-write
  • COW first creates a data pointer table for each source data to store physical pointers to all data of the source data.
  • the storage system will make a copy of the source data pointer table, which is used as the snapshot data.
  • COW creates a snapshot volume only when creating a snapshot. The snapshot volume only takes up a relatively small amount of storage space and is used to save the updated data in the source data volume after the snapshot time point.
  • each write operation after the snapshot is created requires the original data in the source data to be copied to the snapshot before the source data can be written, the write performance of the source data volume will be reduced. Obviously, if multiple snapshots of the same source data are taken, the write performance will be even lower, so COW is not suitable for the write performance requirements of a single bucket of massive data in object storage.
  • ROW's snapshot data pointer table holds the original copy of the source data
  • the source data volume data pointer table holds the updated copy, which results in the need to synchronize the data pointed to by the snapshot volume data pointer table to The source data volume.
  • a snapshot chain will be generated, which will make the access to the original data, the tracking of the snapshot and source data, and the deletion of the snapshot become extremely complicated.
  • This application will provide a method for querying data, and simultaneously implement a bucket-level snapshot method for a time slice-based object storage system, and provide an implementation method suitable for the characteristics of object storage massive data. Users can read-only access all the object contents of a snapshot, and they can quickly roll back the entire bucket to a snapshot without affecting their own business performance.
  • FIG. 2 is a flowchart of a method for querying data according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an example bucket provided by an embodiment of the present application.
  • a method for querying data provided by an embodiment of the present application will be described in detail with reference to FIGS. 2 and 3. As shown in FIG. 2, the method 200 includes:
  • S210 Generate a mapping relationship, which is used to indicate a one-to-one correspondence between the N operations corresponding to the first storage and the N storage spaces, where the N operations occur at different times and the N storage spaces
  • the first storage space in is used to store a first storage object after processing based on a first operation, the first operation corresponds to the first storage space, and the first storage space is any of the N storage spaces storage.
  • S230 Receive a first query request, where the first query request is used to request to query a storage state of the first storage object at a first moment.
  • the largest rectangular frame in FIG. 3 can be regarded as a bucket.
  • a bucket can be a container for storing objects.
  • An object must belong to and only belong to one bucket.
  • a bucket is referred to as "first storage” or "object storage system”.
  • the first storage may be divided into N storage spaces. That is, based on the prior art, the object storage system is divided according to time slices, and can be divided into N storage spaces, where N is a positive integer.
  • the management of the time slice can be maintained through a dedicated module. It should be understood that the time slice is monotonically increasing, as shown in FIG. 3 from time slice 1 to time slice 7.
  • the storage object 1 is stored in the first storage space indicated by the time slice 1.
  • the second storage space indicated by the time slice 2 is reached, the storage object 1 is rewritten.
  • the mapping relationship mentioned in S210 can be understood as the correspondence between storage space and operations.
  • the operation corresponding to the first storage space is a new write to storage object 1
  • the corresponding The operation is to overwrite storage object 1.
  • the storage object 1 rewritten in the second storage space is the first storage object after being processed based on the rewrite operation.
  • the operation record here is "rewrite", that is, the storage object 1 is changed, but it is the same as the existing storage object 1.
  • the technical difference is that the changes in this application are only additions to storage objects, without actual overwriting.
  • the storage object 2 is taken as an example, and the storage object 2 is stored in the first storage space indicated by the time slice 1.
  • the second storage space indicated by the time slice 2 is reached, no operation is performed on the storage object 2, and when the third storage space indicated by the time slice 3 is performed, the write operation is performed on the storage object 2.
  • the instruction here marks the deletion of the storage object 2 and does not actually delete it.
  • snapshot A When snapshot A is generated at the time represented by time slice 2, because storage object 2 is deleted in time slice 3, storage object 2 cannot be accessed after time slice 2, but because the deletion behavior is not real The storage object 2 is deleted, so users can access storage object 2 by accessing snapshot A.
  • the operation record is divided into N time slices, and the operation record in each time slice includes time information of the time slice.
  • the storage object needs to be accessed through the mapping relationship between the metadata and the storage object.
  • the concept of a time slice is added to the object storage system, in order to improve the metadata against time, Search and enumerate slices.
  • the time slice information of the current operation can be added to the storage object metadata.
  • the metadata system needs to increase the index for time slices.
  • the first storage establishes a first snapshot at a first time slice, and when the first storage receives a read request for the first snapshot sent by the client, the object storage system may according to the first snapshot
  • the snapshot name determines the first time slice; the metadata and the first storage object are determined according to the first time slice.
  • the object storage system When a client sends an access request and decides to access the data of a snapshot, the object storage system first finds the corresponding time slice according to the snapshot name, and then finds the corresponding metadata and data for access according to the time slice.
  • the first storage creates a first snapshot at a first time slice
  • the method further includes : Determine the first period according to the time slice where the rollback request is located and the first time slice; delete all the operation records in the first period.
  • the system background actually deletes all changes between the two time slices according to the rolled back time slice and the snapshot time slice, while the foreground access Ignore all change records in this time period.
  • snapshot A is generated in the second storage space corresponding to time slice 2.
  • a rollback of snapshot A is generated in the sixth storage space corresponding to time slice 6, All operations from 2 to time slice 6 are deleted, that is, the storage state of time slice 2 is returned.
  • a client accesses a storage object, it can achieve read-only access to all object contents of the snapshot by establishing a snapshot.
  • the method of establishing a time-slice-based snapshot can form a snapshot at each time point from the beginning of the first snapshot, and does not bring additional storage overhead, and the snapshot process does not affect existing services and reads at all.
  • Write performance when the client performs major business transformation or has important buckets, it can be protected by snapshots. When an accident occurs, the bucket-level overall rollback can be performed to achieve the purpose of data protection.
  • FIG. 4 shows a schematic block diagram of an apparatus 400 for querying data according to an embodiment of the present application.
  • the apparatus 400 may correspond to the object storage system described in the foregoing method 200, and may also be a chip or component applied to the object storage system.
  • Each module or unit in the apparatus 400 is respectively configured to perform various actions or processing processes performed by the object storage system in the method 200 described above.
  • the apparatus 400 may include a generating unit 410 and a receiving unit 420. And determination unit 430.
  • the generating unit 410 is configured to generate a mapping relationship used to indicate a one-to-one correspondence between N operations corresponding to the first storage and N storage spaces, where the N operations occur at different times, the A first storage space of the N storage spaces is used to store a first storage object after processing based on a first operation, the first operation corresponds to the first storage space, and the first storage space is the N storage spaces Any storage space in.
  • the generating unit 410 is further configured to generate an operation record, where the operation record is used to record occurrence times of the N operations.
  • the receiving unit 420 is configured to receive a first query request, where the first query request is used to query a storage state of the first storage object at a first moment.
  • the determining unit 430 is configured to determine a first operation before the first time from the N operations according to the operation record.
  • the determining unit 430 is further configured to determine a first storage object stored in a storage space corresponding to the first operation according to the mapping relationship.
  • the operation record is divided into N time slices, and the operation record in each time slice includes time information of the time slice.
  • the generating unit 410 is further configured to generate new metadata according to time information of the time slice of the first operation and metadata corresponding to the first storage object.
  • the first storage creates a first snapshot at a first time slice.
  • the determining unit 430 further uses Yu: determine the first time slice according to a snapshot name of the first snapshot; determine the metadata and the first storage object according to the first time slice.
  • the first storage creates a first snapshot at a first time slice, and when the first storage receives a rollback request for the first snapshot sent by a client, the determining unit further It is used to determine the first period according to the time slice where the rollback request is located and the first time slice; delete all the operation records in the first period.
  • FIG. 5 is a schematic structural diagram of an apparatus for querying data according to another embodiment of the present application. It should be understood that the apparatus 500 shown in FIG. 5 is merely an example, and the apparatus for querying data in the embodiment of the present application may further include other modules or units, or include modules similar in function to each module in FIG. 5, or does not include All modules in Figure 5.
  • the memory 510 stores program code for implementing the functions of the respective modules shown in FIG. 2.
  • the processor 520 is configured to execute program code stored in the memory 510.
  • the communication interface 530 is configured to communicate with a storage device or a storage medium, read data from the storage device or the storage medium, or write data to the storage device or the storage medium.
  • the processor 520 may call the communication interface 530 to implement the method for querying data in FIG. For brevity, I will not repeat them here.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种查询数据的方法和装置,该方法包括:根据时间片将对象存储系统划分为多个存储空间,建立存储空间、时间片和操作记录的映射关系,生成快照后,并记录该快照对应的时间片信息,实现基于该快照的只读、快照回滚、对象变更等业务,能够在不增加额外存储开销的情况下提高查询性能;此外,可以实现针对桶进行只追加而不实际覆盖和删除的操作,从而对客户端的重大业务改造或者重要的桶,实现数据保护的目的。

Description

查询数据的方法和装置 技术领域
本申请涉及存储领域,并且更具体地,涉及一种查询数据的方法和装置。
背景技术
快照(snapshot)是某个数据集在某一特定时刻的镜像,也称为即时拷贝,它是这个数据集的一个完整可用的副本。存储网络行业协会(storage network industry association,SNIA)对快照的定义是:关于指定数据集合的一个完全可用拷贝,该拷贝包括相应数据在某个时间点的映像。快照可以是其所表示的数据的一个副本,也可以是数据的一个复制品。
现有的快照技术包括全量快照和增量快照两种类型,各自使用了不同的快照技术。其中,全量快照使用镜像分离(split mirror)快照技术在达到预设的快照时间点之前,首先为源数据卷创建并维护一个完整的镜像卷,每次写入数据时,都会往源数据卷和镜像卷同时写入,会占用大量的存储空间。增量快照可以跟踪数据卷和快照卷的变化,当一个新的增量快照生成之后,旧的快照数据将被刷新。
以上列举的都是针对对象的快照技术,例如块和文件等的快照技术,而目前并没有一种针对对象存储系统的快照技术,例如针对对象的容器—桶的快照技术。此外,对于海量数据存储系统,例如云存储等存储空间足够大的存储系统,没有一种快照技术能够使得不影响存储系统业务性能的同时,使得用户能够只读访问对象存储系统的内容。
发明内容
本申请提供一种查询数据的方法和装置,能够提供一种桶级别的快照方法,实现访问某个快照的所有对象内容,同时不影响业务性能。
第一方面,提供了一种查询数据的方法,包括:生成映射关系,该映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,该N个操作在不同时刻发生,该N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,该第一操作与该第一存储空间对应,该第一存储空间是该N个存储空间中的任一存储空间;生成操作记录,该操作记录用于记录该N个操作的发生时刻;接收第一查询请求,该第一查询请求用于请求查询该第一存储对象在第一时刻的存储状态;根据该操作记录,从该N个操作中确定该第一时刻之前的第一操作;根据该映射关系,确定与该第一操作对应的存储空间中存储的第一存储对象。
通过本申请实施例提供的技术方案,通过时间片的划分,实现客户端在访问存储对象时,能够通过建立快照的方式只读访问该快照的所有对象内容。建立基于时间片的快照的方法,从第一个快照开始的后面每一个时间点都可以形成一个快照,并且不会带来额外的存储开销,并且打快照的过程完全不影响现有业务和读写性能。
结合第一方面,在第一方面的某些实现方式中,该操作记录被划分为N个时间片,且每个时间片中的操作记录包括该时间片的时间信息。
结合第一方面及上述实现方式,在第一方面的某些实现方式中,该生成操作记录,包 括:根据该第一操作的时间片的时间信息和该第一存储对象对应的元数据生成新的元数据。
结合第一方面及上述实现方式,在第一方面的某些实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的读请求时,该方法还包括:根据该第一快照的快照名确定该第一时间片;根据该第一时间片确定该元数据和该第一存储对象。
结合第一方面及上述实现方式,在第一方面的某些实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的回滚请求时,该方法还包括:根据该回滚请求所在的时间片和该第一时间片确定第一时段;删除该第一时段内的所有该操作记录。
应理解,本申请提供的技术方案适合应用于云存储,因为云存储的存储空间足够大,当云存储的数据很重要时,对于删除和变更不是特别频繁的场景下,能够让整个桶快速回滚到某个快照,同时不影响本身的业务能力。
通过上述技术方案,当客户端进行重大业务改造或者有重要的桶,可以通过快照的方式进行保护,当出现意外时可以进行桶级别的整体回滚,达到数据保护的目的。
第二方面,提供了一种查询数据的装置,包括:生成单元,用于生成映射关系,该映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,该N个操作在不同时刻发生,该N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,该第一操作与该第一存储空间对应,该第一存储空间是该N个存储空间中的任一存储空间;该生成单元,还用于生成操作记录,该操作记录用于记录该N个操作的发生时刻;接收单元,用于接收第一查询请求,该第一查询请求用于请求查询该第一存储对象在第一时刻的存储状态;确定单元,用于根据该操作记录,从该N个操作中确定该第一时刻之前的第一操作;该确定单元,还用于根据该映射关系,确定与该第一操作对应的存储空间中存储的第一存储对象。
结合第二方面,在第二方面的某些实现方式中,该操作记录被划分为N个时间片,且每个时间片中的操作记录包括该时间片的时间信息。
结合第二方面及上述实现方式,在第二方面的某些实现方式中,该生成单元还用于:根据该第一操作的时间片的时间信息和该第一存储对象对应的元数据生成新的元数据。
结合第二方面及上述实现方式,在第二方面的某些实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的读请求时,该确定单元还用于:根据该第一快照的快照名确定该第一时间片;根据该第一时间片确定该元数据和该第一存储对象。
结合第二方面及上述实现方式,在第二方面的某些实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的回滚请求时,该确定单元还用于:根据该回滚请求所在的时间片和该第一时间片确定第一时段;删除该第一时段内的所有该操作记录。
第三方面,提供了一种查询数据的装置,该快照处理装置包括处理器和通信接口,处理器用于执行程序。当处理器执行代码时,处理器和通信接口实现上述第一方面或第一方面中任意一项该查询数据的方法。
该处理器中可以集成有存储器,或者该处理装置可以包括处理器。
第四方面,本申请提供了一种计算机可读存储介质。该计算机可读存储介质中存储用于快照处理装置执行的程序代码。该程序代码包括用于执行第一方面或第一方面中任意一种可能的实现方式中的快照处理方法的指令。
第五方面,本申请提供了一种包含指令的计算机程序产品。当该计算机程序产品在快照处理装置上运行时,使得快照处理装置执行第一方面或第一方面中任意一种可能的实现方式中的查询数据的方法。
第六方面,本申请提供了一种芯片,该芯片系统包括处理器和通信接口。处理器用于执行程序。当处理器执行代码时,处理器和通信接口实现第一方面或第一方面中任意一种可能的实现方式中的查询数据的方法。
可选地,该芯片还可以包括存储器。进一步地,存储器和处理器可以集成在一起。
附图说明
图1是应用本申请实施例的对象存储系统和客户端之间数据读写过程的示意图。
图2是本申请实施例提供的一例查询数据的方法流程图。
图3是本申请实施例提供的一例桶的示意图。
图4示出了本申请实施例的查询数据的装置的示意性框图。
图5为本申请另一个实施例的查询数据的装置的示意性结构图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
图1是对象存储系统和客户端之间数据读写过程的示意图。如图1所示,对象存储系统通常采用数据与元数据分离的方式,对象存储系统100可以包括对象语义接口和服务层、元数据存储系统、数据存储系统。其中,对象语义接口和服务层同时和元数据存储系统、数据存储系统连接。
当客户端和对象存储系统之间有正常业务运行时,例如当对象存储系统执行数据写请求过程时,其基本流程如图1所示,包括:对象语义接口和服务层接收到客户端的写请求,对象语义接口和服务层向数据存储系统下发写数据,并得到数据唯一标识或者数据组织形式,如图1中写1操作;对象语义接口和服务层将获得的数据唯一标识连同其他对象的属性合并成元数据一并写入元数据存储系统,如图1中写2操作,元数据存储系统可以通过对象标识唯一地索引到对象的元数据。
当对象存储系统执行数据读请求过程时,其基本流程如图1所示,包括:对象语义接口和服务层根据对象标识向元数据存储系统索引得到对象元数据,并得到数据唯一标识或者数据组织形式,如图1中读1操作;对象语义接口和服务层将获得的数据唯一标识向数据存储系统进行索引得到数据,如图1中读2操作,并连同元数据的其他属性返回给客户端。
随着存储应用需求的提高,用户需要在线方式进行数据保护,快照就是在线存储设备防范数据丢失的有效方法之一。快照是指关于指定的数据集合的一个完全可用拷贝,即快 照可以是其所表示的数据的一个副本,也可以是数据的一个复制品。或者可以理解为,快照是一种保护文件系统的数据的技术,用于保护文件系统在某个时间(例如,启动数据备份的时间)所处的状态。具体地实现过程中,快照是指向保存在存储设备中的数据的引用标记或指针。此外,快照能够进行在线数据恢复,当存储设备发生应用故障或者文件破损时可以进行及时的数据恢复,将数据恢复成快照产生时间点的状态;或者,快照可以为存储用户提供另外一个数据访问通道,当元数据进行在线应用处理时,用户可以访问快照数据,还可以利用快照进行测试等工作。
下面简单介绍几种快照技术,按照存储网络行业协会(storage network industry association,SNIA)对快照的划分,快照包括全量快照和增量快照两种类型,各自使用了不同的快照技术。其中,全量快照又称为全拷贝快照或原样复制,使用镜像分离(split mirror)快照技术在达到预设的快照时间点之前,首先为源数据卷创建并维护一个完整的镜像卷,每次写入数据时,都会往源数据卷和镜像卷同时写入,这样保证了同一份数据的两个副本分别保存在源数据卷和镜像卷上,并且由两者组成的一个镜像对。在预设快照时间点到达时,镜像对的数据写入操作被停止,镜像卷快速脱离镜像对并转化为快照卷,这样就获得了一份数据快照。快照卷在完成数据快照/数据备份等应用后,将与源数据卷重新进行同步,成为一盒新的镜像卷。
镜像分离快照的好处在于数据隔离性好,使离线访问数据成为可能,并且简化了恢复、复制或存档一块硬盘上的所有数据的过程,最重要的是操作的时间非常短,仅仅是断开镜像卷对所需的时间,通常只有几毫秒,这样小的备份窗口几乎不会对上层应用造成影响。不存在快照卷和源数据卷的相互影响,但这种方式缺乏灵活性,无法在任意时间点为任意的数据卷建立快照。另外,它需要一个或者多个与源数据卷容量相同的镜像卷,占用了大量存储空间,而且写数据时同时写两份,对写入性能影响比较大,在同步镜像时还会降低存储系统的整体性能。
增量快照的特点是可以跟踪数据卷和快照卷的变化。当一个新的增量快照生成之后,旧的快照数据将被刷新。第一个快照和随后创建的每一个增量快照数据上都有时间戳标记,利用时间戳我们能够将快照数据回滚到任意的一个时间点。增量快照技术能够加快后续快照的生成速度,而且仅仅在名义上多消耗了一点空间而已。由此,我们可以提高创建快照的频率,也能让快照保留得更久一点。增量快照可以包括复制写快照(copy-on-write,COW)和重定向写快照(redirect-on-write,ROW)。
COW首先会为每个源数据都创建一张数据指针表用于保存源数据所有数据的物理指针,在创建快照时,存储系统会拷贝出一份源数据指针表的副本,该副本作为快照数据指针表。COW只有在创建快照时才会建立快照卷,该快照卷只占用了相对少量的存储空间,用于保存快照时间点之后源数据卷中被更新的数据。但是,因为创建快照后的每次写入操作都需要先将源数据中的原始数据拷贝到快照中才能开始写入源数据,所以会降低源数据卷的写性能。很显然,如果对同一源数据做了多次快照之后,写性能将会更加低下,因此COW不适合对象存储单桶海量数据下的写性能要求。
ROW源数据创建一个快照之后,对源数据的数据进行了更新操作的话,并不会像COW技术直接修改源数据原始数据,而是再开辟一个新的空间用于存放用于更新原始数据的新 的数据。ROW的快照数据指针表保存的是源数据的原始副本,而源数据卷数据指针表保存的则是更新后的副本,这导致在删除快照卷之前需要将快照卷数据指针表指向的数据同步至源数据卷中。而且当创建了多个快照后,会产生一个快照链,使原始数据的访问、快照和源数据的追踪以及快照的删除将变得异常复杂。
目前没有针对整个桶级别的快照,即桶级别的访问某个时间点的办法,对象存储系统的数据复制技术可以实现针对桶级别的复制,但是复制到另外一个桶有一定滞后性,而且没有明确的回滚方法,需要用户自己进行恢复,并且只能按照对象级别回滚。
本申请将提供一种查询数据的方法,同时实现基于时间片的对象存储系统的桶级别的快照方法,提供一种适合对象存储海量数据的特征的实现方法。用户能够只读访问某个快照的所有对象内容,也能够让整个桶快速回滚到某个快照,同时不影响本身的业务性能。
图2是本申请实施例提供的一例查询数据的方法流程图。图3是本申请实施例提供的一例桶的示意图,下面,将结合图2和图3对本申请实施例提供的查询数据的方法进行详细的说明。如图2所示,方法200包括:
S210,生成映射关系,该映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,该N个操作在不同时刻发生,该N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,该第一操作与该第一存储空间对应,该第一存储空间是该N个存储空间中的任一存储空间。
S220,生成操作记录,该操作记录用于记录该N个操作的发生时刻。
S230,接收第一查询请求,该第一查询请求用于请求查询该第一存储对象在第一时刻的存储状态。
S240,根据该操作记录,从该N个操作中确定该第一时刻之前的第一操作。
S250,根据该映射关系,确定与该第一操作对应的存储空间中存储的第一存储对象。
具体地,以图3为例,图3中最大的矩形框可以看做一个桶,桶可以是存储对象的容器,一个对象必须属于并且只属于一个桶。本申请中,将桶称为“第一存储”或“对象存储系统”。如图3所示,该第一存储可以划分为N个存储空间。即在现有技术的基础上,将该对象存储系统按照时间片进行划分,可以划分为N个存储空间,N为正整数。这里时间片的管理可以通过专门的模块进行维护,应理解,时间片为单调递增,如图3中从时间片1到时间片7的顺序递增。
以一个存储对象1为例,在时间片1所表示的第一存储空间中,存入存储对象1。到了时间片2所表示的第二存储空间,对存储对象1进行重写操作。在S210中所说的映射关系可以理解为存储空间和操作的对应关系,例如,对于存储对象1,第一存储空间对应的操作是新增写入存储对象1,对于第二存储空间,对应的操作是重写存储对象1。这里第二存储空间中重新写入的存储对象1是经过基于重写操作的处理后的第一存储对象,这里的操作记录就是“重写”,即对存储对象1进行变更,但是与现有技术不同的是,本申请的变更只是对存储对象的追加,而不进行实际的覆盖。
同理,对于删除操作,以存储对象2为例,在时间片1所表示的第一存储空间中,存入存储对象2。到了时间片2所表示的第二存储空间,对存储对象2没有任何操作,到了在时间片3所表示的第三存储空间中,对存储对象2进行删除写操作。但是,这里的指示 标记了对存储对象2的删除行为,并不进行真正的删除。
当在时间片2所代表的时刻生成了快照A,因为在时间片3针对存储对象2进行了删除操作,则在时间片2之后,就不能访问存储对象2,但是因为该删除行为并不是真正的删除了存储对象2,因此,用户可以通过访问快照A来访问存储对象2。
可选地,该操作记录被划分为N个时间片,且每个时间片中的操作记录包括该时间片的时间信息。根据该第一操作的时间片的时间信息和该第一存储对象对应的元数据生成新的元数据。
当确定某个时间片为快照之后,针对存储对象和桶的所有操作,都不删除或者覆盖原有数据和元数据,而是根据基于时间片的内部版本号生成新的原数据,根据最新的时间片和版本号对桶进行访问。应理解,该访问操作是由客户端发送访问请求等方式来控制打开和关闭,而该对象存储系统不能自主打开。
前面介绍了在对象存储系统中,需要通过元数据和存储对象的映射关系来访问存储对象,在本申请实施例中,因为对对象存储系统增加了时间片的概念,那么为了提高元数据针对时间片的查找和列举,针对对象和桶的所有操作,都可以追加当时操作的时间片信息记录到存储对象元数据当中,同时元数据系统需要增加针对时间片的索引,则在存储对象的访问过程中,可以根据时间片的索引查询存储对象的元数据,再根据存储对象的元数据访问该存储对象。
可选地,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的读请求时,该对象存储系统可以根据该第一快照的快照名确定该第一时间片;根据该第一时间片确定该元数据和该第一存储对象。
当客户端发送访问请求,决定访问某个快照的数据时,对象存储系统先根据快照名找到对应的时间片,再根据该时间片找到对应的元数据和数据进行访问。
在另一种可能的实施方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的回滚请求时,该方法还包括:根据该回滚请求所在的时间片和该第一时间片确定第一时段;删除该第一时段内的所有该操作记录。
当客户端发送访问请求,决定回滚到某个快照的数据时,系统后台根据回滚的时间片和快照的时间片,对两个时间片之间的所有变更进行真正的删除,同时前台访问时忽略次时间段内的所有变更记录。
具体地,以图3的桶为例,在时间片2对应的第二存储空间生成了快照A,当在时间片6对应的第六存储空间生成了快照A的回滚,则,从时间片2到时间片6之间的所有操作都进行删除,即回到时间片2的存储状态。
应理解,本申请提供的技术方案适合应用于云存储,因为云存储的存储空间足够大,当云存储的数据很重要时,对于删除和变更不是特别频繁的场景下,能够让整个桶快速回滚到某个快照,同时不影响本身的业务能力。
通过本申请实施例提供的技术方案,通过时间片的划分,实现客户端在访问存储对象时,能够通过建立快照的方式只读访问该快照的所有对象内容。建立基于时间片的快照的方法,从第一个快照开始的后面每一个时间点都可以形成一个快照,并且不会带来额外的存储开销,并且打快照的过程完全不影响现有业务和读写性能。此外,当客户端进行重大 业务改造或者有重要的桶,可以通过快照的方式进行保护,当出现意外时可以进行桶级别的整体回滚,达到数据保护的目的。
以上结合图1至图3对本申请实施例的查询数据的方法做了详细说明。以下,结合图4至图5对本申请实施例的查询数据的装置进行详细说明。
图4示出了本申请实施例的查询数据的装置400的示意性框图,该装置400可以对应上述方法200中描述的对象存储系统,也可以是应用于该对象存储系统的芯片或组件,并且,该装置400中各模块或单元分别用于执行上述方法200中该对象存储系统的所执行的各动作或处理过程,如图4所示,该装置400可以包括:生成单元410、接收单元420和确定单元430。
生成单元410,用于生成映射关系,该映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,该N个操作在不同时刻发生,该N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,该第一操作与该第一存储空间对应,该第一存储空间是该N个存储空间中的任一存储空间。
该生成单元410,还用于生成操作记录,该操作记录用于记录该N个操作的发生时刻。
接收单元420,用于接收第一查询请求,该第一查询请求用于请求查询该第一存储对象在第一时刻的存储状态。
确定单元430,用于根据该操作记录,从该N个操作中确定该第一时刻之前的第一操作。
该确定单元430,还用于根据该映射关系,确定与该第一操作对应的存储空间中存储的第一存储对象。
可选地,该操作记录被划分为N个时间片,且每个时间片中的操作记录包括该时间片的时间信息。
可选地,该生成单元410还用于:根据该第一操作的时间片的时间信息和该第一存储对象对应的元数据生成新的元数据。
在一种可能的实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的读请求时,该确定单元430还用于:根据该第一快照的快照名确定该第一时间片;根据该第一时间片确定该元数据和该第一存储对象。
在另一种可能的实现方式中,该第一存储在第一时间片建立第一快照,当该第一存储接收到客户端发送的针对该第一快照的回滚请求时,该确定单元还用于:根据该回滚请求所在的时间片和该第一时间片确定第一时段;删除该第一时段内的所有该操作记录。
图5为本申请另一个实施例的查询数据的装置的示意性结构图。应理解,图5所示的装置500仅是示例,本申请实施例的查询数据的装置还可以包括其他模块或单元,或者包括与图5中的各个模块的功能相似的模块,或者并非要包括图5中所有模块。
存储器510存储用于实现图2所示的各个模块的功能的程序代码。
处理器520用于执行存储器510中存储的程序代码。
通信接口530用于与存储设备或者存储介质进行通信,读取存储设备或存储介质中的数据,或者向存储设备或存储介质写入数据。
具体地,处理器执行存储器510中存储的程序代码时,处理器520可以调用通信接口 530实现图2中查询数据的方法。为了简洁,此处不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (12)

  1. 一种查询数据的方法,其特征在于,包括:
    生成映射关系,所述映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,所述N个操作在不同时刻发生,所述N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,所述第一操作与所述第一存储空间对应,所述第一存储空间是所述N个存储空间中的任一存储空间;
    生成操作记录,所述操作记录用于记录所述N个操作的发生时刻;
    接收第一查询请求,所述第一查询请求用于请求查询所述第一存储对象在第一时刻的存储状态;
    根据所述操作记录,从所述N个操作中确定所述第一时刻之前的第一操作;
    根据所述映射关系,确定与所述第一操作对应的存储空间中存储的第一存储对象。
  2. 根据权利要求1所述的方法,其特征在于,所述操作记录被划分为N个时间片,且每个时间片中的操作记录包括所述时间片的时间信息。
  3. 根据权利要求2所述的方法,其特征在于,所述生成操作记录,包括:
    根据所述第一操作的时间片的时间信息和所述第一存储对象对应的元数据生成新的元数据。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第一存储在第一时间片建立第一快照,当所述第一存储接收到客户端发送的针对所述第一快照的读请求时,所述方法还包括:
    根据所述第一快照的快照名确定所述第一时间片;
    根据所述第一时间片确定所述元数据和所述第一存储对象。
  5. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第一存储在第一时间片建立第一快照,当所述第一存储接收到客户端发送的针对所述第一快照的回滚请求时,所述方法还包括:
    根据所述回滚请求所在的时间片和所述第一时间片确定第一时段;
    删除所述第一时段内的所有所述操作记录。
  6. 一种查询数据的装置,其特征在于,包括:
    生成单元,用于生成映射关系,所述映射关系用于指示针对第一存储对应的N个操作与N个存储空间之间的一一对应关系,其中,所述N个操作在不同时刻发生,所述N个存储空间中的第一存储空间用于存储经过基于第一操作的处理后的第一存储对象,所述第一操作与所述第一存储空间对应,所述第一存储空间是所述N个存储空间中的任一存储空间;
    所述生成单元,还用于生成操作记录,所述操作记录用于记录所述N个操作的发生时刻;
    接收单元,用于接收第一查询请求,所述第一查询请求用于请求查询所述第一存储对象在第一时刻的存储状态;
    确定单元,用于根据所述操作记录,从所述N个操作中确定所述第一时刻之前的第一操作;
    所述确定单元,还用于根据所述映射关系,确定与所述第一操作对应的存储空间中存储的第一存储对象。
  7. 根据权利要求6所述的装置,其特征在于,所述操作记录被划分为N个时间片,且每个时间片中的操作记录包括所述时间片的时间信息。
  8. 根据权利要求7所述的装置,其特征在于,所述生成单元还用于:
    根据所述第一操作的时间片的时间信息和所述第一存储对象对应的元数据生成新的元数据。
  9. 根据权利要求6至8中任一项所述的装置,其特征在于,所述第一存储在第一时间片建立第一快照,当所述第一存储接收到客户端发送的针对所述第一快照的读请求时,所述确定单元还用于:
    根据所述第一快照的快照名确定所述第一时间片;
    根据所述第一时间片确定所述元数据和所述第一存储对象。
  10. 根据权利要求6至8中任一项所述的装置,其特征在于,所述第一存储在第一时间片建立第一快照,当所述第一存储接收到客户端发送的针对所述第一快照的回滚请求时,所述确定单元还用于:
    根据所述回滚请求所在的时间片和所述第一时间片确定第一时段;
    删除所述第一时段内的所有所述操作记录。
  11. 一种查询数据的装置,其特征在于,所述快照处理装置包括处理器和通信接口,处理器用于执行程序,当处理器执行代码时,所述处理器和所述通信接口用于实现权利要求1至5中任意一项所述的查询数据的方法。
  12. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行如权利要求1至5中任意一项查询数据的方法。
PCT/CN2019/095103 2018-08-03 2019-07-08 查询数据的方法和装置 WO2020024772A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19845149.4A EP3822762A4 (en) 2018-08-03 2019-07-08 DATA INTERROGATION PROCESS AND APPARATUS
US17/164,981 US11579986B2 (en) 2018-08-03 2021-02-02 Data query method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810879580.4A CN109144416B (zh) 2018-08-03 2018-08-03 查询数据的方法和装置
CN201810879580.4 2018-08-03

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/164,981 Continuation US11579986B2 (en) 2018-08-03 2021-02-02 Data query method and apparatus

Publications (1)

Publication Number Publication Date
WO2020024772A1 true WO2020024772A1 (zh) 2020-02-06

Family

ID=64791418

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/095103 WO2020024772A1 (zh) 2018-08-03 2019-07-08 查询数据的方法和装置

Country Status (4)

Country Link
US (1) US11579986B2 (zh)
EP (1) EP3822762A4 (zh)
CN (1) CN109144416B (zh)
WO (1) WO2020024772A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144416B (zh) 2018-08-03 2020-04-28 华为技术有限公司 查询数据的方法和装置
CN109922490B (zh) * 2019-01-28 2022-01-28 广东中视信息科技有限公司 一种数据防中断持续传输管理系统
US11422897B2 (en) 2019-07-31 2022-08-23 Rubrik, Inc. Optimizing snapshot image processing
JP7313458B2 (ja) * 2019-09-18 2023-07-24 華為技術有限公司 ストレージシステム、ストレージノード及びデータ記憶方法
CN112751895B (zh) * 2019-10-30 2022-02-25 千寻位置网络有限公司 通信连接保活方法及其系统
CN111857602B (zh) * 2020-07-31 2022-10-28 重庆紫光华山智安科技有限公司 数据处理方法、装置、数据节点及存储介质
CN116521094B (zh) * 2023-07-03 2023-11-14 之江实验室 一种元数据存储方法、装置、计算机设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541750A (zh) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 数据快照的实现方法和装置
CN102843396A (zh) * 2011-06-22 2012-12-26 中兴通讯股份有限公司 一种分布式缓存系统中的数据写入及读取方法及装置
CN103678039A (zh) * 2013-12-05 2014-03-26 华为技术有限公司 数据恢复方法及设备
CN104657362A (zh) * 2013-11-18 2015-05-27 深圳市腾讯计算机系统有限公司 数据存储、查询方法和装置
CN107426605A (zh) * 2017-04-21 2017-12-01 北京疯景科技有限公司 数据处理方法及装置
US20180157434A1 (en) * 2014-06-19 2018-06-07 Cohesity, Inc. Making more active use of a secondary storage system
CN109144416A (zh) * 2018-08-03 2019-01-04 华为技术有限公司 查询数据的方法和装置

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7206961B1 (en) * 2002-09-30 2007-04-17 Emc Corporation Preserving snapshots during disk-based restore
US7287133B2 (en) * 2004-08-24 2007-10-23 Symantec Operating Corporation Systems and methods for providing a modification history for a location within a data store
CN100369000C (zh) * 2005-08-12 2008-02-13 西安三茗科技有限责任公司 一种计算机硬盘数据多时间点快速存储与恢复方法
US8195912B2 (en) * 2007-12-06 2012-06-05 Fusion-io, Inc Apparatus, system, and method for efficient mapping of virtual and physical addresses
CN101777016B (zh) * 2010-02-08 2012-04-25 北京同有飞骥科技股份有限公司 一种连续数据保护系统的快照存储和数据恢复方法
US10509776B2 (en) * 2012-09-24 2019-12-17 Sandisk Technologies Llc Time sequence data management
CN103677834B (zh) * 2013-12-16 2017-04-05 北京经纬恒润科技有限公司 一种信号操作处理方法及装置
KR20150081810A (ko) * 2014-01-07 2015-07-15 한국전자통신연구원 데이터 저장장치에 대한 다중 스냅샷 관리 방법 및 장치
CN105677687A (zh) * 2014-11-21 2016-06-15 阿里巴巴集团控股有限公司 一种数据处理方法及装置
CN107203331B (zh) * 2016-03-17 2022-05-06 中兴通讯股份有限公司 写数据的方法及装置
JP6708929B2 (ja) * 2016-08-15 2020-06-10 富士通株式会社 ストレージ制御装置、ストレージシステムおよびストレージ制御プログラム
CN106897168A (zh) * 2017-01-11 2017-06-27 广东小天才科技有限公司 一种移动设备的系统恢复方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843396A (zh) * 2011-06-22 2012-12-26 中兴通讯股份有限公司 一种分布式缓存系统中的数据写入及读取方法及装置
CN102541750A (zh) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 数据快照的实现方法和装置
CN104657362A (zh) * 2013-11-18 2015-05-27 深圳市腾讯计算机系统有限公司 数据存储、查询方法和装置
CN103678039A (zh) * 2013-12-05 2014-03-26 华为技术有限公司 数据恢复方法及设备
US20180157434A1 (en) * 2014-06-19 2018-06-07 Cohesity, Inc. Making more active use of a secondary storage system
CN107426605A (zh) * 2017-04-21 2017-12-01 北京疯景科技有限公司 数据处理方法及装置
CN109144416A (zh) * 2018-08-03 2019-01-04 华为技术有限公司 查询数据的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3822762A4 *

Also Published As

Publication number Publication date
US11579986B2 (en) 2023-02-14
CN109144416A (zh) 2019-01-04
US20210157685A1 (en) 2021-05-27
CN109144416B (zh) 2020-04-28
EP3822762A1 (en) 2021-05-19
EP3822762A4 (en) 2021-09-01

Similar Documents

Publication Publication Date Title
WO2020024772A1 (zh) 查询数据的方法和装置
US10997035B2 (en) Using a snapshot as a data source
US9804934B1 (en) Production recovery using a point in time snapshot
US10089192B2 (en) Live restore for a data intelligent storage system
EP3008599B1 (en) Live restore for a data intelligent storage system
US9336230B1 (en) File replication
US9600377B1 (en) Providing data protection using point-in-time images from multiple types of storage devices
US9405481B1 (en) Replicating using volume multiplexing with consistency group file
US9785518B2 (en) Multi-threaded transaction log for primary and restore/intelligence
US20200293571A1 (en) Targeted search of backup data using facial recognition
US8521694B1 (en) Leveraging array snapshots for immediate continuous data protection
US8924668B1 (en) Method and apparatus for an application- and object-level I/O splitter
US9268602B2 (en) Systems and methods for performing data management operations using snapshots
US8745004B1 (en) Reverting an old snapshot on a production volume without a full sweep
US7577806B2 (en) Systems and methods for time dependent data storage and recovery
US8433863B1 (en) Hybrid method for incremental backup of structured and unstructured files
US8375181B1 (en) System and method for performing replication based on change tracking information
US9223797B2 (en) Reparse point replication
CN104360914A (zh) 增量快照方法和装置
US11514002B2 (en) Indexing splitter for any pit replication
US11899625B2 (en) Systems and methods for replication time estimation in a data deduplication system
CN113821476B (zh) 数据处理方法及装置
CN115658391A (zh) 基于QianBase MPP数据库的WAL机制的备份恢复方法
Wang et al. Towards cluster-wide deduplication based on Ceph
KR102089710B1 (ko) 연속 데이터 관리 시스템 및 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845149

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE