WO2019062574A1 - 一种元数据查询方法及装置 - Google Patents

一种元数据查询方法及装置 Download PDF

Info

Publication number
WO2019062574A1
WO2019062574A1 PCT/CN2018/105969 CN2018105969W WO2019062574A1 WO 2019062574 A1 WO2019062574 A1 WO 2019062574A1 CN 2018105969 W CN2018105969 W CN 2018105969W WO 2019062574 A1 WO2019062574 A1 WO 2019062574A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
identifier
volume
data block
snapshot
Prior art date
Application number
PCT/CN2018/105969
Other languages
English (en)
French (fr)
Inventor
李勇
杨忠兵
涂妍
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18861134.7A priority Critical patent/EP3678015B1/en
Publication of WO2019062574A1 publication Critical patent/WO2019062574A1/zh
Priority to US16/831,005 priority patent/US11474972B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/128Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/156Query results presentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the embodiments of the present invention relate to the field of data storage, and in particular, to a metadata query method and apparatus.
  • Snapshot is an indispensable feature. Almost all cloud storage supports snapshot functions such as AWS and Facebook Cloud.
  • a snapshot provides an image of the system at a point in time that restores the system to an available point in time in the event of a system failure.
  • the snapshot volume can reuse the data of the base volume, which can effectively reduce the overhead of the snapshot storage space.
  • Metadata refers to information used to describe data attributes in a snapshot volume to support functions such as indicating storage location, historical data, resource lookup, file logging, and so on.
  • snapshot chain which are used to reduce the metadata information of snapshots, such as tree snapshots and chain snapshots.
  • Tree snapshots require multiple copies of metadata when forming a snapshot, but there are no close dependencies between snapshot devices.
  • Chained snapshots require only a small amount of metadata copying, but there are tight dependencies between snapshot devices.
  • the information records the incremental metadata information after the most recent snapshot point, so the metadata space is relatively saved.
  • the incremental metadata information stored on it cannot cover all the metadata in many cases, and needs to be searched in multiple snapshot volumes and base volumes according to the dependency, so the search time overhead is relatively large.
  • the embodiment of the invention provides a metadata query method and device, which solves the problem that the search time is relatively large when searching for metadata in multiple snapshot volumes and base volumes in the prior art, and improves the efficiency of searching metadata. .
  • the first aspect provides a metadata query method, which is applied to a chain snapshot, the method comprising: receiving a metadata query request, where the metadata query request includes a volume identifier and a data block identifier of the first snapshot volume; a volume identifier of the snapshot volume, the first timing identifier is obtained from the first snapshot volume, the first timing identifier is used to indicate a creation timing of the first snapshot volume, and the historical index information is queried according to the data block identifier and the first timing identifier, the history The index information includes a correspondence between the query data block identifier and the historical query snapshot information, where the historical query snapshot information is used to indicate the query volume identifier and the query time interval; when the history index information exists, the data block identifier is present, and the first The time-series identifier obtains the corresponding target volume identifier from the historical index information, and obtains the address element corresponding to the data block identifier from the second snapshot volume indicated by the target volume identifier. data.
  • the historical index information when receiving the metadata query request, may be queried first, and when the data block identifier exists in the historical index information, and the first time series identifier is in the query timing interval corresponding to the data block identifier, Obtaining a corresponding target volume identifier from the historical index information, and obtaining address metadata corresponding to the data block identifier from the second snapshot volume indicated by the target volume identifier, where the history index information stores the query data block identifier and The history queries the correspondence between the snapshot information. Therefore, the problem of searching step by step according to the prior art can be avoided, thereby reducing the search time and improving the query efficiency of the metadata.
  • the historical query snapshot information includes at least one query volume identifier, and a query timing identifier and a query hop count corresponding to each query volume identifier, and a query timing corresponding to a query volume identifier.
  • the identifier and the query hop count are used to indicate the query timing interval corresponding to the query volume identifier.
  • the method before the querying the historical index information according to the data block identifier and the first time series identifier, the method further includes: querying the location identification information according to the data block identifier, where the location identification information is used. And indicating a volume location of the address metadata corresponding to the data block identifier; when determining that the volume location of the address metadata is a base volume, obtaining the address metadata from the base volume according to the data block identifier; correspondingly,
  • the querying the historical index information according to the data block identifier and the first timing identifier specifically includes: when determining that the volume location where the address metadata is located is a snapshot volume, querying the historical index information according to the data block identifier and the first time series identifier.
  • the volume location of the address metadata corresponding to the data block identifier may be quickly determined to be a base volume or a snapshot volume, and the metadata may be queried from the base volume or the snapshot volume, thereby Improve query efficiency and reduce storage space for historical index information.
  • the location identification information includes at least one data block identifier, and a latest snapshot timing identifier corresponding to each of the at least one data block identifier
  • the method further includes: when When the data block identifier exists in the location identification information, and the first time sequence identifier is smaller than the latest snapshot timing identifier corresponding to the data block identifier, the volume location where the address metadata is located is determined to be a snapshot volume.
  • the data block identifier does not exist in the history index information, or the data block identifier exists in the history index information, and the first time sequence identifier is not in the data block identifier.
  • the method further includes: searching step by step from the first snapshot volume to determine a volume identifier and a target time interval of the third snapshot volume where the address metadata corresponding to the data block identifier is located; and identifying the data block according to the data block identifier
  • the historical index information is updated with the correspondence between the volume identifier of the third snapshot volume and the target timing interval.
  • the volume identifier and the target time interval of the third snapshot volume where the address metadata corresponding to the data block identifier is located are determined by the hierarchical search, and the historical index information is updated based on the information, so that the subsequent search may be reduced.
  • the time overhead of the address metadata improves the query efficiency of the metadata.
  • the historical index information includes M partitions, each partition is used to store N query records, and one query record includes a historical data block identifier and a query volume identifier and a query timing interval.
  • the correspondence between the two, M, N is a positive integer.
  • the historical index information is updated according to the correspondence between the data block identifier and the volume identifier and the target time interval of the third snapshot volume, including: when the first of the M partitions When the query record included in the partition is less than N, the correspondence between the data block identifier and the volume identifier of the third snapshot volume and the target timing interval is stored in the first partition.
  • the historical index information is updated based on the query result, so that the time cost of searching for the address metadata again is reduced, and the query efficiency of the metadata is improved.
  • the historical index information is updated according to the correspondence between the data block identifier and the volume identifier and the target time interval of the third snapshot volume, including: when the first of the M partitions If the query record included in the partition is equal to N, and the target query hop count in the target time interval is greater than the query hop count in the N query records included in the first partition, the query record with the smallest query hop count in the first partition is replaced with the query record.
  • the data block identifies a correspondence between the volume identifier of the third snapshot volume and the target timing interval.
  • the query record with the smaller query hop count in the historical index information is replaced based on the query result, thereby reducing the search time overhead of the metadata with a large query hop count and improving the query efficiency of the metadata.
  • the historical index information is updated according to the correspondence between the data block identifier and the volume identifier and the target time interval of the third snapshot volume, including: when the first of the M partitions The partition includes a query record equal to N, and the target query hop count in the time interval is smaller than the query hop count in the N query records included in the first partition, and the data block identifier and the volume identification and target timing of the third snapshot volume are The correspondence of the intervals is stored in the second partition.
  • the historical index information is updated based on the query result, so that the time cost of searching for the address metadata again is reduced, and the query efficiency of the metadata is improved.
  • the storage space corresponding to the second partition is empty, and the data block identifier and the volume identifier and target of the third snapshot volume are The correspondence between the time series intervals is stored before the second partition, and the method further comprises: allocating a storage space for storing the N query records for the second partition.
  • storage space is not allocated for the second partition, and when storage in the second partition is required, storage space is allocated for the second partition, which may improve storage. Utilization, avoiding waste of storage space.
  • each of the M partitions corresponds to a hop threshold, and the number of query hops in the N query records stored in one partition is greater than or equal to the hop threshold corresponding to the partition. .
  • each of the M partitions corresponds to a hop count threshold, thereby improving the efficiency of querying the historical index information, thereby reducing the search time overhead of the metadata, so as to improve the query efficiency of the metadata.
  • a metadata query device is applied to a chain snapshot, the device comprising: a receiving unit, configured to receive a metadata query request, where the metadata query request includes a volume identifier and a data block of the first snapshot volume And an obtaining unit, configured to acquire, according to the volume identifier of the first snapshot volume, a first timing identifier from the first snapshot volume, where the first timing identifier is used to indicate a creation timing of the first snapshot volume, and the query unit is configured to The data block identifier and the first time-series identifier query historical index information, where the historical index information includes a correspondence between the query data block identifier and the historical query snapshot information, where the historical query snapshot information is used to indicate the query volume identifier and the query timing interval;
  • the obtaining unit is further configured to: when the data block identifier exists in the historical index information, and the first time series identifier is in the corresponding query time interval, obtain the corresponding target volume identifier from the historical index information, and identify the target volume identifier from
  • the historical query snapshot information includes at least one query volume identifier, and a query timing identifier and a query hop count corresponding to each query volume identifier, the query timing identifier and the query hopping The number is used to indicate the query timing interval.
  • the query unit is further configured to: according to the data block identifier, query location identification information, where the location identification information is used to indicate a volume location where the address metadata corresponding to the data block identifier is located;
  • the obtaining unit is further configured to: when determining that the volume location where the address metadata is located is a base volume, obtain the address metadata from the base volume according to the data block identifier; correspondingly, the query unit is further configured to: when determining When the volume location of the address metadata is a snapshot volume, the historical index information is queried according to the data block identifier and the first timing identifier.
  • the location identification information includes at least one data block identifier, and a latest snapshot timing identifier corresponding to each of the at least one data block identifier
  • the query unit is further used
  • the data block identifier exists in the location identification information, and the first timing identifier is smaller than the latest snapshot timing identifier corresponding to the data block identifier
  • the volume location where the address metadata is located is determined to be a snapshot volume.
  • the query unit is further configured to: when the data block identifier does not exist in the historical index information, or the first timing identifier is not in the corresponding query timing interval, from the first snapshot volume Starting a step-by-step search to determine a volume identifier and a target time interval of the third snapshot volume where the address metadata corresponding to the data block identifier is located; the apparatus further includes: an update unit, configured to identify the third snapshot volume according to the data block identifier The history identifier information is updated by the correspondence between the volume identifier and the target time interval.
  • the historical index information includes M partitions, each partition is used to store N query records, and one query record includes a historical data block identifier and a query volume identifier and a query timing interval.
  • the correspondence between the M and N is a positive integer.
  • the updating unit is further configured to: when the first partition of the M partitions includes a query record smaller than the N, identify the data block with the third snapshot volume The correspondence between the volume identifier and the target timing interval is stored in the first partition.
  • the updating unit is further configured to: when the first partition of the M partitions includes a query record equal to the N, and the target query hop count in the target time interval is greater than When the number of query hops in the N query records included in the first partition, the query record with the smallest query hop count in the first partition is replaced with the corresponding relationship between the data identifier and the volume identifier of the third snapshot volume and the target time interval. .
  • the updating unit is further configured to: when the first partition of the M partitions includes a query record equal to the N, and the target query hop count in the time interval is smaller than the first When the number of query hops in the N query records included in a partition, the correspondence between the data block identifier and the volume identifier of the third snapshot volume and the target timing interval is stored in the second partition.
  • the apparatus when the query record is not stored in the second partition, the storage space corresponding to the second partition is empty, and the apparatus further includes: an allocating unit, configured to allocate the second partition Storage space for storing N query records.
  • each of the M partitions corresponds to a hop threshold, and the number of query hops in the N query records stored in one partition is greater than or equal to the hop corresponding to the partition. Number threshold.
  • an apparatus comprising a memory, a processor, a bus, and a communication interface, wherein the memory stores code and data, the processor and the memory are connected by a bus, and the processor runs the code in the memory to cause the device to perform the first Aspect or a metadata query method provided by any of the possible implementations of the first aspect.
  • a still further aspect of the present application provides a computer readable storage medium having stored therein instructions that, when executed on a computer, cause the computer to perform the first aspect or the first aspect described above
  • a metadata query method provided by any of the possible implementations.
  • a computer program product comprising instructions which, when run on a computer, cause the computer to perform the elements provided by any of the first aspect or any of the possible implementations of the first aspect Data query method.
  • FIG. 1 is a schematic structural diagram of a chain snapshot according to an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a metadata index of a volume according to an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart diagram of a metadata query method according to an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of metadata organization of a snapshot volume according to an embodiment of the present disclosure
  • FIG. 5 is a schematic structural diagram of metadata organization of a base volume according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of historical index information according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of searching for address metadata according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic flowchart diagram of another metadata query method according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a metadata query apparatus according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of another metadata query apparatus according to an embodiment of the present application.
  • Snapshot is an indispensable feature. Almost all cloud storage supports snapshot functions such as AWS and Facebook Cloud.
  • a snapshot provides an image of the system at a point in time that restores the system to an available point in time in the event of a system failure.
  • a snapshot can refer to a fully available copy of a specified set of data, including an image of the corresponding data at a point in time (the point in time when the copy begins).
  • a snapshot can be a copy of the data it represents, or it can be a replica of the data.
  • the snapshot volume can effectively reduce the storage space overhead by multiplexing the data of the source volume (also known as the base volume).
  • Most snapshots are optimized for the storage space of the snapshot data, for example, write redirection (Redirect On Write, ROW), or copy-on-write (COW) snapshot technology.
  • ROW means that during a snapshot generation, all write operations will be redirected to another storage medium.
  • the COW means that when the data is first written to a certain storage medium, the original content is first read out, written to another storage medium, and then the data is written to the certain storage medium.
  • the data in the snapshot volume is very sparse.
  • the generated data of multiple snapshots directly points to the data of the source volume, or the old data in the source volume is also saved. Therefore, between multiple snapshots of the same source volume, the duplication of metadata is very high because of the sparse nature of the data.
  • the metadata refers to related information for describing data stored on the volume, including access rights of the data, modification time, address of the data, and the like.
  • the address metadata in the present application refers to the address of the data.
  • chain snapshots can reduce the metadata information of snapshots by sharing metadata to optimize the storage space of snapshot volume metadata.
  • FIG. 1 it is a schematic diagram of a chain snapshot.
  • the structure of the chain snapshot includes a base volume and a snapshot volume.
  • the base volume and the snapshot volume form a chain.
  • the newly created snapshot volume is inserted behind the base volume.
  • the base volume and the snapshot volume form a chain structure similar to a linked list.
  • the location of the snapshot volume in a chained snapshot can be used to identify the timing relationship between snapshot volumes. The earlier the snapshot volume is created, the later it is in the chain structure.
  • the embodiment of the present application is directed to a chain snapshot of metadata.
  • the metadata index of the volume adopts a secondary indexing mechanism.
  • the second level index refers to metadata that stores the address describing the data in the snapshot volume
  • the first level index refers to the metadata of the address of the second level index.
  • the storage unit corresponding to the first level index and the second level index may be a container.
  • the capacity of each container may be 8 megabytes (Mega Byte, MB).
  • the container of the first level index is composed of a plurality of metadata units, each of which has a size of 16 bytes (Byte, B), and the minimum unit of reading and writing can be each metadata unit.
  • the first level index can be referred to as a Volume Container Map (VCM).
  • VCM Volume Container Map
  • the second level index of a snapshot volume shares metadata on the snapshot chain, it is represented by a special identifier (for example, "-2" in Figure 2).
  • the container of the second level index can be divided into a plurality of blocks, each block having a size of 4 kilobytes (Kilo Byte, KB), and the minimum unit of reading and writing can be each block.
  • Each block can contain approximately 250 metadata units (4KB/16B), and "A" in Figure 2 represents an address, which is the address of the container representing the second level index.
  • the values of the parameters are exemplary values. It can be understood by those skilled in the art that the value of each parameter can be adjusted according to different scenarios, which is not limited in this embodiment of the present application.
  • the base volume has a complete set of metadata
  • the snapshot volume can share metadata on the base volume and snapshot chain.
  • the information Since in the chain snapshot, the information records the incremental metadata information after the last snapshot point, the metadata space is relatively saved. However, the incremental metadata information stored on it cannot cover all the metadata in many cases. When searching for a certain metadata, it needs to search through multiple snapshot volumes and base volumes according to the dependency relationship, so the search time overhead is relatively large.
  • the embodiment of the present application uses the historical index information, and the historical index information stores the metadata information found in the previous period of time. Therefore, the next time the metadata is searched, the historical index information can be directly searched, thereby avoiding the The direction of the volume is searched up step by step, which greatly reduces the search time.
  • the historical index information in the embodiment of the present application stores the elements in the historical search that are relatively time consuming (or the query hop count is relatively large).
  • the data information improves the query efficiency of the metadata while ensuring the storage space of the metadata is saved.
  • FIG. 3 is a schematic flowchart of a metadata query method according to an embodiment of the present disclosure. Referring to FIG. 3, the method is applied to an electronic device for managing a chain snapshot, and the method includes the following steps.
  • Step 301 Receive a metadata query request, where the metadata query request includes a volume identifier and a data block identifier of the first snapshot volume.
  • the electronic device managing the chain snapshot may be a server or a management device, and the electronic device may be used to manage and maintain the base volume and the snapshot volume, for example, managing metadata in the base volume and the snapshot volume, and creating or deleting a certain Snapshot volumes, as well as data blocks for managing base and snapshot volumes, and more.
  • chain snapshot for example, file system or database
  • some information query for example, querying the volume size, etc.
  • the corresponding metadata query process is triggered, so that the electronic device managing the chain snapshot receives the metadata query request, and the metadata query request includes the volume identifier and the data block identifier of the first snapshot volume, and the first snapshot
  • the volume identifier of the volume refers to the volume identifier of the snapshot volume currently used by the upper layer application, and the metadata query request is used to query the address metadata of the data block indicated by the data block identifier, that is, the address of the storage space where the data block is stored.
  • the address metadata corresponding to the data block identifier may be queried from the first snapshot volume according to the volume identifier of the first snapshot volume. If the address metadata is in the first snapshot volume, the address metadata corresponding to the data block identifier is directly obtained from the first snapshot volume. If the address metadata is not in the first snapshot volume, indicating that the address metadata is metadata shared by the first snapshot volume from the base volume or the snapshot volume generated after the first snapshot volume, the address metadata is located in the base volume or the first In the snapshot volume created after a snapshot volume, the following step 302 is performed.
  • Step 302 Acquire a first timing identifier from the first snapshot volume according to the volume identifier of the first snapshot volume, where the first timing identifier is used to indicate a creation timing of the first snapshot volume.
  • Each snapshot volume is assigned a volume identifier and a timing identifier.
  • the volume identifier of a snapshot volume is used to identify the snapshot volume.
  • the timing identifier of a snapshot volume is used to indicate the creation timing of the snapshot volume.
  • the timing identifier can be stored in the corresponding snapshot volume.
  • the electronic device may determine the first snapshot volume according to the volume identifier of the first snapshot volume included in the metadata query request, and obtain the creation timing of the first snapshot volume from the first snapshot volume, that is, the first timing identifier.
  • Illustrative, as shown in Figure 4, is a metadata organization of a snapshot volume.
  • the LUNID in Figure 4 is the volume identifier of the base volume, which is used to identify the storage space for storing metadata.
  • the SID is the volume identifier of the snapshot volume and is used to uniquely identify the snapshot volume.
  • the TID is the sequence identifier of the snapshot volume. It is used to identify the creation timing of the snapshot volume. You can determine the sequence of creating the multiple snapshot volumes based on the timing identifier of multiple snapshot volumes.
  • the other metadata information in Figure 4 is used to represent the snapshot volume, such as the address metadata of the data on the snapshot volume, the access rights of the snapshot volume, and so on.
  • Step 303 Query historical index information according to the data block identifier and the first time sequence identifier.
  • the historical index information includes a correspondence between the query data block identifier and the historical query snapshot information, where the historical query snapshot information is used to indicate the query volume identifier and the query. Timing interval.
  • the historical index information refers to historical query record information, that is, information about some metadata that is queried before the current metadata search, and the address of the storage space corresponding to the historical index information may be stored on the base volume, thereby The device may obtain the address of the storage space directly from the base volume, and read the historical index information based on the address of the storage space to query the historical index information according to the data block identifier and the first time series identifier.
  • other related information of the historical index information may also be stored in the base volume, which is not limited in this embodiment of the present application.
  • the historical index information includes four partitions as an example in FIG. 5, and each partition corresponds to a hop count threshold.
  • the LUNID in FIG. 5 is used to identify the storage space of the metadata of the storage base volume, and P1, P2, P3, and P4 are used to represent the address of the storage space occupied by each of the four partitions of the historical index information, TS1, TS2. , TS3 and TS4 are respectively used to indicate the hop count threshold corresponding to the four partitions, and other other metadata information indicating the base volume, for example, the address metadata of the data on the base volume, the access authority of the data on the base volume, and the like. .
  • the historical index information may include a correspondence between the query data block identifier and the historical query snapshot information, where the historical query snapshot information is used to indicate the query volume identifier and the query timing interval.
  • the historical query snapshot information may include at least one query volume identifier, and a query timing identifier and a query hop count corresponding to each query volume identifier in the at least one query volume identifier, where the query timing identifier and the query hop count may be used. Indicates the query timing interval.
  • the query timing identifier corresponding to the query volume identifier is used to indicate the creation timing of the snapshot volume corresponding to the query volume identifier.
  • the TID_10 may be used to indicate that the SID1 is identified.
  • the creation time of the snapshot volume is the 10th created snapshot volume.
  • the number of query hops corresponding to a query volume identifier is used to indicate the number of snapshot volumes to be searched for in the metadata query corresponding to the historical query record.
  • a snapshot volume can be recorded as one hop. For example, if the query hop count is 5, Indicates the number of snapshot volumes found by the metadata query corresponding to this historical query record is 5.
  • the determined query timing interval may be [TID_5, TID_10].
  • FIG. 6 is a schematic diagram of historical index information.
  • the correspondence between the data block identifier and the historical query snapshot information in the history index information may include four elements, namely, a data block identifier (VBN), a query timing identifier TID, a query volume identifier SID, and a query hop count JNUM.
  • a history query snapshot information may include one or more query records.
  • the historical query snapshot information corresponding to the data block VBNi in FIG. 6 includes three query records
  • the historical query snapshot information corresponding to VBNi+1 includes two queries. Recording, the historical query snapshot information corresponding to VBNn includes three query records.
  • Each query record includes a query timing identifier TID, a query volume identifier SID, and a query hop count JNUM.
  • the SID when the SID is set according to the creation timing of the snapshot volume, the SID may be used to identify the creation timing of the snapshot volume, so that the TID and the SID may be combined into one identifier.
  • VBNi in FIG. 6 the three query records corresponding to VBN3 can be as follows:
  • the query timing intervals in the three query records corresponding to the VBN3 may be [TID_50, TID_100], [TID_20, TID_30], and [TID_10, TID_0], respectively.
  • the historical index information may be empty when the initial query is performed, that is, there is no historical query record.
  • the electronic device may query the corresponding address metadata according to the prior art, and query the Information about the metadata is stored in the history index information. After a period of time, the historical index information stores a plurality of historical query records, so that when the electronic device queries the metadata, the electronic index may first query from the historical index information.
  • step 304 is performed.
  • Step 304 When the data block identifier exists in the history index information, and the first time sequence identifier is in the query time interval corresponding to the data block identifier, obtain the corresponding target volume identifier from the history index information, and obtain the target volume volume from the target volume. Obtaining the address metadata corresponding to the data block address in the second snapshot volume indicated by the identifier.
  • the electronic device may determine the corresponding historical query snapshot information according to the data block identifier, and then determine a query timing interval in which the first time series identifier is located in the corresponding historical query snapshot information. Then, the target volume identifier corresponding to the query timing interval in which the first timing identifier is located is obtained from the corresponding historical query snapshot information.
  • the target volume is identified as the volume identifier of the second snapshot volume, and the address metadata corresponding to the data block identifier is obtained from the second snapshot volume.
  • the volume identifier in the metadata query request is SID7
  • the data block identifier is VBN3
  • the obtained SID7 timing identifier is TID_25.
  • the query timing interval corresponding to VBN3 in the history index information includes: [TID_50, TID_100], [TID_20, TID_30], and [TID_10, TID_0], and the timing identifier TID_25 is located in the query time interval [TID_20, TID_30], and its corresponding volume identifier
  • the corresponding metadata is obtained from the snapshot volume indicated by SID2.
  • FIG. 7 a schematic diagram of querying metadata in a chain snapshot of metadata
  • the link of the base volume to the snapshot volume is referred to as a downlink link (Downstream link)
  • the snapshot volume is directed to the base volume direction.
  • the link is called an upstream link (Upstream link).
  • FIG. 7 includes 100 snapshot volumes, and their corresponding volume identifiers are SID1-SID100, and the corresponding timing identifiers are TID_0-TID_100, respectively.
  • the case where the data block identifier included in the received metadata query request is VBN3 and the volume identifier of the first snapshot volume is SID50 is taken as an example.
  • the method for searching for metadata may be: the electronic device searches for the address metadata corresponding to the data block identifier VBN3 from the first snapshot volume indicated by the SID 50. If not found, the first timing identifier TID_50 is obtained from the first snapshot volume. If the data block identifier VBN3 and its corresponding history query snapshot information included in the history index information are: VBN3 ⁇ TID_60, 20]>
  • the process of searching for the address metadata in the prior art may be: searching for the address metadata corresponding to the data block identifier VBN3 from the first snapshot volume (ie, the 50th snapshot volume) indicated by the SID 50. If not found, the 51st snapshot volume, the 52nd snapshot volume, ..., the 60th snapshot volume are searched step by step in the direction of the uplink, so that the corresponding address is obtained in the 60th snapshot volume. Metadata.
  • the second level of the address metadata may be determined from the first level index of the snapshot volume SID60 metadata according to the data block identifier VBN3.
  • the address of the container ie, A in the first-level container in FIG. 2), and then determining the corresponding address metadata according to the address of the data block identifier VBN3 and the second-level container (that is, the data block in the second-level container in FIG. 2) ).
  • VBN3 is represented by an eight-bit binary, the first two bits are used to indicate A in the first-level container, and the last six bits are used to indicate the data blocks in the second-level container. Thereby, the data block in which the corresponding address metadata is located can be obtained.
  • the snapshot volume where the address metadata is located may be directly determined, so that the corresponding address metadata is directly obtained from the snapshot volume, thereby
  • the problem of searching step by step according to the prior art can be avoided, the search time is reduced, and the query efficiency of metadata is improved.
  • the electronic device when receiving the metadata query request, may first query the historical index information, where the data block identifier exists in the historical index information, and the first timing identifier is in the query timing corresponding to the data block identifier.
  • the corresponding target volume identifier is obtained from the historical index information
  • the address metadata corresponding to the data block identifier is obtained from the second snapshot volume indicated by the target volume identifier. Therefore, the address is obtained according to the historical index information. Metadata can avoid problems that are searched step by step according to the prior art, thereby reducing the search time and improving the query efficiency of metadata.
  • step 303 the method may further include: step 3021.
  • Step 3021 Query location identification information according to the data block identifier, where the location identification information is used to indicate a volume location where the address metadata corresponding to the data block identifier is located.
  • the location identification information may include at least one data block identifier and a latest snapshot timing identifier corresponding to each of the at least one data block identifier.
  • the data block identifier may be queried in the location identification information. If the data block identifier exists in the location identification information, the corresponding latest snapshot timing identifier is obtained from the location identification information according to the data block identifier.
  • the volume location of the address metadata to be searched for is a snapshot volume; when the first timing identifier is greater than the acquired latest snapshot timing identifier, determining the address to be searched The volume location where the metadata resides is the base volume. If the data block identifier does not exist in the location identification information, it is determined that the volume location of the address metadata to be searched for is the base volume.
  • the electronic device can record the timing TID_new of the latest snapshot volume in which the corresponding data block of the push operation is located, thereby forming the location identification information.
  • TID_new For example, ⁇ VBN, TID_new> is used to represent the location identification information and save it in the base volume.
  • the corresponding identification method may be: the electronic device may determine the size of the TID_snap and the TID_new according to the timing identifier TID_snap of the first snapshot volume.
  • TID_snap is less than TID_new, it is determined that the found address metadata is in the snapshot volume.
  • TID_snap is greater than TID_new, it is determined that the found address metadata is in the base volume, and the corresponding address metadata is directly searched in the base volume.
  • the location identification information may be stored on the base volume, so that when the location identification information is queried according to the data block identifier, the location identification information may be read from the base volume, and the location identification information is further queried according to the data block identifier.
  • step 303 when it is determined that the volume location where the address metadata is located is a base volume, the corresponding address metadata may be directly obtained from the base volume.
  • step 303 is specifically: when determining that the volume location of the address metadata to be searched for is a snapshot volume, according to the data block identifier And the first time series identifier queries historical index information.
  • the location identification information may be used to determine the location of the volume where the address metadata to be searched is located.
  • the corresponding address metadata may be directly obtained from the base volume, when the volume location is When the snapshot volume is obtained, the corresponding address metadata can be directly obtained from the historical index information, thereby quickly identifying the volume location of the address metadata to be searched by the location identification information, thereby improving the query efficiency of the metadata and reducing the requirement of the historical index information. , thereby improving the efficiency of querying historical index information.
  • the method further includes: Step 305 - Step 306.
  • Step 305 Perform a step-by-step search from the first snapshot volume to determine a volume identifier and a target timing interval of the third snapshot volume where the address metadata corresponding to the data block identifier is located.
  • the electronic device may The method searches for the direction of the base volume from the first snapshot volume, and obtains the address metadata corresponding to the data block identifier, and determines the volume identifier of the third snapshot volume where the address metadata corresponding to the data block identifier is located. And the target timing interval in the step-by-step search process.
  • the target time interval may be determined by the timing identifier of the third snapshot volume and the number of snapshot volumes searched in the step-by-step search process.
  • the number of snapshot volumes to be searched may also be referred to as the target query hop count, that is, the first snapshot volume.
  • the number of hops between the third snapshot volume For example, the volume identifier of the third snapshot volume is SID4, the timing identifier is TID_15, and the corresponding query hop count is 10. According to the timing identifier TID_15 and the query hop count 10, the corresponding target timing interval is determined as [TID_5, TID_15].
  • Step 306 Update the historical index information according to the correspondence between the data block identifier and the volume identifier and the target time interval of the third snapshot volume.
  • the electronic device may identify the volume identifier and the target timing interval of the third snapshot volume according to the data block identifier.
  • the historical index information is updated, so that when the address metadata is queried again, the electronic device can directly obtain the historical index information, which avoids the trouble of the step-by-step query in the prior art, thereby reducing the search time.
  • the history index information may include M partitions (for example, 4 partitions), that is, the storage space for storing the historical index information may be divided into M partitions, and each partition may be used to store N query records, M and N is a positive integer, and a query record may include a correspondence between a historical data block identifier and a query volume identifier and a query timing interval.
  • M partitions for example, 4 partitions
  • a query record may include a correspondence between a historical data block identifier and a query volume identifier and a query timing interval.
  • M is 4 and the partition size is 8MB
  • 8MB of storage space is allocated for the second partition, and the corresponding storage space is allocated to the third partition and the fourth partition in the same manner.
  • each partition may correspond to a hop count threshold, that is, the number of query hops in the query record stored in one partition is greater than or equal to the hop count threshold of the partition.
  • the hop count thresholds corresponding to the M partitions are set according to certain rules. For example, the order of the hop count thresholds corresponding to the two adjacent partitions may be fixed according to the order of the largest hops. For example, in the first partition and the second partition, the hop count threshold of the first partition may be 20, and the hop count threshold of the second partition may be 15, and the query hop count in the query record stored in the first partition is greater than Or equal to 20, the number of query hops in the query record stored in the second partition is greater than or equal to 15, and less than 20.
  • the electronic device updates the history index information according to the correspondence between the data block identifier and the volume identifier of the third snapshot volume and the target time interval, and relates to the query record stored in the historical index information, and the historical index information.
  • the storage space is related, which is described in detail below.
  • the first partition is taken as an example.
  • the number of query records stored in the first partition does not reach the upper limit, that is, the number of query records included in the first partition is less than N.
  • the electronic device may store the data block identifier and the corresponding relationship between the volume identifier of the third snapshot volume and the target time interval as a query record, and store the data in the first partition.
  • the electronic device may replace the query record with the smallest query hop count in the first partition with the corresponding relationship between the data block identifier and the volume identifier and the target timing interval of the third snapshot volume.
  • the electronic device may further store the query record with the smallest query hop count in the first partition in the second partition, that is, the replaced query record in the first partition is stored in the second partition.
  • the electronic device When the number of query records stored in the first partition reaches an upper limit, that is, the number of query records included in the first partition is equal to N, and the target query hop count is smaller than the query of each query record in the N query records included in the first partition.
  • the electronic device directly stores the correspondence between the data block identifier and the volume identifier of the third snapshot volume and the target timing interval in the second partition.
  • the storage space corresponding to the second partition may be empty.
  • the electronic device is The second partition allocates a corresponding storage space.
  • other partitions in the M partitions can also allocate corresponding storage space in the same way.
  • the electronic device may also update the history index information according to the corresponding operation. For example, when the electronic device operates on a certain data block VBN, the SID in the query record corresponding to the VBN may be updated to the volume identifier of the latest snapshot volume where the data block is located, so as to ensure the accuracy of the historical index information. effective. It should be noted that when the base volume updates the metadata, all the snapshot volumes still share the original metadata, so the original metadata cannot be lost and needs to be pushed to the latest snapshot volume to continue to maintain the metadata sharing relationship. The process of pushing metadata is called a push operation.
  • the partition may also be closed, that is, the storage space corresponding to the partition is recovered.
  • a partition can store 50 query records.
  • the preset value is 10, that is, the number of query hops actually stored in the partition is less than 10, the partition can be closed. If other partitions store free storage space, you can also store the number of query hops stored in the partition in other partitions.
  • the historical index information corresponding to the snapshot volume is read into the memory. After that, all operations (such as lookup, insert, update, etc.) for the historical index information are performed in memory.
  • the snapshot volume is closed, the historical index information can be saved and the historical index information can be withdrawn from the memory.
  • the bit indicating the snapshot volume timing TID may be limited. Therefore, when the TID of the snapshot volume reaches the upper limit that the bit can represent, the timing TID of the snapshot volume may be re-updated.
  • Delete historical index information By setting an appropriate maximum timing, you can control the frequency of deleting historical index information, and ensure that historical index information can be continuously updated according to historical search records at different times to improve the utilization of historical index information.
  • the electronic device herein may also be referred to as a metadata query device.
  • the metadata query device includes hardware structures and/or software modules corresponding to the execution of the respective functions in order to implement the above functions.
  • the present application can be implemented in a combination of hardware or hardware and computer software in conjunction with the network elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
  • the embodiment of the present application may divide the function module by using the metadata query device according to the foregoing method example.
  • each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
  • FIG. 9 is a schematic diagram of a possible structure of the metadata query device involved in the foregoing embodiment, where the metadata query device includes: a receiving unit 901, an acquiring unit, in a case where each function module is divided by a corresponding function. 902 and query unit 903.
  • the receiving unit 901 is configured to support the metadata querying device to perform step 301 in FIG. 3 or FIG. 8;
  • the obtaining unit 902 is configured to support the metadata querying device to perform step 302 and step 304 in FIG. 3, or in FIG. Step 302, step 304 or step 305;
  • the query unit 903 is configured to support the metadata query device to perform step 303 in FIG. 3 or FIG.
  • the metadata querying device further includes an updating unit 904 and an allocating unit 905; wherein the updating unit 904 is configured to support the metadata querying device to perform step 306 in FIG. 8; the allocating unit 905 is configured to support the metadata querying device. The step of allocating storage space for the second partition.
  • the foregoing obtaining unit 902, the query unit 903, the updating unit 904, and the allocating unit 905 may be a processor; the receiving unit 901 may be a receiver, and the receiver and the transmitter may constitute a communication interface.
  • FIG. 10 is a schematic diagram showing a possible logical structure of a metadata query apparatus involved in the foregoing embodiment provided by an embodiment of the present application.
  • the metadata query device includes a processor 1002, a communication interface 1003, a memory 1001, and a bus 1004.
  • the processor 1002, the communication interface 1003, and the memory 1001 are connected to one another via a bus 1004.
  • the processor 1002 is configured to perform control management on the action of the metadata query device.
  • the processor 1002 is configured to support the metadata query device to perform step 302-step 304 in FIG. 3, or Steps 302-306 in Figure 8, and/or other processes for the techniques described herein.
  • the communication interface 1003 is configured to support the metadata query device for communication.
  • the memory 1001 is configured to store program codes and data of the metadata query device.
  • the processor 1002 can be a central processing unit, a general purpose processor, a digital signal processor, a dedicated integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, combinations of digital signal processors and microprocessors, and the like.
  • the bus 1004 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • a readable storage medium stores computer execution instructions, when a device (which may be a single chip microcomputer, a chip, etc.) or a processor executes FIG. 3 or FIG. 8
  • the aforementioned readable storage medium may include various media that can store program codes, such as a USB flash drive, a removable hard disk, a read only memory, a random access memory, a magnetic disk, or an optical disk.
  • a computer program product comprising computer executed instructions stored in a computer readable storage medium; at least one processor of the device may be Reading the storage medium reads the computer execution instructions, and the at least one processor executing the computer execution instructions causes the device to implement the metadata query method provided in FIG. 3 or 8.
  • the historical index information may be queried first, and the data block identifier exists in the historical index information, and the first time series identifier is in the query timing interval corresponding to the data block identifier.
  • the correspondence between the identifier and the historical query snapshot information is used. Therefore, the problem of searching step by step according to the prior art can be avoided, thereby reducing the search time and improving the query efficiency of the metadata.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种元数据查询方法及装置,涉及数据存储领域,用于提高元数据的查找效率。该方法应用于链式快照中,包括:接收元数据查询请求,元数据查询请求包括第一快照卷的卷标识和数据块标识(301);根据第一快照卷的卷标识,从第一快照卷中获取第一时序标识,第一时序标识用于指示第一快照卷的创建时序(302);根据数据块标识和第一时序标识查询历史索引信息,历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系(303);当存在该数据块标识,且第一时序标识在对应的查询时序区间时,从该历史索引信息中获取对应的目标卷标识,并从目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据(304)。

Description

一种元数据查询方法及装置 技术领域
本申请实施例涉及数据存储领域,尤其涉及一种元数据查询方法及装置。
背景技术
快照(Snapshot)作为一项不可或缺的功能,几乎所有的云存储都支持快照功能,比如AWS、阿里云等。快照提供系统某个时间点的映像,在系统发生故障时可以使系统恢复到某个可用的时间点的状态。在云场景中,快照卷可以复用基卷的数据,从而能够有效减少快照存储空间的开销,但是由于快照卷的数量较大,因此需要一种有效的快照元数据检索方法。元数据是指用于描述快照卷中数据属性的信息,用于支持如指示存储位置、历史数据、资源查找、文件记录等功能。
目前,存在一些快照链等方法,用于减少快照的元数据信息等问题,比如,树形快照和链式快照。树型快照在形成快照时需要进行多份元数据拷贝,但是快照设备之间不存在紧密的依赖关系。链式快照只需要进行少量的元数据拷贝,但是快照设备之间存在紧密的依赖关系。链式快照中,信息记录的是最近一个快照点之后的增量元数据信息,所以元数据空间比较节省。但是,其上保存的增量元数据信息在很多时候不能覆盖全部元数据,需要根据依赖关在多个快照卷和基卷中查找,所以查找时间开销比较大。
发明内容
本发明的实施例提供一种元数据查询方法及装置,解决了现有技术中在多个快照卷和基卷中查找元数据时,查找时间开销比较大的问题,提高了元数据的查找效率。
为达到上述目的,本发明的实施例采用如下技术方案:
第一方面,提供一种元数据查询方法,应用于链式快照中,该方法包括:接收元数据查询请求,该元数据查询请求包括第一快照卷的卷标识和数据块标识;根据第一快照卷的卷标识,从第一快照卷中获取第一时序标识,第一时序标识用于指示第一快照卷的创建时序;根据该数据块标识和第一时序标识查询历史索引信息,该历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系,该历史查询快照信息用于指示查询卷标识和查询时序区间;当该历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间时,从该历史索引信息中获取对应的目标卷标识,并从该目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据。
上述技术方案中,在接收到元数据查询请求时,可以先查询历史索引信息,并当历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间时,从历史索引信息中获取对应的目标卷标识,并从该目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据,由于历史索引信息中存储的是查询数据块标识与历史查询快照信息之间的对应关系,因此,可以避免按照现有技术逐级查找的问题,从而降低了查找时间,提高了元数据的查询效率。
在第一方面的一种可能的实现方式中,该历史查询快照信息包括至少一个查询卷标识,以及与每个查询卷标识对应的查询时序标识和查询跳数,一个查询卷标识对应的查 询时序标识和查询跳数用于指示该查询卷标识对应的查询时序区间。
在第一方面的一种可能的实现方式中,根据该数据块标识和第一时序标识查询历史索引信息之前,该方法还包括:根据该数据块标识查询位置识别信息,该位置识别信息用于指示该数据块标识对应的地址元数据所在的卷位置;当确定该地址元数据所在的卷位置为基卷时,则根据该数据块标识从该基卷上获取该地址元数据;相应的,该根据该数据块标识和第一时序标识查询历史索引信息,具体包括:当确定该地址元数据所在的卷位置为快照卷时,则根据该数据块标识和第一时序标识查询历史索引信息。上述可能的实现方式中,通过位置识别信息,可以快速的确定该数据块标识对应的地址元数据所在的卷位置为基卷或者快照卷,并从基卷或者快照卷中查询元数据,从而可以提高查询效率,减小历史索引信息的存储空间。
在第一方面的一种可能的实现方式中,该位置识别信息包括至少一个数据块标识、以及与至少一个数据块标识中每个数据块标识对应的最新快照时序标识,该方法还包括:当该位置识别信息中存在所述数据块标识,且第一时序标识小于该数据块标识对应的最新快照时序标识时,确定该地址元数据所在的卷位置为快照卷。
在第一方面的一种可能的实现方式中,当该历史索引信息中不存在该数据块标识、或者该历史索引信息中存在该数据块标识、且第一时序标识不在该数据块标识对应的查询时序区间时,该方法还包括:从第一快照卷开始逐级查找,以确定该数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间;根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息。上述可能的实现方式中,通过逐级查找确定该数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间,并基于此更新历史索引信息,从而可以降低后续再次查找该地址元数据的时间开销,提高了元数据的查询效率。
在第一方面的一种可能的实现方式中,该历史索引信息包括M个分区,每个分区用于存储N条查询记录,一条查询记录包括一个历史数据块标识与查询卷标识和查询时序区间之间的对应关系,M、N为正整数。上述可能的实现方式中,通过将历史索引信息分区,可以提高查询历史索引信息的效率,从而降低元数据的查找时间,提高查询效率。
在第一方面的一种可能的实现方式中,根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息,包括:当M个分区中的第一分区包括的查询记录小于N时,将该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,存储在第一分区中。上述可能的实现方式中,基于查询结果更新历史索引信息,从而可以降低后续再次查找该地址元数据的时间开销,提高了元数据的查询效率。
在第一方面的一种可能的实现方式中,根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息,包括:当M个分区中的第一分区包括的查询记录等于N,且目标时序区间中的目标查询跳数大于第一分区包括的N条查询记录中的查询跳数时,将第一分区中查询跳数最小的查询记录替换为该数据块标识与第三快照卷的卷标识和所述目标时序区间的对应关系。上述可能的实现方式中,基于查询结果替换历史索引信息中查询跳数较小的查询记录,从而可以降低查询跳数较大的元数据的查找时间的开销,提高元数据的查询效率。
在第一方面的一种可能的实现方式中,根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息,包括:当M个分区中的第一分区包括 的查询记录等于N,且时序区间中的目标查询跳数小于第一分区包括的N条查询记录中的查询跳数时,将该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,存储在第二分区。上述可能的实现方式中,基于查询结果更新历史索引信息,从而可以降低后续再次查找该地址元数据的时间开销,提高了元数据的查询效率。
在第一方面的一种可能的实现方式中,当第二分区中未存储有查询记录时,第二分区对应的存储空间为空,将该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,存储在第二分区之前,该方法还包括:为第二分区分配用于存储N条查询记录的存储空间。上述可能的实现方式中,当不需要在第二分区中存储时,则不为第二分区分配存储空间,当需要在第二分区存储时,再为第二分区分配存储空间,可以提高存储的利用率,避免存储空间的浪费。
在第一方面的一种可能的实现方式中,M个分区中的每个分区对应一个跳数阈值,一个分区中存储的N条查询记录中的查询跳数大于或等于分区对应的跳数阈值。上述可能的实现方式中,M个分区中的每个分区对应一个跳数阈值,从而可以提高查询历史索引信息的效率,进而降低元数据的查找时间开销,以提高元数据的查询效率。
第二方面,提高一种元数据查询装置,应用于链式快照中,该装置包括:接收单元,用于接收元数据查询请求,该元数据查询请求包括第一快照卷的卷标识和数据块标识;获取单元,用于根据第一快照卷的卷标识,从第一快照卷中获取第一时序标识,第一时序标识用于指示第一快照卷的创建时序;查询单元,用于根据该数据块标识和第一时序标识查询历史索引信息,该历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系,该历史查询快照信息用于指示查询卷标识和查询时序区间;获取单元,还用于当该历史索引信息中存在该数据块标识,且第一时序标识在对应的查询时序区间时,从该历史索引信息中获取对应的目标卷标识,并从该目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据。
在第二方面的一种可能的实现方式中,该历史查询快照信息包括至少一个查询卷标识,以及与每个查询卷标识对应的查询时序标识和查询跳数,该查询时序标识和该查询跳数用于指示该查询时序区间。
在第二方面的一种可能的实现方式中,查询单元,还用于根据该数据块标识查询位置识别信息,该位置识别信息用于指示该数据块标识对应的地址元数据所在的卷位置;获取单元,还用于当确定该地址元数据所在的卷位置为基卷时,则根据该数据块标识从该基卷上获取该地址元数据;相应的,查询单元,还用于:当确定该地址元数据所在的卷位置为快照卷时,则根据该数据块标识和第一时序标识查询历史索引信息。
在第二方面的一种可能的实现方式中,该位置识别信息包括至少一个数据块标识、以及与该至少一个数据块标识中每个数据块标识对应的最新快照时序标识,查询单元,还用于:当该位置识别信息中存在该数据块标识,且第一时序标识小于该数据块标识对应的最新快照时序标识时,确定该地址元数据所在的卷位置为快照卷。
在第二方面的一种可能的实现方式中,查询单元,还用于当该历史索引信息中不存在该数据块标识、或者第一时序标识不在对应的查询时序区间时,从第一快照卷开始逐级查找,以确定该数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间;该装置还包括:更新单元,用于根据该数据块标识与第三快照卷的卷标识和该目标时序区间的对应关系,更新该历史索引信息。
在第二方面的一种可能的实现方式中,该历史索引信息包括M个分区,每个分区用于存储N条查询记录,一条查询记录包括一个历史数据块标识与查询卷标识和查询时序区间之间的对应关系,该M、N为正整数。
在第二方面的一种可能的实现方式中,更新单元,还用于:当该M个分区中的第一分区包括的查询记录小于该N时,将该数据块标识与第三快照卷的卷标识和该目标时序区间的对应关系,存储在第一分区中。
在第二方面的一种可能的实现方式中,更新单元,还用于:当该M个分区中的第一分区包括的查询记录等于该N,且该目标时序区间中的目标查询跳数大于第一分区包括的N条查询记录中的查询跳数时,将第一分区中查询跳数最小的查询记录替换为该数据块标识与第三快照卷的卷标识和该目标时序区间的对应关系。
在第二方面的一种可能的实现方式中,更新单元,还用于:当该M个分区中的第一分区包括的查询记录等于该N,且该时序区间中的目标查询跳数小于第一分区包括的N条查询记录中的查询跳数时,将该数据块标识与第三快照卷的卷标识和该目标时序区间的对应关系,存储在第二分区。
在第二方面的一种可能的实现方式中,当第二分区中未存储有查询记录时,第二分区对应的存储空间为空,该装置还包括:分配单元,用于为第二分区分配用于存储N条查询记录的存储空间。
在第二方面的一种可能的实现方式中,该M个分区中的每个分区对应一个跳数阈值,一个分区中存储的N条查询记录中的查询跳数大于或等于该分区对应的跳数阈值。
第三方面,提供一种设备,该设备包括存储器、处理器、总线和通信接口,存储器中存储代码和数据,处理器与存储器通过总线连接,处理器运行存储器中的代码使得该设备执行第一方面或第一方面的任一种可能的实现方式所提供的元数据查询方法。
本申请的又一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得该计算机执行上述第一方面或第一方面的任一种可能的实现方式所提供的元数据查询方法。
本申请的又一方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得该计算机执行上述第一方面或第一方面的任一种可能的实现方式所提供的元数据查询方法。
可以理解地,上述提供的任一种元数据查询方法的装置、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种链式快照的结构示意图;
图2为本申请实施例提供的一种卷的元数据索引的结构示意图;
图3为本申请实施例提供的一种元数据查询方法的流程示意图;
图4为本申请实施例提供的一种快照卷的元数据组织的结构示意图;
图5为本申请实施例提供的一种基卷的元数据组织的结构示意图;
图6为本申请实施例提供的一种历史索引信息的结构示意图;
图7为本申请实施例提供的一种查找地址元数据的示意图;
图8为本申请实施例提供的另一种元数据查询方法的流程示意图;
图9为本申请实施例提供的一种元数据查询装置的结构示意图;
图10为本申请实施例提供的另一种元数据查询装置的结构示意图。
具体实施方式
快照(Snapshot)作为一项不可或缺的功能,几乎所有的云存储都支持快照功能,比如AWS、阿里云等。快照提供系统某个时间点的映像,在系统发生故障时可以使系统恢复到某个可用的时间点的状态。快照可以是指关于指定数据集合的一个完全可用拷贝,该拷贝包括相应数据在某个时间点(拷贝开始的时间点)的映像。快照可以是其所表示的数据的一个副本,也可以是数据的一个复制品。
在云场景中,快照卷通过复用源卷(也称为基卷)的数据,能够有效减少存储空间开销,大多数快照都针对快照数据的存储空间进行了优化,比如,写重定向(Redirect On Write,ROW)、或者写时复制(Copy On Write,COW)等快照技术。ROW是指在一个快照生成期间,所有的写操作将被重定向到另一个存储介质上,当要创建一个快照时,则将自上次快照以来所有的重定向写数据所对应在源介质中的数据复制出来生成这个时间点的快照,然后再将这些重定向写数据写回到源介质中的相应位置上,从而完成一个快照生成过程。COW是指在数据第一次写入到某个存储介质时,首先将原有的内容读取出来,写到另一存储介质中,然后再将该数据写入到该某个存储介质中。但是,因为快照特点,快照卷中的数据非常稀疏。一般通过ROW或者COW等快照技术,生成的多个快照的数据直接指向源卷的数据,或者在同时保存源卷中的旧数据。所以,同一个源卷的多个快照之间,因为数据稀疏特性,元数据的重复度非常高。该元数据是指用于描述卷上存储的数据的相关信息,包括数据的访问权限、修改时间、数据的地址等,本申请中的地址元数据是指数据的地址。目前,链式快照通过共享元数据的方式可以减少快照的元数据信息,以优化快照卷元数据的存储空间。
如图1所示,为一种链式快照的结构示意图,该链式快照的结构中包括基卷和快照卷,基卷和快照卷形成一个链,最新创建的快照卷插入到基卷的后面,随着快照卷的增加,基卷和快照卷组成一个类似链表的链结构。链式快照中快照卷的位置可以用于标识快照卷之间的时序关系,快照卷创建的越早,则在链结构中的位置越靠后。本申请实施例中针对的是元数据的链式快照。
链式快照中,卷的元数据索引采用二级索引机制,第二级索引是指存放描述快照卷中数据的地址的元数据,第一级索引是指记录第二级索引的地址的元数据。第一级索引和第二级索引对应的存储单位可以为容器(Container),示例性的,每个容器的容量可以是8兆字节(Mega Byte,MB)。如图2所示,第一级索引的容器由多个元数据单元组成,每个元数据单元的大小是16字节(Byte,B),读写的最小单位可以为每个元数据单元,第一级索引可以称为卷元数据表(Volume Container Map,VCM)。如果某个快照卷的第二级索引共享快照链上的元数据,那么用特殊标识(比如,图2中的“-2”)表示。第二级索引的容器可以分成多个块(block),每个block的大小为4千字节(Kilo Byte,KB),读写的最小单位可以为每个block。每个block大约可以包含250个元数据单元(4KB/16B),图2中的“A”表示地址,即表示第二级索引的容器的地址。
在本申请各实施例中,各参数的取值(例如容器的容量、元数据单元的大小、每个block的大小等等)均为示例性的值。本领域人员可以理解的是,各参数的取值可以按照不同的场景进行适应性的调整,本申请实施例对此并不进行限定。
链式快照中,基卷拥有一份完整的元数据,快照卷可以共享基卷和快照链上的元数据。快照卷共享元数据的方式有两种:第一种、当一个快照卷的某个第二级索引的容器中,整个都没有私有元数据,而是共享基卷和快照链上的元数据时,则通过一个特殊标识符(比如,图2中的“-2”)标识,该第二级索引的容器不申请存储空间,如图2中的“NULL”。第二种、当一个快照卷的某个第二级索引的容器中有部分私有元数据时,则在该第二级索引的容器中只写该私有元数据。
由于在链式快照中,信息记录的是上一个快照点之后的增量元数据信息,所以元数据空间比较节省。但是,其上保存的增量元数据信息在很多时候不能覆盖全部元数据,当查找某个元数据时,需要根据依赖关系在多个快照卷和基卷中查找,所以查找时间开销比较大。本申请实施例使用历史索引信息,历史索引信息中存储了之前一段时间内查找到的元数据信息,因此在下次查找元数据时可以直接在历史索引信息中查找,避免了沿着快照卷向基卷的方向逐级向上查找,从而大幅度降低查找时间。此外,由于用于存储历史索引信息的存储空间的容量可能是有限的,因此本申请实施例的历史索引信息中存储的是历史查找中耗时比较长(或者是查询跳数比较大)的元数据信息,从而在保证节省了元数据的存储空间的同时,提高了元数据的查询效率。
图3为本申请实施例提供的一种元数据查询方法的流程示意图,参见图3,该方法应用于管理链式快照的电子设备中,该方法包括以下几个步骤。
步骤301:接收元数据查询请求,所述元数据查询请求包括第一快照卷的卷标识和数据块标识。
其中,管理链式快照的电子设备可以是服务器、或者管理设备等,该电子设备可以用于管理和维护基卷和快照卷,比如,管理基卷和快照卷中的元数据、创建或删除某个快照卷,以及管理基卷和快照卷的数据块等等。
另外,当基于链式快照的上层应用(比如,文件系统或者数据库等)在进行I/O读写、或者一些信息查询(比如,查询卷大小等),而需要查询基卷或者快照卷的元数据时,会触发相应的元数据查询流程,从而使管理链式快照的电子设备接收到元数据查询请求,该元数据查询请求中包括第一快照卷的卷标识和数据块标识,第一快照卷的卷标识是指上层应用当前使用的快照卷的卷标识,元数据查询请求用于查询该数据块标识所指示的数据块的地址元数据,即存放该数据块的存储空间的地址。
具体的,当接收到该元数据查询请求时,可以先根据该第一快照卷的卷标识从第一快照卷中查询该数据块标识对应的地址元数据。如果该地址元数据在第一快照卷中,则直接从第一快照卷中获取该数据块标识对应的地址元数据。如果该地址元数据不在第一快照卷中,表示该地址元数据是第一快照卷从基卷或在第一快照卷之后生成的快照卷共享的元数据,该地址元数据位于基卷或者第一快照卷之后创建的快照卷中,则执行下述步骤302。
步骤302:根据第一快照卷的卷标识,从第一快照卷中获取第一时序标识,第一时序标识用于指示第一快照卷的创建时序。
其中,每个快照卷在创建时会分配一个卷标识和一个时序标识,一个快照卷的卷标 识用于标识该快照卷,一个快照卷的时序标识用于指示该快照卷的创建时序,卷标识和时序标识可存放于对应的快照卷中。具体的,电子设备可以根据元数据查询请求中包括的第一快照卷的卷标识,确定第一快照卷,并从第一快照卷中获取第一快照卷的创建时序,即第一时序标识。
示例性的,如图4所示,为一种快照卷的元数据组织。图4中的LUNID为基卷的卷标识,用于标识存储元数据的存储空间,SID为快照卷的卷标识,用于唯一标识快照卷。TID为快照卷的时序标识,用于标识快照卷的创建时序,根据多个不同快照卷的时序标识可以确定创建该多个快照卷先后顺序)。图4中的其他用于表示快照卷的其他元数据信息,比如,快照卷上数据的地址元数据、快照卷的访问权限等等。
步骤303:根据数据块标识和第一时序标识查询历史索引信息,历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系,该历史查询快照信息用于指示查询卷标识和查询时序区间。
其中,该历史索引信息是指历史查询记录信息,即在当前元数据查找之前查询过的一些元数据的相关信息,且该历史索引信息对应的存储空间的地址可以存储在基卷上,从而电子设备可以直接从基卷上获取该存储空间的地址,并基于该存储空间的地址的读取该历史索引信息,以根据该数据块标识和第一时序标识查询该历史索引信息。此外,基卷中也可以存储该历史索引信息的其他相关信息,本申请实施例对此不作限定。
示例性的,如图5所示,为基卷的元数据组织,图5中以历史索引信息包括四个分区为例进行说明,每个分区对应一个跳数阈值。图5中的LUNID用于标识存储基卷的元数据的存储空间,P1、P2、P3和P4用于表示历史索引信息的4个分区中每个分区所占用的存储空间的地址,TS1、TS2、TS3和TS4分别用于表示与4个分区对应的跳数阈值,其他用于表示基卷的其他元数据信息,比如,基卷上数据的地址元数据、基卷上数据的访问权限等等。
需要说明的是,该历史索引信息包括的多个分区、以及每个分区对应的跳数阈值的描述,具体参见下述步骤306中的相关描述,本申请实施例在此不再赘述。
另外,该历史索引信息可以包括查询数据块标识和历史查询快照信息之间的对应关系,历史查询快照信息用于指示查询卷标识和查询时序区间。具体的,该历史查询快照信息可以包括至少一个查询卷标识,以及与至少一个查询卷标识中每个查询卷标识对应的查询时序标识和查询跳数,该查询时序标识和查询跳数可以用于指示查询时序区间。一个查询卷标识对应的查询时序标识用于指示该查询卷标识所对应的快照卷的创建时序,比如,查询卷标识为SID1,其对应的查询时序标识为TID_10,则TID_10可用于指示SID1所标识的快照卷的创建时序为第10个创建的快照卷。一个查询卷标识对应的查询跳数用于指示本次历史查询记录对应的元数据查询时所查找的快照卷的数量,查找一个快照卷可以记为一跳,比如,查询跳数为5,则表示本次历史查询记录对应的元数据查询时所查找的快照卷的数量为5。基于查询时序标识TID_10和查询跳数为5,确定的查询时序区间可以为[TID_5,TID_10]。
示例性的,如图6所示,为一种历史索引信息的示意图。该历史索引信息中一个数据块标识与历史查询快照信息之间的对应关系中可以包括4个元素,分别是数据块标识(VBN)、查询时序标识TID、查询卷标识SID和查询跳数JNUM。一个历史查询快照信息中可以包括一个或者多个查询记录,比如,图6中数据块VBNi对应的历史查询快 照信息中包括三个查询记录,VBNi+1对应的历史查询快照信息中包括两个查询记录,VBNn对应的历史查询快照信息中包括三个查询记录。每个查询记录中包括一个查询时序标识TID、一个查询卷标识SID和一个查询跳数JNUM。可选的,当SID是按照快照卷的创建时序来设置时,SID可以用于标识快照卷的创建时序,从而TID与SID可以合为一个标识。
比如,以图6中数据块VBNi为VBN3,VBN3对应的三个查询记录可以如下所示:
VBN3{<TID_100],50>|SID3<TID_30,10>|SID2<TID_10,10>|SID1}
其中,上述VBN3对应的三个查询记录中的查询时序区间分别可以为[TID_50,TID_100]、[TID_20,TID_30]和[TID_10,TID_0]。
在实际应用中,该历史索引信息在最初查询时可能为空,即不存在历史查询记录,当电子设备查找元数据时,电子设备可以根据现有技术查询相应的地址元数据,并将查询的元数据的相关信息存储在历史索引信息中。当经过一段时间后,历史索引信息中会存储有多条历史查询记录,从而电子设备在查询元数据时,可以先从历史索引信息中进行查询。
具体的,当电子设备根据该数据块标识和第一时序标识查询历史索引信息时,如果该历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间,则执行步骤304。
步骤304:当该历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间时,则从该历史索引信息中获取对应的目标卷标识,并从目标卷标识所指示的第二快照卷中获取该数据块地址对应的地址元数据。
当该历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间时,即标识之前查找过该数据块标识对应的地址元数据且存储在历史索引信息中。因此,电子设备可以根据该数据块标识确定对应的历史查询快照信息,再确定第一时序标识在对应的历史查询快照信息中所处的查询时序区间。之后,从对应的历史查询快照信息中获取第一时序标识所处的查询时序区间对应的目标卷标识。该目标卷标识为第二快照卷的卷标识,则从第二快照卷中获取该数据块标识对应的地址元数据。比如,元数据查询请求中的卷标识为SID7,数据块标识为VBN3,获取的SID7的时序标识为TID_25。历史索引信息中VBN3对应的查询时序区间包括:[TID_50,TID_100]、[TID_20,TID_30]和[TID_10,TID_0],时序标识TID_25刚好位于查询时序区间[TID_20,TID_30]中,其对应的卷标识为SID2,则从SID2所指示的快照卷中获取对应的元数据。
比如,如图7所示,为一种元数据的链式快照中查询元数据的示意图,将基卷向快照卷方向的链接称为下行链接(Downstream link),将快照卷向基卷方向的链接称为上行链接(Upstream link)。假设图7中包括100个快照卷,且其对应的卷标识分别为SID1-SID100,对应的时序标识分别为TID_0-TID_100。以接收到的元数据查询请求中包括的数据块标识为VBN3、第一快照卷的卷标识为SID50为例进行说明。本申请实施例中,查找元数据的方法可以为:电子设备从SID50所指示的第一快照卷开始查找该数据块标识VBN3对应的地址元数据。如果查找不到,则从第一快照卷中获取第一时序标识TID_50。若历史索引信息中包括的该数据块标识VBN3和其对应的历史查询快照信息为:VBN3{<TID_60,20]>|SID60},查询时序标识TID_60和查询跳数20所指示的查询时序区间为[TID_40,TID_60]时,则可以确定该历史索引信息中存在VBN3、且第一时序 标识TID_50在查询时序区间[TID_40,TID_60]中。因此,从该历史索引信息中获取的目标卷标识为SID60,从而从SID60所指示的快照卷(即第60个快照卷)中获取该数据块标识VBN3对应的地址元数据。
而现有技术中查找地址元数据的过程可以为:从SID50所指示的第一快照卷(即第50个快照卷)中开始查找该数据块标识VBN3对应的地址元数据。如果查找不到,则沿着上行链接的方向,逐级查找第51个快照卷、第52个快照卷、……、第60个快照卷,从而在第60个快照卷中获取到对应的地址元数据。
进一步的,在确定需要查找的地址元数据在SID60所指示的快照卷后,可以根据该数据块标识VBN3,从快照卷SID60元数据的第一级索引中确定该地址元数据所在的第二级容器的地址(即图2中第一级容器中的A),再根据该数据块标识VBN3和第二级容器的地址确定对应的地址元数据(即图2中第二级容器中的数据块)。比如,VBN3是通过八位二进制表示的,前两位用于指示第一级容器中的A,后六位用于指示第二级容器中的数据块。从而可以获取到对应的地址元数据所在的数据块。
因此,根据本申请实施例提供的方法,当历史索引信息中存在对应的元数据信息时,可以直接确定地址元数据所在的快照卷,从而直接从该快照卷上获取对应的地址元数据,从而可以避免按照现有技术逐级查找的问题,降低了查找时间,提高了元数据的查询效率。
在本申请实施例中,电子设备在接收到元数据查询请求时,可以先查询历史索引信息,当历史索引信息中存在该数据块标识、且第一时序标识在该数据块标识对应的查询时序区间时,则从该历史索引信息中获取对应的目标卷标识,并从目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据,因此,根据历史索引信息中获取地址元数据,可以避免按照现有技术逐级查找的问题,从而降低了查找时间,提高了元数据的查询效率。
进一步的,在步骤303之前,该方法还可以包括:步骤3021。
步骤3021:根据该数据块标识查询位置识别信息,该位置识别信息用于指示该数据块标识对应的地址元数据所在的卷位置。
其中,该位置识别信息可以包括至少一个数据块标识、以及与至少一个数据块标识中每个数据块标识对应的最新快照时序标识。相应的,根据该数据块标识查询位置识别信息时,可以先查询该位置识别信息中是否存在该数据块标识。如果该位置识别信息中存在该数据块标识,则根据该数据块标识从该位置识别信息中获取对应的最新快照时序标识。当第一时序标识小于获取的最新快照时序标识时,则确定要查找的地址元数据所在的卷位置为快照卷;当第一时序标识大于获取的最新快照时序标识时,则确定要查找的地址元数据所在的卷位置为基卷。如果该位置识别信息中不存在该数据块标识,则确定要查找的述地址元数据所在的卷位置为基卷。
示例性的,当基卷进行推送操作时,电子设备可以记录推送操作相应的数据块所在的最新的快照卷的时序TID_new,从而形成位置识别信息。比如用<VBN,TID_new>表示该位置识别信息,并将其保存在基卷中。当该位置识别信息存在该数据块标识对应的最新时序标识时,相应的识别方法可以为:电子设备可以根据获取第一快照卷的时序标识TID_snap,判断该TID_snap与TID_new的大小。当TID_snap小于TID_new时,则确定查找的地址元数据在快照卷中。当TID_snap大于TID_new时,则确定查找的地址 元数据在基卷中,则直接在基卷中查找对应的地址元数据。
可选的,该位置识别信息可以存储在基卷上,从而根据该数据块标识查询位置识别信息时,可以从基卷上读取该位置识别信息,进而根据该数据块标识查询位置识别信息。
具体的,当确定该地址元数据所在的卷位置为基卷时,可以直接从基卷上获取对应的地址元数据。当确定该地址元数据所在的卷位置为快照卷时,则执行上述步骤303,即步骤303具体为:当确定要查找的地址元数据所在的卷位置为快照卷时,则根据该数据块标识和第一时序标识查询历史索引信息。
在本申请实施例中,通过位置识别信息可以确定所要查找的地址元数据所在的卷位置,当该卷位置为基卷时可以直接从基卷上获取对应的地址元数据,当该卷位置为快照卷时可以直接从根据历史索引信息获取对应的地址元数据,从而通过位置识别信息快速识别所要查找的地址元数据所在的卷位置,可以提高元数据的查询效率,减小历史索引信息的需求,进而提高历史索引信息的查询效率。
进一步的,参见图8,在步骤303之后,如果确定历史索引信息中不存在该数据块标识,或者确定该历史索引信息中存在该数据块标识,但是第一时序标识不在该数据块标识对应的查询时序区间中,则该方法还包括:步骤305-步骤306。
步骤305:从第一快照卷开始逐级查找,以确定该数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间。
当该历史索引信息中不存在该数据块标识,或者确定该历史索引信息中存在该数据块标识,但是第一时序标识不在该数据块标识对应的查询时序区间中时,则电子设备可以根据现有技术从第一快照卷开始逐级向基卷的方向查找,从而获取该数据块标识对应的地址元数据,并确定该数据块标识对应的地址元数据所在的第三快照卷的卷标识,以及逐级查找过程中的目标时序区间。其中,目标时序区间可以通过第三快照卷的时序标识、以及逐级查找过程中查找的快照卷的数量来确定,查找的快照卷的数量也可以称为目标查询跳数,即第一快照卷与第三快照卷之间的跳数。比如,第三快照卷的卷标识为SID4,时序标识为TID_15,对应的查询跳数为10,根据时序标识TID_15和查询跳数10确定对应的目标时序区间为[TID_5,TID_15]。
步骤306:根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息。
当根据步骤305确定该数据块标识对应的地址元数据所在的第三卷快照的卷标识和目标时序区间时,电子设备可以根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新历史索引信息,从而当后续再次查询该地址元数据时,电子设备可以直接从历史索引信息中获取,避免了现有技术中逐级查询的麻烦,从而降低了查找时间。
进一步的,历史索引信息可以包括M个分区(比如,4个分区),即用于存储历史索引信息的存储空间可以划分为M个分区,每个分区可以用于存储N条查询记录,M和N为正整数,一条查询记录可以包括一个历史数据块标识与查询卷标识和查询时序区间之间的对应关系。在为历史索引信息划分M个分区时,可以事先为每个分区分配一定的存储空间,也可以在元数据查询过程中,随着历史查询信息中查询记录的数量的增长,逐步为M个分区分配存储空间。
比如,M为4,分区的大小为8MB,则可以事先为4个分区中每个分区分配8MB 的存储空间;或者,首先为4个分区中的第一分区分配8MB的存储空间,在第一分区中存储的查询记录的数量达到上限时,再为第二分区分配8MB的存储空间,依次按照同样的方式为第三分区和第四分区分配相应的存储空间。
另外,每个分区可以对应一个跳数阈值,即在一个分区中存储的查询记录中的查询跳数大于或者等于该分区的跳数阈值。M个分区对应的跳数阈值按照一定的规则设置,比如,按照从大到小的顺序设置,相邻两个分区对应的跳数阈值的差值可以是固定的。比如,以第一分区和第二分区为例,第一分区的跳数阈值可以为20,第二分区的跳数阈值可以为15,则第一分区中存储的查询记录中的查询跳数大于或等于20,第二分区中存储的查询记录中的查询跳数大于或等于15,且小于20。
进一步的,电子设备根据该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,更新该历史索引信息的过程,与历史索引信息中存储的查询记录有关、以及历史索引信息的存储空间有关,下面进行详细介绍。
其中,对于M个分区中的任一分区,这里以第一分区为例进行说明,当第一分区中存储的查询记录的数量未达到上限,即第一分区中包括的查询记录的数量小于N时,电子设备可以将该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系作为一条查询记录,存储在第一分区中。
当第一分区中存储的查询记录的数量达到上限,即第一分区中包括的查询记录的数量等于N,且目标查询跳数大于第一分区包括的N条查询记录中每条查询记录的查询跳数时,电子设备可以将第一分区中查询跳数最小的查询记录替换为该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系。此外,电子设备还可以将第一分区中查询跳数最小的查询记录存储在第二分区中,即将第一分区中被替换的查询记录存储在第二分区中。
当第一分区中存储的查询记录的数量达到上限,即第一分区中包括的查询记录的数量等于N,且目标查询跳数小于第一分区包括的N条查询记录中每条查询记录的查询跳数时,电子设备直接将该数据块标识与第三快照卷的卷标识和目标时序区间的对应关系,存储在第二分区。可选的,当第一分区存储已满,且第二分区未存储有查询记录时,第二分区对应的存储空间可以为空,当需要在第二分区中存储查询记录时,电子设备再为第二分区分配对应的存储空间。同理,M个分区中的其他分区也可以按照同样的方式分配对应的存储空间。
此外,当电子设备删除快照卷、或者在进行推送(Push)操作时,电子设备也可以根据相应的操作更新历史索引信息。比如,电子设备在对某个数据块VBN进行操作时,可以将该VBN对应的查询记录中的SID,更新为该数据块所在的最新快照卷的卷标识,以保证历史索引信息的准确性和有效。需要说明的是,当基卷更新元数据时,所有快照卷依然共享原来的元数据,所以原来的元数据不能丢失,需要推送到最近的快照卷上,以继续保持元数据的共享关系,这个推元数据的过程称为推送操作。
在对历史索引信息进行更新过程中,当历史索引信息的一个分区中包括的查询跳数的数量小于预设数值时,还可以关闭该分区,即回收该分区对应的存储空间。比如,一个分区中可以存储50条查询记录,当预设数值为10,即该分区中实际存储的查询跳数小于10时,可以将该分区关闭。若其他分区存储空余存储空间,还可以将该分区中存储的查询跳数存储在其他分区中。
示例性的,当电子设备打开一个快照卷时,会将该快照卷对应的历史索引信息读取到内存中。之后,对于该历史索引信息的所有操作(比如,查找、插入和更新等)都在内存中进行。当关闭该快照卷时,可以保存该历史索引信息,并将该历史索引信息从内存中退出。
另外,在实际应用中,表示快照卷时序TID的比特位可能是有限的,因此,在快照卷的TID达到比特位所能够表示的上限时,可以重新更新快照卷的时序TID,此时,可以删除历史索引信息。通过设置合适的时序最大值,可以控制删除历史索引信息的频率,同时保证历史索引信息能够根据不同时间的历史查找记录进行不断更新,以提高历史索引信息的利用率。
上述主要从电子设备的角度对本申请实施例提供的方案进行了介绍,这里的电子设备也可以称为元数据查询装置。可以理解的是,元数据查询装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的网元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对元数据查询装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图9示出了上述实施例中所涉及的元数据查询装置的一种可能的结构示意图,该元数据查询装置包括:接收单元901、获取单元902和查询单元903。其中,接收单元901用于支持该元数据查询装置执行图3或图8中的步骤301;获取单元902用于支持该元数据查询装置执行图3中的步骤302和步骤304,或者图8中的步骤302、步骤304或步骤305;查询单元903用于支持该元数据查询装置执行图3或图8中的步骤303。进一步的,该元数据查询装置还包括更新单元904和分配单元905;其中,更新单元904用于支持该元数据查询装置执行图8中的步骤306;分配单元905用于支持该元数据查询装置为第二分区分配存储空间的步骤。
在硬件实现上,上述获取单元902、查询单元903、更新单元904和分配单元905可以为处理器;接收单元901可以为接收器,接收器与发送器可以构成通信接口。
图10所示,为本申请的实施例提供的上述实施例中所涉及的元数据查询装置的一种可能的逻辑结构示意图。该元数据查询装置包括:处理器1002、通信接口1003、存储器1001以及总线1004。处理器1002、通信接口1003以及存储器1001通过总线1004相互连接。在本申请的实施例中,处理器1002用于对该元数据查询装置的动作进行控制管理,例如,处理器1002用于支持该元数据查询装置执行图3中的步骤302-步骤304、或者图8中的步骤302-步骤306,和/或用于本文所描述的技术的其他过程。通信接口1003用于支持该元数据查询装置进行通信。存储器1001,用于存储该元数据查询装置的程序代码和数据。
其中,处理器1002可以是中央处理器单元,通用处理器,数字信号处理器,专用集 成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线1004可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图10中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请的另一实施例中,还提供一种可读存储介质,可读存储介质中存储有计算机执行指令,当一个设备(可以是单片机,芯片等)或者处理器执行图3或图8所提供的元数据查询方法。前述的可读存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备实施图3或图8所提供的元数据查询方法。
在本申请实施例中,在接收到元数据查询请求时,可以先查询历史索引信息,并当历史索引信息中存在该数据块标识,且第一时序标识在该数据块标识对应的查询时序区间时,从历史索引信息中获取对应的目标卷标识,并从该目标卷标识所指示的第二快照卷中获取该数据块标识对应的地址元数据,由于历史索引信息中存储的是查询数据块标识与历史查询快照信息之间的对应关系,因此,可以避免按照现有技术逐级查找的问题,从而降低了查找时间,提高了元数据的查询效率。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (25)

  1. 一种元数据查询方法,其特征在于,应用于链式快照中,所述方法包括:
    接收元数据查询请求,所述元数据查询请求包括第一快照卷的卷标识和数据块标识;
    根据所述第一快照卷的卷标识,从所述第一快照卷中获取第一时序标识,所述第一时序标识用于指示所述第一快照卷的创建时序;
    根据所述数据块标识和所述第一时序标识查询历史索引信息,所述历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系,所述历史查询快照信息用于指示查询卷标识和查询时序区间;
    当所述历史索引信息中存在所述数据块标识,且所述第一时序标识在对应的查询时序区间时,从所述历史索引信息中获取对应的目标卷标识,并从所述目标卷标识所指示的第二快照卷中获取所述数据块标识对应的地址元数据。
  2. 根据权利要求1所述的方法,其特征在于,所述历史查询快照信息包括至少一个查询卷标识,以及与每个查询卷标识对应的查询时序标识和查询跳数,所述查询时序标识和所述查询跳数用于指示所述查询时序区间。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述数据块标识和所述第一时序标识查询历史索引信息之前,所述方法还包括:
    根据所述数据块标识查询位置识别信息,所述位置识别信息用于指示所述数据块标识对应的地址元数据所在的卷位置;
    当确定所述地址元数据所在的卷位置为基卷时,则根据所述数据块标识从所述基卷上获取所述地址元数据;
    相应的,所述根据所述数据块标识和所述第一时序标识查询历史索引信息,具体包括:
    当确定所述地址元数据所在的卷位置为快照卷时,则根据所述数据块标识和所述第一时序标识查询历史索引信息。
  4. 根据权利要求3所述的方法,其特征在于,所述位置识别信息包括至少一个数据块标识、以及与所述至少一个数据块标识中每个数据块标识对应的最新快照时序标识,所述方法还包括:
    当所述位置识别信息中存在所述数据块标识,且所述第一时序标识小于所述数据块标识对应的最新快照时序标识时,确定所述地址元数据所在的卷位置为快照卷。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,当所述历史索引信息中不存在所述数据块标识、或者所述第一时序标识不在对应的查询时序区间时,所述方法还包括:
    从所述第一快照卷开始逐级查找,以确定所述数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间;
    根据所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,更新所述历史索引信息。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述历史索引信息包括M个分区,每个分区用于存储N条查询记录,一条查询记录包括一个历史数据块标识与查询卷标识和查询时序区间之间的对应关系,所述M、N为正整数。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,更新所述历史索引信息,包括:
    当所述M个分区中的第一分区包括的查询记录小于所述N时,将所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,存储在所述第一分区中。
  8. 根据权利要求6所述的方法,其特征在于,所述根据所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,更新所述历史索引信息,包括:
    当所述M个分区中的第一分区包括的查询记录等于所述N,且所述目标时序区间中的目标查询跳数大于所述第一分区包括的N条查询记录中的查询跳数时,将所述第一分区中查询跳数最小的查询记录替换为所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系。
  9. 根据权利要求6所述的方法,其特征在于,所述根据所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,更新所述历史索引信息,包括:
    当所述M个分区中的第一分区包括的查询记录等于所述N,且所述时序区间中的目标查询跳数小于所述第一分区包括的N条查询记录中的查询跳数时,将所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,存储在第二分区。
  10. 根据权利要求9所述的方法,其特征在于,当所述第二分区中未存储有查询记录时,所述第二分区对应的存储空间为空,所述将所述数据块标识、所述第三快照卷的卷标识和所述查询时序区间之前,所述方法还包括:
    为所述第二分区分配用于存储N条查询记录的存储空间。
  11. 根据权利要求6-10任一项所述的方法,其特征在于,所述M个分区中的每个分区对应一个跳数阈值,一个分区中存储的N条查询记录中的查询跳数大于或等于所述分区对应的跳数阈值。
  12. 一种元数据查询装置,其特征在于,应用于链式快照中,所述装置包括:
    接收单元,用于接收元数据查询请求,所述元数据查询请求包括第一快照卷的卷标识和数据块标识;
    获取单元,用于根据所述第一快照卷的卷标识,从所述第一快照卷中获取第一时序标识,所述第一时序标识用于指示所述第一快照卷的创建时序;
    查询单元,用于根据所述数据块标识和所述第一时序标识查询历史索引信息,所述历史索引信息中包括查询数据块标识与历史查询快照信息之间的对应关系,所述历史查询快照信息用于指示查询卷标识和查询时序区间;
    所述获取单元,还用于当所述历史索引信息中存在所述数据块标识,且所述第一时序标识在对应的查询时序区间时,从所述历史索引信息中获取对应的目标卷标识,并从所述目标卷标识所指示的第二快照卷中获取所述数据块标识对应的地址元数据。
  13. 根据权利要求12所述的装置,其特征在于,所述历史查询快照信息包括至少一个查询卷标识,以及与每个查询卷标识对应的查询时序标识和查询跳数,所述查询时序标识和所述查询跳数用于指示所述查询时序区间。
  14. 根据权利要求12或13所述的装置,其特征在于,
    所述查询单元,还用于根据所述数据块标识查询位置识别信息,所述位置识别信息用于指示所述数据块标识对应的地址元数据所在的卷位置;
    所述获取单元,还用于当确定所述地址元数据所在的卷位置为基卷时,则根据所述 数据块标识从所述基卷上获取所述地址元数据;
    相应的,所述查询单元,还用于:当确定所述地址元数据所在的卷位置为快照卷时,则根据所述数据块标识和所述第一时序标识查询历史索引信息。
  15. 根据权利要求14所述的装置,其特征在于,所述位置识别信息包括至少一个数据块标识、以及与所述至少一个数据块标识中每个数据块标识对应的最新快照时序标识,所述查询单元,还用于:
    当所述位置识别信息中存在所述数据块标识,且所述第一时序标识小于所述数据块标识对应的最新快照时序标识时,确定所述地址元数据所在的卷位置为快照卷。
  16. 根据权利要求12-15任一项所述的装置,其特征在于,当所述历史索引信息中不存在所述数据块标识、或者所述第一时序标识不在对应的查询时序区间时,
    所述查询单元,还用于从所述第一快照卷开始逐级查找,以确定所述数据块标识对应的地址元数据所在的第三快照卷的卷标识和目标时序区间;
    所述装置还包括:更新单元,用于根据所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,更新所述历史索引信息。
  17. 根据权利要求12-16任一项所述的装置,其特征在于,所述历史索引信息包括M个分区,每个分区用于存储N条查询记录,一条查询记录包括一个历史数据块标识与查询卷标识和查询时序区间之间的对应关系,所述M、N为正整数。
  18. 根据权利要求17所述的装置,其特征在于,所述更新单元,还用于:
    当所述M个分区中的第一分区包括的查询记录小于所述N时,将所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,存储在所述第一分区中。
  19. 根据权利要求18所述的装置,其特征在于,所述更新单元,还用于:
    当所述M个分区中的第一分区包括的查询记录等于所述N,且所述目标时序区间中的目标查询跳数大于所述第一分区包括的N条查询记录中的查询跳数时,将所述第一分区中查询跳数最小的查询记录替换为所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系。
  20. 根据权利要求18所述的装置,其特征在于,所述更新单元,还用于:
    当所述M个分区中的第一分区包括的查询记录等于所述N,且所述时序区间中的目标查询跳数小于所述第一分区包括的N条查询记录中的查询跳数时,将所述数据块标识与所述第三快照卷的卷标识和所述目标时序区间的对应关系,存储在第二分区。
  21. 根据权利要求20所述的装置,其特征在于,当所述第二分区中未存储有查询记录时,所述第二分区对应的存储空间为空,所述装置还包括:
    分配单元,用于为所述第二分区分配用于存储N条查询记录的存储空间。
  22. 根据权利要求17-21任一项所述的装置,其特征在于,所述M个分区中的每个分区对应一个跳数阈值,一个分区中存储的N条查询记录中的查询跳数大于或等于所述分区对应的跳数阈值。
  23. 一种设备,其特征在于,所述设备包括存储器、处理器、总线和通信接口,存储器中存储代码和数据,处理器与存储器通过总线连接,处理器运行存储器中的代码使得所述设备执行权利要求1-11任一项所述的元数据查询方法。
  24. 一种可读存储介质,其特征在于,所述可读存储介质中存储有指令,当所述可读存储介质在设备上运行时,使得所述设备执行权利要求1-11任一项所述的元数据 查询方法。
  25. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行权利要求1-11任一项所述的元数据查询方法。
PCT/CN2018/105969 2017-09-27 2018-09-17 一种元数据查询方法及装置 WO2019062574A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18861134.7A EP3678015B1 (en) 2017-09-27 2018-09-17 Metadata query method and device
US16/831,005 US11474972B2 (en) 2017-09-27 2020-03-26 Metadata query method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710889269.3 2017-09-27
CN201710889269.3A CN110018983B (zh) 2017-09-27 2017-09-27 一种元数据查询方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/831,005 Continuation US11474972B2 (en) 2017-09-27 2020-03-26 Metadata query method and apparatus

Publications (1)

Publication Number Publication Date
WO2019062574A1 true WO2019062574A1 (zh) 2019-04-04

Family

ID=65900830

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/105969 WO2019062574A1 (zh) 2017-09-27 2018-09-17 一种元数据查询方法及装置

Country Status (4)

Country Link
US (1) US11474972B2 (zh)
EP (1) EP3678015B1 (zh)
CN (1) CN110018983B (zh)
WO (1) WO2019062574A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111290713A (zh) * 2020-01-22 2020-06-16 恩亿科(北京)数据科技有限公司 一种数据存储方法、装置、电子设备及存储介质
CN113806406A (zh) * 2021-09-18 2021-12-17 王剑 一种诊疗数据存储方法、查询方法及相关装置
CN114489503A (zh) * 2022-01-21 2022-05-13 北京安天网络安全技术有限公司 数据报文的存储方法、装置、计算机设备
US20240020203A1 (en) * 2022-07-15 2024-01-18 Dell Products L.P. Application aware storage volumes and snapshots for enhanced management and process efficiency

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704239B (zh) * 2019-09-03 2022-05-27 杭州宏杉科技股份有限公司 数据复制方法、装置、电子设备
CN110806950B (zh) * 2019-09-19 2022-07-15 平安科技(深圳)有限公司 基于多态存储机制的快照生成方法、装置和计算机设备
CN112699253A (zh) * 2019-10-23 2021-04-23 广州彩熠灯光股份有限公司 源代码定位方法、系统、介质及装置
CN110781133B (zh) * 2019-10-25 2023-03-21 深信服科技股份有限公司 一种row快照方法、系统、设备及计算机可读存储介质
CN111651453B (zh) * 2020-04-30 2024-02-06 中国平安财产保险股份有限公司 用户历史行为查询方法、装置、电子设备及存储介质
CN111597148B (zh) * 2020-05-14 2023-09-19 杭州果汁数据科技有限公司 用于分布式文件系统的分布式元数据管理方法
CN112416930A (zh) * 2020-11-18 2021-02-26 国家基础地理信息中心 混合时相影像地图数据的查询方法、存储方法及相关装置
CN113312313B (zh) * 2021-01-29 2023-09-29 淘宝(中国)软件有限公司 数据查询方法、非易失性存储介质及电子设备
CN113515487B (zh) * 2021-09-07 2021-11-19 联想凌拓科技有限公司 查询目录的方法、计算设备和分布式文件系统
CN113821476B (zh) * 2021-11-25 2022-03-22 云和恩墨(北京)信息技术有限公司 数据处理方法及装置
CN114238404A (zh) * 2021-12-15 2022-03-25 建信金融科技有限责任公司 数据的查询方法、装置、存储介质及设备
CN114296649B (zh) * 2021-12-27 2024-01-02 天翼云科技有限公司 云间业务迁移系统
CN114168533B (zh) * 2022-02-09 2022-04-19 苏州浪潮智能科技有限公司 快照查询方法、装置、计算机设备和存储介质
CN115658730B (zh) * 2022-09-20 2024-02-13 中国科学院自动化研究所 稀疏数据的查询方法、装置、设备和计算机可读存储介质
CN116821058B (zh) * 2023-08-28 2023-11-14 腾讯科技(深圳)有限公司 元数据访问方法、装置、设备及存储介质
CN117687970A (zh) * 2024-02-02 2024-03-12 济南浪潮数据技术有限公司 一种元数据检索方法、装置及电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677252A (zh) * 2016-01-06 2016-06-15 华为技术有限公司 读数据的方法、数据处理方法及相关存储设备
US20170032005A1 (en) * 2015-07-31 2017-02-02 Netapp, Inc. Snapshot and/or clone copy-on-write
CN107179964A (zh) * 2016-03-11 2017-09-19 中兴通讯股份有限公司 快照的读写方法及装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6957362B2 (en) * 2002-08-06 2005-10-18 Emc Corporation Instantaneous restoration of a production copy from a snapshot copy in a data storage system
US7743031B1 (en) * 2002-09-06 2010-06-22 3Par, Inc. Time and space efficient technique for creating virtual volume copies
TWI316188B (en) * 2006-05-17 2009-10-21 Ind Tech Res Inst Mechanism and method to snapshot data
US8374480B2 (en) * 2009-11-24 2013-02-12 Aten International Co., Ltd. Method and apparatus for video image data recording and playback
CN101968755B (zh) * 2010-11-04 2012-02-08 清华大学 一种自适应应用负载变化的快照生成方法
CN103761159B (zh) * 2014-01-23 2017-05-24 天津中科蓝鲸信息技术有限公司 增量快照处理的方法及系统
US10198321B1 (en) * 2014-04-01 2019-02-05 Storone Ltd. System and method for continuous data protection
CN105740469B (zh) * 2016-03-07 2019-05-28 华为技术有限公司 存储服务器和元数据访问方法
CN106326040B (zh) * 2016-08-27 2019-12-31 苏州浪潮智能科技有限公司 一种快照元数据管理方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170032005A1 (en) * 2015-07-31 2017-02-02 Netapp, Inc. Snapshot and/or clone copy-on-write
CN105677252A (zh) * 2016-01-06 2016-06-15 华为技术有限公司 读数据的方法、数据处理方法及相关存储设备
CN107179964A (zh) * 2016-03-11 2017-09-19 中兴通讯股份有限公司 快照的读写方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3678015A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111290713A (zh) * 2020-01-22 2020-06-16 恩亿科(北京)数据科技有限公司 一种数据存储方法、装置、电子设备及存储介质
CN111290713B (zh) * 2020-01-22 2023-11-03 恩亿科(北京)数据科技有限公司 一种数据存储方法、装置、电子设备及存储介质
CN113806406A (zh) * 2021-09-18 2021-12-17 王剑 一种诊疗数据存储方法、查询方法及相关装置
CN114489503A (zh) * 2022-01-21 2022-05-13 北京安天网络安全技术有限公司 数据报文的存储方法、装置、计算机设备
CN114489503B (zh) * 2022-01-21 2024-02-23 北京安天网络安全技术有限公司 数据报文的存储方法、装置、计算机设备
US20240020203A1 (en) * 2022-07-15 2024-01-18 Dell Products L.P. Application aware storage volumes and snapshots for enhanced management and process efficiency

Also Published As

Publication number Publication date
US11474972B2 (en) 2022-10-18
US20200226100A1 (en) 2020-07-16
EP3678015A4 (en) 2020-09-09
CN110018983B (zh) 2021-07-16
CN110018983A (zh) 2019-07-16
EP3678015B1 (en) 2022-11-09
EP3678015A1 (en) 2020-07-08

Similar Documents

Publication Publication Date Title
US11474972B2 (en) Metadata query method and apparatus
US11032368B2 (en) Data processing method, apparatus, and system
US11853549B2 (en) Index storage in shingled magnetic recording (SMR) storage system with non-shingled region
US10628378B2 (en) Replication of snapshots and clones
US8751763B1 (en) Low-overhead deduplication within a block-based data storage
US11099937B2 (en) Implementing clone snapshots in a distributed storage system
US20190213085A1 (en) Implementing Fault Domain And Latency Requirements In A Virtualized Distributed Storage System
US9519575B2 (en) Conditional iteration for a non-volatile device
US8849876B2 (en) Methods and apparatuses to optimize updates in a file system based on birth time
US10013312B2 (en) Method and system for a safe archiving of data
WO2016045096A1 (zh) 一种文件迁移方法、装置和存储设备
KR102031588B1 (ko) 파일 저장 시의 색인 구현 방법 및 시스템
CN107066498B (zh) 键值kv存储方法和装置
CN108121813B (zh) 数据管理方法、装置、系统、存储介质及电子设备
CN110147203B (zh) 一种文件管理方法、装置、电子设备及存储介质
CN107798063B (zh) 快照处理方法和快照处理装置
CN113760847A (zh) 日志数据处理方法、装置、设备及存储介质
CN111143113A (zh) 复制元数据的方法、电子设备和计算机程序产品
CN111857556A (zh) 管理存储对象的元数据的方法、装置和计算机程序产品
CN114297196A (zh) 元数据存储方法、装置、电子设备及存储介质
US11308038B2 (en) Copying container images
CN115904211A (zh) 一种存储系统、数据处理方法及相关设备
US11556503B2 (en) Distributed management of file modification-time field
US20230342293A1 (en) Method and system for in-memory metadata reduction in cloud storage system
CN115033174A (zh) 写时重定向快照实现方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18861134

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018861134

Country of ref document: EP

Effective date: 20200330