CN103942011B - A kind of residual quantity fast photographic system and its application method - Google Patents
A kind of residual quantity fast photographic system and its application method Download PDFInfo
- Publication number
- CN103942011B CN103942011B CN201410077212.XA CN201410077212A CN103942011B CN 103942011 B CN103942011 B CN 103942011B CN 201410077212 A CN201410077212 A CN 201410077212A CN 103942011 B CN103942011 B CN 103942011B
- Authority
- CN
- China
- Prior art keywords
- snapshot
- block
- logical
- volume
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000013507 mapping Methods 0.000 claims abstract description 108
- 238000007726 management method Methods 0.000 claims abstract description 35
- 238000011084 recovery Methods 0.000 claims abstract description 10
- 238000013468 resource allocation Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims 2
- 230000006399 behavior Effects 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 15
- 238000005516 engineering process Methods 0.000 description 20
- 239000002243 precursor Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 230000001419 dependent effect Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种差量快照系统及其使用方法,该系统包括:快照模块,用于维护源卷和快照卷的依赖关系、快照策略语义以及管理快照元数据;精简配置模块,该精简配置模块具有资源管理子模块,用于实现对快照的资源管理功能,该功能包括资源分配、资源映射、资源回收;该快照模块与该精简配置模块通过通用块层接口通信,该快照模块通过向该精简配置模块发送数据请求或资源管理命令实现通信。该系统使快照语义与物理资源管理解耦合,能够按需分配存储资源,并能够实现写时拷贝、写时重定向、随处写三种方式的处理逻辑,提高快照资源有效利用率。
The invention discloses a differential snapshot system and its usage method. The system comprises: a snapshot module, used for maintaining the dependency relationship between source volumes and snapshot volumes, snapshot policy semantics, and managing snapshot metadata; a thin configuration module, the thin configuration The module has a resource management sub-module, which is used to realize the resource management function of the snapshot, which includes resource allocation, resource mapping, and resource recovery; the snapshot module communicates with the thin provisioning module through a common block layer interface, and the snapshot module communicates with the thin provisioning module through the The thin provisioning module communicates by sending data requests or resource management commands. The system decouples snapshot semantics from physical resource management, can allocate storage resources on demand, and can implement three processing logics: copy-on-write, redirect-on-write, and write-anywhere, improving the effective utilization of snapshot resources.
Description
技术领域technical field
本发明涉及信息技术领域,特别涉及存储领域的差量快照技术。The invention relates to the field of information technology, in particular to a differential snapshot technology in the storage field.
背景技术Background technique
存储网络工业协会SNIA(Storage Networking Industry Association),按照实现技术将快照分为两大类:全复制快照与差量复制快照。与全复制快照相比差量快照技术具有时间开销低,空间利用率高等突出优势,因此差量快照技术被工业界广泛应用。目前差量快照技术有三种实现,即写时拷贝(COW,Copy On Write),写时重定向(ROW,Redirect OfWrite),随处写(WA,Write Anywhere),以满足不同实例的需求。The Storage Networking Industry Association (SNIA) divides snapshots into two categories according to implementation technologies: full-copy snapshots and differential-copy snapshots. Compared with the full copy snapshot, the differential snapshot technology has outstanding advantages such as low time overhead and high space utilization, so the differential snapshot technology is widely used in the industry. Currently, there are three implementations of differential snapshot technology, namely copy on write (COW, Copy On Write), redirect on write (ROW, Redirect OfWrite), and write anywhere (WA, Write Anywhere), to meet the needs of different instances.
写时拷贝,原理如图1所示,创建快照后源卷数据第一次被覆盖写之前,先将源卷的旧数据拷贝到为快照预留的空间,再将新数据写入源卷,并建立索引表来记录快照创建之后修改过的数据块。写时拷贝技术保持源卷上数据的物理分布,因此不影响读源卷数据的性能。但是创建快照卷后的首次写引入了数据的物理拷贝,因此源卷的写性能差,而且各个时刻的增量数据集分散在各个快照卷中,读快照卷数据的性能会受到影响。The principle of copy-on-write is shown in Figure 1. After the snapshot is created, before the source volume data is overwritten for the first time, the old data of the source volume is copied to the space reserved for the snapshot, and then the new data is written into the source volume. And build an index table to record the data blocks modified after the snapshot is created. Copy-on-write technology maintains the physical distribution of data on the source volume, so it does not affect the performance of reading data from the source volume. However, the first write after the snapshot volume is created introduces a physical copy of the data, so the write performance of the source volume is poor, and the incremental data sets at each time are scattered in each snapshot volume, and the performance of reading the snapshot volume data will be affected.
写时重定向,原理如图2所示,在创建快照后,对源卷的写操作将会被重定向到为快照预留的空间,并建立索引表来记录数据位置,源卷中的旧数据仍保存在原位置。写时重定向对源卷的写性能较优,而且各个时刻的增量数据集聚集保存在一个快照卷中,读快照卷数据的性能也要优于写时拷贝快照。然而由于新数据保存在快照卷中,源数据集的分布变得分散,因此读源数据集的性能会受到影响。此外,源卷数据集既存放在源卷也存放在快照卷中,当删除一个快照时,需要将快照中的数据迁移合并到源中。Redirection on write, the principle is shown in Figure 2. After the snapshot is created, the write operation to the source volume will be redirected to the space reserved for the snapshot, and an index table will be created to record the data location. The old data in the source volume The data remains in its original location. Redirection-on-write has better write performance on the source volume, and the incremental data sets at each moment are aggregated and stored in a snapshot volume, and the performance of reading snapshot volume data is also better than copy-on-write snapshots. However, since the new data is stored in the snapshot volume, the distribution of the source data set becomes fragmented, so the performance of reading the source data set will be affected. In addition, the source volume data set is stored in both the source volume and the snapshot volume. When deleting a snapshot, the data migration in the snapshot needs to be merged into the source.
随处写,原理如图3所示,基本思想是将所有物理资源进行虚拟化管理,即将磁盘按照固定粒度切分成等大的物理块,建立映射表记录所有的逻辑块与物理块的映射关系,使用映射指针的方式管理数据块。创建快照时,将当前的映射指针复制一份作为快照。当数据块发生变化时,随处写技术不修改老数据,而是为数据块新申请一个空闲的物理块,接着将源的对应数据块映射指针指向新物理块,并将新数据写入新物理块的存储空间。快照的映射指针仍指向老数据所存放的物理块。随处写技术写性能较优,并且还可以通过请求调度将随机写变成顺序写优化随机写性能。在删除快照时,随处写技术只需要进行数据块的回收操作,不需要进行数据合并迁移。随处写技术将源卷与快照卷的存储空间混合在一起进行统一管理,源卷与快照卷的读性能相同。但是,随处写技术会造成大量的数据碎片,读请求会引起大量的合并操作,严重影响读源数据和快照数据的性能。Write anywhere, the principle is shown in Figure 3. The basic idea is to virtualize all physical resources, that is, divide the disk into physical blocks of equal size according to a fixed granularity, and establish a mapping table to record the mapping relationship between all logical blocks and physical blocks. Data blocks are managed by mapping pointers. When creating a snapshot, copy the current mapping pointer as a snapshot. When a data block changes, the Write Anywhere technology does not modify the old data, but applies for a new free physical block for the data block, then points the corresponding data block mapping pointer of the source to the new physical block, and writes the new data into the new physical block Block storage space. The mapping pointer of the snapshot still points to the physical block where the old data is stored. The write anywhere technology has better write performance, and random write can also be changed into sequential write through request scheduling to optimize random write performance. When deleting a snapshot, the Write Anywhere technology only needs to reclaim data blocks, and does not need to perform data merge and migration. The Write Anywhere technology mixes the storage space of the source volume and the snapshot volume for unified management, and the read performance of the source volume and the snapshot volume is the same. However, the write-anywhere technology will cause a large amount of data fragmentation, and read requests will cause a large number of merge operations, seriously affecting the performance of reading source data and snapshot data.
通过对差量快照三种实现方式的分析得出它们的共同点,即维护快照层语义需要为新数据或者旧数据分配存储空间,即快照同时需要支持资源管理功能。综合前述三种实现方式,快照技术支持两种方式资源管理方式,即空间预留方式和动态资源分配方式。空间预留方式,创建快照时就为快照卷按比率预留可用资源,属于静态的资源管理,该方式的资源利用率较低,该方式应用于写时拷贝与写时重定向。动态资源分配方式,统一管理系统中所有可用资源,按照按需分配的原则,采用写时分配策略,该方式的资源利用率较高,但是需要动态的更新资源使用情况,实现也更复杂,该方式应用于随处写。Through the analysis of the three implementations of differential snapshots, it is found that they have something in common, that is, maintaining the semantics of the snapshot layer requires allocating storage space for new or old data, that is, snapshots also need to support resource management functions. Combining the aforementioned three implementation methods, the snapshot technology supports two resource management methods, namely the space reservation method and the dynamic resource allocation method. The space reservation method, when creating a snapshot, reserves available resources for the snapshot volume according to the ratio, which belongs to static resource management, and the resource utilization rate of this method is low. This method is applied to copy-on-write and redirect-on-write. The dynamic resource allocation method uniformly manages all available resources in the system. According to the principle of on-demand allocation, a write-time allocation strategy is adopted. This method has a high resource utilization rate, but it needs to dynamically update the resource usage, and the implementation is more complicated. way to apply to write everywhere.
基于上述考虑,传统差量快照技术中资源管理方式与快照语义紧耦合,由此带来以下几方面的不足:Based on the above considerations, the resource management method and snapshot semantics in the traditional differential snapshot technology are tightly coupled, which brings the following deficiencies:
第一,快照系统中需要维护资源映射信息。两种资源管理分配方式都需要快照系统支持资源映射功能,并在有新数据产生时需要修改或者建立相应的映射关系。资源映射管理和维护增加了快照系统的复杂性。First, resource mapping information needs to be maintained in the snapshot system. Both resource management and allocation methods require the snapshot system to support the resource mapping function, and the corresponding mapping relationship needs to be modified or established when new data is generated. Resource map management and maintenance add to the complexity of the snapshot system.
第二,快照实现技术与资源管理方式耦合,按照存储网络工业协会的定义,写时拷贝或者写时重定向为资源预留方式,不支持按需分配。由此导致这两种实现方式的资源有效利用率难以提高。Second, snapshot implementation technology is coupled with resource management methods. According to the definition of the Storage Network Industry Association, copy-on-write or redirect-on-write are resource reservation methods, which do not support on-demand allocation. As a result, it is difficult to improve the effective utilization of resources in the two implementation manners.
发明内容Contents of the invention
为了解决上述问题,本发明的目的在于,提供一种基于精简配置系统的差量快照系统及其使用方法,使快照语义与物理资源管理解耦合,能够按需分配存储资源,并能够实现写时拷贝、写时重定向、随处写三种方式的处理逻辑,提高快照资源有效利用率。In order to solve the above problems, the object of the present invention is to provide a thin-provisioning-based differential snapshot system and its usage method, which decouples snapshot semantics from physical resource management, allocates storage resources on demand, and enables write-time The processing logic of copy, redirection on write, and write anywhere improves the effective utilization of snapshot resources.
为实现上述目的,本发明提出一种差量快照系统,用于按需分配存储资源,其特征在于,该系统包括:To achieve the above purpose, the present invention proposes a differential snapshot system for allocating storage resources on demand, characterized in that the system includes:
快照模块,用于维护源卷和快照卷的依赖关系、快照策略语义、管理快照元数据;Snapshot module, used to maintain dependencies between source volumes and snapshot volumes, snapshot policy semantics, and manage snapshot metadata;
精简配置模块,该精简配置模块具有资源管理子模块,用于实现对快照的资源管理功能,该功能包括资源分配、资源映射、资源回收;A thin provisioning module, the thin provisioning module has a resource management sub-module, which is used to realize the resource management function of the snapshot, and the function includes resource allocation, resource mapping, and resource recovery;
该快照模块的逻辑卷底层设备是该精简配置模块的逻辑卷,通过该精简配置模块对该逻辑卷的资源管理实现按需分配存储资源,该快照模块与该精简配置模块通过通用块层接口通信,该快照模块通过向该精简配置模块发送数据请求或资源管理命令实现通信。The underlying device of the logical volume of the snapshot module is the logical volume of the thin provisioning module, and the resource management of the logical volume by the thin provisioning module realizes on-demand allocation of storage resources, and the snapshot module communicates with the thin provisioning module through a common block layer interface , the snapshot module implements communication by sending a data request or a resource management command to the thin provisioning module.
本发明的差量快照系统,其特征在于,在该快照模块中,所采用的快照技术为写时拷贝或写时重定向,在该精简配置模块中,采用写时分配技术,支持随处写快照语义。The differential snapshot system of the present invention is characterized in that, in the snapshot module, the snapshot technology adopted is copy-on-write or redirection-on-write, and in the thin configuration module, allocation-on-write technology is adopted to support writing snapshots anywhere semantics.
本发明的差量快照系统,其特征在于,该精简配置模块还包括:The differential snapshot system of the present invention is characterized in that the thin provisioning module also includes:
资源池,用于对物理存储资源进行统一的虚拟化管理,对外提供可用快照逻辑块;The resource pool is used for unified virtualized management of physical storage resources and provides available snapshot logic blocks to the outside world;
物理存储设备,用于存储快照数据块;A physical storage device for storing snapshot data blocks;
其中,该快照逻辑块与该快照数据块通过该精简配置模块管理的逻辑卷中的资源映射表相对应;Wherein, the snapshot logical block corresponds to the resource mapping table in the logical volume managed by the snapshot data block through the thin provisioning module;
空闲资源表,记录当前资源池中可用的物理资源块数量及信息。The free resource table records the number and information of available physical resource blocks in the current resource pool.
本发明还涉及一种如上所述的差量快照系统的使用方法,其特征在于,该方法包括快照模块步骤和精简配置模块步骤,其中,The present invention also relates to a method for using the above-mentioned differential snapshot system, which is characterized in that the method includes a snapshot module step and a thin provisioning module step, wherein,
该快照模块步骤,包括:The snapshot module steps, including:
用户请求处理步骤,用于当该快照模块接收到用户的快照数据块写请求时,根据该写请求所访问的逻辑卷所应用的快照策略,进行写时拷贝处理或写时重定向处理,并向该精简配置层发送重映射命令或写请求,当该快照模块接收到用户的读请求时,根据读请求地址和快照逻辑确定请求数据块所在逻辑卷,向精简配置层转发读请求,写请求或读请求返回后,向用户返回,The user request processing step is used to perform copy-on-write processing or write-time redirection processing according to the snapshot policy applied to the logical volume accessed by the write request when the snapshot module receives the user's snapshot data block write request, and Send a remapping command or write request to the thin provisioning layer. When the snapshot module receives a read request from the user, it determines the logical volume where the requested data block is located according to the read request address and the snapshot logic, and forwards the read request and write request to the thin provisioning layer. Or return to the user after the read request returns,
数据合并步骤,用于对同一快照逻辑地址在不同快照卷中对应的快照数据块进行合并,并向该精简配置层发送重映射命令或资源回收命令;The data merging step is used for merging snapshot data blocks corresponding to the same snapshot logical address in different snapshot volumes, and sending a remapping command or a resource recovery command to the thin provisioning layer;
该精简配置模块步骤,包括:The thin provisioning module steps, including:
写请求步骤,用于当该精简配置模块接收到快照模块发出的快照数据块的写请求时,根据策略选择是否需要为该快照数据块分配资源并建立该快照数据块与物理资源块的映射关系,然后向底层物理存储模块转发请求将数据写入物理存储设备,写完成后逐层向上返回,The write request step is used to select whether to allocate resources for the snapshot data block and establish a mapping relationship between the snapshot data block and the physical resource block when the thin provisioning module receives the write request of the snapshot data block sent by the snapshot module , and then forward the request to the underlying physical storage module to write the data into the physical storage device, and return upward layer by layer after the writing is completed.
读请求步骤,用于当该精简配置模块接收到快照模块发出的快照数据块的读请求时,将请求重定向到数据块在物理存储设备中的位置,转发到物理存储设备,从物理存储设备中读出数据返回给用户,The read request step is used for redirecting the request to the position of the data block in the physical storage device when the thin provisioning module receives the read request of the snapshot data block sent by the snapshot module, and forwarding to the physical storage device, from the physical storage device The read data is returned to the user,
重映射步骤,用于当该精简配置模块接收到快照模块发出的快照数据块的重映射命令时,对该快照数据块进行重映射操作,The remapping step is used to perform a remapping operation on the snapshot data block when the thin provisioning module receives a remapping command of the snapshot data block sent by the snapshot module,
资源回收步骤,用于当该精简配置模块接收到快照数据块的资源回收命令时,对该快照数据块进行物理资源回收。The resource reclamation step is used for reclaiming physical resources of the snapshot data block when the thin provisioning module receives a resource reclamation command of the snapshot data block.
本发明的差量快照系统的使用方法,其特征在于,所述写时拷贝处理包括下列步骤:The method for using the differential snapshot system of the present invention is characterized in that the copy-on-write processing includes the following steps:
步骤11,源卷接收用户对快照逻辑块写请求,查询依赖于该源卷的当前快照卷中的元数据,依此判断本次写是否为对该快照逻辑块的首次写,若该快照卷的元数据中标识该快照逻辑块无有效快照数据块,则为首次写,执行步骤12,否则,执行步骤13,Step 11, the source volume receives the user's write request for the snapshot logical block, queries the metadata in the current snapshot volume dependent on the source volume, and judges whether this write is the first write to the snapshot logical block, if the snapshot volume If there is no valid snapshot data block identified in the metadata of the snapshot logical block, then it is the first write, go to step 12, otherwise, go to step 13,
步骤12,该快照源卷向底层设备即该精简配置系统的逻辑卷发送重映射命令,将该快照逻辑块对应的旧版本快照数据块从源卷逻辑空间重映射到快照卷逻辑空间,更新该快照卷元数据,并通知该源卷的数据逻辑单元已完成对该快照逻辑块对应快照数据块的逻辑拷贝,Step 12, the snapshot source volume sends a remapping command to the underlying device, that is, the logical volume of the thin provisioning system, remaps the old version snapshot data block corresponding to the snapshot logical block from the source volume logical space to the snapshot volume logical space, and updates the metadata of the snapshot volume, and notify the data logical unit of the source volume that the logical copy of the snapshot data block corresponding to the snapshot logical block has been completed,
步骤13,该源卷将用户的该快照数据写请求转发给底层设备,即精简配置逻辑卷,该快照写请求返回后向用户层返回;Step 13, the source volume forwards the snapshot data write request of the user to the underlying device, that is, the thin-provisioned logical volume, and returns the snapshot write request to the user layer after returning;
该写时重定向处理包括下列步骤:The redirect-on-write process includes the following steps:
步骤21,源卷接收用户对快照逻辑块写请求,查询依赖于该源卷的当前快照卷,将写请求重定向到该快照卷逻辑空间,Step 21, the source volume receives the user's write request for the snapshot logical block, queries the current snapshot volume dependent on the source volume, and redirects the write request to the snapshot volume logical space,
步骤22,该快照卷接收用户的快照逻辑块写请求,并转发给底层设备,Step 22, the snapshot volume receives the user's snapshot logical block write request and forwards it to the underlying device,
步骤23,该写请求返回后更新快照卷元数据,并返回用户层。Step 23, after the write request is returned, the metadata of the snapshot volume is updated and returned to the user layer.
本发明的差量快照系统的使用方法,其特征在于,该数据合并步骤具体为:若将在快照逻辑卷LV2中存在有效快照数据块的快照逻辑块B对应的快照数据块合并到快照逻辑卷LV1,其中LV1、LV2为同一源卷在不同时刻的快照卷,或者源卷与快照卷,The method for using the differential snapshot system of the present invention is characterized in that the data merging step is specifically as follows: if the snapshot data block corresponding to the snapshot logical block B that has a valid snapshot data block in the snapshot logical volume LV2 is merged into the snapshot logical volume LV1, where LV1 and LV2 are snapshot volumes of the same source volume at different times, or source volumes and snapshot volumes,
若该快照逻辑卷LV1中的数据集新于该快照逻辑卷LV2中的数据集,则进行如下步骤:If the data set in the snapshot logical volume LV1 is newer than the data set in the snapshot logical volume LV2, proceed as follows:
步骤41,对于该快照逻辑卷LV2中的该快照逻辑块B,检查该快照逻辑卷LV1中的元数据,判断该卷中是否有快照逻辑块的新版本快照数据块,若有则执行步骤42,否则执行步骤43,Step 41, for the snapshot logical block B in the snapshot logical volume LV2, check the metadata in the snapshot logical volume LV1 to determine whether there is a new version of the snapshot data block of the snapshot logical block in the volume, and if so, execute step 42 , otherwise go to step 43,
步骤42,该快照模块向该快照逻辑卷LV2底层逻辑卷发送该快照逻辑块的资源回收命令,该命令成功返回后,更新该快照逻辑卷LV2的元数据,将该快照逻辑块标识为无有效快照数据块,则完成对该快照逻辑块对应的快照数据块的版本合并,Step 42, the snapshot module sends a resource recovery command of the snapshot logical block to the underlying logical volume of the snapshot logical volume LV2, and after the command returns successfully, updates the metadata of the snapshot logical volume LV2, and marks the snapshot logical block as invalid snapshot data block, the version merging of the snapshot data block corresponding to the snapshot logical block is completed,
步骤43,该快照模块向该快照逻辑卷LV2底层逻辑卷发送该快照逻辑块的重映射命令,将该快照逻辑块对应的快照数据块从该快照逻辑卷LV2底层逻辑卷重映射到该快照逻辑卷LV1底层逻辑卷,其中,该重映射命令包括下列参数,该快照逻辑块的逻辑块号,源映射设备,目标映射设备,Step 43, the snapshot module sends a remapping command of the snapshot logical block to the underlying logical volume of the snapshot logical volume LV2, and remaps the snapshot data block corresponding to the snapshot logical block from the underlying logical volume of the snapshot logical volume LV2 to the snapshot logical volume The underlying logical volume of volume LV1, wherein the remapping command includes the following parameters, the logical block number of the snapshot logical block, the source mapping device, and the target mapping device,
步骤44,该重映射命令成功返回后,分别更新该快照逻辑卷LV1与该快照逻辑卷LV2的元数据,将该快照逻辑卷LV1中元数据标识为该快照逻辑块存在有效快照数据块,将该快照逻辑卷LV2的元数据标识为该快照逻辑块不存在有效快照数据块,则完成对该快照逻辑块对应的快照数据块的版本合并;Step 44, after the remapping command returns successfully, update the metadata of the snapshot logical volume LV1 and the snapshot logical volume LV2 respectively, identify the metadata in the snapshot logical volume LV1 as valid snapshot data blocks for the snapshot logical block, and set The metadata identification of the snapshot logical volume LV2 is that the snapshot logical block does not have a valid snapshot data block, then the version merging of the snapshot data block corresponding to the snapshot logical block is completed;
若该快照逻辑卷LV1中的数据集旧于该快照逻辑卷LV2中的数据集,则进行如下步骤:If the data set in the snapshot logical volume LV1 is older than the data set in the snapshot logical volume LV2, proceed as follows:
步骤51,对于该快照逻辑卷LV2中的该快照逻辑块B,向该精简配置系统发送该快照逻辑块的重映射命令,将该快照逻辑块B对应的快照数据块从该快照逻辑卷LV2的底层逻辑卷重映射到该快照逻辑卷LV1的底层逻辑卷,该重映射命令包括下列参数,该快照逻辑块的逻辑块号,源映射设备,目标映射设备,Step 51, for the snapshot logical block B in the snapshot logical volume LV2, send the remapping command of the snapshot logical block to the thin provisioning system, and transfer the snapshot data block corresponding to the snapshot logical block B from the snapshot logical volume LV2 The underlying logical volume is remapped to the underlying logical volume of the snapshot logical volume LV1. The remapping command includes the following parameters, the logical block number of the snapshot logical block, the source mapping device, and the target mapping device,
步骤52,该重映射命令成功返回后,合并该快照逻辑块在该快照逻辑卷LV1和该快照逻辑卷LV2中的元数据,并据此更新该快照逻辑卷LV1的元数据,并更新该快照逻辑卷LV2的元数据,将该快照逻辑块标识为无有效快照数据块,此时对该快照数据块的合并完成,Step 52, after the remapping command returns successfully, merge the metadata of the snapshot logical block in the snapshot logical volume LV1 and the snapshot logical volume LV2, and update the metadata of the snapshot logical volume LV1 accordingly, and update the snapshot The metadata of the logical volume LV2 identifies the snapshot logical block as having no valid snapshot data block, and the merging of the snapshot data block is completed at this time,
其中,该重映射命令参数中的源映射设备和目标映射设备分别为快照逻辑卷LV2和LV1的底层设备,且当该快照逻辑卷LV2中所有快照逻辑块对应的有效快照数据块均合并到该快照逻辑卷LV1,该快照逻辑卷LV2与该快照逻辑卷LV1的数据合并操作完成。Wherein, the source mapping device and the target mapping device in the parameters of the remapping command are the underlying devices of the snapshot logical volumes LV2 and LV1 respectively, and when the valid snapshot data blocks corresponding to all snapshot logical blocks in the snapshot logical volume LV2 are merged into the The data merging operation of the snapshot logical volume LV1, the snapshot logical volume LV2 and the snapshot logical volume LV1 is completed.
本发明的差量快照系统的使用方法,其特征在于,在精简配置模块步骤中,所述写请求步骤具体为,The method for using the differential snapshot system of the present invention is characterized in that, in the thin provisioning module step, the write request step is specifically:
步骤61,该精简配置模块的逻辑卷接收到上层写请求,在该逻辑卷资源映射表中查找当前快照逻辑块的映射项,若映射信息中该快照逻辑块没有关联相应的物理资源块,则是对该快照逻辑块的首次写,执行步骤62;否则执行步骤64;Step 61, the logical volume of the thin provisioning module receives a write request from the upper layer, searches the mapping item of the current snapshot logical block in the logical volume resource mapping table, if the snapshot logical block in the mapping information is not associated with the corresponding physical resource block, then It is the first write to the snapshot logic block, go to step 62; otherwise go to step 64;
步骤62,从该资源管理模块的资源池中为该快照逻辑块分配资源,更新资源池中的空闲资源记录;Step 62, allocate resources for the snapshot logic block from the resource pool of the resource management module, and update the idle resource record in the resource pool;
步骤63,建立该快照逻辑块与物理资源块的映射关系,在该快照逻辑块的映射项中添加该物理资源块的信息,该信息包括该物理资源块所在的物理存储设备及地址;Step 63, establishing the mapping relationship between the snapshot logical block and the physical resource block, adding the information of the physical resource block to the mapping item of the snapshot logical block, the information including the physical storage device and address where the physical resource block is located;
步骤64,将写请求转发给物理存储模块;Step 64, forwarding the write request to the physical storage module;
步骤65,该物理存储模块写返回后,逐层向上返回。Step 65, after the physical storage module writes back, it goes back up layer by layer.
本发明的差量快照系统的使用方法,其特征在于,在精简配置模块步骤中,所述读请求步骤具体为,The method for using the differential snapshot system of the present invention is characterized in that, in the thin provisioning module step, the read request step is specifically:
步骤71,该精简配置模块的逻辑卷接收到上层读请求,在该逻辑卷的资源映射表中查找该读请求所访问的快照逻辑块对应的资源映射项,获取与该快照逻辑块对应的快照数据块在物理存储设备中的地址;Step 71, the logical volume of the thin provisioning module receives the upper layer read request, searches the resource mapping table of the logical volume for the resource mapping item corresponding to the snapshot logical block accessed by the read request, and obtains the snapshot corresponding to the snapshot logical block The address of the data block in the physical storage device;
步骤72,将读请求重定向到该快照数据块在物理存储设备中的存储位置,转发到物理存储模块;Step 72, redirecting the read request to the storage location of the snapshot data block in the physical storage device, and forwarding it to the physical storage module;
步骤73,底层设备返回后,逐层向上返回。In step 73, after the bottom device returns, it returns upward layer by layer.
本发明的差量快照系统的使用方法,其特征在于,在精简配置模块步骤中,所述重映射步骤具体为,The method for using the differential snapshot system of the present invention is characterized in that, in the thin provisioning module step, the remapping step is specifically:
步骤81,该精简配置模块接收到重映射命令,解析重映射操作参数,即获取该重映射操作所需的逻辑块号,源映射逻辑卷,目标映射逻辑卷,若源映射逻辑卷为LV3,目标映射逻辑卷为LV4,待重映射快照逻辑块为B;Step 81, the thin provisioning module receives the remapping command and parses the remapping operation parameters, that is, obtains the logical block number required for the remapping operation, the source mapping logical volume, and the target mapping logical volume. If the source mapping logical volume is LV3, The target mapped logical volume is LV4, and the logical block of the snapshot to be remapped is B;
步骤82,在该源映射逻辑卷LV3的资源映射表中查找该快照逻辑块B对应的映射项,若该映射项中该快照逻辑块B关联了物理资源块,获得其对应的物理资源块P,执行步骤83,否则执行步骤87;Step 82, look up the mapping item corresponding to the snapshot logical block B in the resource mapping table of the source mapping logical volume LV3, if the snapshot logical block B in the mapping item is associated with a physical resource block, obtain its corresponding physical resource block P , execute step 83, otherwise execute step 87;
步骤83,在该目标映射逻辑卷LV4的资源映射表中查找对应该快照逻辑块B的映射项,若找到相应映射项,获取其对应的物理资源块Q,并执行步骤84,否则执行步骤85;Step 83, look up the mapping entry corresponding to the snapshot logical block B in the resource mapping table of the target mapping logical volume LV4, if the corresponding mapping entry is found, obtain the corresponding physical resource block Q, and execute step 84, otherwise execute step 85 ;
步骤84,取消快照逻辑块与物理资源块Q的关联,释放该目标映射逻辑卷LV4中该快照逻辑块B占用的物理资源块Q,并更新共享资源池中空闲资源记录;Step 84, cancel the association between the snapshot logical block and the physical resource block Q, release the physical resource block Q occupied by the snapshot logical block B in the target mapping logical volume LV4, and update the idle resource record in the shared resource pool;
步骤85,在该目标映射逻辑卷LV4的资源映射表中将该快照逻辑块B与物理资源块P关联,修改该快照逻辑块B对应的映射项,写入该快照逻辑块关联的该物理资源块P信息,该信息包括物理资源块所在物理设备,以及在该物理设备中的地址;Step 85, associate the snapshot logical block B with the physical resource block P in the resource mapping table of the target mapping logical volume LV4, modify the mapping item corresponding to the snapshot logical block B, and write the physical resource associated with the snapshot logical block Block P information, which includes the physical device where the physical resource block is located and the address in the physical device;
步骤86,在该源映射逻辑卷LV3的资源映射表中取消该快照逻辑块B与该物理资源块P的关联,重映射操作完成;Step 86, cancel the association between the snapshot logical block B and the physical resource block P in the resource mapping table of the source mapping logical volume LV3, and the remapping operation is completed;
步骤87,向上层返回。Step 87, return to the upper layer.
本发明的差量快照系统的使用方法,其特征在于,在精简配置模块步骤中,所述资源回收步骤具体为,The method for using the differential snapshot system of the present invention is characterized in that, in the thin provisioning module step, the resource recovery step is specifically:
步骤91,该精简配置模块接收到资源回收命令,解析参数,即获取逻辑卷、逻辑块信息;Step 91, the thin provisioning module receives the resource recovery command, parses the parameters, that is, obtains logical volume and logical block information;
步骤92,在该参数指定逻辑卷的资源映射表中查找对应逻辑块的映射项,若该映射项中关联了对应的物理资源块,则执行步骤93,否则执行步骤95;Step 92, look up the mapping item of the corresponding logical block in the resource mapping table of the logical volume specified by the parameter, if the corresponding physical resource block is associated in the mapping item, then perform step 93, otherwise perform step 95;
步骤93,取消该逻辑块与物理资源块映射关系,并释放被映射的物理资源;Step 93, cancel the mapping relationship between the logical block and the physical resource block, and release the mapped physical resource;
步骤94,更新资源池空闲资源记录;Step 94, update the free resource record of the resource pool;
步骤95,向上层返回。Step 95, return to the upper layer.
本发明的积极效果在于:The positive effects of the present invention are:
第一,精简配置系统与快照技术的结合,一方面,将物理资源管理从快照系统中分离,快照层只需要维护快照语义,无需进行复杂的资源管理;另一方面,实现了全局资源动态配置,快照实现方式与资源管理方式解耦,即写时拷贝、写时重定向语义不需要预分配快照空间。First, the combination of thin provisioning system and snapshot technology, on the one hand, separates physical resource management from the snapshot system, and the snapshot layer only needs to maintain the snapshot semantics without complex resource management; on the other hand, it realizes the dynamic allocation of global resources , the snapshot implementation method is decoupled from the resource management method, that is, the copy-on-write and redirect-on-write semantics do not need to pre-allocate snapshot space.
第二,精简配置系统实现随处写机制,快照层维护写时拷贝和写时重定向语义,实现了写时拷贝、写时重定向与随处写技术的嵌套。Second, the thin provisioning system implements the write-anywhere mechanism, and the snapshot layer maintains the copy-on-write and redirect-on-write semantics, realizing the nesting of copy-on-write, redirect-on-write, and write-anywhere technologies.
附图说明Description of drawings
图1是差量快照实现方式中的写时拷贝的原理示意图;FIG. 1 is a schematic diagram of the principle of copy-on-write in a differential snapshot implementation;
图2是差量快照实现方式中的写时重定向的原理示意图;FIG. 2 is a schematic diagram of the principle of redirection-on-write in a differential snapshot implementation;
图3是差量快照实现方式中的随处写的原理示意图;FIG. 3 is a schematic diagram of the principle of writing anywhere in the differential snapshot implementation;
图4是本发明的差量快照系统的逻辑结构示意图;Fig. 4 is a schematic diagram of the logical structure of the differential snapshot system of the present invention;
图5是本发明的差量快照系统的功能结构示意图;FIG. 5 is a schematic diagram of the functional structure of the differential snapshot system of the present invention;
图6是本发明的差量快照系统的快照语义嵌套示意图;Fig. 6 is a schematic diagram of snapshot semantic nesting of the differential snapshot system of the present invention;
图7是本发明的差量快照系统的写时拷贝处理步骤的流程图。FIG. 7 is a flow chart of the copy-on-write processing steps of the differential snapshot system of the present invention.
具体实施方式detailed description
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图对本发明的差量快照系统及其使用方法进行进一步详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不用于限定本发明。In order to make the purpose, technical solution and advantages of the present invention clearer, the differential snapshot system and its usage method of the present invention will be further described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the present invention.
本发明向用户提供源卷和快照卷两类逻辑卷,其中快照卷依赖于源卷。快照模块的逻辑卷底层设备为精简配置模块导出的逻辑卷。本发明综合写时拷贝、写时重定向、随处写三种实现方式的处理逻辑,对快照的逻辑功能进行了抽象划分,其层次结构如图2所示,功能结构如图3所示,通过精简配置系统对本模块逻辑卷的管理实现资源按需配置,快照模块只需维护快照语义,实现写时拷贝或者写时重定向逻辑。The present invention provides users with two types of logical volumes, the source volume and the snapshot volume, wherein the snapshot volume depends on the source volume. The underlying device of the logical volume of the snapshot module is the logical volume exported by the thin provisioning module. The present invention synthesizes the processing logic of the three implementation modes of copy-on-write, redirection-on-write, and write-anywhere, and abstracts the logical functions of snapshots. Its hierarchical structure is shown in Figure 2, and its functional structure is shown in Figure 3. The thin provisioning system manages the logic volume of this module to realize on-demand configuration of resources, and the snapshot module only needs to maintain the snapshot semantics to implement copy-on-write or redirection-on-write logic.
快照模块与精简配置模块通过接口模块通信,快照模块通过向精简配置模块发请求或者资源管理命令实现通信,涉及用户请求处理和数据合并两个过程。总体步骤按照系统结构分为快照模块步骤和精简配置模块步骤。下面将以快照模块的一个快照数据块为对象描述该步骤。The snapshot module communicates with the thin provisioning module through the interface module, and the snapshot module realizes communication by sending requests or resource management commands to the thin provisioning module, involving two processes of user request processing and data merging. The overall steps are divided into snapshot module steps and thin provisioning module steps according to the system structure. This step will be described below taking a snapshot data block of the snapshot module as an object.
一、快照模块步骤1. Snapshot module steps
该步骤具体包括用户写请求处理步骤、用户读请求处理步骤、数据合并步骤,下面依次对这些步骤进行进一步说明。This step specifically includes a user write request processing step, a user read request processing step, and a data merging step, and these steps will be further described in turn below.
1、用户写请求处理步骤1. User write request processing steps
当接收到用户写请求时,若进行写时拷贝,则执行下列步骤:When a user write request is received, if copy-on-write is performed, the following steps are performed:
步骤11,源卷接收用户对快照逻辑块写请求,查询依赖于该源卷的当前快照卷中的元数据,据此判断本次写是否为对该快照数据块的首次写,若该快照卷的元数据中标识该快照逻辑块无有效快照数据块则为首次写,则执行步骤12,否则,执行步骤13,Step 11, the source volume receives the user's write request for the snapshot logical block, queries the metadata in the current snapshot volume dependent on the source volume, and judges whether this write is the first write of the snapshot data block, if the snapshot volume If there is no valid snapshot data block identified in the metadata of the snapshot logic block, then it is the first write, then perform step 12, otherwise, perform step 13,
步骤12,该快照源卷向底层设备即该精简配置系统的逻辑卷发送重映射命令,将旧数据块从源卷逻辑空间重映射到快照卷逻辑空间,更新该快照卷元数据,并通知该源卷的数据逻辑单元已完成对该快照数据块的逻辑拷贝,Step 12, the snapshot source volume sends a remapping command to the underlying device, that is, the logical volume of the thin provisioning system, remaps old data blocks from the source volume logical space to the snapshot volume logical space, updates the snapshot volume metadata, and notifies the The data logical unit of the source volume has completed the logical copy of the snapshot data block,
步骤13,该源卷将用户的该快照数据写请求转发给底层设备,写请求返回后,返回用户层;Step 13, the source volume forwards the snapshot data write request of the user to the underlying device, and returns to the user layer after the write request is returned;
若进行写时重定向处理,则执行下列步骤:For redirect-on-write processing, perform the following steps:
步骤21,源卷接收用户对快照逻辑块写请求,查询依赖于该源卷的当前快照卷,该将写请求重定向到该快照卷逻辑空间,Step 21, the source volume receives the user's write request for the snapshot logical block, queries the current snapshot volume dependent on the source volume, and redirects the write request to the snapshot volume logical space,
步骤22,该快照卷接收用户的快照逻辑块写请求,并转发给底层设备,Step 22, the snapshot volume receives the user's snapshot logical block write request and forwards it to the underlying device,
步骤23,该写请求返回后更新快照卷元数据,并返回用户层。Step 23, after the write request is returned, the metadata of the snapshot volume is updated and returned to the user layer.
2、用户读请求处理步骤2. User read request processing steps
该用户读请求与SNIA定义的处理逻辑相同,分为读快照源卷与读快照卷。The user's read request is the same as the processing logic defined by SNIA, and is divided into read snapshot source volume and read snapshot volume.
步骤31,用户读快照源卷,具体步骤如下:Step 31, the user reads the snapshot source volume, the specific steps are as follows:
步骤311,若为写时拷贝方式,则执行以下步骤:Step 311, if it is a copy-on-write mode, then perform the following steps:
(a)将读请求重定向该快照源卷的底层逻辑卷,并转发给源卷的底层精简配置模块逻辑卷;(a) redirecting the read request to the underlying logical volume of the snapshot source volume, and forwarding to the underlying thin provisioning module logical volume of the source volume;
(b)精简配置模块逻辑卷接收到读请求执行读流程,将读到的数据返回给上层;(b) The logical volume of the thin provisioning module receives the read request and executes the read process, and returns the read data to the upper layer;
(c)快照源卷收到底层逻辑卷的返回,直接向用户返回。(c) The snapshot source volume receives the return from the underlying logical volume and returns it directly to the user.
步骤312,若为写时重定向方式,则执行以下步骤:Step 312, if it is the redirection-on-write mode, perform the following steps:
(a)该快照源卷接收到用户读请求,查找依赖于该快照源卷的当前快照卷;(a) The snapshot source volume receives a user read request, and searches for the current snapshot volume that depends on the snapshot source volume;
(b)检查当前快照卷中所请求的逻辑块对应的元数据,确定是否存在该逻辑块的有效数据,若没有则执行步骤(c);否则执行步骤(d);(b) Check the metadata corresponding to the requested logical block in the current snapshot volume, determine whether there is valid data of the logical block, if not, perform step (c); otherwise perform step (d);
(c)获当前快照卷的前驱快照逻辑卷,并将该前驱快照逻辑卷作为当前快照卷,执行(b);(c) Obtain the precursor snapshot logical volume of the current snapshot volume, and use the precursor snapshot logical volume as the current snapshot volume, and execute (b);
(d)读请求重定向到该当前快照卷,快照卷接收到读请求直接转发给其底层的精简配置模块的逻辑卷;(d) The read request is redirected to the current snapshot volume, and the snapshot volume receives the read request and forwards it directly to the logical volume of the underlying thin provisioning module;
(e)精简配置模块逻辑卷接收到读请求后,执行本层读流程,将请求转发到底层物理存储设备,并将读出的数据逐层向上返回。(e) After the logical volume of the thin provisioning module receives the read request, it executes the read process of this layer, forwards the request to the underlying physical storage device, and returns the read data layer by layer.
步骤32,用户读快照卷,则执行以下步骤:Step 32, the user reads the snapshot volume, then perform the following steps:
步骤321,若为写时拷贝方式,则执行以下步骤:Step 321, if it is the copy-on-write mode, then perform the following steps:
(a)当前快照卷接收到用户读请求,检查本卷的元数据是否有所请求的数据,若没有则执行(b);否则执行(c);(a) The current snapshot volume receives a user read request, check whether the metadata of this volume has the requested data, if not, execute (b); otherwise execute (c);
(b)获取当前快照卷的前驱逻辑卷,并将该逻辑卷作为当前快照卷执行(a);(b) obtaining the predecessor logical volume of the current snapshot volume, and performing (a) with the logical volume as the current snapshot volume;
(c)当前快照卷将请求转发给底层的精简配置模块逻辑卷;(c) The current snapshot volume forwards the request to the underlying thin provisioning module logical volume;
(d)底层逻辑卷接收到读请求执行本层读流程;(d) The underlying logical volume receives a read request and executes the read process of this layer;
(e)将读到的数据逐层向上返回。(e) Return the read data layer by layer.
步骤322,若为写时重定向方式,则执行以下步骤:Step 322, if it is the redirection-on-write mode, perform the following steps:
(a)快照卷接收到读请求,获取该快照卷的前驱逻辑卷,将读请求重定向到本快照卷的前驱逻辑卷;(a) The snapshot volume receives a read request, obtains the precursor logical volume of the snapshot volume, and redirects the read request to the precursor logical volume of the snapshot volume;
(b)该前驱逻辑卷接收到读请求,检查本卷中的元数据,判断是否有所请求的数据,若该逻辑卷中不存在所请求的数据,则执行(c),否则执行(d);(b) The precursor logical volume receives a read request, checks the metadata in this volume, and judges whether there is requested data, if the requested data does not exist in the logical volume, then execute (c), otherwise execute (d );
(c)将该前驱逻辑卷作为当前请求逻辑卷,获取该当前请求逻辑卷的前驱逻辑卷,将读请求重定向到该前驱逻辑卷,执行(b);(c) taking the precursor logical volume as the current request logical volume, obtaining the precursor logical volume of the current request logical volume, redirecting the read request to the precursor logical volume, and executing (b);
(d)将请求转发给底层的精简配置模块的逻辑卷;(d) forward the request to the logical volume of the underlying thin provisioning module;
(e)精简配置模块的逻辑卷接收到读请求,执行本层读流程;(e) The logical volume of the thin provisioning module receives a read request, and executes the read process of this layer;
(f)将读到的数据逐层向上返回。(f) Return the read data layer by layer.
3、数据合并步骤3. Data Merging Steps
当需要将两个快照卷中的快照数据进行版本合并时,进行如下步骤,When it is necessary to merge the snapshot data in two snapshot volumes, perform the following steps,
数据合并对LV2卷中的有效数据逐一与LV1卷中的数据做版本合并。下面以快照逻辑块B的有效数据版本合并操作为例进行阐述。Data merging performs version merging of the valid data in the LV2 volume with the data in the LV1 volume one by one. The following takes the valid data version merging operation of the snapshot logical block B as an example to illustrate.
若将快照逻辑块B在逻辑卷LV2中对应的有效快照数据块合并到逻辑卷LV1,其中LV1、LV2为同一源卷在不同时刻的快照卷,若该逻辑卷LV1中的数据集新于该逻辑卷LV2中的数据集,则进行如下步骤:If the valid snapshot data block corresponding to snapshot logical block B in logical volume LV2 is merged into logical volume LV1, where LV1 and LV2 are snapshot volumes of the same source volume at different times, if the data set in logical volume LV1 is newer than the For the data set in the logical volume LV2, perform the following steps:
步骤41,对于该逻辑卷LV2中存在有效快照数据块的该快照逻辑块B,检查该逻辑卷LV1中的元数据,判断该逻辑卷LV1中是否有该快照逻辑块的新版本快照数据块,若有则执行步骤42,否则执行步骤43,Step 41, for the snapshot logical block B that has valid snapshot data blocks in the logical volume LV2, check the metadata in the logical volume LV1, and determine whether there is a new version snapshot data block of the snapshot logical block in the logical volume LV1, If so, execute step 42, otherwise execute step 43,
步骤42,该逻辑卷LV2向底层的精简配置逻辑卷发送该快照逻辑块B的资源回收命令,该命令成功返回后,更新该逻辑卷LV2的元数据,将该快照逻辑块标识为无有效快照数据块,则完成对该快照逻辑块B对应的快照数据块的版本合并,Step 42, the logical volume LV2 sends the resource reclamation command of the snapshot logical block B to the underlying thin-provisioned logical volume, after the command returns successfully, update the metadata of the logical volume LV2, and mark the snapshot logical block as having no valid snapshot data block, the version merging of the snapshot data block corresponding to the snapshot logical block B is completed,
步骤43,该逻辑卷LV2向底层的该精简配置逻辑卷发送该快照逻辑块B的重映射命令,将该快照逻辑块对应的快照数据块从该逻辑卷LV2底层逻辑卷重映射到该逻辑卷LV1底层逻辑卷,其中,该重映射命令包括下列参数,该快照逻辑块的逻辑块号,源映射逻辑卷,目标映射逻辑卷,Step 43, the logical volume LV2 sends a remapping command of the snapshot logical block B to the underlying thin-provisioned logical volume, and remaps the snapshot data block corresponding to the snapshot logical block from the underlying logical volume of the logical volume LV2 to the logical volume LV1 underlying logical volume, wherein the remapping command includes the following parameters, the logical block number of the snapshot logical block, source mapping logical volume, target mapping logical volume,
步骤44,该重映射命令成功返回后,分别更新该逻辑卷LV1与该逻辑卷LV2的元数据,将该逻辑卷LV1中元数据标识为该快照逻辑块存在有效快照数据块,将该逻辑卷LV2的元数据标识为该快照逻辑块不存在有效快照数据块,则完成对该快照逻辑块B对应的快照数据块的版本合并;Step 44, after the remapping command returns successfully, update the metadata of the logical volume LV1 and the logical volume LV2 respectively, identify the metadata in the logical volume LV1 as valid snapshot data blocks for the snapshot logical block, and update the logical volume LV1 The metadata of LV2 is identified as the snapshot logical block does not have a valid snapshot data block, then the version merging of the snapshot data block corresponding to the snapshot logical block B is completed;
若该逻辑卷LV1中的数据集旧于该逻辑卷LV2中的数据集,则进行如下步骤:If the data set in the logical volume LV1 is older than the data set in the logical volume LV2, proceed as follows:
步骤51,对于该逻辑卷LV2中的该快照逻辑块B,向该精简配置系统发送该快照逻辑块B的重映射命令,将该快照逻辑块B对应的快照数据块从该逻辑卷LV2底层逻辑卷重映射到该逻辑卷LV1底层逻辑卷,该重映射命令包括下列参数,该快照逻辑块的逻辑块号,源映射逻辑卷,目标映射逻辑卷,Step 51, for the snapshot logical block B in the logical volume LV2, send the remapping command of the snapshot logical block B to the thin provisioning system, and transfer the snapshot data block corresponding to the snapshot logical block B from the underlying logic of the logical volume LV2 The volume is remapped to the underlying logical volume of the logical volume LV1. The remapping command includes the following parameters, the logical block number of the snapshot logical block, the source mapped logical volume, and the target mapped logical volume.
步骤52,该重映射命令成功返回后,合并该快照逻辑块B在LV1和LV2中的元数据,并据此更新该LV1的元数据,并更新LV2的元数据,将该快照逻辑块标识为无有效快照数据块,此时对该快照逻辑块B的快照数据块的版本合并完成,Step 52, after the remapping command returns successfully, merge the metadata of the snapshot logical block B in LV1 and LV2, and accordingly update the metadata of the LV1, and update the metadata of the LV2, and identify the snapshot logical block as There is no valid snapshot data block. At this time, the version merging of the snapshot data block of the snapshot logical block B is completed.
其中,该重映射命令参数中的源映射逻辑卷和目标映射逻辑卷分别是逻辑卷LV2和LV1的底层逻辑卷,当该逻辑卷LV2中所有快照逻辑块对应的有效快照数据块均合并到该逻辑卷LV1,该逻辑卷LV2与该逻辑卷LV1的数据合并操作完成。Wherein, the source mapping logical volume and the target mapping logical volume in the remapping command parameters are the underlying logical volumes of logical volumes LV2 and LV1 respectively, when the valid snapshot data blocks corresponding to all snapshot logical blocks in the logical volume LV2 are merged into the The logical volume LV1, the logical volume LV2 and the logical volume LV1 have completed the data merge operation.
二、精简配置模块步骤2. Simplify the configuration module steps
在精简配置模块步骤中,包括写请求步骤、读请求步骤、重映射步骤、资源回收步骤等。The steps of the thin provisioning module include a write request step, a read request step, a remapping step, a resource recovery step, and the like.
1、写请求步骤1. Write request steps
该步骤具体为:The steps are specifically:
步骤61,该精简配置模块的逻辑卷接收到上层写请求,在本卷的资源映射表中查找当前快照逻辑块的映射信息,若未查找到该快照逻辑块与物理资源块关联的映射信息,则是对该快照逻辑块的首次写,执行步骤62;否则执行步骤64;Step 61, the logical volume of the thin provisioning module receives the upper-layer write request, and searches the mapping information of the current snapshot logical block in the resource mapping table of the current volume. If the mapping information associated with the snapshot logical block and the physical resource block is not found, It is the first write to the snapshot logical block, and execute step 62; otherwise, execute step 64;
步骤62,从该资源管理模块的资源池中为该快照逻辑块分配资源,更新资源池中的空闲资源记录;Step 62, allocate resources for the snapshot logic block from the resource pool of the resource management module, and update the idle resource record in the resource pool;
步骤63,建立该快照逻辑块与物理资源块的映射关系,在该逻辑卷的资源映射表中该快照逻辑块对应的映射中增加该物理资源块的信息,该信息包括该物理资源块所在的物理存储设备及地址;Step 63: Establish a mapping relationship between the snapshot logical block and the physical resource block, and add the information of the physical resource block in the mapping corresponding to the snapshot logical block in the resource mapping table of the logical volume, the information includes the physical resource block where the physical resource block is located. Physical storage device and address;
步骤64,将写请求转发给物理存储模块;Step 64, forwarding the write request to the physical storage module;
步骤65,该物理存储模块写返回后,向上层返回。Step 65, after the physical storage module writes back, it returns to the upper layer.
2、读请求步骤2. Read request steps
该步骤具体为:The steps are specifically:
步骤71,该精简配置模块的逻辑卷接收到上层读请求,在资源映射表中查找相应的资源映射项,获取存储该读请求所需数据块的物理设备及物理地址;Step 71, the logical volume of the thin provisioning module receives the upper-layer read request, searches the corresponding resource mapping item in the resource mapping table, and obtains the physical device and the physical address of the data block required for storing the read request;
步骤72,将读请求重定向到该快照数据块在物理存储设备中的存储位置,转发到物理存储模块;Step 72, redirecting the read request to the storage location of the snapshot data block in the physical storage device, and forwarding it to the physical storage module;
步骤73,底层设备向本模块返回后,逐层向上返回。Step 73, after the bottom equipment returns to this module, it returns upward layer by layer.
3、重映射步骤3. Remapping steps
快照层的源卷或者快照卷的语义对精简配置模块透明,精简配置模块并不区分快照模块的源卷或者快照卷,只区分重映射操作的源设备和目标设备,即源逻辑卷与目标逻辑卷。下面以一个快照逻辑块为对象描述重映射操作的步骤。The semantics of the source volume or snapshot volume of the snapshot layer are transparent to the thin provisioning module. The thin provisioning module does not distinguish the source volume or the snapshot volume of the snapshot module, but only distinguishes the source device and the target device of the remapping operation, that is, the source logical volume and the target logical volume. roll. The steps of the remapping operation are described below taking a snapshot logical block as an object.
该步骤具体为:The steps are specifically:
步骤81,该精简配置模块接收到重映射命令,解析重映射操作参数,即源映射逻辑卷,逻辑块号,目标映射逻辑卷信息,假设源映射逻辑卷为LV3,目标映射逻辑卷为LV4,待重映射快照逻辑块为B;Step 81, the thin provisioning module receives the remapping command, and parses the remapping operation parameters, namely the source mapping logical volume, logical block number, target mapping logical volume information, assuming that the source mapping logical volume is LV3, and the target mapping logical volume is LV4, The snapshot logical block to be remapped is B;
步骤82,在该源映射逻辑卷LV3的资源映射表中查找该快照逻辑块B对应的映射项,若该映射项中该快照逻辑块B关联了物理资源块,获得其对应的物理资源块P,执行步骤83,否则执行步骤87;Step 82, look up the mapping item corresponding to the snapshot logical block B in the resource mapping table of the source mapping logical volume LV3, if the snapshot logical block B in the mapping item is associated with a physical resource block, obtain its corresponding physical resource block P , execute step 83, otherwise execute step 87;
步骤83,在该目标映射逻辑卷LV4的资源映射表中查找对应该快照逻辑块B的映射项,若该快照逻辑块B的映射项中关联了相应的物理资源块,获取该物理资源块Q,并执行步骤84,否则执行步骤85;Step 83, look up the mapping entry corresponding to the snapshot logical block B in the resource mapping table of the target mapping logical volume LV4, if the mapping entry of the snapshot logical block B is associated with a corresponding physical resource block, obtain the physical resource block Q , and execute step 84, otherwise execute step 85;
步骤84,取消快照逻辑块与物理资源块Q的关联,释放该目标逻辑卷LV4中该快照逻辑块B占用的物理资源块Q,并更新共享资源池中空闲资源记录;Step 84, cancel the association between the snapshot logical block and the physical resource block Q, release the physical resource block Q occupied by the snapshot logical block B in the target logical volume LV4, and update the idle resource record in the shared resource pool;
步骤85,在该目标逻辑卷LV4的资源映射表中将该快照逻辑块B与物理资源块P关联,修改该快照逻辑块B的映射项,写入该物理资源块P的信息,该信息包括该物理资源块所在的物理存储设备及物理地址;Step 85, associate the snapshot logical block B with the physical resource block P in the resource mapping table of the target logical volume LV4, modify the mapping item of the snapshot logical block B, and write the information of the physical resource block P, the information includes The physical storage device and the physical address where the physical resource block is located;
步骤86,在该源映射逻辑卷LV3的资源映射表中取消该快照逻辑块B与该物理资源块P的关联,重映射操作完成;Step 86, cancel the association between the snapshot logical block B and the physical resource block P in the resource mapping table of the source mapping logical volume LV3, and the remapping operation is completed;
步骤87,向上层返回。Step 87, return to the upper layer.
4、资源回收步骤4. Resource recovery steps
该步骤具体为:The steps are specifically:
步骤91,该精简配置模块接收到资源回收命令,解析参数,即获取逻辑卷、逻辑块信息;Step 91, the thin provisioning module receives the resource recovery command, parses the parameters, that is, obtains logical volume and logical block information;
步骤92,在参数指定逻辑卷的资源映射表中查找对应逻辑块的映射项,若找到相应的映射项则执行步骤93,否则执行步骤95;Step 92, look up the mapping item of the corresponding logical block in the resource mapping table of the logical volume specified by the parameter, if the corresponding mapping item is found, then perform step 93, otherwise perform step 95;
步骤93,取消逻辑块与物理资源块映射关系,并释放被映射的物理资源;Step 93, cancel the mapping relationship between the logical block and the physical resource block, and release the mapped physical resource;
步骤94,更新资源池空闲资源记录;Step 94, update the free resource record of the resource pool;
步骤95,向上层返回。Step 95, return to the upper layer.
本发明具有以下优点:The present invention has the following advantages:
第一,快照语义与物理资源管理解耦合,资源管理对快照层透明。精简配置系统作为底层支持技术实现资源管理功能,其核心功能包括资源分配,资源映射,资源回收等;快照层核心功能包括维护快照语义,即维护快照卷间的依赖关系、管理快照层元数据(标识数据块在快照卷逻辑空间内是否存在有效数据),实现写时拷贝或者写时重定向逻辑。First, snapshot semantics are decoupled from physical resource management, and resource management is transparent to the snapshot layer. Thin provisioning system is used as the underlying support technology to implement resource management functions. Its core functions include resource allocation, resource mapping, resource recycling, etc.; the core functions of the snapshot layer include maintaining snapshot semantics, that is, maintaining dependencies between snapshot volumes, and managing snapshot layer metadata ( Identifies whether the data block has valid data in the logical space of the snapshot volume), and implements copy-on-write or redirection-on-write logic.
第二,精简配置系统按照按需分配原则,执行写时分配策略,实现随处写语义,并且可以将随机写优化为顺序写,提高写性能。Second, the thin provisioning system implements the allocation-on-write strategy in accordance with the principle of allocation on demand, realizes write-anywhere semantics, and can optimize random writes to sequential writes to improve write performance.
第三,资源管理与快照语义的分离既解决了写时拷贝、写时重定与资源按需分配结合的问题,同时实现了写时拷贝、写时重定向与随处写机制的嵌套,如图4所示。Third, the separation of resource management and snapshot semantics not only solves the problem of combining copy-on-write, redirect-on-write, and on-demand resource allocation, but also realizes the nesting of copy-on-write, redirection-on-write, and write-anywhere mechanisms, as shown in the figure 4.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410077212.XA CN103942011B (en) | 2014-03-04 | 2014-03-04 | A kind of residual quantity fast photographic system and its application method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410077212.XA CN103942011B (en) | 2014-03-04 | 2014-03-04 | A kind of residual quantity fast photographic system and its application method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103942011A CN103942011A (en) | 2014-07-23 |
CN103942011B true CN103942011B (en) | 2017-06-09 |
Family
ID=51189692
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410077212.XA Expired - Fee Related CN103942011B (en) | 2014-03-04 | 2014-03-04 | A kind of residual quantity fast photographic system and its application method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103942011B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407935B (en) * | 2014-11-07 | 2018-05-18 | 华为数字技术(成都)有限公司 | Snapshot rollback method and storage device |
CN105808449B (en) * | 2014-12-31 | 2018-11-27 | 中国电信股份有限公司 | A kind of virtual memory image method for edition management and system for virtual machine |
CN105988895B (en) * | 2015-02-10 | 2020-11-03 | 中兴通讯股份有限公司 | Snapshot processing method and device |
CN105278878B (en) * | 2015-09-30 | 2018-09-21 | 成都华为技术有限公司 | A kind of disk space distribution method and device |
CN107357928B (en) * | 2017-07-26 | 2020-09-18 | 苏州浪潮智能科技有限公司 | Method and system for realizing snapshot storage |
CN107491363A (en) * | 2017-08-24 | 2017-12-19 | 郑州云海信息技术有限公司 | A kind of Snapshot Method and device of the storage volume based on linux kernel |
CN109598156B (en) * | 2018-11-19 | 2023-04-11 | 杭州信核数据科技股份有限公司 | Method for redirecting engine snapshot stream during writing |
CN109739688B (en) * | 2018-12-18 | 2021-01-26 | 杭州宏杉科技股份有限公司 | Snapshot resource space management method and device and electronic equipment |
CN110502187B (en) * | 2019-07-09 | 2020-12-04 | 华为技术有限公司 | Snapshot rollback method and device |
CN110941511B (en) * | 2019-11-21 | 2023-03-21 | 深信服科技股份有限公司 | Snapshot merging method, device, equipment and storage medium |
CN113157199A (en) * | 2020-01-22 | 2021-07-23 | 阿里巴巴集团控股有限公司 | Snapshot occupation space calculation method and device, electronic equipment and storage medium |
CN111552437B (en) * | 2020-04-22 | 2024-03-15 | 上海天玑科技股份有限公司 | Snapshot method and snapshot device applied to distributed storage system |
CN114816250B (en) * | 2022-04-15 | 2022-10-25 | 北京志凌海纳科技有限公司 | Continuous data protection method under distributed storage system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102163177A (en) * | 2010-02-24 | 2011-08-24 | 株式会社日立制作所 | Reduction of i/o latency for writable copy-on-write snapshot function |
CN102915275A (en) * | 2012-09-13 | 2013-02-06 | 曙光信息产业(北京)有限公司 | Thin provisioning system |
CN103080894A (en) * | 2010-12-28 | 2013-05-01 | 株式会社日立制作所 | Storage system, management method of the storage system, and program |
CN103561098A (en) * | 2013-11-05 | 2014-02-05 | 华为技术有限公司 | Method, device and system for selecting storage resources |
-
2014
- 2014-03-04 CN CN201410077212.XA patent/CN103942011B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102163177A (en) * | 2010-02-24 | 2011-08-24 | 株式会社日立制作所 | Reduction of i/o latency for writable copy-on-write snapshot function |
CN103080894A (en) * | 2010-12-28 | 2013-05-01 | 株式会社日立制作所 | Storage system, management method of the storage system, and program |
CN102915275A (en) * | 2012-09-13 | 2013-02-06 | 曙光信息产业(北京)有限公司 | Thin provisioning system |
CN103561098A (en) * | 2013-11-05 | 2014-02-05 | 华为技术有限公司 | Method, device and system for selecting storage resources |
Also Published As
Publication number | Publication date |
---|---|
CN103942011A (en) | 2014-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942011B (en) | A kind of residual quantity fast photographic system and its application method | |
CN110597451B (en) | Method for realizing virtualized cache and physical machine | |
CN104915151B (en) | A kind of memory excess distribution method that active is shared in multi-dummy machine system | |
US9009437B1 (en) | Techniques for shared data storage provisioning with thin devices | |
CN101419535B (en) | Distributed virtual disk system for virtual machines | |
CN101594309B (en) | Method and device for managing memory resources in cluster system, and network system | |
CN111124951B (en) | Method, apparatus and computer program product for managing data access | |
US8533397B2 (en) | Improving performance in a cache mechanism by way of destaging data in partial strides | |
US9639459B2 (en) | I/O latency and IOPs performance in thin provisioned volumes | |
CN110663019A (en) | File system for Shingled Magnetic Recording (SMR) | |
CN109697016B (en) | Method and apparatus for improving storage performance of containers | |
CN100428131C (en) | Resource Allocation Method in Mass Storage System | |
KR20170008153A (en) | A heuristic interface for enabling a computer device to utilize data property-based data placement inside a nonvolatile memory device | |
CN101847105A (en) | Computer and internal memory sharing method of a plurality of operation systems | |
US10061523B2 (en) | Versioning storage devices and methods | |
CN104035887A (en) | Block device caching device and method based on simplification configuration system | |
WO2021047425A1 (en) | Virtualization method and system for persistent memory | |
US20220083281A1 (en) | Reading and writing of distributed block storage system | |
WO2016123748A1 (en) | Flash memory storage system and read/write and delete methods therefor | |
CN105701219A (en) | Distributed cache implementation method | |
US20240086092A1 (en) | Method for managing namespaces in a storage device and storage device employing the same | |
KR102326280B1 (en) | Method, apparatus, device and medium for processing data | |
CN111078143B (en) | Hybrid storage method and system for data layout and scheduling based on segment mapping | |
CN106326132A (en) | Storage system, storage management device, storage, hybrid storage device and storage management method | |
CN107577733B (en) | A method and system for accelerating data replication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170609 |
|
CF01 | Termination of patent right due to non-payment of annual fee |