WO2016095761A1

WO2016095761A1 - Cache processing method and apparatus

Info

Publication number: WO2016095761A1
Application number: PCT/CN2015/097176
Authority: WO
Inventors: 李明君
Original assignee: 华为技术有限公司
Priority date: 2014-12-16
Filing date: 2015-12-11
Publication date: 2016-06-23
Also published as: CN104503703B; CN104503703A

Abstract

A cache processing method and apparatus, the method comprising: receiving an input-output-IO-read-request, according to the IO-read-request, reading small data block from a parent mirror (S101); if a first large data block containing the small data block is not in a cache mirror, checking whether storage space of the cache mirror has no space or an occupancy ratio of the storage space is greater than or equal to a preset ratio threshold (S102); if so, replacing a second large data block of the cache mirror with the first large data block, and writing the small data block into the storage space corresponding to the first large data block in the cache mirror (S103), replacing the dated second large data block with a new first large data block, and writing the small data block needed to be written into the cache mirror to effectively increase utilization rate of the storage space.

Description

Cache processing method and device

Technical field

The embodiments of the present invention relate to a storage technology, and in particular, to a cache processing method and apparatus.

Background technique

The desktop cloud separates the personal computer desktop environment from the physical machine through the cloud computing mode, and becomes a service that can provide desktops externally.

FIG. 1 is a schematic diagram of an application scenario of a desktop cloud in the prior art. As shown in FIG. 1 , the application scenario includes a parent image 1 , a cache image 2 , a sub mirror 3 , and a virtual machine 4 . The cache image 2 link is cloned in the parent image 1 and the sub mirror 3 link is cloned in the cache image 2 . In this scenario, as shown in FIG. 1, for the read/write IO in the virtual machine 4, the write IO directly writes the data block into the sub-image 3; the read IO first reads the data block into the sub-image, if the data block If it exists in the submirror, it is read in the submirror; if the data block does not exist in the submirror, it is read in the cache image; if the data block does not exist in the cache image, it is read in the parent image. The read data block is written to the cache image so that the data block can be read from the cache image the next time the same data block is read.

However, the parent image, the cache image, and the sub-mirror are in the virtual hard disk format (VHD). The maximum size of the VHD file does not exceed the size of the local storage file system, and the VHD cannot perform the data block replacement function. For example, the data blocks read from the parent image can only be written to the cache image in the same way. If the local storage file system space where the cache image resides is used up, the new data block cannot be written to the cache. In the image, some data in the cache image may not be read later or rarely, so the proportion of useful data blocks in the cache image is reduced, resulting in wasted cache space.

Summary of the invention

The embodiment of the invention provides a method and a device for processing a cache, which replaces the second largest data block that is aged with a new first large data block, and then writes the small data block that needs to be written into the cache image, thereby effectively improving The utilization of storage space.

A first aspect of the embodiments of the present invention provides a method for processing a cache, including:

Receiving a read input and output IO request, and reading a small data block from the parent image according to the read IO request;

If the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupation ratio of the storage space is greater than or equal to a preset ratio threshold;

If the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, replacing the first large data block with the second largest data block in the cache image, And writing the small data block to a storage space corresponding to the first large data block in the cache image.

In a first possible implementation manner of the first aspect, after the reading the small data block from the parent image, the method further includes:

Adding an access count value of the first large data block containing the small data block by one;

If the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupied ratio of the storage space is greater than or equal to a preset ratio threshold. Previously, the method further includes:

The identifier of the first large data block and the corresponding access count value are written into the management linked list.

In conjunction with the first possible implementation of the first aspect, in a second possible implementation manner of the first aspect, the method further includes:

Adding the first large data block to the cache image and writing the small data block if the storage space of the cache image is not full or the occupation ratio of the storage space is less than the preset ratio threshold. The storage space corresponding to the first large data block in the cache image.

In conjunction with the first possible implementation of the first aspect, in a third possible implementation of the first aspect, the replacing the first large data block with the second largest data block in the cache image, and The small data block is written into the storage space corresponding to the first large data block in the cache image, and specifically includes:

Determining, according to the access count value of the first large data block, starting from the header position of the management linked list, acquiring the second large data block, and determining the identifier and the access count value of the second large data block from Removing the management linked list, and writing the identifier of the first large data block and the access count value to the identifier of the third large data block in the management linked list; wherein the second large data block is In the management In the linked list, the access data is less than or equal to the first data block of the access count value of the first large data block; and the third large data block is in the management linked list after the identifier of the second large data block is deleted. Accessing a first data block whose count value is greater than or equal to an access count value of the first large data block;

Deactivating a location flag corresponding to the second largest data block in the block allocation table BAT, and storing a starting offset position of the first large data block to a location corresponding to the first large data block in the BAT; The starting offset position of the first large data block is the starting offset position of the second large data block;

Writing the small data block into a storage space corresponding to a starting offset position of the first large data block in the cache image.

In conjunction with the second possible implementation of the first aspect, in a fourth possible implementation of the first aspect, the first large data block is added to the cache image, and the small data block is written The storage space corresponding to the first large data block in the cache image includes:

And writing an identifier of the first large data block and an access count value into the management linked list according to an access count value of the first large data block;

Finding a free block in the cache image, and writing the small data block to the free block; wherein the free block is a storage space corresponding to the first large data block;

Writing a starting offset position of the free block to a location corresponding to the free block in the BAT.

In conjunction with the first possible implementation of the first aspect, in a fifth possible implementation manner of the first aspect, if the first large data block is in a cache image, the management is performed according to the access count value. Updating a location of the identifier of the first large data block and the access count value in a linked list, and writing the small data block to a storage space corresponding to the first large data block in the cache image.

A second aspect of the embodiments of the present invention provides a cache processing apparatus, including:

a reading module, configured to receive a read input and output IO request, and read a small data block from the parent image according to the read IO request;

The checking module is configured to: if the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupied ratio of the storage space is greater than or equal to a preset ratio Threshold value

a processing module, if the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, replacing the first large data block with the first one of the cache image Two large data blocks, and writing the small data block to the first big data in the cache image The storage space corresponding to the block.

In a first possible implementation manner of the second aspect, the reading module is further configured to increase an access count value of the first large data block that includes the small data block by one;

The processing module is further configured to write the identifier of the first large data block and the corresponding access count value into the management linked list.

With reference to the first possible implementation of the second aspect, in a second possible implementation manner of the second aspect, the processing module is further configured to: if the storage space of the cache image is not full or the proportion of the storage space is occupied If the threshold is smaller than the preset ratio, the first large data block is added to the cache image, and the small data block is written into a storage space corresponding to the first large data block in the cache image.

With reference to the first possible implementation of the second aspect, in a third possible implementation manner of the second aspect, the processing module includes:

a second large data block obtaining unit, configured to sequentially search from the header position of the management linked list according to an access count value of the first large data block, acquire the second large data block, and use the second The identification and access count value of the big data block is removed from the management linked list, and the identifier and the access count value of the first large data block are written before the identifier of the third large data block in the management linked list; The second large data block is the first data block in the management linked list, and the access count value is less than or equal to the access count value of the first large data block; the third large data block is deleted. In the management chain table after the identification of the second largest data block, the access data is greater than or equal to the first data block of the access count value of the first large data block;

a location marking unit, configured to invalidate a location identifier corresponding to the second large data block in the BAT, and store a starting offset location of the first large data block to a location corresponding to the first large data block in the BAT; The starting offset position of the first large data block is a starting offset position of the second large data block;

And a small data block writing unit, configured to write the small data block into a storage space corresponding to a starting offset position of the first large data block in the cache image.

With reference to the second possible implementation of the second aspect, in a fourth possible implementation manner of the second aspect, the processing module includes:

An identifier writing unit, configured to write an identifier of the first large data block and an access count value into the management linked list according to an access count value of the first large data block;

a free block obtaining unit, configured to find a free block in the cache image, and write the small data block into the free block; wherein the free block is a storage corresponding to the first large data block Empty between;

And a location writing unit, configured to write a starting offset position of the free block to a location corresponding to the free block in the BAT.

With reference to the first possible implementation of the second aspect, in a fifth possible implementation manner of the second aspect, the processing module includes:

And an update unit, configured to: when the first large data block is in the cache image, update the location of the identifier of the first large data block and the access count in the management link table according to the access count value And storing the small data block into a storage space corresponding to the first large data block in the cache image.

The processing method of the cache provided in this embodiment receives a read input and output IO request, and reads a small data block from the parent image according to the read IO request, and if the first large data block including the small data block is not in the cache image, Check whether the storage space of the cache mirror is full or the proportion of storage space is greater than or equal to the preset ratio threshold. If the storage space of the cache mirror is full or the storage ratio is greater than or equal to the preset ratio threshold, it will be the first. The big data block replaces the second largest data block in the cache image, and writes the small data block into the storage space corresponding to the first largest data block in the cache image. Compared with the prior art, when the storage space of the cache image is used up, the new data block can no longer be written into the cache image. In this embodiment, when the storage space in the cache image is full. Or, when the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, replace the second largest data block in the cache image with the first large data block, and write the small data block into the storage corresponding to the second largest data block in the cache image. Space, that is, replace the aging second largest data block with the new first large data block, and then write the small data block that needs to be written into the cache image, which effectively improves the utilization of the storage space.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.

FIG. 1 is a schematic diagram of an application scenario of a desktop cloud in the prior art;

2 is a flowchart of a method for processing a cache according to Embodiment 1 of the present invention;

3 is a flowchart of a method for processing a cache according to Embodiment 2 of the present invention;

4 is a schematic diagram of a management chain table before the identification of the first big data block moves;

5 is a schematic diagram of a management chain table after the identification of the first large data block is moved;

Figure 6 is a schematic diagram of bit chart management;

FIG. 7 is a flowchart of a method for processing a cache according to Embodiment 3 of the present invention;

8 is a schematic diagram of a management link table before the first large data block enters the cache image in the third embodiment;

9 is a schematic diagram of a management chain table after the first large data block enters the cache image in the third embodiment;

10 is a flowchart of a method for processing a cache according to Embodiment 4 of the present invention;

11 is a schematic diagram of a management chain table before the first large data block enters the cache image in the fourth embodiment;

12 is a schematic diagram of a management chain table after the first large data block enters the cache image in the fourth embodiment;

FIG. 13 is a schematic structural diagram of a cache processing apparatus according to Embodiment 5 of the present invention; FIG.

14 is a schematic structural diagram of a cache processing apparatus according to Embodiment 6 of the present invention;

15 is a schematic structural diagram of a cache processing apparatus according to Embodiment 7 of the present invention;

FIG. 16 is a schematic structural diagram of a cache processing apparatus according to Embodiment 8 of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

FIG. 2 is a flowchart of a method for processing a cache according to Embodiment 1 of the present invention. In this embodiment, the executor may be a central processing unit (CPU), a server, a physical host, a terminal device, etc., and is not limited thereto. As shown in Figure 1, the method includes the following steps:

Step 101: Receive an Input and Output (IO) request, and read a small data block from the parent image according to the read IO request.

In this embodiment, when a read IO request is received, and none of the sub-mirrors and the cache image are missed according to the read IO, that is, no small data blocks are read in the sub-mirror and the cache image according to the read IO, The small data block needs to be read from the parent image according to the read IO request. The read IO request may include an identifier of the small data block, for example, a number of the small data block or a starting offset address of the small data block storage. A large data block can be divided into multiple small data blocks. For example, 512B can be used as a small data block and 2MB as a large data block. A 2MB data block can also be used as a small data block, and a 20 GB data block can be used as a big data block. The block can be set according to actual conditions, and is not limited in the present invention.

Step 102: If the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to a preset ratio threshold.

In this embodiment, the first large data block including the small data block is not in the cache image, and it is indicated that any small data block in the first large data block is stored in the storage space of the cache image, if the first big data At least one small data block in the block is stored in the storage space of the cache image, and the first large data block is in the cache image. The storage space of the cache image can be a local high-speed storage device. For example, the storage speed and the input and output per second (IOPS) are much larger than those of the ordinary hard disk. It can be either a memory or a solid state disk. A storage device that can perform high-speed interoperability, such as Solid State Disk (SSD), can select a suitable storage device for caching from the perspective of cost, performance, and life cycle.

Step 103: If the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to a preset ratio threshold, replace the first large data block with the second largest data block in the cache image, and write the small data block. The storage space corresponding to the first largest data block in the cache image.

In this embodiment, if the storage space of the cached image is full, that is, the storage space cannot be stored in an additional manner, or if the storage ratio of the cached image is greater than or equal to a preset ratio threshold, for example, When the occupation ratio of the storage space exceeds 80%, the first large data block is replaced with the second largest data block in the cache image, and the small data block is written into the storage space corresponding to the first large data block in the cache image. The second largest data block is a large data block in the cache image that is accessed relatively small times, or a large data block that has not been accessed within a preset time period, or may be accessed less than or equal to the number of times. A large block of data that is accessed by a large block of data.

Optionally, in this embodiment, a preset cache replacement policy may be used to replace the second largest data block in the cache image with the first large data block. For example, the cache replacement policy may be a Least Recently Used (LRU) policy, or a Least Frequently Used (LFU) policy, or other policies, and those skilled in the art may In terms of characteristics and usage scenarios, an appropriate cache replacement strategy is selected, which is not limited in this invention.

It should be noted that, in this embodiment, instead of replacing the second largest data block in the cache image with the data in the first large data block, an aging time with a relatively small number of times of being accessed is found in the cache image. After the large data block, the aging large data block is used as the second largest data block, and the storage space of the second largest data block in the cache image is reserved to the first large data block, and the small data in the first large data block is used. After the block is read, the small data block is written to the storage space corresponding to the original second largest data block in the cache image.

FIG. 3 is a flowchart of a method for processing a cache according to Embodiment 2 of the present invention. As shown in FIG. 3, the method includes the following steps:

Step 201: Receive a read input and output IO request, and read a small data block from the parent image according to the read IO request.

Step 202: Add an access count value of the first large data block including the small data block by one.

In this embodiment, the access count value is used to record the number of times the first large data block is accessed, when there is a small data block in the first large data block, regardless of whether the first large data block is in the cache image, Add 1 to the access count value of the first large data block.

Step 203: Determine whether the first large data block that includes the small data block is in the cache image. If yes, that is, the first large data block that includes the small data block is in the cache image, step 208 is performed; if not, the small data is included. If the first large data block of the block is not in the cache image, step 204 is performed.

Step 204: Write the identifier of the first large data block and the corresponding access count value into the management linked list.

In this embodiment, the management linked list is used to manage large data blocks in the cache image and store big data. The identifier of the block and the access count value of the big data block, and the management chain table may specifically be an LFU linked list. In the management list, the head node points to the most recent big block with the smallest access count, and the tail point points to the oldest block with the largest access count. If the identifier of the first large data block exists in the management linked list, the first data block is in the cache image, that is, the small data block in the first large data block is stored in the storage space of the cache image, if there is no management list The identifier of the first large data block is not in the cache image, that is, no small data block in the first large data block is stored in the storage space of the cache image.

When the small data block in the first large data block is accessed, and the first large data block is not in the cache image, the identifier of the first large data block and the corresponding access count value are written into the management linked list, indicating that the first largest The data block is in the cache image. In the initial state, the access count value of all large data blocks is set to 0, and these large data blocks are managed by the management linked list. If there are small data blocks in the large data block into the cache image, that is, the small data blocks included in the first large data block. When it needs to be stored in the storage space of the cache image after being read, the first big block identifier and the access count value are added to the management list.

Step 205: Check whether the storage space of the cached image is full or the proportion of the storage space is greater than or equal to a preset ratio threshold. If yes, the storage space of the cached image is full or the occupied proportion of the storage space is greater than or equal to a preset ratio threshold. Then, step 206 is performed; if no, that is, the storage space of the cache image is not full or the occupation ratio of the storage space is less than the preset ratio threshold, step 207 is performed.

Step 206: Replace the first large data block with the second largest data block in the cache image, and write the small data block into the storage space corresponding to the first large data block in the cache image.

Step 207: Add the first large data block to the cache image, and write the small data block into the storage space corresponding to the first large data block in the cache image. End.

In this embodiment, the first large data block can be added to the cache image by using two methods, one is preemption in advance, and the other is dynamic allocation. The preemption method is to occupy the cache space in advance when the first large data block is cached and added to the cache, regardless of whether the small data block in the first large data block can be accessed and stored in the storage space of the cache image. The space required for the first largest data block in the middle (such as 2MB); the dynamic allocation method allocates only the space required for small data blocks in the storage space (such as several 512B) when the first large data block is added to the cache image. This way you can make more useful small data blocks into the cache image.

Step 208: Update, according to the access count value, a location and an access count value of the identifier of the first large data block in the management linked list, and write the small data block into the storage space corresponding to the first large data block in the cache image. between.

In this embodiment, when the first large data block is in the cache image, that is, the management link list has the identifier and the access count value of the first large data block, according to the access count value, in the management linked list, from the first big data. The location of the identified identity of the block begins, the first large data block whose access count value is less than or equal to the access count value of the first large data block is found, and the identifier of the first large data block and the access count value are written to the first large Before the data block.

4 is a schematic diagram of a management chain table before the identification of the first large data block moves, and FIG. 5 is a schematic diagram of the management chain table after the identification of the first large data block moves. As shown in Figure 4 and Figure 5, there are three big data blocks in the cache image: big data block 1 (BLOCK1), big data block 6 (BLOCK6), and big data block 8 (BLOCK8). Their access count values are respectively The access count value of BLOCK1 is 3, the access count value of BLOCK6 is 2, and the access count value of BLOCK8 is 2. When BLOCK8 is accessed again, its access count value is incremented to 3, and the access count value of BLOCK8 is already greater than BLOCK6. In the management list, BLOCK8 is still in front of BLOCK6, then BLOCK8 needs to be moved in the management list. . At the current position of BLOCK8, look backwards to find the first data block whose access count value is greater than or equal to 3, in this case BLOCK1, as shown in Figure 5, move BLOCK8 to BLOCK1 in the management list.

Optionally, in this embodiment, a small data block in a large data block may also be managed by using a bitmap, for example, a 2 MB data block is used as a large data block, and a 512B data block is used as a small data block, The 2MB big data block can be further divided into 4096 512B small data blocks. These 512B data blocks can be managed by using a bit chart. Each 512B in the bit chart is managed with 1 bit. 6 is a schematic diagram of bit chart management. As shown in FIG. 6, the bitmap shows the management diagram of two 2MB data blocks. If the corresponding bit value is 0, it means that the corresponding 512B is not in the storage space of the cache image. Conversely, if the corresponding bit value is 1, it means that the corresponding 512B is in the storage space of the cache image. It should be noted that FIG. 6 only shows eight 512B data blocks, and all 512B data blocks in one 2MB data block are not used. The actual operation is subject to actual conditions.

In this embodiment, before the small data block is written into the storage space of the cache image, the corresponding position in the bitmap is 0. After the small data block is written into the storage space corresponding to the first large data block in the cache image, The position corresponding to the small data block in the bit chart is updated to 1.

The processing method of the cache provided in this embodiment receives the read input and output IO request, and reads the small data block from the parent image according to the read IO request, and adds the access count value of the first large data block to 1, if the first The big data block is in the cache image, and according to the access count value, the location and the access count value of the identifier of the first large data block are updated in the management linked list, and the small data block is written into the first large data block in the cache image. The storage space, if the first large data block is not in the cache image, write the identifier of the first large data block and the corresponding access count value into the management chain table, and check whether the storage space of the cache image is full or the storage space is occupied. Whether the ratio is greater than or equal to the preset ratio threshold. If the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, the first large data block is replaced with the second largest data block in the cache image. And the small data block is written into the storage space corresponding to the first large data block in the cache image. If the storage space of the cache image is not full or the storage space occupancy ratio is less than the preset ratio threshold, if not, the storage space of the cache image is not The occupancy ratio of full or storage space is less than the preset ratio threshold. Compared with the prior art, when the storage space of the cache image is used up, the new data block can no longer be written into the cache image. In this embodiment, not only the new first large data block can be implemented. Replace the aging second largest data block, and then write the small data block that needs to be written into the cache image, which effectively improves the utilization of the storage space. Moreover, the management chain table is used to manage the large data blocks in the cache image according to the big data. The access count value of the block can quickly and easily find the aging data block, and improve the efficiency of the replacement, join and data block identification of the big data block in the linked list.

FIG. 7 is a flowchart of a method for processing a cache according to Embodiment 3 of the present invention. On the basis of the foregoing embodiment 2, the step of “replace the first large data block with the second largest data block in the cache image and write the small data block into the storage space corresponding to the first large data block in the cache image” The implementation includes the following steps:

Step 301: According to the access count value of the first big data block, search from the header position of the management linked list, obtain the second largest data block, and remove the identifier and the access count value of the second large data block from the management linked list. And write the identity and access count value of the first large data block before the identity of the third big data block in the management linked list.

The second largest data block is the first data block in the management linked list, and the access count value is less than or equal to the access count value of the first large data block; the third largest data block is the identifier of the second largest data block deleted. In the subsequent management list, the access count value is greater than or equal to the first data block of the access count value of the first large data block.

In this embodiment, in the management linked list, according to the access count value of the first large data block, starting from the position of the header, the data block whose first access count value is less than or equal to the access count value of the second data is found. The data block is used as the second largest data block, and the identifier and the access count value of the second largest data block are removed from the management linked list, indicating that the second large data block does not exist in the cache image, and the first big data is The identity of the block and the access count value are written into the management list, indicating that the first large block of data exists in the cache image.

FIG. 8 is a schematic diagram of a management link table before the first large data block enters the cache image in the third embodiment, and FIG. 9 is a schematic diagram of the management link table after the first large data block enters the cache image in the third embodiment. As shown in FIG. 8, the management list includes data block 3 (BLOCK3), data block 6 (BLOCK6), data block 1 (BLOCK1), and data block 8 (BLOCK8). The access count value of BLOCK3 is 5, and the access of BLOCK6 is as shown in FIG. The count value is 7, the access count value of BLOCK1 is 7, and the access count value of BLOCK8 is 8. In the other data blocks except the cache image, the first large data block is BLOCK5 and the access count value is 5. When BLOCK5 is accessed again, its access count value is increased to 1 and the access count value of BLOCK5 is already greater than BLOCK3. That is, BLOCK3 is the data block whose first access count value is less than BLOCK5, then BLOCK3 is the second largest. Data block, and in the management linked list, BLOCK5 is not in the linked list, that is, the first large data block is not in the cache image, then it needs to be replaced in the management linked list, as shown in Figure 9, BLOCK3 is removed from the linked list, and BLOCK5 joins the management list. When adding BLOCK5 to the management linked list, look up from the beginning to find the first data block with the access count value greater than or equal to 6. In this case, BLOCK6, that is, BLOCK6 is the fourth data block, then add BLOCK5 to the management list. Before BLOCK6.

It should be noted that, in this embodiment, the access count value of the identifier of more data blocks may be included in the management linked list, and is not limited to FIG. 8 and FIG. 9 .

Step 302: Disable the location tag corresponding to the second large data block in the BAT, and store the starting offset location of the first large data block to a location corresponding to the first large data block in the BAT.

The starting offset position of the first large data block is the starting offset position of the second large data block.

In this embodiment, 2 MB data blocks are used as big data blocks, 512B data blocks are used as small data blocks, and BAT is used to manage these 2 MB blocks, and BAT stores the start of each 2 MB data block in the storage space. Offset position. For example, for a VHD virtual hard disk file with a capacity of 20 GB, the number of BAT entries is 10,240. If the 5th 2MB data block is stored at the location where the VHD virtual hard disk file offset address is 100M, the fifth in the BAT. The value at the table entry is 100M. If an item in the BAT is 0XFFFFFFFF, it means that the data block does not exist in the cache image.

Table 1 is the BAT before the first large data block is added to the cache image in the third embodiment. Table 2 is the BAT after the first large data block is added to the cache image in the third embodiment. As shown in Table 1, the BAT stores the starting position offset addresses of the three data blocks, which are 6M+1.5K, 4M+1K, and 2M+512, respectively, and 0M indicates the storage space corresponding to the BAT. Relative starting offset position, not the actual starting point in the storage space Offset position. In this embodiment, the starting offset position of the second large data block is 6M+1.5K, and the third entry in the BAT is marked as 0XFFFFFFFF, indicating that the third data is not in the cache image, and then The initial offset position is written into the fifth entry of the BAT written by 6M+1.5K. As shown in Table 2, the fifth entry in the BAT corresponds to the starting offset position of the first large data block. It is indicated that the first large data block exists in the cache image, and the storage space of the first large data block is the storage space of the original second largest data block.

Table 1

OMOM
OXFFFFFFFFOXFFFFFFFF
6M+1.5K6M+1.5K
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
4M+1K4M+1K
OXFFFFFFFFOXFFFFFFFF
2M+5122M+512

Table 2

OMOM
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
6M+1.5K6M+1.5K
4M+1K4M+1K
OXFFFFFFFFOXFFFFFFFF
2M+5122M+512

Step 303: Write the small data block into the storage space corresponding to the starting offset position of the first large data block in the cache image.

In this embodiment, each small data block has a corresponding position in the first large data block, and the first large data block occupies the storage space of the original second largest data block, if the small size needs to be written If the data block has a data block storage at a position corresponding to the original second largest data block, the data block in the original corresponding position is directly replaced by the small data block; if the small data block to be written is in the original second largest data block If there is no data block storage in the corresponding location, the small data block needs to be allocated a storage space in the corresponding storage location to store the small data block in the allocated storage space.

It should be noted that, in this embodiment, as shown in Table 1 and Table 2, the third item in the entry corresponds to the second largest data block, and is marked as 0XFFFFFFFF, indicating that the second large data block is invalid, and may be This location stores the new data block, but in fact the small data block that was originally accessed in the second largest data block is still stored in the storage space of the cache image. For example, in FIG. 6, if the data block 1 is the original second largest data block, after the second large data block is replaced by the first large data block, although the BAT entry corresponding to the second largest data block is marked as 0XFFFFFFFF, However, the third 512B small data block is still stored in the storage space. When the small data block of the first data block is written into the cache image, if the small data block of the first data block is stored in the data block, The position of the third 512B data block of the bit chart directly replaces the third 512B data block of the original second largest data block with the small data block of the first data block; if the small data block of the first data block The storage location is the location of the second 512B data block in the bitmap, but the location is 0, that is, there is no storage space at the location, and the storage space needs to be allocated in the location of the second 512B data block, and the small data is allocated. The block is stored in and the value of the second position in the corresponding bit chart is set to 1.

The processing method of the cache provided in this embodiment searches in order from the header position of the management linked list according to the access count value of the first large data block, acquires the second largest data block, and counts the identifier and access of the second largest data block. The value is removed from the management linked list, and the identifier of the first large data block and the access count value are written before the identifier of the third large data block in the management linked list, and the location mark corresponding to the second largest data block is invalidated in the BAT. And storing a starting offset position of the first large data block to a position corresponding to the first large data block in the BAT, and writing the small data block to a corresponding offset position of the first large data block in the cache image Storage space, using the management linked list and BAT to manage the data blocks in the cache image, can quickly and easily find the aging large data blocks in the cache image, and replace the old large data blocks with new big data blocks, which will read small The data block is stored in the storage space of the cache image, which not only improves the utilization of the storage space, but also accesses the cache image directly when the next time the small data block is accessed, thereby alleviating the parent image. The pressure stored remotely.

FIG. 10 is a flowchart of a method for processing a cache according to Embodiment 4 of the present invention. On the basis of the foregoing embodiment 2, the specific implementation manner of the step of adding the first large data block to the cache image and writing the small data block to the storage space corresponding to the first large data block in the cache image includes the following steps:

Step 401: Write the identifier of the first large data block and the access count value into the management linked list according to the access count value of the first large data block.

In this embodiment, when the first large data block including the small data block is not in the cache image, and the storage space of the cache image is not full or the occupation ratio of the storage space is less than a preset ratio threshold, according to The access count value of the first large data block starts from the head node of the management linked list, finds the first large data block whose access count value is greater than or equal to the access count value of the first large data block, and the first large data block The identity and access count value are written before the first big data block.

11 is a schematic diagram of a management link table before the first large data block enters the cache image in the fourth embodiment, and FIG. 12 is a schematic diagram of the management link table after the first large data block enters the cache image in the fourth embodiment. As shown in FIG. 11, the cache image has two large data blocks, namely, data block 1 (BLOCK1) and data block 6 (BLOCK6), the access count value of BLOCK1 is 1, and the access count value of BLOCK6 is 3. As shown in FIG. 12, the access count value of the first large data block (BLOCK8) outside the cache image is changed from 1 to 2. When BLOCK8 is added to the management list, starting from the head node, the first access count value is greater than or equal to BLOCK8. The data block of the access count value, here BLOCK6, adds the BLOCK8 identification and access count value to the BLOCK6 in the management list.

Step 402: Find a free block in the cache image and write the small data block to the free block.

The free block is a storage space corresponding to the first large data block.

In this embodiment, a free block is searched in the storage space of the cache image, and the free block is used as a storage space corresponding to the first large data block, and the small data block is written into the free block.

Step 403: Write a starting offset position of the free block to a position corresponding to the free block in the BAT.

In this embodiment, the initial offset position of the free block is written to the entry position corresponding to the free block in the BAT, so that the storage space of the first large data block is found according to the initial offset position during the next access. .

Table 3 is the BAT before the first large data block is added to the cache image in the fourth embodiment. Table 4 is the BAT after the first large data block is added to the cache image in the fourth embodiment. As shown in Table 3, the starting offset position of BLOCK1 is 0M, and the starting offset position of BLOCK6 is 2M+512. As shown in Table 4, when BLOCK8 is added to the cache image, the found offset position of the free block is 4M+1K, and 4M+1K is written to the entry position corresponding to BLOCK8 in the BAT.

table 3

OMOM
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
2M+5122M+512
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF

Table 4

OMOM
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
OXFFFFFFFFOXFFFFFFFF
2M+5122M+512
OXFFFFFFFFOXFFFFFFFF
4M+1K4M+1K

The processing method of the cache provided in this embodiment, when the first large data block including the small data block is not in the cache image, and the storage space of the cache image is not full or the occupation ratio of the storage space is less than a preset ratio threshold, according to the first The access count value of a large data block, the identifier of the first large data block and the access count value are written into the management linked list, and a free block is found in the cache image, and the small data block is written into the free block, and the idle block is idle. The starting offset position of the block is written to the position corresponding to the free block in the BAT, and the management chain table and the BAT are used to manage the data block in the cache image, so that the first large data block can be quickly and conveniently added to the cache image, and the read small The data block is stored in the storage space of the cache image, which not only improves the utilization of the storage space, but also accesses the cache image directly when the next time the small data block is accessed, thereby reducing the pressure on the remote storage where the parent image is located.

FIG. 13 is a schematic structural diagram of a cache processing apparatus according to Embodiment 5 of the present invention. As shown in FIG. 13, the apparatus includes a reading module 11, an inspection module 12, and a processing module 13. The reading module 11 is configured to receive the read input and output IO request and read the small data block from the parent image according to the read IO request. The checking module 12 is configured to check whether the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to a preset ratio threshold if the first large data block that includes the small data block is not in the cache image. The processing module 13 is configured to replace the first large data block with the second largest data block in the cache image, and replace the small data block, if the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to a preset ratio threshold. Writes the storage space corresponding to the first large data block in the cache image.

The device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 2, and the implementation principle and technical effects are similar, and details are not described herein again.

FIG. 14 is a schematic structural diagram of a cache processing apparatus according to Embodiment 6 of the present invention. On the basis of the foregoing embodiment 5, as shown in FIG. 14, the processing module includes an updating unit 21, configured to update the first large data block in the management linked list according to the access count value if the first large data block is in the cache image. The location of the identifier and the access count value, and write the small data block to the cache corresponding to the first largest data block in the cache image. Storage space. The reading module 11 is further configured to increment the access count value of the first large data block containing the small data block by one. The processing module 13 is further configured to write the identifier of the first large data block and the corresponding access count value into the management linked list. The processing module 13 is further configured to add the first large data block to the cache image and write the small data block into the cache image if the storage space of the cache image is not full or the occupation ratio of the storage space is less than a preset ratio threshold. The storage space corresponding to the data block.

The device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 3, and the implementation principle and technical effects are similar, and details are not described herein again.

FIG. 15 is a schematic structural diagram of a cache processing apparatus according to Embodiment 7 of the present invention. On the basis of the above-described sixth embodiment, as shown in FIG. 15, the processing module 13 includes a second large data block acquiring unit 22, a position marking unit 23, and a small data block writing unit 24. The second large data block obtaining unit 22 is configured to search, according to the access count value of the first large data block, from the header position of the management linked list, obtain the second large data block, and identify the second large data block. The access count value is removed from the management linked list, and the identifier of the first large data block and the access count value are written before the identifier of the third large data block in the management linked list; wherein the second largest data block is in the management linked list The first data block whose access count value is less than or equal to the access count value of the first large data block; the third largest data block is the management link list after the identifier of the second largest data block is deleted, and the access count value is greater than or equal to The first data block of the access count value of the first large data block. The location marking unit 23 is configured to invalidate the location tag corresponding to the second large data block in the BAT, and store the starting offset location of the first large data block to a location corresponding to the first large data block in the BAT; The starting offset position of the first large data block is the starting offset position of the second largest data block. The small data block writing unit 24 is configured to write the small data block into the storage space corresponding to the start offset position of the first large data block in the cache image.

The device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 7. The implementation principle and technical effects are similar, and details are not described herein again.

FIG. 16 is a schematic structural diagram of a cache processing apparatus according to Embodiment 8 of the present invention. On the basis of the above-described sixth embodiment, as shown in FIG. 16, the processing module 13 includes an identification writing unit 25, a free block acquiring unit 26, and a position writing unit 27. The identifier writing unit 25 is configured to write the identifier of the first big data block and the access count value into the management linked list according to the access count value of the first large data block. The free block obtaining unit 26 is configured to find a free block in the cache image and write the small data block into the free block; wherein the free block is the storage space corresponding to the first large data block. The position writing unit 27 is for writing the start offset position of the free block to the position corresponding to the free block in the BAT.

The device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 10, and the implementation principle and technical effects are similar, and details are not described herein again.

One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims

A method for processing a cache, comprising:

Receiving a read input and output IO request, and reading a small data block from the parent image according to the read IO request;

If the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupation ratio of the storage space is greater than or equal to a preset ratio threshold;

If the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, replacing the first large data block with the second largest data block in the cache image, And writing the small data block to a storage space corresponding to the first large data block in the cache image.
The method according to claim 1, wherein after the reading the small data block from the parent image, the method further comprises:

Adding an access count value of the first large data block containing the small data block by one;

If the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupied ratio of the storage space is greater than or equal to a preset ratio threshold. Previously, the method further includes:

The identifier of the first large data block and the corresponding access count value are written into the management linked list.
The method according to claim 1 or 2, further comprising:

Adding the first large data block to the cache image and writing the small data block if the storage space of the cache image is not full or the occupation ratio of the storage space is less than the preset ratio threshold. The storage space corresponding to the first large data block in the cache image.
The method according to claim 2, wherein said replacing said first large data block with a second largest data block in said cache image and writing said small data block into said cache image The storage space corresponding to the first large data block specifically includes:

Determining, according to the access count value of the first large data block, starting from the header position of the management linked list, acquiring the second large data block, and determining the identifier and the access count value of the second large data block from Removing the management linked list, and writing the identifier of the first large data block and the access count value to the identifier of the third large data block in the management linked list; wherein the second large data block is In the management linked list, the access data is less than or equal to the first data block of the access count value of the first large data block; The third large data block is a first data block in the management link table after the identifier of the second large data block is deleted, and the access count value is greater than or equal to the access count value of the first large data block;

Deactivating a location flag corresponding to the second largest data block in the block allocation table BAT, and storing a starting offset position of the first large data block to a location corresponding to the first large data block in the BAT; The starting offset position of the first large data block is the starting offset position of the second large data block;

Writing the small data block into a storage space corresponding to a starting offset position of the first large data block in the cache image.
The method according to claim 3, wherein said adding said first large data block to said cache image and writing said small data block to said first large data block in said cache image Corresponding storage space, including:

And writing an identifier of the first large data block and an access count value into the management linked list according to an access count value of the first large data block;

Finding a free block in the cache image, and writing the small data block to the free block; wherein the free block is a storage space corresponding to the first large data block;

Writing a starting offset position of the free block to a location corresponding to the free block in the BAT.
The method according to claim 2, wherein if the first large data block is in a cache image, updating the identifier of the first large data block in the management linked list according to the access count value a location and the access count value, and writing the small data block to a storage space corresponding to the first large data block in the cache image.
A cache processing device, comprising:

a reading module, configured to receive a read input and output IO request, and read a small data block from the parent image according to the read IO request;

The checking module is configured to: if the first large data block that includes the small data block is not in the cache image, check whether the storage space of the cache image is full or whether the occupied ratio of the storage space is greater than or equal to a preset ratio Threshold value

a processing module, if the storage space of the cache image is full or the occupation ratio of the storage space is greater than or equal to the preset ratio threshold, replacing the first large data block with the first one of the cache image And storing the small data block into a storage space corresponding to the first large data block in the cache image.
The apparatus of claim 7 wherein said reading module is further for The access count value of the first large data block of the small data block is increased by one;

The processing module is further configured to write the identifier of the first large data block and the corresponding access count value into the management linked list.
The device according to claim 7 or 8, wherein the processing module is further configured to: if the storage space of the cache image is not full or the occupation ratio of the storage space is less than the preset ratio threshold, The first large data block is added to the cache image, and the small data block is written into a storage space corresponding to the first large data block in the cache image.
The device of claim 8 wherein the processing module comprises:

a second large data block obtaining unit, configured to sequentially search from the header position of the management linked list according to an access count value of the first large data block, acquire the second large data block, and use the second The identification and access count value of the big data block is removed from the management linked list, and the identifier and the access count value of the first large data block are written before the identifier of the third large data block in the management linked list; The second large data block is the first data block in the management linked list, and the access count value is less than or equal to the access count value of the first large data block; the third large data block is deleted. In the management chain table after the identification of the second largest data block, the access data is greater than or equal to the first data block of the access count value of the first large data block;

a location marking unit, configured to invalidate a location identifier corresponding to the second large data block in the BAT, and store a starting offset location of the first large data block to a location corresponding to the first large data block in the BAT; The starting offset position of the first large data block is a starting offset position of the second large data block;

And a small data block writing unit, configured to write the small data block into a storage space corresponding to a starting offset position of the first large data block in the cache image.
The device according to claim 9, wherein the processing module comprises:

An identifier writing unit, configured to write an identifier of the first large data block and an access count value into the management linked list according to an access count value of the first large data block;

a free block obtaining unit, configured to find a free block in the cache image, and write the small data block into the free block; wherein the free block is a storage corresponding to the first large data block space;

And a location writing unit, configured to write a starting offset position of the free block to a location corresponding to the free block in the BAT.
The method according to claim 8, wherein the processing module further comprises:

And an update unit, configured to: when the first large data block is in the cache image, update the location of the identifier of the first large data block and the access count in the management link table according to the access count value And storing the small data block into a storage space corresponding to the first large data block in the cache image.