WO1993000635A1 - Data storage management systems - Google Patents

Data storage management systems Download PDF

Info

Publication number
WO1993000635A1
WO1993000635A1 PCT/GB1992/001137 GB9201137W WO9300635A1 WO 1993000635 A1 WO1993000635 A1 WO 1993000635A1 GB 9201137 W GB9201137 W GB 9201137W WO 9300635 A1 WO9300635 A1 WO 9300635A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
data storage
management system
blocks
storage management
Prior art date
Application number
PCT/GB1992/001137
Other languages
French (fr)
Inventor
Alan Welsh Sinclair
Original Assignee
Anamartic Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anamartic Limited filed Critical Anamartic Limited
Publication of WO1993000635A1 publication Critical patent/WO1993000635A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1724Details of de-fragmentation performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/40Specific encoding of data in memory or cache
    • G06F2212/401Compressed data

Definitions

  • This invention relates to data storage management systems and in particular to systems for storing data which has a variable block size, such as may arise when data compression techniques are employed.
  • Data is held on the physical storage medium in corresponding blocks whose physical addresses can readily be derived from their logical addresses.
  • the logical to physical address mapping requirement constrains the device to operate with a fixed physical block size and for physical and logical addresses to follow the same order.
  • the physical block address is simply a function of the logical address and the physical block size.
  • the physical address space may not be continuous but may accommodate the characteristics of the storage medium
  • the effective capacity of a storage device can be increased by the use of data compression techniques, such as run-length encoding, which are commonly applied to data transmission to reduce the volume of data.
  • This type of technique can be applied in magnetic tape devices where data is written to and read from the tape in a continuous stream and fast random access of a block of data is not necessary. It can also be applied by either hardware or software means to a file of data to reduce its volume before it is passed to a device driver which interfaces with a storage device and transfers data in units of a fixed blocksize.
  • the logical address of a block of data is translated to a physical address algorithmically, that is, the physical location of the block can be computed from a knowledge of sequential block number, fixed blocksize and the characteristics of the medium.
  • This physical address is defined in terms of cylinder or track and sector and the physical blocks are ordered identically to the logical blocks. Deviation from such a logically sequential ordering of physical blocks would require the use of a lookup table for address translation which, for a high capacity disk, would be of a size which could reasonably be stored only on the disk itself.
  • the seek and latency times for retrieval from magnetic disk of address information from a look up table would unacceptably compromise the performance of the disk system.
  • a logical block is assigned a fixed physical area on a magnetic disk when the disk is first formatted.
  • One object of the present invention is to provide a scheme for efficient management of storage of data having a variable blocksize.
  • Figure 1 shows schematically one example of compression of a logical block of data
  • Figure 2 shows examples of continuous data space, and partitioned data space
  • Figure 3 shows an address map for use with continuous data space
  • Figure 4 shows various options for writing a data block to memory
  • Figure 5 shows various options for relocating a block o stored data
  • FIG. 6 shows a block diagram of the control system fo the storage management system of the present invention.
  • Figure 7 shows the storage system architecture for a system which includes a memory overflow device.
  • a data compression, system is resident in the data path between the device interface, which operates with fixed blocksize, and the memory controller which manages the storage of the resultant blocks of data of variable size.
  • the compression of a logical block of data is shown schematically in figure 1.
  • the logical data block is of fixed size, which may be defined during the initial formatting of the device.
  • the size of the compressed block is variable, and whilst is is typically significantly smaller than the logicl block, it may also be larger if the data proves to be uncompressable.
  • the physical blocksize is not infinitely variable but is constrained to be an integral number of capacity units of the storage device which may be designated as tiles.
  • the size of a tile is defined during initial formatting of the device and the maximum blocksize is set to typically 16 tiles. Thus the number of tiles in a block of data to be stored depends on the data compression ratio achieved for that block.
  • a tile is a physically contiguous rectangular area of N memory locations.
  • the Addressing scheme for a tile is set out in WO90/09634-A and the arrangement is such that the N memory locations are addressed by N successive counts of an address counter. This is achieved by interleaving at least the higher order address bits between row and column decoders.
  • a tile is not limited to a physically contiguous area of memory but is used to describe a set of consecutive memory addresses which may not by physically contiguous on the memory.
  • the data storage management technique employed has four elements;
  • Data is stored in the memory in the form of data blocks, whose length is variable, in discrete steps. That is to say each block comprises an integral number of smaller fixed capacity units called tiles.
  • a physical block may typically comprise up to 16 tiles.
  • the order in which the blocks are stored in the memory device is not constrained to be the same as their logical order and a block may be located without any fixed relationship ' to other logical tiles.
  • a look up table is required to translate a logical tile number to a physical address.
  • This address map may be held in a semiconductor memory which is either within the memory controller or is in a dedicated region of the memory device and is accessed for each data block transfer.
  • a record of the logical block number can be stored as a header with each data block so that any block read from physical memory may be identified.
  • the organisation of stored data blocks may be one of the following.
  • data is stored in a continuous area within the storage memory and blocks of differing size can be located adjacently, as shown in figure 2.
  • the memory space can be considered as cyclic and the head of the data space can wrap around to the bottom of the physical memory if it is free.
  • An address map provides the physical start address and physical block length for each logical block number, as shown in figure 3.
  • data is stored within discrete partitioned areas of the memory space according to the length of the data block.
  • all blocks have identical length and one partitioned area is allocated for each possible size of data block, as shown in figure 2.
  • the memory space can be treated as cyclic in the same manner as for the continuous data space.
  • the address map need contain only the physical block number related to the bottom address for the partitioning area, since all blocksizes are identical within the partition.
  • the partitions are of variable size and their boundaries may be moved by a process known as adaptive partitioning in accordance with the actual volume of data stored in each partitioned area. If a partitioned area cannot be expanded further because of data stored in adjacent areas, the partition may be fragmented in two or more non-adjacent areas. This may be achieved by simple management of the address pointers indicating the partition boundaries.
  • Data is written to the storage memory in blocks of ⁇ variable size.
  • a block When a block is modified its size may increase. For this reason there may not be sufficient space available at the address it previously occupied for it to be re-stored there. For this reason a write scheme is used whic will locate the block at a suitable physical location irrespective of whether the block exists elsewhere in the memory.
  • an entry is made in th address table to define its physical location thereby mapping its logical to its physical address. If the block previously existed elsewhere in memory, it becomes obsolete and a hole is effectively created in the data space. The start address and length of this hole is entered in a separate hole address table.
  • a head of the data space may be defined as an address immediately above which is an area of free memory which is large enough to guarantee storage of a data block of maximum size.
  • the exact length of the physical block is known before its location in memory need be determined and the block can be located in. a hole of the exact size to accommodate the block. Block and hole address tables are updated accordingly.
  • This scheme is similar to the write to exact and best fit schemes, but the hole selected for the write location is the first encountered which is large enough to accommodate the data block.
  • the data block from either the head or tail of the data space is read into a buffer memory and is written to a location where a hole exists in accordance with the write to exact, best or first fit schemes.
  • a hole in the data space is enlarged by relocation of a adjacent data block in accordance with any of the data write schemes .
  • Data may be swapped between the storage memory and an alternative storage device to which the memory controller also has access.
  • This facility may be used to accommodate data overflow which may occur from a mass storage device which operates with variable blocksize.
  • data compression When data compression is employed in a storage device, the variability of the data blocksize as a function of the data characteristics results in the the exact storage capacity of the device being undefined and therefore the possibility of a device overflow as a result of a data block write exists.
  • the storage device controller may write the data block to a location on an alternative storage device and record this location in the block address table.
  • the data block may be written in either compressed or uncompressed format.
  • the block may be relocated to the main storage device when sufficient memory space later becomes free.
  • This organisation is particularly suited to the use of a low cost magnetic disk as an overflow storage device for a high cost semiconductor solid state disk storage device.
  • the magnetic disk may also function as a backup non- olatile storage device.
  • Data blocks may also be relocated between the storage device memory and the alternative storage device. This allows data which has been written to the alternative storage device to be restored to the storage memory. It also allows data to be interchanged between the two storage devices in accordance with a cacheing algorithm resident in the storage device controller. This allows a high cost solid state disk to be configured as a high-speed cache memory to a much higher capacity, low cost, magnetic disk memory.
  • Figure 6 shows a block diagram of a data storage device which contains a hardware data compression facility and a storage management system for handling the variable blocksize resulting from data compression.
  • An interface bus 2 couples the system to a host computer and typically conforms to SCSI or other industry standards for peripheral device interfaces. Communications on the bus are controlled by the Host Controller (4) which contains a small buffer memory for tranfer of bursts of data on the bus.
  • a data Storage device (6) may typically be either semiconductor memory as in the case of a solid state disk or a magnetic medium in the case of a magnetic disk.
  • a data Controller (8) controls both the Data Storage device and all data transfers and manipulations within the system controller. It contains and executes the algorithms for data relocation and data swapping and performs all address translation and table management functions. The software required for these can be implemented in a straightforward manner by a man skilled in the art.
  • the data compressor (10) performs compression and decompression in real time on data which is transferred to it.
  • Data transfers are routed through a data buffer (12), which can store several blocks of data, via data channels (14) and a DMA Controller (16).
  • An interface Controller (11) provides access to alternative storage devices via another industry standard bus (20). Data transfers will typically occur in units of one block. Data is written from a host via bus 2, to the data storage device 6 via the DMA controller 16, the data buffer 12 and the data controller 8, and is read via the inverse path. Data is relocated within the Data Storage device by reading out to the data buffer 12 and then back to the storage device 6.
  • Data in compressed format is swapped between the Data Storage device and an alternative storage device on bus 20 via the data controller 8, the DMA controller 16, the data buffer 12, and the interface controller 18 and is restored via the inverse path.
  • Interface Controller (18) may operate with a fixed blocksize and may store data in compressed format with the addition of a pad.
  • Data in uncompressed format is swapped by being routed from the data storage device to the bus 20 via the data compressor 10 and restored via the inverse path. Data is transferred directly between a host on Bus A and a backup storage device on Bus B via the host controller 4, the DMA controller 16, and the buffer 12 and the interface controller 18.
  • a high performance memory system will use semiconductor memory for the data storage device 6 and may take the form of a solid state disk.
  • the host controller (4) need not operate on an I/O channel and may be coupled directly to the CPU bus of a host computer system.
  • the storage system will normally incorporate a magnetic disk as a backup storage device to provide non-volatility for the stored data, and the system will be configured as shown in figure 7.
  • a solid state disk 24 is connected to a host via bus A and to a backup storage device 26 (e.g. a magnetic disk) and to other storage devices 28 via a bus B.
  • the cost of the additional magnetic disk is low relative to the cost of semiconductor memory. Because semiconductor data storage is relatively expensive in comparison with magnetic storage, optimised us of data compression is very desirable.
  • FIG. 7 allows the system to continue operation i the event of a solid state disk overflow with only a small impact on performance.
  • a magnetic disk 26 of capacity at least as large as the maximum logical capacity of the solid state disk 24 is connected as a backup storage device on bus to provide non-volatility when power is disconnected. If t physical capacity of the-solid state disk is filled before i stated logical capacity, the storage system controller may locate additional data blocks which are written by the host the magnetic disk. This results in a performance degradatio rather than a storage system failure, but will occur very infrequently.
  • the frequency of occurrence is a direct function of the logical capacity to physical capacity ratio which is assumed for the solid state disk and it is therefor possible to trade off average input/output rate for the soli state disk against logical capacity.
  • One implementation of the management system operates with the storage system controller configuring the data storage device as a continuous data space.
  • a hole is define as a gap in the data space which is not guaranteed to be lar enough to accommodate a data block of undefined exact size.
  • head is defined as a gap in the data space which is guarante to be large enough to accommodate a data block of undefined exact size.
  • a head may wrap around from above the highest address in the data space to below the lowest address.
  • Read and write of data blocks is given priority by the controller and relocation of data blocks is run as a background task, which can run concurrently if the data bandwidth to the data storage device 6 on channel is greater than that to the host on bus 2.
  • the controller manages data write and date relocation according to the following algorithms.
  • a medium performance memory system such as a magne disk
  • the characteristics of the storage medium dictate a different choice of options for data storage, data write an data relocation but the same architecture for the system controller as shown in figure 6 can be applied.
  • the data storage is organised as partitioned data spa with variable blocksize.
  • the partition arranged to be a track or cylinder on the disk and is fixed when the disk is formatted. Sectors need not be formatted within a track or cylinder, therefore each partition has capacity for an unknown number of data blocks of variable blocksize.
  • Each block is written within the partition with an associated header to identify its logical block number and t block address table need only identify the partition within which each logical block is located, that is, the block's track or cylinder number.
  • a block is read by reading a partition until the target block is identified from its header.
  • the access time is the time to locate the partition, the latency.
  • the average latency is hal a rotation of the disk which is the time to read 50% of a partition.
  • Data of variable blocksize is always written within a partition at the head of the data space.
  • the controller selects the partition to which to allocate a block on the basis of the head size available in each partition.
  • the operation of writing a block is combined with the operation o relocating blocks within the partition using the compact partition scheme.
  • the complete data space from the partition is firs read into the data buffer and compacted to eliminate any hole which exist.
  • the block being written . is then appended to the end of this data space, and the compacted partition written back to the data storage device.
  • a hole address table for each partition is maintained s that obsolete blocks may be identified and eliminated when compaction is performed in the data buffer.
  • the logical number of the block is entered in the hole address table for that other partition.
  • the typical latency is half a rotation of the disk, which is the time to read or write half a track.
  • the variable blocksize management scheme requires two rotations of the disk to perform the compaction of the partition.
  • average seek time for a write is 14mS and averge latency is 6mS, giving a typical write time of 20mS.
  • the average increase in disk access time for the variable blocksize management scheme is 18%.
  • variable blocksize storage management techniques support incorporation of a data compression facility within a data storage device in a manner which is transparent to the host system.
  • the system architecture allows use of a magnetic disk as an overflow device for a solid state disk memory with data compression to eliminate system errors resulting from the undefined capacity of the compressed data storage. Compromises can be made between solid state disk capacity and average input/output time.
  • the storage management enables introduction of data compression techniques to magnetic disk at the expense of only a modest increase in device access time.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data storage management system sotres logical blocks of data in a main memory device (6). A data compressor (10) compresses the blocks prior to storage and thereby produces blocks having various sizes. A data controller (8) stores the blocks at physical locations in free memory space. A supplementary memory device (fig. 7, 26) is used for the storage of logical blocks whose physical size exceeds the largest area of free memory space available in the main memory device. Blocks of data are stored within partitions in the memory device, along with an identifying header. A block address table then stores the partition address corresponding to the partition in which the logical block is stored.

Description

DATA STORAGE MANAGEMENT SYSTEMS
This invention relates to data storage management systems and in particular to systems for storing data which has a variable block size, such as may arise when data compression techniques are employed.
In many data storage devices such as magnetic disk, optical disk and magnetic tape data is stored in units of fixed size called blocks, which may typically be of size 512 bytes. Each block is allocated a logical address which uniquely identifies it and the set of logical block addresses forms a continuous series.
Data is held on the physical storage medium in corresponding blocks whose physical addresses can readily be derived from their logical addresses. For a random block access storage device such as a magnetic disk, the logical to physical address mapping requirement constrains the device to operate with a fixed physical block size and for physical and logical addresses to follow the same order. The physical block address is simply a function of the logical address and the physical block size. The physical address space may not be continuous but may accommodate the characteristics of the storage medium
The effective capacity of a storage device can be increased by the use of data compression techniques, such as run-length encoding, which are commonly applied to data transmission to reduce the volume of data. This type of technique can be applied in magnetic tape devices where data is written to and read from the tape in a continuous stream and fast random access of a block of data is not necessary. It can also be applied by either hardware or software means to a file of data to reduce its volume before it is passed to a device driver which interfaces with a storage device and transfers data in units of a fixed blocksize. However, it has not been possible to incorporate such data compression within the immediate control structure of a random block access storage device such as magnetic disk because of the difficulty of managing the variable physical blocksize which results from compression of fixed logical block of data.
For a magnetic disk, the logical address of a block of data is translated to a physical address algorithmically, that is, the physical location of the block can be computed from a knowledge of sequential block number, fixed blocksize and the characteristics of the medium. This physical address is defined in terms of cylinder or track and sector and the physical blocks are ordered identically to the logical blocks. Deviation from such a logically sequential ordering of physical blocks would require the use of a lookup table for address translation which, for a high capacity disk, would be of a size which could reasonably be stored only on the disk itself. The seek and latency times for retrieval from magnetic disk of address information from a look up table would unacceptably compromise the performance of the disk system.
Thus a logical block is assigned a fixed physical area on a magnetic disk when the disk is first formatted. This presents a major obstacle to effective use of any technique such as data compression which results in variation in the size of a block data to be stored. Any increase in size modification of a previously stored block cannot be tolerated if no spare capacity for the block has been allowed and the alternative of allocating sufficent storage capacity to a block to accommodate its maximum size for worst case data compression ratio is not viable because of the wide spread of data compression efficiency for different types of data.
Current techniques for incorporation of data compression in magnetic disk storage systems are confined to operation during data transfer to the device driver for the magnetic disk. The device driver accesses the disk with a single fixed blocksize, or with a very limited range of fixed blocksizes if partitioning of the disk is employed.
Another problem which can arise after data compression is the fact that some data is very incompressible and may result in a larger blocksize than that of the original data. Clearly this presents a problem if the compressed blocksize is larger than the blocksize on the memory device.
One object of the present invention is to provide a scheme for efficient management of storage of data having a variable blocksize.
The invention is defined in the appended claims to which reference should now be made.
The invention will now be described in detail by way of exampl with reference to the accompanying drawings in which:
Figure 1 shows schematically one example of compression of a logical block of data;
Figure 2 shows examples of continuous data space, and partitioned data space;
Figure 3 shows an address map for use with continuous data space; Figure 4 shows various options for writing a data block to memory;
Figure 5 shows various options for relocating a block o stored data;
Figure 6 shows a block diagram of the control system fo the storage management system of the present invention; and
Figure 7 shows the storage system architecture for a system which includes a memory overflow device.
In this description of the invention no specific details of the software used by the management system are given since this can be implemented straightforwardly by a man skilled in the art.
In the context of this invention. It is assumed that a data compression, system is resident in the data path between the device interface, which operates with fixed blocksize, and the memory controller which manages the storage of the resultant blocks of data of variable size. The compression of a logical block of data is shown schematically in figure 1. The logical data block is of fixed size, which may be defined during the initial formatting of the device. The size of the compressed block is variable, and whilst is is typically significantly smaller than the logicl block, it may also be larger if the data proves to be uncompressable. The physical blocksize is not infinitely variable but is constrained to be an integral number of capacity units of the storage device which may be designated as tiles. The size of a tile is defined during initial formatting of the device and the maximum blocksize is set to typically 16 tiles. Thus the number of tiles in a block of data to be stored depends on the data compression ratio achieved for that block.
A tile is a physically contiguous rectangular area of N memory locations. The Addressing scheme for a tile is set out in WO90/09634-A and the arrangement is such that the N memory locations are addressed by N successive counts of an address counter. This is achieved by interleaving at least the higher order address bits between row and column decoders.
For the purposes of the current invention a tile is not limited to a physically contiguous area of memory but is used to describe a set of consecutive memory addresses which may not by physically contiguous on the memory.
The data storage management technique employed has four elements;
(1) the organisation of the storage of data blocks in physical memory and the means of logical to physical address translation
(2) the operation of writing a data block to the storage memory
(3) the relocation of data blocks within the storage memory to eliminate fragmentation of the available memory space
(4) the operation of swapping a data block between the storage memory and another storage device
Data is stored in the memory in the form of data blocks, whose length is variable, in discrete steps. That is to say each block comprises an integral number of smaller fixed capacity units called tiles. A physical block may typically comprise up to 16 tiles. The order in which the blocks are stored in the memory device is not constrained to be the same as their logical order and a block may be located without any fixed relationship'to other logical tiles. To accommodate this storage scheme, a look up table is required to translate a logical tile number to a physical address. This address map may be held in a semiconductor memory which is either within the memory controller or is in a dedicated region of the memory device and is accessed for each data block transfer. A record of the logical block number can be stored as a header with each data block so that any block read from physical memory may be identified.
The organisation of stored data blocks may be one of the following.
(i) Continuous data space.
With this organisation, data is stored in a continuous area within the storage memory and blocks of differing size can be located adjacently, as shown in figure 2. The memory space can be considered as cyclic and the head of the data space can wrap around to the bottom of the physical memory if it is free. An address map provides the physical start address and physical block length for each logical block number, as shown in figure 3.
(ii) Partitioned data space, fixed blocksize.
With this organisation, data is stored within discrete partitioned areas of the memory space according to the length of the data block. Within one partitioned area all blocks have identical length and one partitioned area is allocated for each possible size of data block, as shown in figure 2. The memory space can be treated as cyclic in the same manner as for the continuous data space. The address map need contain only the physical block number related to the bottom address for the partitioning area, since all blocksizes are identical within the partition. The partitions are of variable size and their boundaries may be moved by a process known as adaptive partitioning in accordance with the actual volume of data stored in each partitioned area. If a partitioned area cannot be expanded further because of data stored in adjacent areas, the partition may be fragmented in two or more non-adjacent areas. This may be achieved by simple management of the address pointers indicating the partition boundaries.
(iii) Partitioned data space variable blocksize.
With this organisation, data is stored within discrete partioned areas of the memory space but blocks of differing size can be located adjacently, as shown n figure 2. The partitions are of fixed size and location. This technique is particularly appropriate to data storage on magnetic disk, where a partioned area could be a track or cylinder on the disk. The address map need contain only the address of the partioned area at which each logical block number is located.
Data is written to the storage memory in blocks of variable size. When a block is modified its size may increase. For this reason there may not be sufficient space available at the address it previously occupied for it to be re-stored there. For this reason a write scheme is used whic will locate the block at a suitable physical location irrespective of whether the block exists elsewhere in the memory. Whenever a block is written, an entry is made in th address table to define its physical location thereby mapping its logical to its physical address. If the block previously existed elsewhere in memory, it becomes obsolete and a hole is effectively created in the data space. The start address and length of this hole is entered in a separate hole address table.
Options for locating a data block during a write operation are shown in figure 4 and are as follows.
(i) Write to head of data space.
Data is written at the first available address above the highest address occupied by data, and an entry is made in the block address table. The length of the data block need not be known prior to starting the write operation and so this scheme does not require prior buffering of the write data stream, unless it is implemented in conjunction with the partitioned data space with fixed blocksize organisation. A head of the data space may be defined as an address immediately above which is an area of free memory which is large enough to guarantee storage of a data block of maximum size. Several heads may therefore exist concurrently within the data space and these can be marked by address pointers in a separate head address table.
(ii) Write to exact fit.
If the data stream is buffered prior to starting the write operation, the exact length of the physical block is known before its location in memory need be determined and the block can be located in. a hole of the exact size to accommodate the block. Block and hole address tables are updated accordingly.
(iii) Write to best fit. This scheme is similar to the write to exact fit scheme but the hole selected for the write location is one with closest available size to the data block.
(iv) Write to first fit.
This scheme is similar to the write to exact and best fit schemes, but the hole selected for the write location is the first encountered which is large enough to accommodate the data block.
The schemes described for writing a block of data have the effect of introducing holes in the data space where blocks had previously been located and this leads to fragmentation of the unused memory space and effective reduction of the available memory space for writing data. To combat this fragmentation problem, a data relocation scheme which involves physical movement of blocks of data within the storage memory can be adopted. A block or group of blocks is read from the storage memory to a data buffer and is relocated by a write operation to a different location in memory. Block, hole and head address tables are updated accordingly.
Options for data relocation are shown in figure 5 and are as follows.
(i) Fill hole from head or tail.
The data block from either the head or tail of the data space is read into a buffer memory and is written to a location where a hole exists in accordance with the write to exact, best or first fit schemes.
(ii) Enlarge hole.
A hole in the data space is enlarged by relocation of a adjacent data block in accordance with any of the data write schemes .
(iii) Compact partition.
With this technique, all data blocks residing in a partition of the memory space are read into a buffer memory and are written back to the same partition after compaction to eliminate holes.
Data may be swapped between the storage memory and an alternative storage device to which the memory controller also has access. This facility may be used to accommodate data overflow which may occur from a mass storage device which operates with variable blocksize. When data compression is employed in a storage device, the variability of the data blocksize as a function of the data characteristics results in the the exact storage capacity of the device being undefined and therefore the possibility of a device overflow as a result of a data block write exists. In such circumstances, the storage device controller may write the data block to a location on an alternative storage device and record this location in the block address table. The data block may be written in either compressed or uncompressed format. The block may be relocated to the main storage device when sufficient memory space later becomes free. This organisation is particularly suited to the use of a low cost magnetic disk as an overflow storage device for a high cost semiconductor solid state disk storage device. The magnetic disk may also function as a backup non- olatile storage device.
Data blocks may also be relocated between the storage device memory and the alternative storage device. This allows data which has been written to the alternative storage device to be restored to the storage memory. It also allows data to be interchanged between the two storage devices in accordance with a cacheing algorithm resident in the storage device controller. This allows a high cost solid state disk to be configured as a high-speed cache memory to a much higher capacity, low cost, magnetic disk memory.
Figure 6 shows a block diagram of a data storage device which contains a hardware data compression facility and a storage management system for handling the variable blocksize resulting from data compression.
An interface bus 2 couples the system to a host computer and typically conforms to SCSI or other industry standards for peripheral device interfaces. Communications on the bus are controlled by the Host Controller (4) which contains a small buffer memory for tranfer of bursts of data on the bus. A data Storage device (6) may typically be either semiconductor memory as in the case of a solid state disk or a magnetic medium in the case of a magnetic disk. A data Controller (8) controls both the Data Storage device and all data transfers and manipulations within the system controller. It contains and executes the algorithms for data relocation and data swapping and performs all address translation and table management functions. The software required for these can be implemented in a straightforward manner by a man skilled in the art. It also contains a block buffer memory for data transferred from a data compressor 10 so that the blocksize of a compressed block may be determined prior to allocation of a physical address for the block. The data compressor (10) performs compression and decompression in real time on data which is transferred to it. Data transfers are routed through a data buffer (12), which can store several blocks of data, via data channels (14) and a DMA Controller (16). An interface Controller (11) provides access to alternative storage devices via another industry standard bus (20). Data transfers will typically occur in units of one block. Data is written from a host via bus 2, to the data storage device 6 via the DMA controller 16, the data buffer 12 and the data controller 8, and is read via the inverse path. Data is relocated within the Data Storage device by reading out to the data buffer 12 and then back to the storage device 6. Data in compressed format is swapped between the Data Storage device and an alternative storage device on bus 20 via the data controller 8, the DMA controller 16, the data buffer 12, and the interface controller 18 and is restored via the inverse path. Interface Controller (18) may operate with a fixed blocksize and may store data in compressed format with the addition of a pad. Data in uncompressed format is swapped by being routed from the data storage device to the bus 20 via the data compressor 10 and restored via the inverse path. Data is transferred directly between a host on Bus A and a backup storage device on Bus B via the host controller 4, the DMA controller 16, and the buffer 12 and the interface controller 18.
A high performance memory system will use semiconductor memory for the data storage device 6 and may take the form of a solid state disk. However, the host controller (4) need not operate on an I/O channel and may be coupled directly to the CPU bus of a host computer system. The storage system will normally incorporate a magnetic disk as a backup storage device to provide non-volatility for the stored data, and the system will be configured as shown in figure 7. A solid state disk 24 is connected to a host via bus A and to a backup storage device 26 (e.g. a magnetic disk) and to other storage devices 28 via a bus B. The cost of the additional magnetic disk is low relative to the cost of semiconductor memory. Because semiconductor data storage is relatively expensive in comparison with magnetic storage, optimised us of data compression is very desirable. This, however, lead to an increased probability of solid state disk overflow because the data compression efficiency actually achieved h not been sufficient to provide the logical capacity assumed when the system was first initialised. The configuration shown in figure 7 allows the system to continue operation i the event of a solid state disk overflow with only a small impact on performance. A magnetic disk 26 of capacity at least as large as the maximum logical capacity of the solid state disk 24 is connected as a backup storage device on bus to provide non-volatility when power is disconnected. If t physical capacity of the-solid state disk is filled before i stated logical capacity, the storage system controller may locate additional data blocks which are written by the host the magnetic disk. This results in a performance degradatio rather than a storage system failure, but will occur very infrequently. The frequency of occurrence is a direct function of the logical capacity to physical capacity ratio which is assumed for the solid state disk and it is therefor possible to trade off average input/output rate for the soli state disk against logical capacity.
One implementation of the management system operates with the storage system controller configuring the data storage device as a continuous data space. A hole is define as a gap in the data space which is not guaranteed to be lar enough to accommodate a data block of undefined exact size. head is defined as a gap in the data space which is guarante to be large enough to accommodate a data block of undefined exact size. A head may wrap around from above the highest address in the data space to below the lowest address. Read and write of data blocks is given priority by the controller and relocation of data blocks is run as a background task, which can run concurrently if the data bandwidth to the data storage device 6 on channel is greater than that to the host on bus 2.
The controller manages data write and date relocation according to the following algorithms.
Data Write Algorithm
If data block matches a hole with exact fit then
write data block to hole of exact fi update block and hole address tables else
write data block to smallest head update block and head address tables end
Data Relocation Algorithm
If number of holes > 0 then perform hole relocation else perform head relocation end Hole Relocation Routine
If data block following largest hole matches a hole with exact fit then move data block to hole of exact fit update block and hole address tabl else move data block to smallest head update block and head address tabl end
Head Relocation Routine
If number of heads > 1 then move data block following largest head to smallest head update block and head address tabl end
In a medium performance memory system such as a magne disk, the characteristics of the storage medium dictate a different choice of options for data storage, data write an data relocation but the same architecture for the system controller as shown in figure 6 can be applied.
The data storage is organised as partitioned data spa with variable blocksize. For magnetic disk, the partition arranged to be a track or cylinder on the disk and is fixed when the disk is formatted. Sectors need not be formatted within a track or cylinder, therefore each partition has capacity for an unknown number of data blocks of variable blocksize.
Each block is written within the partition with an associated header to identify its logical block number and t block address table need only identify the partition within which each logical block is located, that is, the block's track or cylinder number. A block is read by reading a partition until the target block is identified from its header. The access time is the time to locate the partition, the latency. For a magnetic disk, the average latency is hal a rotation of the disk which is the time to read 50% of a partition.
Data of variable blocksize is always written within a partition at the head of the data space. The controller selects the partition to which to allocate a block on the basis of the head size available in each partition. The operation of writing a block is combined with the operation o relocating blocks within the partition using the compact partition scheme. When a block is to be written in a partition, the complete data space from the partition is firs read into the data buffer and compacted to eliminate any hole which exist. The block being written .is then appended to the end of this data space, and the compacted partition written back to the data storage device.
A hole address table for each partition is maintained s that obsolete blocks may be identified and eliminated when compaction is performed in the data buffer. When a block is written which has previously been located in another partition, the logical number of the block is entered in the hole address table for that other partition.
For conventional operation of a magnetic disk with fixed blocksize, the typical latency is half a rotation of the disk, which is the time to read or write half a track. The variable blocksize management scheme requires two rotations of the disk to perform the compaction of the partition. For a typical disk, average seek time for a write is 14mS and averge latency is 6mS, giving a typical write time of 20mS. For the variable blocksize management scheme the typical write time is 14mS + (2xl2)mS = 38mS. For a typical read to write ratio of 4 to 1, the average increase in disk access time for the variable blocksize management scheme is 18%.
The variable blocksize storage management techniques support incorporation of a data compression facility within a data storage device in a manner which is transparent to the host system.
The system architecture allows use of a magnetic disk as an overflow device for a solid state disk memory with data compression to eliminate system errors resulting from the undefined capacity of the compressed data storage. Compromises can be made between solid state disk capacity and average input/output time. The storage management enables introduction of data compression techniques to magnetic disk at the expense of only a modest increase in device access time.

Claims

1. A data storage management system for storing logical blocks of data in a main memory device comprising means for compressing the logical blocks of data to produce compressed blocks having varying sizes, means for storing the blocks at physical locations in free memory space in the memory device, and a supplementary memory device for storage of logical blocks of data whose size exceeds the largest area of free memory space available in the main memory device.
2. A data storage management system according to- claim 1 including means for relocating blocks stored in the supplementary memory device in the main memory when sufficient free memory space becomes available.
3. A data storage management system according to claim 1 or 2 in which the main memory device comprises a solid state memory.
4. A data storage management system according to claim 1, 2, or 3 in which the supplementary memory device comprises a magnetic disc.
5. A data storage management system according to any preceding claim in which the storing means is arranged to write a data block to a physical address above which is an area of free memory space large enough to store a data block of maximum possible size.
6. A data storage management system according to any preceding claim including means for storing a list of the physical addresses of all areas of free memory space and their sizes.
7. A data storage management system according to claim 6 in which the storing means stores a logical block of data in an area of free memory space of identical physical size to the physical size of the logical block.
8. A data storage management system according to claim 6 in which the storing means stores a logical block of data in an area of free memory space whose physical size is closest to that of the logical block.
9. A data storage management system according to any preceding claim including means for relocating stored blocks of data thereby reducing fragmentation of free memory space.
10. A data storage management system according to any preceding claim including means for compacting logical blocks stored within a partition in the memory device thereby reducing fragmentation of free memory space in that partition.
11. A data storage management system according to claim 10 in which the compacting means comprises means for relocating all blocks stored in a partition to a buffer memory, and means for re-storing the logical blocks sequentially from the start of the partition.
12.- A data storage management system according to any preceding claim in which the storing means stores logical blocks of data in partitions within the memory along with an identifying header and stores a partition address for that logical block in a block address table.
13. A data storage management system according to claim 12 including means responsive to a logical block address to read a partition address from the block address table, means for searching a partition for a header identifying a logical block, and means for reading that logical block.
14. A data storage management system for storing logical blocks of data of variable size comprising a memory device divided into a plurality of partitions, means for storing a logical block of data in a partition along with an identifying header, and means for storing a partition address, corresponding to the partition in which the logical block is stored, in a block address table.
15. A data storage management system according to claim 15 including means responsive to the logical block address to read a partition address from a block address table, means for searching a partition for a header identifying the logical block, and means for reading the logical block.
16. A data storage managmenet system according to claim 14 or 15 including, a hole address table for each partition storing the physical addresses of all areas of free memory space within a respective partition.
17. A data storage management system according to claim 14, 15, or 16 including means for compacting all blocks of data stored within a partition prior to writing a new block of data into the partition.
18. A data storage management system for a memory device comprising input means receiving logical blocks of data, means for compressing the data in the blocks thereby altering the block sizes of the received logical blocks, means for storing the compressed blocks at physical locations in the memory device such that the fragmentation of free memory space is minimised, and means for translating logical addresses of stored blocks to physical addresses in the memory device in response to a request to access a logical block.
19. A data storage management system according to claim 18 including means for storing a list of the physical addresses of all areas of free memory space and their sizes.
20. A data storage management system according to claim 19 in which the storing means stores a logical block of data in an area of free memory space of identical physical size to the physical size of the logical block.
21. A data storage management system according to claim 19 in which the storing means stores a logical block of data in an area of free memory space whose physical size is closest to that of the logical block.
22. A data storage management system according to any one of claims 18 to 21 including means for relocating stored blocks of data.
23. A data storage management system according to any of claims 18 to 22 including means for compacting logical blocks of data stored within a partition in the memory device.
24. A data storage management system according to claim 23 in which the compacting means comprises means for relocating all blocks stored in a partition to a memory buffer and means for re-storing the logical blocks sequentially from the start of the partition.
25. A data storage management system substantially as herein described with reference to the accompanying drawings.
PCT/GB1992/001137 1991-06-21 1992-06-22 Data storage management systems WO1993000635A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB919113469A GB9113469D0 (en) 1991-06-21 1991-06-21 Data storage management systems
GB9113469.2 1991-06-21

Publications (1)

Publication Number Publication Date
WO1993000635A1 true WO1993000635A1 (en) 1993-01-07

Family

ID=10697117

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1992/001137 WO1993000635A1 (en) 1991-06-21 1992-06-22 Data storage management systems

Country Status (2)

Country Link
GB (1) GB9113469D0 (en)
WO (1) WO1993000635A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1898301A2 (en) * 2006-09-01 2008-03-12 Siemens VDO Automotive AG Database system, method for operating a database system and computer program product
WO2008030672A2 (en) * 2006-09-06 2008-03-13 International Business Machines Corporation Systems and methods for masking latency of memory reorganization work in a compressed memory system
WO2008039527A2 (en) * 2006-09-27 2008-04-03 Network Appliance, Inc. Method and apparatus for defragmenting a storage device
WO2008042283A2 (en) * 2006-09-28 2008-04-10 Network Appliance, Inc. Write-in-place within a write-anywhere filesystem
WO2009006099A2 (en) * 2007-06-28 2009-01-08 Qualcomm Incorporated An efficient image compression scheme to minimize storage and bus bandwidth requirements
US7603530B1 (en) 2005-05-05 2009-10-13 Seagate Technology Llc Methods and structure for dynamic multiple indirections in a dynamically mapped mass storage device
US7617358B1 (en) 2005-05-05 2009-11-10 Seagate Technology, Llc Methods and structure for writing lead-in sequences for head stability in a dynamically mapped mass storage device
US7620772B1 (en) 2005-05-05 2009-11-17 Seagate Technology, Llc Methods and structure for dynamic data density in a dynamically mapped mass storage device
US7653847B1 (en) 2005-05-05 2010-01-26 Seagate Technology Llc Methods and structure for field flawscan in a dynamically mapped mass storage device
US7685360B1 (en) 2005-05-05 2010-03-23 Seagate Technology Llc Methods and structure for dynamic appended metadata in a dynamically mapped mass storage device
US7752491B1 (en) 2005-05-05 2010-07-06 Seagate Technology Llc Methods and structure for on-the-fly head depopulation in a dynamically mapped mass storage device
US7916421B1 (en) 2005-05-05 2011-03-29 Seagate Technology Llc Methods and structure for recovery of write fault errors in a dynamically mapped mass storage device
US20110099350A1 (en) * 2009-10-23 2011-04-28 Seagate Technology Llc Block boundary resolution for mismatched logical and physical block sizes
US8019925B1 (en) * 2004-05-06 2011-09-13 Seagate Technology Llc Methods and structure for dynamically mapped mass storage device
WO2014114947A1 (en) * 2013-01-24 2014-07-31 Acunu Ltd Method and system for allocating space on a storage device
GB2519211A (en) * 2013-08-16 2015-04-15 Lsi Corp Translation layer partitioned between host and controller
US9329991B2 (en) 2013-01-22 2016-05-03 Seagate Technology Llc Translation layer partitioned between host and controller

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2226705A1 (en) * 1973-04-23 1974-11-15 Honeywell Inf Systems
US4075694A (en) * 1975-10-23 1978-02-21 Telefonaktiebolaget L M Ericsson Apparatus in connection with a computer memory for enabling transportation of an empty memory field from one side to the other of an adjacent data field while the computer is operative
JPS62257553A (en) * 1986-04-30 1987-11-10 Toshiba Corp Disk controller

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2226705A1 (en) * 1973-04-23 1974-11-15 Honeywell Inf Systems
US4075694A (en) * 1975-10-23 1978-02-21 Telefonaktiebolaget L M Ericsson Apparatus in connection with a computer memory for enabling transportation of an empty memory field from one side to the other of an adjacent data field while the computer is operative
JPS62257553A (en) * 1986-04-30 1987-11-10 Toshiba Corp Disk controller

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
COMPUTER ARCHITECTURE NEWS. vol. 19, no. 2, April 1991, IEEE,WASHINGTON D.C., US pages 96 - 107; APPEL ET AL.: 'Virtual memory primitives for user programs' *
PATENT ABSTRACTS OF JAPAN vol. 12, no. 136 (P-694)26 April 1988 & JP,A,62 257 553 ( TOSHIBA ) 10 November 1987 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8019925B1 (en) * 2004-05-06 2011-09-13 Seagate Technology Llc Methods and structure for dynamically mapped mass storage device
US7617358B1 (en) 2005-05-05 2009-11-10 Seagate Technology, Llc Methods and structure for writing lead-in sequences for head stability in a dynamically mapped mass storage device
US7603530B1 (en) 2005-05-05 2009-10-13 Seagate Technology Llc Methods and structure for dynamic multiple indirections in a dynamically mapped mass storage device
US7620772B1 (en) 2005-05-05 2009-11-17 Seagate Technology, Llc Methods and structure for dynamic data density in a dynamically mapped mass storage device
US7752491B1 (en) 2005-05-05 2010-07-06 Seagate Technology Llc Methods and structure for on-the-fly head depopulation in a dynamically mapped mass storage device
US7685360B1 (en) 2005-05-05 2010-03-23 Seagate Technology Llc Methods and structure for dynamic appended metadata in a dynamically mapped mass storage device
US7653847B1 (en) 2005-05-05 2010-01-26 Seagate Technology Llc Methods and structure for field flawscan in a dynamically mapped mass storage device
US7916421B1 (en) 2005-05-05 2011-03-29 Seagate Technology Llc Methods and structure for recovery of write fault errors in a dynamically mapped mass storage device
EP1898301A3 (en) * 2006-09-01 2009-10-07 Continental Automotive GmbH Database system, method for operating a database system and computer program product
EP1898301A2 (en) * 2006-09-01 2008-03-12 Siemens VDO Automotive AG Database system, method for operating a database system and computer program product
US8122216B2 (en) 2006-09-06 2012-02-21 International Business Machines Corporation Systems and methods for masking latency of memory reorganization work in a compressed memory system
WO2008030672A2 (en) * 2006-09-06 2008-03-13 International Business Machines Corporation Systems and methods for masking latency of memory reorganization work in a compressed memory system
WO2008030672A3 (en) * 2006-09-06 2008-05-08 Ibm Systems and methods for masking latency of memory reorganization work in a compressed memory system
US7562203B2 (en) 2006-09-27 2009-07-14 Network Appliance, Inc. Storage defragmentation based on modified physical address and unmodified logical address
WO2008039527A2 (en) * 2006-09-27 2008-04-03 Network Appliance, Inc. Method and apparatus for defragmenting a storage device
WO2008039527A3 (en) * 2006-09-27 2008-07-24 Network Appliance Inc Method and apparatus for defragmenting a storage device
WO2008042283A2 (en) * 2006-09-28 2008-04-10 Network Appliance, Inc. Write-in-place within a write-anywhere filesystem
US7562189B2 (en) 2006-09-28 2009-07-14 Network Appliance, Inc. Write-in-place within a write-anywhere filesystem
WO2008042283A3 (en) * 2006-09-28 2008-07-03 Network Appliance Inc Write-in-place within a write-anywhere filesystem
US8331663B2 (en) 2007-06-28 2012-12-11 Qualcomm Incorporated Efficient image compression scheme to minimize storage and bus bandwidth requirements
WO2009006099A2 (en) * 2007-06-28 2009-01-08 Qualcomm Incorporated An efficient image compression scheme to minimize storage and bus bandwidth requirements
WO2009006099A3 (en) * 2007-06-28 2009-02-19 Qualcomm Inc An efficient image compression scheme to minimize storage and bus bandwidth requirements
EP2012544A3 (en) * 2007-06-28 2009-03-11 Qualcomm Incorporated An efficient image compression scheme to minimize storage and bus bandwidth requirements
KR101139563B1 (en) * 2007-06-28 2012-04-27 콸콤 인코포레이티드 An efficient image compression scheme to minimize storage and bus bandwidth requirements
US20110099350A1 (en) * 2009-10-23 2011-04-28 Seagate Technology Llc Block boundary resolution for mismatched logical and physical block sizes
US8745353B2 (en) * 2009-10-23 2014-06-03 Seagate Technology Llc Block boundary resolution for mismatched logical and physical block sizes
US9329991B2 (en) 2013-01-22 2016-05-03 Seagate Technology Llc Translation layer partitioned between host and controller
WO2014114947A1 (en) * 2013-01-24 2014-07-31 Acunu Ltd Method and system for allocating space on a storage device
GB2519211A (en) * 2013-08-16 2015-04-15 Lsi Corp Translation layer partitioned between host and controller

Also Published As

Publication number Publication date
GB9113469D0 (en) 1991-08-07

Similar Documents

Publication Publication Date Title
US6360300B1 (en) System and method for storing compressed and uncompressed data on a hard disk drive
JP4051375B2 (en) System and method for using compressed main memory based on compressibility
US5237460A (en) Storage of compressed data on random access storage devices
US6467021B1 (en) Data storage system storing data of varying block size
US6996669B1 (en) Cluster-based cache memory allocation
US6857045B2 (en) Method and system for updating data in a compressed read cache
US6941420B2 (en) Log-structure array
KR100216146B1 (en) Data compression method and structure for a direct access storage device
US6134062A (en) Method and apparatus for increasing disc drive performance
US6968424B1 (en) Method and system for transparent compressed memory paging in a computer system
US6233648B1 (en) Disk storage system and data update method used therefor
US6449689B1 (en) System and method for efficiently storing compressed data on a hard disk drive
JP2783748B2 (en) Method and apparatus for data transfer to auxiliary storage in a dynamically mapped data storage system
US6587919B2 (en) System and method for disk mapping and data retrieval
WO1993000635A1 (en) Data storage management systems
US5666560A (en) Storage method and hierarchical padding structure for direct access storage device (DASD) data compression
US6028725A (en) Method and apparatus for increasing disc drive performance
US20020099907A1 (en) System and method for storing data sectors with header and trailer information in a disk cache supporting memory compression
US5420983A (en) Method for merging memory blocks, fetching associated disk chunk, merging memory blocks with the disk chunk, and writing the merged data
US7958289B2 (en) Method and system for storing memory compressed data onto memory compressed disks
US5644791A (en) System for storing pointers to initial sectors of variable length n units and storing second pointers within the initial sector of the n unit
KR20010007058A (en) Virtual uncompressed cache for compressed main memory
KR980010784A (en) Compressed Data Cache Storage System
US5537658A (en) Distributed directory method and structure for direct access storage device (DASD) data compression
US20190235755A1 (en) Storage apparatus and method of controlling same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): GB JP US