WO2024051913A1 - Tape storage system - Google Patents

Tape storage system Download PDF

Info

Publication number
WO2024051913A1
WO2024051913A1 PCT/EP2022/074588 EP2022074588W WO2024051913A1 WO 2024051913 A1 WO2024051913 A1 WO 2024051913A1 EP 2022074588 W EP2022074588 W EP 2022074588W WO 2024051913 A1 WO2024051913 A1 WO 2024051913A1
Authority
WO
WIPO (PCT)
Prior art keywords
tape storage
tape
catalog
level catalog
level
Prior art date
Application number
PCT/EP2022/074588
Other languages
French (fr)
Inventor
Yair Toaff
Assaf Natanzon
Idan Zach
Aviv Kuvent
Michael Sternberg
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2022/074588 priority Critical patent/WO2024051913A1/en
Publication of WO2024051913A1 publication Critical patent/WO2024051913A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0686Libraries, e.g. tape libraries, jukebox

Definitions

  • the present disclosure relates generally to a tape storage system, and more particularly, the present disclosure relates to a method of optimised writing of a catalog to a tape.
  • Tape storage is a storage system in which a magnetic tape is used as a recording media to store data.
  • the need for tape storage is increasing as disks are no longer big enough to accommodate the needs.
  • the focus for secondary storage and archiving is moving from the disk with deduplication to tape technologies due to the tape’s cheaper price and its ability to retain the data for longer time. Further, introducing deduplication in the tape storage makes the price of the tape even cheaper.
  • Cataloging the data is used in many storage systems to search for specific files/objects in an efficient way in accordance with different search parameters. Due to the enormous size of the tape-based storage system, a global catalog for the entire system is no longer maintainable in a regular database (DB) based on fast storage (e.g. Solid State Drive (SSD)/Hard Disk Drive (HDD)).
  • DB regular database
  • HDD Hard Disk Drive
  • the catalog for such systems has to be kept on the tapes. Since the seek time in the tape is relatively long (e.g. 100 seconds), the query time while searching for a file becomes very long. Thus, there is a need to reduce the seek time required while maintaining the detailed catalog intact.
  • a length of the tape i.e. the length of every wrap
  • 1 kilometer e.g. in Linear Tape Open-9, LTO-9.
  • Running the tape back-and-forth, also known as shoeshining, is inefficient as it greatly reduces the life-span of the tape. Further, continuously stopping and restarting the tape movement is inefficient as it reduces the durability of the tape. For these reasons, it is important to buffer large chunks of data in other storage media (e.g. hard disk drive, or solid-state drive) before dumping the data onto the tape.
  • other storage media e.g. hard disk drive, or solid-state drive
  • LTFS Linear Tape File System
  • DB database
  • all of the file parameters e.g. a name, a date, an owner, key terms, etc.
  • searches can be performed according to any combination of the file parameters.
  • the DB solution is more flexible, but it requires large storage for all the internal indexing that the DB performed for improved performance. In the cases of a very large tape storage system with thousands of cartridges and especially the tape storage system that contains deduplication that makes it even bigger, it requires a disk-based storage system just for holding the DB data. If this DB data is held on the tape, the query time may be very high and unacceptable.
  • the present disclosure provides a tape storage system and a method of optimised writing of a catalog to a tape.
  • a tape storage system comprising: a disk storage module for fast access to data; a plurality of tape storage modules for long-term storage of data; and a catalog module for searching data stored on one or more of the plurality of tape storage modules, wherein the catalog module comprises a second level catalog, written on each of a plurality of tape storage modules, configured to point to specific data stored in a specific tape storage module; and a first level catalog, written on the disk storage module, configured to point to the specific tape storage module and includes offsets of the second level catalog.
  • the disclosed tape storage system enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length.
  • the tape storage system maintains a 2-level catalog which enables the second level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog.
  • the tape storage system increases the lifespan and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module.
  • the second level catalog is written at several wrap locations on each of the corresponding tape storage modules.
  • the first level catalog is configured to point to the specific tape storage module storing data in response to a search query.
  • the respective second level catalog of the specific tape storage module is configured to be read to the disk storage medium to process the search query.
  • the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules.
  • the disk storage module may be a Solid-State drive.
  • the tape storage module is a Linear Tape Open-9 (LTO-9) drive.
  • LTO-9 Linear Tape Open-9
  • the tape storage module may include any type of tape storage technology.
  • a method of optimised writing of a catalog to a tape comprising: segmenting the catalog into a first level catalog and a second level catalog; writing the second level catalog on a plurality of tape storage modules such that the second level catalog points to specific data stored on a respective tape storage module; and writing the first level catalog on a disk storage module such that the first level catalog points to one or more of the plurality of tape storage modules and includes offsets of the second level catalog
  • the method enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length.
  • the method maintains a 2-level catalog which enables the second-level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog.
  • the method increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module.
  • the method further comprises writing the second level catalog at several wrap locations on each of the corresponding tape storage modules.
  • the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules.
  • FIG. 1 illustrates a block diagram of a tape storage system in accordance with an implementation of the present disclosure
  • FIG. 2 illustrates an exemplary view of a second level catalog that is written at one or more wrap locations (e.g. wrap 0, wrap 1, wrap 2, wrap 3) on a tape storage module in accordance with an implementation of the present disclosure
  • wrap locations e.g. wrap 0, wrap 1, wrap 2, wrap 3
  • FIG. 3 illustrates an exemplary view of a second level catalog that is written at one or more wrap locations (e.g. wrap 0, wrap 1, wrap 2, wrap 3) on a tape storage module irrespective of a position of a reader head in accordance with an implementation of the present disclosure
  • FIG. 4 is a flow diagram that illustrates a method of optimised writing of a catalog to a tape in accordance with an implementation of the present disclosure.
  • Implementations of the present disclosure provide a tape storage system and a method of optimised writing of a catalog to a tape.
  • a process, a method, a system, a product, or a device that includes a series of steps or units is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
  • Catalog refers to a searchable database table that contains details about the different files/objects written into a storage it relates to.
  • Tracks A tape is composed of tracks, on which data is written and from which data is read. The track runs the length of the tape, from tape start to tape end.
  • Wrap is a head of a tape drive, and is used for reading and writing.
  • the wrap contains multiple read and write elements, which allows reading/writing of multiple adjacent tracks. A set of such adjacent tracks is called the wrap.
  • Bands The tape is divided into bands.
  • a tape drive head covers a width of a single band.
  • Each band is composed of an even number of wraps.
  • a first wrap is written from start to end, then a second wrap from end to start, then a third wrap from start to end, and so on.
  • the direction of recording and of reading is the direction of the relevant wrap. So, wraps 0, 2, 4, ... are from the start of tape to the end of tape, while wraps 1, 3, 5, ... are from the end of tape to the start of tape.
  • a band is composed of an equal number of start-to-end wraps and end-to- start wraps.
  • LTO-9 tapes In LTO-9 tapes (i.e. tapes constructed according to the Linear Tape Open specifications, generation 9), there are 32 tracks per wrap, 52 wraps per band (26 start-to-end wraps and 26 end-to-start wraps) and 4 bands.
  • LTO partition in LTO-9, there is an option for up to 4 partitions. Each partition contains an even number of wraps.
  • FIG. 1 illustrates a block diagram of a tape storage system 100 in accordance with an implementation of the present disclosure.
  • the tape storage system 100 includes a disk storage module 102 (such as a SSD) for fast access to data, a plurality of tape storage modules 104A- N (such as LTO-9) for long-term storage of data, and a catalog module 106 for searching data stored on one or more of the plurality of tape storage modules 104A-N.
  • the catalog module 106 includes a first level catalog 108 and a second level catalog 110.
  • the second level catalog 110 written on each of the plurality of tape storage modules 104A-N, is configured to point to specific data stored in the specific tape storage module (e.g. 104A).
  • the first level catalog 108 written on the disk storage module 102, is configured to point to the specific tape storage module (e.g. 104A) and includes offsets of the second level catalog 110.
  • any suitable technology or standard can be used for implementing the disk storage module 102 and the tape storage module 104 and the present disclosure is not limited to SSD and LTO- 9.
  • the tape storage system 100 enables free text searches on the catalog without taking too much space in the disk storage module 102, thereby limiting the query time to an acceptable length.
  • the tape storage system 100 maintains a 2-level catalog (i.e. the first level catalog 108 and the second level catalog 110) which enables the second level catalog 110 to be written several times in different offsets on the tape storage module (e.g.
  • the first level catalog 108 in the disk storage module 102 points to the specific tape storage module or a small number of the tape storage modules 104A-N.
  • the second level catalog 110 is written to each of the tape storage modules 104A-N (i.e. each tape storage module holds a part of the catalog module 106 that contains the data on this tape storage module content).
  • the tape storage system 100 increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module 102 before writing the data to the tape storage module (e.g. 104A-N)
  • the second level catalog 110 is written at several wrap locations on each of the corresponding tape storage modules 104A-N.
  • the first level catalog 108 is configured to point to the specific tape storage module (e.g. 104A) storing data in response to a search query and to include offsets of the second level catalog 110 present in the first level catalog 108.
  • the specific tape storage module e.g. 104A
  • the respective second level catalog 110 of the specific tape storage module (e.g. 104A) is configured to be read to the disk storage medium to process the search query.
  • the second level catalog 110 is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules 104A-N such that irrespective of the position of a reader head, the second level catalog 110 is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog 110 on the tape storage modules 104A-N.
  • the disk storage module 102 may be a Solid-State drive (SSD).
  • the tape storage module e.g. 104A-N
  • LTO-9 Linear Tape Open-9
  • the second level catalog from a specific tape storage module may be read to the disk storage module (or to a memory cache) to process a search query. This may take more than a minute due to the seek time.
  • the tape storage system 100 reduces this seek time by saving several copies (N) of the second level catalog 110 on the tape storage modules 104A-N in different offsets in a way that the maximal can be the N* part of the tape storage modules 104A-N instead of the whole tape storage modules 104A-N (i.e. in this case, the average seek time may be half when compared to the seek time of conventional storage system).
  • the average with copies (N) is (l/N)/2 (maximum 1/N average is half of it). For example, ifN is 10, the tape storage system 100 reduces the seek time to a maximum of ⁇ 10 seconds instead of a maximum of -100 seconds. That is, if the seek time for the whole tape storage modules 104A-N is up to 100 seconds, then for 10 th part of the distance, the seek time is up to 10 seconds with an average of 5 seconds.
  • N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog 110 on the tape storage modules 104A-N.
  • FIG. 2 illustrates an exemplary view of a second level catalog 206 that is written at one or more wrap locations (e.g. wrap 0, wrap 1 202, wrap 2, wrap 3 204) on a tape storage module in accordance with an implementation of the present disclosure.
  • the second level catalog 206 is written on wrap 1 202.
  • the tape storage module/tape needs to be rolled almost for its entire length.
  • the second level catalog 206 is written on wrap 3 204.
  • the tape storage module/tape needs to be rolled only for a small length.
  • FIG. 3 illustrates an exemplary view of a second level catalog 306 that is written at one or more wrap locations (e.g. wrap 0, wrap 1 302, wrap 2, wrap 3 304) on a tape storage module irrespective of a position of a reader head in accordance with an implementation of the present disclosure.
  • the second level catalog 306 is written on 1/3 and 2/3 areas of the wraps (i.e. wrap 1 302 and wrap 3 304).
  • wrap 1 302 and wrap 3 304 the tape storage module has less than 1/3 area to roll before it accesses the second level catalog 306.
  • the second level catalog 306 is written on wrap 3 304 and the tape storage module/tape needs to be rolled only a small length.
  • the tape storage system has two methods to write the second level catalog 306 to the tape storage module.
  • the tape storage system writes the second level catalog 306 on the tape storage module in a single pass.
  • the whole data for this tape storage module is reordered in a staging area. This reorder includes deduplication of data and reordering of different segments in a way that may improve the read times.
  • the second level catalog 306 is placed according to the selected N in a way that ensures that the tape storage module/tape does not require to be rolled more than 1/N of the tape length.
  • the tape storage system writes the second level catalog 306 on the tape storage module in several appends.
  • a first partition of the tape storage module may be reserved for the second level catalog 306.
  • the second level catalog 306 is written to the first partition at least N times or until a full wrap is filled. If only a single wrap is filled, the tape storage system may fill an opposite wrap next time. Otherwise, the tape storage system may start from the partition start. This method is typically performed with very large appends, otherwise, the tape storage module/tape may be ruined by rewriting too many times.
  • FIG. 4 is a flow diagram that illustrates a method of optimised writing of a catalog to a tape in accordance with an implementation of the present disclosure.
  • the catalog is segmented into a first level catalog and a second level catalog.
  • the second level catalog is written on a plurality of tape storage modules such that the second level catalog points to specific data stored on a respective tape storage module.
  • the first level catalog is written on a disk storage module such that the first level catalog points to one or more of a plurality of tape storage modules and includes offsets of the second level catalog.
  • the method enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length.
  • the method maintains a 2-level catalog which enables the second level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog.
  • the method increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module.
  • the method further comprises writing the second level catalog at several wrap locations on each of the corresponding tape storage modules.
  • the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules.
  • a computer program that includes instructions which, when the program is executed by a computer, cause the computer to carry out the steps of the above method.

Abstract

Provided is a tape storage system (100). The tape storage system includes a disk storage module (102) for fast access to data, a plurality of tape storage modules (104A-N) for long-term storage of data, and a catalog module (106) for searching data stored on one or more of the plurality of tape storage modules. The catalog module includes a first level catalog (108) and a second level catalog (110, 206, 306). The second level catalog, written on each of a plurality of tape storage modules, is configured to point to specific data stored in a specific tape storage module. The first level catalog, written on the disk storage module, is configured to point to the specific tape storage module and includes offsets of the second level catalog.

Description

TAPE STORAGE SYSTEM
TECHNICAL FIELD
The present disclosure relates generally to a tape storage system, and more particularly, the present disclosure relates to a method of optimised writing of a catalog to a tape.
BACKGROUND
Tape storage is a storage system in which a magnetic tape is used as a recording media to store data. The need for tape storage is increasing as disks are no longer big enough to accommodate the needs. The focus for secondary storage and archiving is moving from the disk with deduplication to tape technologies due to the tape’s cheaper price and its ability to retain the data for longer time. Further, introducing deduplication in the tape storage makes the price of the tape even cheaper.
Cataloging the data is used in many storage systems to search for specific files/objects in an efficient way in accordance with different search parameters. Due to the enormous size of the tape-based storage system, a global catalog for the entire system is no longer maintainable in a regular database (DB) based on fast storage (e.g. Solid State Drive (SSD)/Hard Disk Drive (HDD)). The catalog for such systems has to be kept on the tapes. Since the seek time in the tape is relatively long (e.g. 100 seconds), the query time while searching for a file becomes very long. Thus, there is a need to reduce the seek time required while maintaining the detailed catalog intact.
In modern tapes, a length of the tape (i.e. the length of every wrap) can exceed 1 kilometer (km) (e.g. in Linear Tape Open-9, LTO-9). Running the tape back-and-forth, also known as shoeshining, is inefficient as it greatly reduces the life-span of the tape. Further, continuously stopping and restarting the tape movement is inefficient as it reduces the durability of the tape. For these reasons, it is important to buffer large chunks of data in other storage media (e.g. hard disk drive, or solid-state drive) before dumping the data onto the tape.
Existing solution provides a Linear Tape File System (LTFS) that is implemented on the tape to keep the index (i.e. very similar to the catalog) on a first partition of a cartridge of LTFS. When there are changes to the tape, a new version of the index is written after the old index on the first partition. When the LTFS cartridge is loaded, the index is held in a memory until the cartridge is unloaded. This may lead to a large memory consumption (i.e. a tape library may include hundreds of tape drives). It is also limited to LTFS searches on file names/dates that search the file system (FS) tree completely. Thus, the cartridge has to be loaded in order to know what is stored in it. There are versions of LTFS for libraries, but for keeping everything online, a special server is required.
Another existing solution provides a storage system, and its use is based on a database (DB), usually the DB that can handle free text searches (e.g. ElasticSearch). In the DB, all of the file parameters (e.g. a name, a date, an owner, key terms, etc.) are inserted and searches can be performed according to any combination of the file parameters. The DB solution is more flexible, but it requires large storage for all the internal indexing that the DB performed for improved performance. In the cases of a very large tape storage system with thousands of cartridges and especially the tape storage system that contains deduplication that makes it even bigger, it requires a disk-based storage system just for holding the DB data. If this DB data is held on the tape, the query time may be very high and unacceptable.
Therefore, there arises a need to address the aforementioned technical problem/drawbacks in writing a catalog for a tape storage system.
SUMMARY
It is an object of the present disclosure to provide a tape storage system and a method of optimised writing of a catalog to a tape while avoiding one or more disadvantages of prior art approaches.
This object is achieved by the features of the independent claims. Further, implementation forms are apparent from the dependent claims, the description, and the figures.
The present disclosure provides a tape storage system and a method of optimised writing of a catalog to a tape.
According to a first aspect, there is provided a tape storage system comprising: a disk storage module for fast access to data; a plurality of tape storage modules for long-term storage of data; and a catalog module for searching data stored on one or more of the plurality of tape storage modules, wherein the catalog module comprises a second level catalog, written on each of a plurality of tape storage modules, configured to point to specific data stored in a specific tape storage module; and a first level catalog, written on the disk storage module, configured to point to the specific tape storage module and includes offsets of the second level catalog.
Advantageously, the disclosed tape storage system enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length. The tape storage system maintains a 2-level catalog which enables the second level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog. The tape storage system increases the lifespan and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module.
Preferably, the second level catalog is written at several wrap locations on each of the corresponding tape storage modules.
Preferably, the first level catalog is configured to point to the specific tape storage module storing data in response to a search query.
Preferably, the respective second level catalog of the specific tape storage module is configured to be read to the disk storage medium to process the search query.
Preferably, the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules. The disk storage module may be a Solid-State drive. Preferably, the tape storage module is a Linear Tape Open-9 (LTO-9) drive. Although, it is to be noted that the tape storage module may include any type of tape storage technology. According to a second aspect, there is provided a method of optimised writing of a catalog to a tape comprising: segmenting the catalog into a first level catalog and a second level catalog; writing the second level catalog on a plurality of tape storage modules such that the second level catalog points to specific data stored on a respective tape storage module; and writing the first level catalog on a disk storage module such that the first level catalog points to one or more of the plurality of tape storage modules and includes offsets of the second level catalog
Advantageously, the method enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length. The method maintains a 2-level catalog which enables the second-level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog. The method increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module.
Preferably, the method further comprises writing the second level catalog at several wrap locations on each of the corresponding tape storage modules.
Preferably, the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules.
These and other aspects of the present disclosure will be apparent from the implementation(s) described below.
BRIEF DESCRIPTION OF DRAWINGS
Implementations of the present disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 illustrates a block diagram of a tape storage system in accordance with an implementation of the present disclosure; FIG. 2 illustrates an exemplary view of a second level catalog that is written at one or more wrap locations (e.g. wrap 0, wrap 1, wrap 2, wrap 3) on a tape storage module in accordance with an implementation of the present disclosure;
FIG. 3 illustrates an exemplary view of a second level catalog that is written at one or more wrap locations (e.g. wrap 0, wrap 1, wrap 2, wrap 3) on a tape storage module irrespective of a position of a reader head in accordance with an implementation of the present disclosure; and
FIG. 4 is a flow diagram that illustrates a method of optimised writing of a catalog to a tape in accordance with an implementation of the present disclosure.
DETAILED DESCRIPTION OF THE DRAWINGS
Implementations of the present disclosure provide a tape storage system and a method of optimised writing of a catalog to a tape.
To make solutions of the present disclosure more comprehensible for a person skilled in the art, the following implementations of the present disclosure are described with reference to the accompanying drawings.
Terms such as "a first", "a second", "a third", and "a fourth" (if any) in the summary, claims, and foregoing accompanying drawings of the present disclosure are used to distinguish between similar objects and are not necessarily used to describe a specific sequence or order. It should be understood that the terms so used are interchangeable under appropriate circumstances, so that the implementations of the present disclosure described herein are, for example, capable of being implemented in sequences other than the sequences illustrated or described herein. Furthermore, the terms "include" and "have" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or units, is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
Definitions:
Catalog: Catalog refers to a searchable database table that contains details about the different files/objects written into a storage it relates to. Tracks: A tape is composed of tracks, on which data is written and from which data is read. The track runs the length of the tape, from tape start to tape end.
Wraps: Wrap is a head of a tape drive, and is used for reading and writing. The wrap contains multiple read and write elements, which allows reading/writing of multiple adjacent tracks. A set of such adjacent tracks is called the wrap.
Bands: The tape is divided into bands. A tape drive head covers a width of a single band. Each band is composed of an even number of wraps.
Write Order: Data is recorded on a tape in a linear serpentine manner. A first wrap is written from start to end, then a second wrap from end to start, then a third wrap from start to end, and so on. The direction of recording and of reading is the direction of the relevant wrap. So, wraps 0, 2, 4, ... are from the start of tape to the end of tape, while wraps 1, 3, 5, ... are from the end of tape to the start of tape. A band is composed of an equal number of start-to-end wraps and end-to- start wraps.
Linear Tape Open-9 (LTO-9): In LTO-9 tapes (i.e. tapes constructed according to the Linear Tape Open specifications, generation 9), there are 32 tracks per wrap, 52 wraps per band (26 start-to-end wraps and 26 end-to-start wraps) and 4 bands.
Linear Tape Open (LTO) partition: in LTO-9, there is an option for up to 4 partitions. Each partition contains an even number of wraps.
FIG. 1 illustrates a block diagram of a tape storage system 100 in accordance with an implementation of the present disclosure. The tape storage system 100 includes a disk storage module 102 (such as a SSD) for fast access to data, a plurality of tape storage modules 104A- N (such as LTO-9) for long-term storage of data, and a catalog module 106 for searching data stored on one or more of the plurality of tape storage modules 104A-N. The catalog module 106 includes a first level catalog 108 and a second level catalog 110. The second level catalog 110, written on each of the plurality of tape storage modules 104A-N, is configured to point to specific data stored in the specific tape storage module (e.g. 104A). The first level catalog 108, written on the disk storage module 102, is configured to point to the specific tape storage module (e.g. 104A) and includes offsets of the second level catalog 110. It is to be understood that any suitable technology or standard can be used for implementing the disk storage module 102 and the tape storage module 104 and the present disclosure is not limited to SSD and LTO- 9. Advantageously, the tape storage system 100 enables free text searches on the catalog without taking too much space in the disk storage module 102, thereby limiting the query time to an acceptable length. The tape storage system 100 maintains a 2-level catalog (i.e. the first level catalog 108 and the second level catalog 110) which enables the second level catalog 110 to be written several times in different offsets on the tape storage module (e.g. 104A-N) to reduce the seek time before reading the catalog, thereby ensuring fast response to search queries. The first level catalog 108 in the disk storage module 102 points to the specific tape storage module or a small number of the tape storage modules 104A-N. The second level catalog 110 is written to each of the tape storage modules 104A-N (i.e. each tape storage module holds a part of the catalog module 106 that contains the data on this tape storage module content). The tape storage system 100 increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module 102 before writing the data to the tape storage module (e.g. 104A-N)
Preferably, the second level catalog 110 is written at several wrap locations on each of the corresponding tape storage modules 104A-N.
Preferably, the first level catalog 108 is configured to point to the specific tape storage module (e.g. 104A) storing data in response to a search query and to include offsets of the second level catalog 110 present in the first level catalog 108.
Preferably, the respective second level catalog 110 of the specific tape storage module (e.g. 104A) is configured to be read to the disk storage medium to process the search query.
Preferably, the second level catalog 110 is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules 104A-N such that irrespective of the position of a reader head, the second level catalog 110 is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog 110 on the tape storage modules 104A-N. The disk storage module 102 may be a Solid-State drive (SSD). Preferably, the tape storage module (e.g. 104A-N) is a Linear Tape Open-9 (LTO-9) drive. Although, the tape storage module (e.g. 104A-N) may include any of type of tape storage technology known in the art. Typically, in conventional storage systems, the second level catalog from a specific tape storage module may be read to the disk storage module (or to a memory cache) to process a search query. This may take more than a minute due to the seek time. The tape storage system 100 reduces this seek time by saving several copies (N) of the second level catalog 110 on the tape storage modules 104A-N in different offsets in a way that the maximal can be the N* part of the tape storage modules 104A-N instead of the whole tape storage modules 104A-N (i.e. in this case, the average seek time may be half when compared to the seek time of conventional storage system). That is, the average with copies (N)is (l/N)/2 (maximum 1/N average is half of it). For example, ifN is 10, the tape storage system 100 reduces the seek time to a maximum of ~10 seconds instead of a maximum of -100 seconds. That is, if the seek time for the whole tape storage modules 104A-N is up to 100 seconds, then for 10th part of the distance, the seek time is up to 10 seconds with an average of 5 seconds. N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog 110 on the tape storage modules 104A-N.
FIG. 2 illustrates an exemplary view of a second level catalog 206 that is written at one or more wrap locations (e.g. wrap 0, wrap 1 202, wrap 2, wrap 3 204) on a tape storage module in accordance with an implementation of the present disclosure. In this example, in a first scenario, the second level catalog 206 is written on wrap 1 202. As a result, the tape storage module/tape needs to be rolled almost for its entire length. In the second scenario, the second level catalog 206 is written on wrap 3 204. As a result, the tape storage module/tape needs to be rolled only for a small length.
FIG. 3 illustrates an exemplary view of a second level catalog 306 that is written at one or more wrap locations (e.g. wrap 0, wrap 1 302, wrap 2, wrap 3 304) on a tape storage module irrespective of a position of a reader head in accordance with an implementation of the present disclosure. In this example, the second level catalog 306 is written on 1/3 and 2/3 areas of the wraps (i.e. wrap 1 302 and wrap 3 304). Thus, irrespective of where the reader head is located, the tape storage module has less than 1/3 area to roll before it accesses the second level catalog 306. In this example, the second level catalog 306 is written on wrap 3 304 and the tape storage module/tape needs to be rolled only a small length. The same can be done for any N selected, where N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog 306 on the tape storage modules. In an embodiment, the tape storage system has two methods to write the second level catalog 306 to the tape storage module. In a first method, the tape storage system writes the second level catalog 306 on the tape storage module in a single pass. In this first method, the whole data for this tape storage module is reordered in a staging area. This reorder includes deduplication of data and reordering of different segments in a way that may improve the read times. The second level catalog 306 is placed according to the selected N in a way that ensures that the tape storage module/tape does not require to be rolled more than 1/N of the tape length.
In a second method, the tape storage system writes the second level catalog 306 on the tape storage module in several appends. For this, a first partition of the tape storage module may be reserved for the second level catalog 306. Every time that appends of data ends, the second level catalog 306 is written to the first partition at least N times or until a full wrap is filled. If only a single wrap is filled, the tape storage system may fill an opposite wrap next time. Otherwise, the tape storage system may start from the partition start. This method is typically performed with very large appends, otherwise, the tape storage module/tape may be ruined by rewriting too many times.
FIG. 4 is a flow diagram that illustrates a method of optimised writing of a catalog to a tape in accordance with an implementation of the present disclosure. At a step 402, the catalog is segmented into a first level catalog and a second level catalog. At a step 404, the second level catalog is written on a plurality of tape storage modules such that the second level catalog points to specific data stored on a respective tape storage module. At a step 406, the first level catalog is written on a disk storage module such that the first level catalog points to one or more of a plurality of tape storage modules and includes offsets of the second level catalog.
Advantageously, the method enables free text searches on the catalog without taking too much space in the disk storage module, thereby limiting the query time to an acceptable length. The method maintains a 2-level catalog which enables the second level catalog to be written several times in different offsets on the tape storage module to reduce the seek time before reading the catalog. The method increases the life-span and the durability of the tape as it buffers large chunks of data in the disk storage module before writing the data to the tape storage module. Preferably, the method further comprises writing the second level catalog at several wrap locations on each of the corresponding tape storage modules.
Preferably, the second level catalog is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules such that irrespective of the position of a reader head, the second level catalog is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog on the tape storage modules.
In an implementation, a computer program that includes instructions which, when the program is executed by a computer, cause the computer to carry out the steps of the above method.
It should be understood that the arrangement of components illustrated in the figures described are exemplary and that other arrangement may be possible. It should also be understood that the various system components (and means) defined by the claims, described below, and illustrated in the various block diagrams represent components in some systems configured according to the subject matter disclosed herein. For example, one or more of these system components (and means) may be realized, in whole or in part, by at least some of the components illustrated in the arrangements illustrated in the described figures.
In addition, while at least one of these components are implemented at least partially as an electronic hardware component, and therefore constitutes a machine, the other components may be implemented in software that when included in an execution environment constitutes a machine, hardware, or a combination of software and hardware.

Claims

CLAIMS:
1. A tape storage system (100) comprising: a disk storage module (102) for fast access to data; a plurality of tape storage modules (104A-N) for long-term storage of data; and a catalog module (106) for searching data stored on one or more of the plurality of tape storage modules (104A-N), wherein the catalog module (106) comprises a second level catalog (110, 206, 306), written on each of the plurality of tape storage modules (104A-N), configured to point to specific data stored in the specific tape storage module; and a first level catalog (108), written on the disk storage module (102), configured to point to a specific tape storage module.
2. The tape storage system (100) of claim 1, wherein the second level catalog (110, 206, 306) is written at several wrap locations on each of the corresponding tape storage modules (104A-N).
3. The tape storage system (100) of claim 1 or 2, wherein the first level catalog (108) is configured to point to the specific tape storage module storing data in response to a search query.
4. The tape storage system (100) of claim 3, wherein the respective second level catalog (110, 206, 306) of the specific tape storage module is configured to be read to the disk storage medium to process the search query.
5. The tape storage system (100) of any one of claims 2 to 4, wherein the second level catalog (110, 206, 306) is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules (104A-N) such that irrespective of the position of a reader head, the second level catalog (110, 206, 306) is accessed before the reader head has rolled l/N* part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog (110, 206, 306) on the tape storage modules (104A-N).
6. The tape storage system (100) of any preceding claim, wherein the disk storage module (102) is a Solid-State drive.
7. The tape storage system (100) of any preceding claim, wherein the tape storage module (104A-N) is an LTO-9 drive.
8. A method of optimised writing of a catalog to a tape comprising: segmenting the catalog into a first level catalog (108) and a second level catalog (110, 206, 306); writing the second level catalog (110, 206, 306) on a plurality of tape storage modules (104A-N) such that the second level catalog (110, 206, 306) points to specific data stored on a respective tape storage module; and writing the first level catalog (108) on a disk storage module (102) such that the first level catalog (108) points to one or more of the plurality of tape storage modules (104A-N) and comprises offsets of the second level catalog (110, 206, 306) .
9. The method of claim 8, further comprising writing the second level catalog (110, 206, 306) at several wrap locations on each of the corresponding tape storage modules (104A-N).
10. The method of claim 9, wherein the second level catalog (110, 206, 306) is written in 1/N, 2/N up to (N-l)/N areas of wraps of the tape storage modules (104A-N) such that irrespective of the position of a reader head, the second level catalog (110, 206, 306) is accessed before the reader head has rolled l/N111 part of tape length, wherein N is an integer and is selected according to a desired maximum seek time and storage space required to store multiple copies of the second level catalog (110, 206, 306) on the tape storage modules (104A-N).
PCT/EP2022/074588 2022-09-05 2022-09-05 Tape storage system WO2024051913A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/074588 WO2024051913A1 (en) 2022-09-05 2022-09-05 Tape storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/074588 WO2024051913A1 (en) 2022-09-05 2022-09-05 Tape storage system

Publications (1)

Publication Number Publication Date
WO2024051913A1 true WO2024051913A1 (en) 2024-03-14

Family

ID=83398289

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/074588 WO2024051913A1 (en) 2022-09-05 2022-09-05 Tape storage system

Country Status (1)

Country Link
WO (1) WO2024051913A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010034811A1 (en) * 1997-12-10 2001-10-25 Robert Beverley Basham Host-available device block map for optimized file retrieval from serpentine tape drives
US20150261465A1 (en) * 2014-02-21 2015-09-17 Netapp, Inc. Systems and methods for storage aggregates and infinite storage volumes
US20220164110A1 (en) * 2020-11-24 2022-05-26 International Business Machines Corporation Non-volatile storage of high resolution tape directory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010034811A1 (en) * 1997-12-10 2001-10-25 Robert Beverley Basham Host-available device block map for optimized file retrieval from serpentine tape drives
US20150261465A1 (en) * 2014-02-21 2015-09-17 Netapp, Inc. Systems and methods for storage aggregates and infinite storage volumes
US20220164110A1 (en) * 2020-11-24 2022-05-26 International Business Machines Corporation Non-volatile storage of high resolution tape directory

Similar Documents

Publication Publication Date Title
US6542975B1 (en) Method and system for backing up data over a plurality of volumes
US6415300B1 (en) Method of performing a high-performance backup which gains efficiency by reading input file blocks sequentially
US8108599B2 (en) Erasure techniques for emulating streamed data format for non tape media
US8082388B2 (en) Optimizing operational requests of logical volumes
US20110015778A1 (en) Data cartridge and tape library including flash memory
CN107526689B (en) Read cache management
US8400897B2 (en) Migrating data from one recording medium to another
BR112014029956B1 (en) METHOD FOR STORING DATA, READY STORAGE MEDIA AND COMPUTER DEVICE FOR OPTIMIZED DISPOSAL
JP6005010B2 (en) Method, storage system, and program for spanning one file on multiple tape media
KR960003024B1 (en) Data file and directory for data file information writing method and apparatus
US20140215145A1 (en) Tape drive cache memory
US9875030B2 (en) Media write operation
US9910859B2 (en) Support for WORM cartridges realized by linear tape file system (LTFS)
US7689623B1 (en) Method for performing an external (disk-based) sort of a large data file which takes advantage of “presorted” data already present in the input
CN111797058A (en) Universal file system and file management method
US20070223875A1 (en) Storage device and method of accessing storage device
US7143232B2 (en) Method, system, and program for maintaining a directory for data written to a storage medium
WO2024051913A1 (en) Tape storage system
US8655892B2 (en) Data reorganization
US8296512B2 (en) Recording method for a disk device having recording regions different in recording density
JP2016115377A (en) Method of spanning and writing file in plurality of tape cartridges
KR20020081696A (en) Method and system for reducing fragmentation
US6405283B1 (en) Method for handling buffer under-run during disc recording
US7661021B2 (en) Method for defect management in rewritable optical storage media
CN111427513B (en) Method for improving storage performance of high-speed signal acquisition system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22773468

Country of ref document: EP

Kind code of ref document: A1