WO2023241783A1 - Device and method for improved redundant storing of sequential access data - Google Patents

Device and method for improved redundant storing of sequential access data Download PDF

Info

Publication number
WO2023241783A1
WO2023241783A1 PCT/EP2022/066073 EP2022066073W WO2023241783A1 WO 2023241783 A1 WO2023241783 A1 WO 2023241783A1 EP 2022066073 W EP2022066073 W EP 2022066073W WO 2023241783 A1 WO2023241783 A1 WO 2023241783A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage
sads
store
parity information
Prior art date
Application number
PCT/EP2022/066073
Other languages
French (fr)
Inventor
Assaf Natanzon
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2022/066073 priority Critical patent/WO2023241783A1/en
Publication of WO2023241783A1 publication Critical patent/WO2023241783A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/108Parity data distribution in semiconductor storages, e.g. in SSD

Definitions

  • the present disclosure relates to the field of storage, data redundancy, RAID (that is, a redundant array of independent disks) and tape drives and provides a device for improved redundant storing of data in a sequential access manner. Moreover, the present disclosure provides a corresponding method and computer program.
  • an erasure code is used for redundant storing of data.
  • a regenerative code is a further type of erasure code that allows for faster rebuilding.
  • Erasure codes are codes that allow for recovery in case of loss of some of the data.
  • RAID (redundant array of independent disks) 5 for example (the concept of which is illustrated in FIG. 7) uses one additional disk which includes the parity data of all the other devices and allows for recovery of lost data in case of failure of one disk.
  • RAID 6 for example is a scheme that allows rebuilding data in case of two drive failures.
  • RAID 5 and RAID 6 require reading all data from remaining disks in case of a drive failure.
  • FIG. 7 a RAID 5 system with two disks of data and one disk for parity is shown in FIG. 7.
  • the disks are sliced, and parity is kept for each slice.
  • the parity is stored on the first disk and the data on the two other disk.
  • each block in the disk can be restored using (DiskO XOR Diskl).
  • RAIT redundant array of independent tapes
  • File A 8 e.g., shows how subsequent blocks of File A are written to three tape drives in a round robin fashion.
  • File A can also be called “sequential access data”.
  • File A is striped to three physical tape drives, which however are visualized to appear as a single drive.
  • the disadvantage is that many tape drives are needed to achieve a RAID with a large parity number and low overhead.
  • an objective of embodiments of the present disclosure is to provide an improved redundancy mechanism for storing sequential access data.
  • a first aspect of the present disclosure provides a device for redundancy of sequential access data, wherein the device is configured to store a predefined amount of data in a sequential access data store, SADS; determine parity information based on the predefined amount of data; and when the predefined amount of data is stored in the SADS, store the parity information in the SADS.
  • SADS sequential access data store
  • the device is configured to obtain the predefined amount of data, before storing it to the SADS.
  • the predefined amount of data is obtained from an external device.
  • the SADS can be external to the device for redundancy of sequential access data.
  • the SADS can be part of the device for redundancy of sequential access data.
  • sequential access data comprises data which is to be stored in an SADS.
  • the device is further configured to store the parity information in the SADS when the predefined amount of data is stored in the SADS completely.
  • the device is further configured to store the parity information in a direct access data store, DADS.
  • DADS direct access data store
  • parity information can be stored in a manner which allows direct access to the parity information, which is very efficient during calculation of the parity information, as it does not need to be stored in a sequential access manner.
  • the DADS can be external to the device for redundancy of sequential access data.
  • the DADS can be part of the device for redundancy of sequential access data.
  • the device is further configured to store the parity information in the DADS before the parity information is stored in the SADS.
  • the device is further configured to, for each element of the predefined amount of data that is being stored in the SADS, update the parity information. This ensures that the parity information can be updated, e.g., while it is stored in the DADS, so that parity information does only need to be written to the SADS once.
  • the element comprises at least one of: a file; a data block.
  • the DADS is a storage of a higher storage tier, compared to the SADS.
  • the DADS comprises at least one of primary storage, secondary storage.
  • the primary storage comprises at least one of a main memory, a random access memory.
  • the secondary storage comprises at least one of a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
  • the SADS comprises tertiary storage.
  • the tertiary storage comprises a tape drive.
  • the SADS comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data. It is also possible that one or more storage entities in the X storage entities can store a different amount of data.
  • a storage entity is a virtual or a physical storage drive.
  • a storage medium in the storage entity can be replaced during storing the predefined amount of data in the SADS.
  • the storage entity is a tape drive
  • a tape in the tape drive can be replaced when it is full.
  • the DADS comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
  • the ratio of X and Y corresponds to a RAID level of the predefined amount of data when being stored by means of the device.
  • RAID relates to redundant array of inexpensive disks.
  • a second aspect of the present disclosure provides a method for redundancy of sequential access data, the method comprising the steps of storing, by a device, a predefined amount of data in a sequential access data store, SADS; determining, by the device, parity information based on the predefined amount of data; and when the predefined amount of data is stored in the SADS, storing, by the device, the parity information in the SADS.
  • the method further comprises storing, by the device, the parity information in the SADS when the predefined amount of data is stored in the SADS completely.
  • the method further comprises storing, by the device, the parity information in a direct access data store, DADS.
  • the method further comprises storing, by the device, the parity information in the DADS before the parity information is stored in the SADS.
  • the method further comprises, for each element of the predefined amount of data that is being stored in the SADS, updating, by the device, the parity information.
  • the DADS is a storage of a higher storage tier, compared to the SADS.
  • the DADS comprises at least one of primary storage, secondary storage.
  • the primary storage comprises at least one of a main memory, a random access memory.
  • the secondary storage comprises at least one of a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
  • the SADS comprises tertiary storage.
  • the tertiary storage comprises a tape drive.
  • the SADS comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data. It is also possible that one or more storage entities in the X storage entities can store a different amount of data.
  • the DADS comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
  • the second aspect and its implementation forms include the same advantages as the first aspect and its respective implementation forms.
  • a third aspect of the present disclosure provides a computer program comprising instructions which, when the program is executed by a computer, cause the computer to perform the method according to the second aspect or any of its implementation forms.
  • the third aspect includes the same advantages as the first aspect and its respective implementation forms.
  • FIG. 1 shows a schematic view of a device according to an embodiment of the present disclosure
  • FIG. 2 shows a schematic view of a device according to an embodiment of the present disclosure in more detail
  • FIG. 3 shows a schematic view of a storage archive system
  • FIG. 4 shows a schematic view of an operating scenario according to the present disclose
  • FIG. 5 shows another schematic view of an operating scenario according to the present disclose
  • FIG. 6 shows a schematic view of a method according to an embodiment of the present disclosure
  • FIG. 7 shows a schematic view of a conventional RAID system
  • FIG. 8 shows a schematic view of a conventional RAIT system.
  • FIG. 1 shows a schematic view of a device 100.
  • the device 100 is for improved redundant storing of sequential access data.
  • the device 100 is configured to store a predefined amount of data 101 in a sequential access data store, SADS 102.
  • the device 100 may be configured to obtain the predefined amount of data 101 from an external device with the intention to have the predefined amount of data 101 stored in a SADS 102.
  • the predefined amount of data 101 may be provided to the device from a computing device, e.g., via a network connection or a system bus.
  • the device 100 is further configured to determine parity information 103 based on the predefined amount of data 101.
  • the device 100 obtains the parity information 103 and also stores the parity information 103. More precisely, when the predefined amount of data 101 is stored in the SADS 102, the device 100 stores the parity information 103 in the SADS 102. Thereby, the parity information 103 can be stored independently from the predefined amount of data 101 (e.g., a file, a data object, or a block), which increases efficiency and reduces overhead.
  • the predefined amount of data 101 e.g., a file, a data object, or a block
  • the device 100 may first store the predefined amount of data 101 in the SADS 102 completely and then store the parity information 103 in the SADS 102. That is, the parity information 103 is stored in the SADS 102, after its final state is determined and no further changes to the parity information 103 (which can be changed with every piece of the predefined amount of data 101 that is stored in the SADS 102) are to be expected.
  • the device 100 is now going to be described in more detail in view of FIG. 2.
  • the device 100 of FIG. 2 includes all functions and features of the device 100 as described in view of FIG. 1.
  • the device 100 may optionally store the parity information 103 in a direct access data store, DADS 201.
  • DADS 201 changes can be easily applied to the parity information 103 based on the predefined amount of data 101, while the predefined amount of data 101 is stored to the SADS 102.
  • the device 100 may store the parity information 103 in the DADS 201 before the parity information 103 is stored in the SADS 102.
  • the parity information 103 is stored in the SADS after it was stored in the DADS 201, where it could be easily amended by means of direct access to the parity information 103.
  • FIG. 2 schematically shows that optionally the parity information 103 can be updated for (and based on) each element 202 of the predefined amount of data 101 that is being stored in the SADS 102. That is, if the predefined amount of data 101 is not yet stored in the SADS 102 completely, the parity information 103 may change based on said elements 202.
  • the DADS 201 can be a storage of a higher storage tier, compared to the SADS 102.
  • Storage tiering is a method of storing data according to how they are accessed on different storage media (so-called storage tiers).
  • the DADS 201 comprises at least one of primary storage 203, secondary storage 204.
  • the primary storage 203 may comprise at least one of a main memory, a random access memory.
  • the secondary storage 204 may comprise at least one of: a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
  • the SADS 102 may comprise tertiary storage 205.
  • the tertiary storage 205 may comprise a tape drive.
  • the device 100 optionally also may be regarded as a storage system (or as a part of such), which may expose multiple protocols such as file system and object storage and can keep data in multiple tiers including a tape tier long term archive (which corresponds to the SADS 102 and is labelled with 102 in FIG. 3).
  • a storage system or as a part of such
  • multiple protocols such as file system and object storage and can keep data in multiple tiers including a tape tier long term archive (which corresponds to the SADS 102 and is labelled with 102 in FIG. 3).
  • the device 100 may support any RAID type independently from the number of tapes.
  • the device 100 may keep the RAID for the tapes on a higher tier (i.e., the DADS 201, which comprises a FILE system/object interface, an SSD tier or a HDD tier).
  • the parity information 103 is stored in a higher tier (i.e. the DADS 201), before it is stored in the SADS 102.
  • the device 100 may need space for N tapes in the higher tiers, where N is the number of parity devices. So, for example if a tape needs double protection, and each tape has 10TB of data, the total space needed on higher HDD/SSD tiers will be 20TB.
  • Data that is to be written to tapes can be either tiered to the HDD/SSD (and written to tape later) or directly written to the storage.
  • the data will be written directly to the tape and the parity of the data will be kept on the higher tier.
  • the device 100 can create a RAID, which is larger than the storage volume of single tapes by writing simultaneously to multiple tapes (i.e., the SADS 102) and updating the parity in the DADS 201.
  • the SADS 102 multiple tapes
  • the parity is linear it can be assumed that the tapes that are not written yet have zero data, thus not impacting the current parity.
  • FIG. 4 and FIG. 5 below show how to build a RAID 6 (four data units + two parity units) using only three tape drives. This can be realized, as in the beginning objects are only written to the first three tape drives (which form the SADS 102), and parity is kept on the HDD/SSD (i.e., the DADS 201).
  • the SADS 102 comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data.
  • X equals three and the three storage entities correspond to “Tape 1”, “Tape 2” and “Tape 3”.
  • cassettes are replaced in the three tape drives, and three new cassettes are put in.
  • the data is written to cassette four, while the parity is calculated from the saved parity on the higher tier with the new data being written. Portions of the parity written to the tape can then be erased from the higher tier.
  • the DADS 201 is formed by the “higher tier” (i.e., by “Parity 1” and “Parity 2”) and comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
  • Y equals two.
  • Data can be written to multiple tape drives in parallel, i.e. file 1 to tape 1, file 2 to tape 2 and file 3 again to tape 1. Once all the tapes are written up to some location, the parity of that portion can be written to the parity tape.
  • Recovery of a faulty tape can be done in a similar way, by rebuilding the data of the faulty tape in the higher tier, this allows to do the rebuild with any number of available tapes.
  • FIG. 6 shows a schematic view of a method 600.
  • the method 600 is for redundancy of sequential access data and comprises a first step of storing 601, by a device 100, a predefined amount of data 101 in a sequential access data store, SADS 102.
  • the method further comprises a second step of determining 602, by the device 100, parity information 103 based on the predefined amount of data 101.
  • the method comprises a third step of, when the predefined amount of data 101 is stored in the SADS 102, storing 603by the device 100, the parity information 103 in the SADS 102.
  • the present disclosure has been described in conjunction with various embodiments as examples as well as implementations.

Abstract

The present disclosure relates to the field of storage, data redundancy, RAID and tape drives. A device (100) is proposed for redundancy of sequential access data, wherein the device (100) is configured to store a predefined amount of data (101) in a sequential access data store, SADS (102). The device is further configured to determine parity information (103) based on the predefined amount of data (101) and, when the predefined amount of data (101) is stored in the SADS (102), store the parity information (103) in the SADS (102).

Description

DEVICE AND METHOD FOR IMPROVED REDUNDANT STORING OF SEQUENTIAL ACCESS DATA
TECHNICAL FIELD
The present disclosure relates to the field of storage, data redundancy, RAID (that is, a redundant array of independent disks) and tape drives and provides a device for improved redundant storing of data in a sequential access manner. Moreover, the present disclosure provides a corresponding method and computer program.
BACKGROUND
In a conventional storage system, an erasure code is used for redundant storing of data. A regenerative code is a further type of erasure code that allows for faster rebuilding. Erasure codes are codes that allow for recovery in case of loss of some of the data. RAID (redundant array of independent disks) 5 for example (the concept of which is illustrated in FIG. 7) uses one additional disk which includes the parity data of all the other devices and allows for recovery of lost data in case of failure of one disk. RAID 6 for example is a scheme that allows rebuilding data in case of two drive failures. RAID 5 and RAID 6 require reading all data from remaining disks in case of a drive failure.
Further codes, such as zigzag codes, allow reading significantly less data in case of a drive failure, and can also fix more than one disk failure. Unfortunately, zigzag codes get exponentially big and thus are not practical for systems with more than 10 disk drives.
As a conventional example a RAID 5 system with two disks of data and one disk for parity is shown in FIG. 7. The disks are sliced, and parity is kept for each slice. The disk which contains the parity is changed for every slice - for the first slice the data is in the first two disks Al, A2, and parity Ap = (Al XOR A2) is stored on the third disk. For the second slice, the parity is stored on the first disk and the data on the two other disk. In case disk 2 fails, each block in the disk can be restored using (DiskO XOR Diskl). In a tape system RAID (called redundant array of independent tapes, RAIT), the data usually is written to multiple tapes which appear as a single TAPE. FIG. 8 e.g., shows how subsequent blocks of File A are written to three tape drives in a round robin fashion. As File A is to be written to tape drives, which only allow for sequential access, File A can also be called “sequential access data”. As shown in the figure, File A is striped to three physical tape drives, which however are visualized to appear as a single drive.
Tape libraries which support RAID (i.e., RAIT) write the data directly to the tape with the parity, this means that the amount of tape drives needed is determining the available RAID of the system.
The disadvantage is that many tape drives are needed to achieve a RAID with a large parity number and low overhead.
As a result, there is the need for an improved redundancy mechanism for storing sequential access data.
SUMMARY
In view of the above-mentioned problem, an objective of embodiments of the present disclosure is to provide an improved redundancy mechanism for storing sequential access data.
This or other objectives may be achieved by embodiments of the present disclosure as described in the enclosed independent claims. Advantageous implementations of embodiments of the present disclosure are further defined in the dependent claims.
A first aspect of the present disclosure provides a device for redundancy of sequential access data, wherein the device is configured to store a predefined amount of data in a sequential access data store, SADS; determine parity information based on the predefined amount of data; and when the predefined amount of data is stored in the SADS, store the parity information in the SADS.
This ensures that the parity information is stored in the SADS when the predefined amount of data is already stored in the SADS. That is, as the parity information, which changes during storing the predefined amount of data, can be stored separately from the predefined amount of data, efficiency of storing sequential access data is improved and overhead is decreased.
In particular, the device is configured to obtain the predefined amount of data, before storing it to the SADS. In particular, the predefined amount of data is obtained from an external device. In particular, the SADS can be external to the device for redundancy of sequential access data. In particular, the SADS can be part of the device for redundancy of sequential access data.
In particular, sequential access data comprises data which is to be stored in an SADS.
In an implementation form of the first aspect, the device is further configured to store the parity information in the SADS when the predefined amount of data is stored in the SADS completely.
This ensures that the parity information can be fully calculated before it is stored in the SADS, thereby reducing overhead (e.g., for interim storing of parity information in an SADS) and further increasing efficiency.
In a further implementation form of the first aspect, the device is further configured to store the parity information in a direct access data store, DADS.
This ensures that the parity information can be stored in a manner which allows direct access to the parity information, which is very efficient during calculation of the parity information, as it does not need to be stored in a sequential access manner.
In particular, the DADS can be external to the device for redundancy of sequential access data. In particular, the DADS can be part of the device for redundancy of sequential access data.
In a further implementation form of the first aspect, the device is further configured to store the parity information in the DADS before the parity information is stored in the SADS.
This is beneficial as the parity information does not need to be stored to an SADS during storing the predefined amount of data.
In a further implementation form of the first aspect, the device is further configured to, for each element of the predefined amount of data that is being stored in the SADS, update the parity information. This ensures that the parity information can be updated, e.g., while it is stored in the DADS, so that parity information does only need to be written to the SADS once.
In particular, the element comprises at least one of: a file; a data block.
In a further implementation form of the first aspect the DADS is a storage of a higher storage tier, compared to the SADS.
This is beneficial as a higher storage tier allows for direct access to data, while a lower storage tier only allows for sequential access to data.
In a further implementation form of the first aspect the DADS comprises at least one of primary storage, secondary storage.
This ensures that storage types can be used which allow for effectively using direct access to data.
In a further implementation form of the first aspect the primary storage comprises at least one of a main memory, a random access memory.
This ensures that memory types can be used which allow for effectively using direct access to data.
In a further implementation form of the first aspect the secondary storage comprises at least one of a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
This ensures that memory or disk types can be used which allow for effectively using direct access to data.
In a further implementation form of the first aspect the SADS comprises tertiary storage.
This ensures that storage types can be used which allow for effectively using sequential access to data. In a further implementation form of the first aspect the tertiary storage comprises a tape drive.
This ensures that drive types can be used which allow for effectively using sequential access to data.
In a further implementation form of the first aspect the SADS comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data. It is also possible that one or more storage entities in the X storage entities can store a different amount of data.
This ensures that a number of X storages, e.g., tape drives, can be used to implement the SADS.
In particular, a storage entity is a virtual or a physical storage drive. In particular, a storage medium in the storage entity can be replaced during storing the predefined amount of data in the SADS. For example, if the storage entity is a tape drive, a tape in the tape drive can be replaced when it is full.
In a further implementation form of the first aspect the DADS comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
This ensures that a desired RAID level can be flexibly implemented by the DADS and the SADS.
In particular, the ratio of X and Y corresponds to a RAID level of the predefined amount of data when being stored by means of the device.
In particular, the term RAID relates to redundant array of inexpensive disks.
A second aspect of the present disclosure provides a method for redundancy of sequential access data, the method comprising the steps of storing, by a device, a predefined amount of data in a sequential access data store, SADS; determining, by the device, parity information based on the predefined amount of data; and when the predefined amount of data is stored in the SADS, storing, by the device, the parity information in the SADS.
In an implementation form of the second aspect, the method further comprises storing, by the device, the parity information in the SADS when the predefined amount of data is stored in the SADS completely.
In a further implementation form of the second aspect, the method further comprises storing, by the device, the parity information in a direct access data store, DADS.
In a further implementation form of the second aspect, the method further comprises storing, by the device, the parity information in the DADS before the parity information is stored in the SADS.
In a further implementation form of the second aspect, the method further comprises, for each element of the predefined amount of data that is being stored in the SADS, updating, by the device, the parity information.
In a further implementation form of the second aspect the DADS is a storage of a higher storage tier, compared to the SADS.
In a further implementation form of the second aspect the DADS comprises at least one of primary storage, secondary storage.
In a further implementation form of the second aspect the primary storage comprises at least one of a main memory, a random access memory.
In a further implementation form of the second aspect the secondary storage comprises at least one of a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
In a further implementation form of the second aspect the SADS comprises tertiary storage. In a further implementation form of the second aspect the tertiary storage comprises a tape drive.
In a further implementation form of the second aspect the SADS comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data. It is also possible that one or more storage entities in the X storage entities can store a different amount of data.
In a further implementation form of the second aspect the DADS comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
The second aspect and its implementation forms include the same advantages as the first aspect and its respective implementation forms.
A third aspect of the present disclosure provides a computer program comprising instructions which, when the program is executed by a computer, cause the computer to perform the method according to the second aspect or any of its implementation forms.
The third aspect includes the same advantages as the first aspect and its respective implementation forms.
It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. BRIEF DESCRIPTION OF DRAWINGS
The above-described aspects and implementation forms of the present disclosure will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which
FIG. 1 shows a schematic view of a device according to an embodiment of the present disclosure;
FIG. 2 shows a schematic view of a device according to an embodiment of the present disclosure in more detail;
FIG. 3 shows a schematic view of a storage archive system;
FIG. 4 shows a schematic view of an operating scenario according to the present disclose;
FIG. 5 shows another schematic view of an operating scenario according to the present disclose;
FIG. 6 shows a schematic view of a method according to an embodiment of the present disclosure;
FIG. 7 shows a schematic view of a conventional RAID system;
FIG. 8 shows a schematic view of a conventional RAIT system.
DETAILED DESCRIPTION OF EMBODIMENTS
FIG. 1 shows a schematic view of a device 100. The device 100 is for improved redundant storing of sequential access data. To this end, the device 100 is configured to store a predefined amount of data 101 in a sequential access data store, SADS 102. The device 100 may be configured to obtain the predefined amount of data 101 from an external device with the intention to have the predefined amount of data 101 stored in a SADS 102. For example, the predefined amount of data 101 may be provided to the device from a computing device, e.g., via a network connection or a system bus. The device 100 is further configured to determine parity information 103 based on the predefined amount of data 101. To store the predefined amount of data 101 in a redundant manner, the device 100 obtains the parity information 103 and also stores the parity information 103. More precisely, when the predefined amount of data 101 is stored in the SADS 102, the device 100 stores the parity information 103 in the SADS 102. Thereby, the parity information 103 can be stored independently from the predefined amount of data 101 (e.g., a file, a data object, or a block), which increases efficiency and reduces overhead.
Optionally, the device 100 may first store the predefined amount of data 101 in the SADS 102 completely and then store the parity information 103 in the SADS 102. That is, the parity information 103 is stored in the SADS 102, after its final state is determined and no further changes to the parity information 103 (which can be changed with every piece of the predefined amount of data 101 that is stored in the SADS 102) are to be expected.
The device 100 is now going to be described in more detail in view of FIG. 2. The device 100 of FIG. 2 includes all functions and features of the device 100 as described in view of FIG. 1.
As it is illustrated in FIG. 2, the device 100 may optionally store the parity information 103 in a direct access data store, DADS 201. In the DADS 201, changes can be easily applied to the parity information 103 based on the predefined amount of data 101, while the predefined amount of data 101 is stored to the SADS 102.
Further optionally, the device 100 may store the parity information 103 in the DADS 201 before the parity information 103 is stored in the SADS 102. Thus, the parity information 103 is stored in the SADS after it was stored in the DADS 201, where it could be easily amended by means of direct access to the parity information 103.
FIG. 2 schematically shows that optionally the parity information 103 can be updated for (and based on) each element 202 of the predefined amount of data 101 that is being stored in the SADS 102. That is, if the predefined amount of data 101 is not yet stored in the SADS 102 completely, the parity information 103 may change based on said elements 202.
Further optionally, the DADS 201 can be a storage of a higher storage tier, compared to the SADS 102. Storage tiering is a method of storing data according to how they are accessed on different storage media (so-called storage tiers).
Further optionally, and as illustrated in FIG. 2, the DADS 201 comprises at least one of primary storage 203, secondary storage 204. The primary storage 203 may comprise at least one of a main memory, a random access memory. The secondary storage 204 may comprise at least one of: a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
Further optionally, and as illustrated in FIG. 2, the SADS 102 may comprise tertiary storage 205. Optionally, the tertiary storage 205 may comprise a tape drive.
As it is now going to be described in view of FIG. 3, the device 100 optionally also may be regarded as a storage system (or as a part of such), which may expose multiple protocols such as file system and object storage and can keep data in multiple tiers including a tape tier long term archive (which corresponds to the SADS 102 and is labelled with 102 in FIG. 3).
The device 100 may support any RAID type independently from the number of tapes. The device 100 may keep the RAID for the tapes on a higher tier (i.e., the DADS 201, which comprises a FILE system/object interface, an SSD tier or a HDD tier). In other words, the parity information 103 is stored in a higher tier (i.e. the DADS 201), before it is stored in the SADS 102. Thus, the device 100 may need space for N tapes in the higher tiers, where N is the number of parity devices. So, for example if a tape needs double protection, and each tape has 10TB of data, the total space needed on higher HDD/SSD tiers will be 20TB.
Data that is to be written to tapes (i.e., the predefined amount of data 101) can be either tiered to the HDD/SSD (and written to tape later) or directly written to the storage. The data will be written directly to the tape and the parity of the data will be kept on the higher tier.
Since parity algorithms are linear, the device 100 can create a RAID, which is larger than the storage volume of single tapes by writing simultaneously to multiple tapes (i.e., the SADS 102) and updating the parity in the DADS 201. Typically, data is written to tapes sequentially, and since the parity is linear it can be assumed that the tapes that are not written yet have zero data, thus not impacting the current parity.
FIG. 4 and FIG. 5 below show how to build a RAID 6 (four data units + two parity units) using only three tape drives. This can be realized, as in the beginning objects are only written to the first three tape drives (which form the SADS 102), and parity is kept on the HDD/SSD (i.e., the DADS 201). As can be seen in figures 4 and 5, the SADS 102 comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data. In the figures, X equals three and the three storage entities correspond to “Tape 1”, “Tape 2” and “Tape 3”.
Once the three cassettes in the tape drives are full, cassettes are replaced in the three tape drives, and three new cassettes are put in. As new data is arriving, the data is written to cassette four, while the parity is calculated from the saved parity on the higher tier with the new data being written. Portions of the parity written to the tape can then be erased from the higher tier.
In the figures, the DADS 201 is formed by the “higher tier” (i.e., by “Parity 1” and “Parity 2”) and comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities. In figures 4 and 5, Y equals two.
It is to be noted that, while the example is a RAID 6 (4+2), with 3 tapes any protection level X+Y can be realized as long as there is space for the parity information in the higher tier.
Data can be written to multiple tape drives in parallel, i.e. file 1 to tape 1, file 2 to tape 2 and file 3 again to tape 1. Once all the tapes are written up to some location, the parity of that portion can be written to the parity tape.
Recovery of a faulty tape can be done in a similar way, by rebuilding the data of the faulty tape in the higher tier, this allows to do the rebuild with any number of available tapes.
FIG. 6 shows a schematic view of a method 600. The method 600 is for redundancy of sequential access data and comprises a first step of storing 601, by a device 100, a predefined amount of data 101 in a sequential access data store, SADS 102. The method further comprises a second step of determining 602, by the device 100, parity information 103 based on the predefined amount of data 101. The method comprises a third step of, when the predefined amount of data 101 is stored in the SADS 102, storing 603by the device 100, the parity information 103 in the SADS 102. The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed disclosure, from the studies of the drawings, this disclosure, and the independent claims. In the claims as well as in the description, the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

1. A device (100) for redundancy of sequential access data, wherein the device (100) is configured to
- store a predefined amount of data (101) in a sequential access data store, SADS (102);
- determine parity information (103) based on the predefined amount of data (101); and
- when the predefined amount of data (101) is stored in the SADS (102), store the parity information (103) in the SADS (102).
2. The device (100) according to claim 1, further configured to store the parity information (103) in the SADS (102) when the predefined amount of data (101) is stored in the SADS (102) completely.
3. The device (100) according to claim 1 or 2, further configured to store the parity information (103) in a direct access data store, DADS (201).
4. The device (100) according to any of the preceding claims, further configured to store the parity information (103) in the DADS (201) before the parity information (103) is stored in the SADS (102).
5. The device (100) according to any of the preceding claims, further configured to, for each element (202) of the predefined amount of data (101) that is being stored in the SADS (102), update the parity information (103).
6. The device (100) according to any of the preceding claims, wherein the DADS (201) is a storage of a higher storage tier, compared to the SADS (102).
7. The device (100) according to any of the preceding claims, wherein the DADS (201) comprises at least one of: primary storage (203), secondary storage (204).
8. The device (100) according to claim 7, wherein the primary storage (203) comprises at least one of: a main memory, a random access memory.
9. The device (100) according to claim 7 or 8, wherein the secondary storage (204) comprises at least one of: a mass storage device, a hard disk drive, a solid state drive, a flash drive, a RAM disk, a non-volatile memory.
10. The device (100) according to any one of the preceding claims, wherein the SADS (102) comprises tertiary storage (205).
11. The device (100) according to claim 10, wherein the tertiary storage (205) comprises a tape drive.
12. The device (100) according to any one of the preceding claims, wherein the SADS (102) comprises a number of X storage entities, wherein X is an integer greater than 1 and wherein each storage entity in the X storage entities can store the same amount of data.
13. The device (100) according to any one of claims 3 to 12, wherein the DADS (201) comprises storage space for a number of Y storage entities, wherein each storage entity in the Y storage entities can store the same amount of data as a storage entity in the X storage entities.
14. A method (600) for redundancy of sequential access data, the method (600) comprising the steps of
- storing (601), by a device (100), a predefined amount of data (101) in a sequential access data store, SADS (102);
- determining (602), by the device (100), parity information (103) based on the predefined amount of data (101); and
- when the predefined amount of data (101) is stored in the SADS (102), storing (603), by the device (100), the parity information (103) in the SADS (102).
15. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to perform the method (600) according to claim 14.
PCT/EP2022/066073 2022-06-14 2022-06-14 Device and method for improved redundant storing of sequential access data WO2023241783A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/066073 WO2023241783A1 (en) 2022-06-14 2022-06-14 Device and method for improved redundant storing of sequential access data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/066073 WO2023241783A1 (en) 2022-06-14 2022-06-14 Device and method for improved redundant storing of sequential access data

Publications (1)

Publication Number Publication Date
WO2023241783A1 true WO2023241783A1 (en) 2023-12-21

Family

ID=82319683

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/066073 WO2023241783A1 (en) 2022-06-14 2022-06-14 Device and method for improved redundant storing of sequential access data

Country Status (1)

Country Link
WO (1) WO2023241783A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120079184A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Methods for managing ownership of redundant data and systems thereof
US20160117222A1 (en) * 2014-10-27 2016-04-28 International Business Machines Corporation Time multiplexed redundant array of independent tapes
US11340987B1 (en) * 2021-03-04 2022-05-24 Netapp, Inc. Methods and systems for raid protection in zoned solid-state drives

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120079184A1 (en) * 2010-09-29 2012-03-29 International Business Machines Corporation Methods for managing ownership of redundant data and systems thereof
US20160117222A1 (en) * 2014-10-27 2016-04-28 International Business Machines Corporation Time multiplexed redundant array of independent tapes
US11340987B1 (en) * 2021-03-04 2022-05-24 Netapp, Inc. Methods and systems for raid protection in zoned solid-state drives

Similar Documents

Publication Publication Date Title
US6393516B2 (en) System and method for storage media group parity protection
US8839028B1 (en) Managing data availability in storage systems
US7281089B2 (en) System and method for reorganizing data in a raid storage system
US6718436B2 (en) Method for managing logical volume in order to support dynamic online resizing and software raid and to minimize metadata and computer readable medium storing the same
US7971013B2 (en) Compensating for write speed differences between mirroring storage devices by striping
US7386758B2 (en) Method and apparatus for reconstructing data in object-based storage arrays
US9104342B2 (en) Two stage checksummed raid storage model
US7206991B2 (en) Method, apparatus and program for migrating between striped storage and parity striped storage
US8041891B2 (en) Method and system for performing RAID level migration
JPH04230512A (en) Method and apparatus for updating record for dasd array
US8930745B2 (en) Storage subsystem and data management method of storage subsystem
US20090006904A1 (en) Apparatus and method to check data integrity when handling data
JP2001147785A (en) Method for managing data
US7882420B2 (en) Method and system for data replication
US6223323B1 (en) Method for storing parity information in a disk array storage system
US10095585B1 (en) Rebuilding data on flash memory in response to a storage device failure regardless of the type of storage device that fails
WO2016137402A1 (en) Data stripping, allocation and reconstruction
US8402213B2 (en) Data redundancy using two distributed mirror sets
US6546458B2 (en) Method and apparatus for arbitrarily large capacity removable media
US7240237B2 (en) Method and system for high bandwidth fault tolerance in a storage subsystem
US8832370B2 (en) Redundant array of independent storage
GB2343265A (en) Data storage array rebuild
WO2023241783A1 (en) Device and method for improved redundant storing of sequential access data
JP2002328814A (en) Method for executing parity operation
US20220066658A1 (en) Raid member distribution for granular disk array growth

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22735107

Country of ref document: EP

Kind code of ref document: A1