US20210034289A1 - User stream aware file systems with user stream detection - Google Patents

User stream aware file systems with user stream detection Download PDF

Info

Publication number
US20210034289A1
US20210034289A1 US16/526,391 US201916526391A US2021034289A1 US 20210034289 A1 US20210034289 A1 US 20210034289A1 US 201916526391 A US201916526391 A US 201916526391A US 2021034289 A1 US2021034289 A1 US 2021034289A1
Authority
US
United States
Prior art keywords
data
stream
sub
digests
streams
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US16/526,391
Other versions
US10929066B1 (en
Inventor
Nickolay Dalmatov
Richard P. Ruef
Kurt W. Everson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC IP Holding Co LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC IP Holding Co LLC filed Critical EMC IP Holding Co LLC
Priority to US16/526,391 priority Critical patent/US10929066B1/en
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RUEF, RICHARD P., Everson, Kurt W.
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH SECURITY AGREEMENT Assignors: DELL PRODUCTS L.P., EMC CORPORATION, EMC IP Holding Company LLC
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT PATENT SECURITY AGREEMENT (NOTES) Assignors: DELL PRODUCTS L.P., EMC CORPORATION, EMC IP Holding Company LLC
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DELL PRODUCTS L.P., EMC CORPORATION, EMC IP Holding Company LLC
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DALMATOV, Nickolay Alexandrovich
Publication of US20210034289A1 publication Critical patent/US20210034289A1/en
Publication of US10929066B1 publication Critical patent/US10929066B1/en
Application granted granted Critical
Assigned to DELL PRODUCTS L.P., EMC CORPORATION, EMC IP Holding Company LLC reassignment DELL PRODUCTS L.P. RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421 Assignors: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH
Assigned to EMC IP Holding Company LLC, EMC CORPORATION, DELL PRODUCTS L.P. reassignment EMC IP Holding Company LLC RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to DELL PRODUCTS L.P., EMC CORPORATION, EMC IP Holding Company LLC reassignment DELL PRODUCTS L.P. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling

Definitions

  • Data storage systems include storage processing circuitry coupled to arrays of non-volatile storage devices, such as, for example, solid state drives (SSDs), hard disk drives (HDDs), optical drives, and so on.
  • the storage processing circuitry is configured to service host-generated storage input/output (JO) requests, which specify data blocks, data files, data pages, and/or other data elements to be written to, read from, created on, and/or deleted from the respective non-volatile storage devices.
  • Such storage processing circuitry is further configured to execute software programs for managing the storage JO requests (e.g., write requests, read requests), and for performing various processing tasks to organize and/or secure the data blocks, data files, data pages, and/or other data elements on the respective non-volatile storage devices.
  • Solid state drives can be configured to support multi-streaming capabilities, which allow the placement of data within the flash-based SSDs to be controlled to reduce write amplification, as well as minimize the processing overhead of garbage collection activities performed within the flash-based SSDs.
  • storage processing circuitry can be configured to service host-generated write requests, each of which can include a tag (also referred to herein as a “stream identifier” or “stream ID”) added to the write request.
  • the storage processing circuitry can group or associate a data element (e.g., a data block) specified by the write request with one or more other data blocks having the same stream ID.
  • the storage processing circuitry can also direct an SSD to place the data block in the same storage segment as the other data block(s) in the group.
  • each such data block in the group can have at least one common attribute relating to, for example, temporal locality, spatial locality, stream ID, logical block address (LBA), data type, port number, and so on.
  • LBA logical block address
  • Such a stream-aware data storage system can include storage processing circuitry configured to service host-generated storage IO requests (e.g., write requests, read requests), which can direct the data storage system to write and/or read data blocks, data files, data pages, and/or other data elements to/from file systems, logical units (LUNs), and/or any other suitable storage objects.
  • host-generated storage IO requests e.g., write requests, read requests
  • LUNs logical units
  • the stream-aware data storage system can further include a file system that has a log-based architecture design, and can employ one or more SSDs (e.g., flash-based SSDs) that provide log-based data storage, which can include a data log divided into a series of storage segments of equal or varying size.
  • SSDs e.g., flash-based SSDs
  • the storage processing circuitry can service a plurality of host-generated write requests specifying a plurality of data blocks, respectively, in an incoming stream of data.
  • the storage processing circuitry can detect and/or identify one or more separate sub-streams in the incoming stream of data based on at least one attribute of the data.
  • the storage processing circuitry can detect and/or identify the respective sub-streams based on certain attribute information such as the temporal locality of the data blocks, the spatial locality of the data blocks, a stream ID associated with each data block, an LBA associated with the data blocks, the type of each data block, the port number through which each data block is received, the host computer that generated the storage JO request, and so on, or any suitable combination thereof Having detected and/or identified the separate sub-streams of data blocks, the storage processing circuitry can form a group of data blocks corresponding to each respective sub-stream, and associate, bind, and/or assign a stream ID to each data block in the respective sub-stream.
  • the storage processing circuitry can then write each group of data blocks having the same stream ID to the same segment of the data log included in the SSD(s).
  • the storage processing circuitry can manage and/or maintain, in persistent data storage, the attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written, received, and/or created.
  • a method of handling multiple data streams in a stream-aware data storage system includes identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, forming one or more groups of data elements from the respective data sub-streams, and writing one or more groups of data elements as log structured data to one or more segments of a data log.
  • the method includes identifying one or more data sub-streams in the incoming data stream based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
  • the method includes associating a stream identifier (ID) to each data element in each data sub-stream.
  • ID stream identifier
  • the method includes writing each group of data elements having the same stream ID to the same segment of the data log.
  • the method includes maintaining, in persistent data storage, information pertaining to at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
  • the method includes generating a digest for each data element in each respective data sub-stream, thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
  • the method includes forming a group of digests from the plurality of digests, and associating a stream identifier (ID) to each digest in the group of digests.
  • ID stream identifier
  • the method includes writing the group of digests as a data stream of log structured data to a segment of the data log.
  • the method includes maintaining, in persistent data storage, information pertaining to (i) at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
  • a data storage system includes a memory, and processing circuitry configured to execute program instructions out of the memory to identify one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, form one or more groups of data elements from the respective data sub-streams, and write one or more groups of data elements as log structured data to one or more segments of a data log.
  • the processing circuitry is configured to execute the program instructions out of the memory to identify one or more data sub-streams in the incoming data stream based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
  • the processing circuitry is configured to execute the program instructions out of the memory to associate a stream identifier (ID) to each data element in each data sub-stream.
  • ID stream identifier
  • the processing circuitry is configured to execute the program instructions out of the memory to write each group of data elements having the same stream ID to the same segment of the data log.
  • the processing circuitry is configured to execute the program instructions out of the memory to maintain, in persistent data storage, information pertaining to at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
  • the processing circuitry is configured to execute the program instructions out of the memory to generate a digest for each data element in each respective data sub-stream, thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
  • the processing circuitry is configured to execute the program instructions out of the memory to form a group of digests from the plurality of digests, and associate a stream identifier (ID) to each digest in the group of digests.
  • ID stream identifier
  • the processing circuitry is configured to execute the program instructions out of the memory to write the group of digests as a data stream of log structured data to a segment of the data log.
  • the processing circuitry is configured to execute the program instructions out of the memory to maintain, in persistent data storage, information pertaining to (i) at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
  • a computer program product includes a set of non-transitory, computer-readable media having instructions that, when executed by control circuitry of a computerized apparatus, cause the control circuitry to perform a method of handling multiple data streams in a stream-aware data storage system.
  • the method includes identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, forming one or more groups of data elements from the respective data sub-streams, and writing one or more groups of data elements as log structured data to one or more segments of a data log.
  • the method includes generating a digest for each data element in each respective data sub-stream in order to generate a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream, forming a group of digests from the plurality of digests, associating a stream identifier (ID) to each digest in the group of digests, and writing the group of digests as a data stream of log structured data to a segment of the data log.
  • ID stream identifier
  • FIG. 1 is a block diagram of an exemplary storage environment, in which techniques can be practiced for handling multiple sub-streams in an incoming stream of data in a stream-aware data storage system;
  • FIG. 2 is a block diagram of exemplary stream detection logic, exemplary deduplication logic, and exemplary stream placement logic included in the data storage system of FIG. 1 , for use in forming, as one or more sub-streams and/or streams, one or more groups of data blocks and/or digests having similar attributes, as well as exemplary persistent data storage for use in managing and/or maintaining information pertaining to the attributes of the groups of data blocks/digests in the respective sub-streams/streams;
  • FIG. 3 a is a block diagram of an exemplary data log included in log-based data storage associated with the data storage system of FIG. 1 , illustrating the placement of the sub-streams/streams of FIG. 2 in respective segments of the data log;
  • FIG. 3 b is a block diagram illustrating an exemplary garbage collection function being performed on data blocks of one of the sub-streams of FIG. 2 ;
  • FIG. 3 c is a block diagram illustrating an exemplary garbage collection function being performed on a stream of digests corresponding to the respective data blocks of FIG. 3 b ;
  • FIG. 4 is a flow diagram of an exemplary method of handling multiple sub-streams in an incoming stream of data in a stream-aware data storage system.
  • the data storage systems can detect and/or identify multiple sub-streams in an incoming stream of data, form a group of data elements (e.g., data blocks) corresponding to each respective sub-stream, and associate, bind, and/or assign a stream identifier (ID) to each data block in the respective sub-stream.
  • a group of data elements e.g., data blocks
  • ID stream identifier
  • the data storage systems can write each group of data blocks having the same stream ID to the same segment of a data log included in one or more non-volatile storage devices (e.g., solid state drives (SSDs)), and manage and/or maintain, in persistent data storage, attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written, received, and/or created.
  • SSDs solid state drives
  • the disclosed techniques can be employed in data storage systems to improve the detection and/or identification of multiple sub-streams in an incoming stream of data, as well as improve the management and/or maintenance of attribute information pertaining to groups of data blocks in the respective sub-streams.
  • FIG. 1 depicts an illustrative embodiment of an exemplary storage environment 100 , in which techniques can be practiced for handling multiple data streams in stream-aware data storage systems.
  • the storage environment 100 can include a plurality of host computers 102 . 1 , 102 . 2 , . . . , 102 . n , a data storage system 104 , and a communications medium 103 that includes at least one network 106 .
  • each of the plurality of host computers 102 . 1 , . . . , 102 . n can be configured as a web server computer, a file server computer, an email server computer, an enterprise server computer, or any other suitable client or server computer or computerized device.
  • the plurality of host computers 102 . 1 , . . . , 102 . n can be further configured to provide, over the network 106 , storage input/output (IO) requests (e.g., small computer system interface (SCSI) commands, network file system (NFS) commands) to the data storage system 104 .
  • IO storage input/output
  • SCSI small computer system interface
  • NFS network file system
  • such storage IO requests e.g., write requests, read requests
  • host data also referred to herein as “host data”
  • the communications medium 103 can be configured to interconnect the plurality of host computers 102 . 1 , . . . , 102 . n with the data storage system 104 to enable them to communicate and exchange data and/or control signaling.
  • the communications medium 103 can be illustrated as a “cloud” to represent different communications topologies such as a backbone topology, a hub-and-spoke topology, a loop topology, an irregular topology, and so on, or any suitable combination thereof.
  • the communications medium 103 can include copper based data communications devices and cabling, fiber optic based communications devices and cabling, wireless communications devices, and so on, or any suitable combination thereof.
  • the communications medium 103 can be further configured to support storage area network (SAN) based communications, network attached storage (NAS) based communications, local area network (LAN) based communications, metropolitan area network (MAN) based communications, wide area network (WAN) based communications, wireless communications, distributed infrastructure communications, and/or any other suitable communications.
  • SAN storage area network
  • NAS network attached storage
  • LAN local area network
  • MAN metropolitan area network
  • WAN wide area network
  • wireless communications distributed infrastructure communications, and/or any other suitable communications.
  • the data storage system 104 can include a communications interface 108 , storage processing circuitry 110 , a memory 112 , and log-based storage media 114 .
  • the communications interface 108 can include SCSI target adapters, network interface adapters, and/or any other suitable adapters for converting electronic, optical, and/or wireless signals received over the network 106 to a form suitable for use by the storage processing circuitry 110 .
  • the memory 112 can include persistent memory (e.g., flash memory, magnetic memory) and non-persistent memory (e.g., dynamic random access memory (DRAM), static random access memory
  • the memory 112 can accommodate specialized software constructs including stream detection logic 116 , stream placement logic 118 , deduplication logic 120 , and a log-based file system 122 .
  • the log-based storage media 114 can accommodate specialized hardware constructs (e.g., processor or processing circuitry, memory) and/or software constructs including a garbage collector 124 , a data log 126 , and/or any other suitable hardware/software construct(s), as well as one or more non-volatile storage devices 128 . 0 , . . . , 128 . m such as solid state drives (e.g., flash-based SSDs).
  • the data log 126 can be implemented on one or more of the flash-based SSDs 128 . 0 , . . . , 128 . m , and can be divided into a series of storage segments (or “windows,” using CBFS® Storage terminology) of equal or varying size.
  • the stream detection logic 116 can be configured to detect and/or identify one or more separate sub-streams in an incoming data stream (e.g., an incoming data stream 202 ; see FIG. 2 ) based on at least one attribute of the data. For example, if the incoming data stream includes an incoming stream of data blocks, then the stream detection logic 116 can detect and/or identify the respective sub-streams based on certain attribute information such as the temporal locality of the data blocks, the spatial locality of the data blocks, a stream identifier (ID) associated with each data block, a logical block address (LBA) associated with the data blocks, the type of each data block (also referred to herein as the “block type”) (e.g., ASCII data type, integer data type, pointer data type, image data type, multimedia data type, digest data type), the port number through which each data block is received, the host computer that generated the storage IO request, and so on, or any suitable combination thereof.
  • ID stream identifier
  • LBA logical
  • the term “temporal locality” refers to a number of data block addresses referenced by storage IO requests per unit time. For example, if the temporal locality of references to a data block address is high, then it is likely that the data block at that address will be accessed again soon. Further, the term “spatial locality” refers to a number of data block addresses referenced by storage IO requests per unit address space. For example, if the spatial locality of references relative to a data block address is high, then it is likely that one or more other data block addresses close to that data block address will also be accessed.
  • the stream placement logic 118 can be configured to form a group of data blocks for each detected and/or identified sub-stream, to associate, bind, and/or assign a stream ID to each data block in the group, and to write the group of data blocks having the same stream ID to logical addresses of the log-based file system 122 .
  • the log-based file system 122 can translate the logical addresses to physical addresses of the log-based storage media 114 , and write the group of data blocks to the respective physical addresses, which can correspond to the same segment of the data log 126 . In this way, the placement of a data sub-stream in a storage segment of the data log 126 of the log-based storage media 114 can be accomplished.
  • the deduplication logic 120 can be configured to generate a digest for each data block (e.g., by applying a hash function to the data block) in each group of data blocks formed by the stream placement logic 118 . Once digests for a respective group of data blocks have been generated, the stream placement logic 118 can group the digests, associate, bind, and/or assign a stream ID to each digest in the group, and write the group of digests having the same stream ID to the same segment of the data log 126 .
  • the deduplication logic 120 can generate a digest for the received data block, compare the generated digest with the respective grouped digests for that sub-stream, and determine whether there is a matching digest, possibly signifying multiple copies of the received data block. If an actual copy of the received data block is found (such as by a bit-by-bit comparison), the storage processing circuitry 110 can remove the received data block from the data storage system 104 , and replace it with a reference to the copy of the data block stored on the log-based storage media 114 , thereby saving storage space.
  • the storage processing circuitry 110 can manage and/or maintain, in persistent data storage (e.g., in the memory 112 and/or on the log-based storage media 114 ), attribute information pertaining to the groups of data blocks in the respective sub-streams, as well as attribute information pertaining to their respective digests, relative to time periods during which the groups of data blocks/digests were written, received, created, and/or generated.
  • persistent data storage e.g., in the memory 112 and/or on the log-based storage media 114
  • attribute information pertaining to the groups of data blocks in the respective sub-streams e.g., in the memory 112 and/or on the log-based storage media 114
  • attribute information pertaining to their respective digests e.g., relative to time periods during which the groups of data blocks/digests were written, received, created, and/or generated.
  • the storage processing circuitry 110 can include one or more physical storage processors or engines (running specialized software), data movers, director boards, blades, IO modules, storage drive controllers, switches, and/or any other suitable computer hardware or combination thereof
  • the storage processing circuitry 110 can execute program instructions out of the memory 112 , process storage IO requests (e.g., write requests, read requests) provided by the respective host computers 102 . 1 , . . . , 102 . n , and store host data in any suitable storage environment (e.g., a redundant array of independent disks (RAID) environment) implemented by the flash-based SSDs 128 . 0 , . . . , 128 . m.
  • RAID redundant array of independent disks
  • a computer program product can be configured to deliver all or a portion of the specialized software constructs to the respective processor(s).
  • a computer program product can include one or more non-transient computer-readable storage media, such as a magnetic disk, a magnetic tape, a compact disk (CD), a digital versatile disk (DVD), an optical disk, a flash drive, a solid state drive (SSD), a secure digital (SD) chip or device, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and so on.
  • the non-transient computer-readable storage media can be encoded with sets of instructions that, when executed by the respective processor(s), perform the techniques disclosed herein.
  • the data log 126 included in the log-based storage media 114 can be divided into a series of storage segments of equal or varying size.
  • a variety of techniques can be employed to partition the data log 126 into the series of storage segments based on, for example, logical addresses, physical addresses, RAID groups, RAID stripes, RAID extents, and/or storage device extents.
  • the series of storage segments can be distributed across different storage tiers, such as a high speed tier of SSDs, a medium speed tier of serial attached SCSI (SAS) devices, a low speed tier of near-line SAS devices, and so on.
  • SAS serial attached SCSI
  • the stream placement logic 118 can associate, bind, and/or assign a stream ID to each data block in the group, and write the group of data blocks having the same stream ID to logical addresses of the log-based file system 122 .
  • the log-based file system 122 can, in turn, write the group of data blocks to physical addresses of the log-based storage media 114 that correspond to the same next adjacent or non-adjacent unoccupied (or available) segment of the data log 126 .
  • the garbage collector 124 can perform garbage collection functions to reclaim storage space in the segment containing the group of data blocks, thereby reducing fragmentation.
  • garbage collection functions can include combining or consolidating any remaining valid data blocks in the storage segment, copying the valid data blocks to unoccupied storage space in a next available segment of the data log 126 , and/or erasing the data blocks in the segment to make its storage space available for reuse.
  • the garbage collector 124 can perform garbage collection functions to reclaim storage space in the segment containing the group of digests for the respective data blocks, due to one or more of the data blocks and their respective digests being concurrently invalidated.
  • the data storage system 104 (see FIG. 1 ) services host-generated storage IO requests (e.g., write requests, read requests), which direct the data storage system 104 to write and/or read data blocks of the incoming data stream 202 (see FIG. 2 ) to/from logical addresses of the log-based file system 122 .
  • host-generated storage IO requests e.g., write requests, read requests
  • FIG. 2 depicts the incoming data stream 202 , which includes, in the order of the storage IO requests, at least a block 212 ( p+ 1), a block 212 ( p ), a block 212 ( q+ 1), a block 212 ( q ), a block 212 ( p ⁇ 1), and a block 212 ( q ⁇ 1).
  • the incoming data stream 202 is operated on by the stream detection logic 116 , which detects and/or identifies one or more separate sub-streams in the incoming data stream 202 based on at least one attribute of the data.
  • the stream detection logic 116 detects the respective sub-streams based on certain attribute information such as the block type (e.g., ASCII data type, integer data type, pointer data type, image data type, multimedia data type, digest data type). For example, the stream detection logic 116 can inspect a header of each data block to detect or identify at least (i) a first sub-stream including the blocks 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1), each of which has the block type, “integer” (INT), and (ii) a second sub-stream including the blocks 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1), each of which has the block type, “image” (IMG).
  • the block type e.g., ASCII data type, integer data type, pointer data type, image data type, multimedia data type, digest data type.
  • the stream detection logic 116 can inspect a header of each data block to detect or identify at least (i) a first sub
  • the stream placement logic 118 forms a first group of data blocks (i.e., . . . block 212 ( p ⁇ 1), block 212 ( p ), block 212 ( p +1) . . . ) corresponding to the first sub-stream, and a second group of data blocks (i.e., . . . block 212 ( q ⁇ 1), block 212 ( q ), block 212 ( q +1) . . . ) corresponding to the second sub-stream, and associates, binds, and/or assigns a stream ID to each data block in the respective groups of data blocks.
  • the stream placement logic 118 associates the stream ID “1” to each data block in the first group of data blocks (i.e., . . . block 212 ( p ⁇ 1), block 212 ( p ), block 212 ( p +1) . . . ), and associates the stream ID “ 2 ” to each data block in the second group of data blocks (i.e., . . . block 212 ( q ⁇ 1), block 212 ( q ), block 212 ( q+ 1) . . . ).
  • a data sub-stream 204 see FIG.
  • a data sub-stream 208 (see also FIG. 2 ) is generated that includes the second group of data blocks having the stream ID “ 2 ” and block type “IMG.”
  • the deduplication logic 120 Having generated the data sub-stream 204 and the data sub-stream 208 , the deduplication logic 120 generates a digest for each data block in the data sub-stream 204 , and likewise generates a digest for each data block in the data sub-stream 208 .
  • the stream placement logic 118 groups the digests for the data blocks in the data sub-stream 204 , groups the digests for the data blocks in the data sub-stream 208 , and associates, binds, and/or assigns a stream ID to each digest in the respective digest groupings.
  • the stream placement logic 118 associates the stream ID “ 3 ” to each digest in the grouping corresponding to the data blocks in the data sub-stream 204 , and associates the stream ID “ 4 ” to each digest in the grouping corresponding to the data blocks in the data sub-stream 208 .
  • the stream placement logic 118 also associates a data type, namely, “digest” (DIG), to the respective groupings of digests.
  • DIG digest
  • the stream placement logic 118 (i) writes the data sub-stream 204 to the log-based file system 122 starting at logical block address (LBA) “W,” (ii) writes the data stream 206 to the log-based file system 122 starting at LBA “X,” (iii) writes the data sub-stream 208 to the log-based file system 122 starting at LBA “Y,” and (iv) writes the data stream 210 to the log-based file system 122 starting at LBA “Z.”
  • the log-based file system 122 then (i) writes the data sub-stream 204 to a first segment of the data log 126 starting at a first physical address translated from the LBA “W,” (ii) writes the data stream 206 to a second segment of the data log 126 starting at a second physical address translated from the LBA “X,” (iii) writes the data sub-stream 208 to a third segment of the data log 126 starting at a third physical address translated from the LBA “Y,” and (iv) writes the data stream
  • the storage processing circuitry 110 manages and/or maintains, in persistent data storage (e.g., in the memory 112 and/or on the log-based storage media 114 ), attribute information pertaining to the data sub-stream 204 , the data stream 206 , the data sub-stream 208 , and the data stream 210 written to the first segment, the second segment, the third segment, and the fourth segment, respectively, of the data log 126 .
  • attribute information for the respective data sub-streams/streams 204 , 206 , 208 , 210 is managed and/or maintained in a log 214 (see FIG. 2 ) relative to time periods during which the corresponding groups of data blocks/digests were written, received, created, and/or generated.
  • the attribute information for the data sub-stream 204 i.e., ID “ 1 ”, LBA “W”, type “INT”
  • the attribute information for its corresponding group of digests in the data stream 206 i.e., ID “ 3 ”, LBA “X”, type “DIG”
  • attribute information for the data sub-stream 208 i.e., ID “ 2 ”, LBA “Y”, type “IMG”
  • attribute information for its corresponding group of digests in the data stream 210 i.e., ID “ 4 ”, LBA “Z”, type “DIG”
  • FIG. 3 a depicts an exemplary embodiment of the data log 122 , which is divided into a series of storage segments of equal or varying size, including at least a storage segment 310 , a storage segment 311 , a storage segment 312 , a storage segment 313 , a storage segment 314 , a storage segment 315 , a storage segment 316 , and a storage segment 317 .
  • the data sub-stream 204 including the first group of blocks . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . is written to the segment 310 , the data stream 206 including the group of digests . . .
  • 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . is written to the segment 312
  • the data sub-stream 208 including the second group of blocks . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . is written to the segment 314
  • the data stream 210 including the group of digests . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . is written to the segment 316 .
  • any suitable number of storage segments 302 can be used to store data blocks corresponding to the data sub-stream 204
  • any suitable number of storage segments 304 can be used to store digests corresponding to the data stream 206 , such that the temporal order of the digests . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . in the storage segments 304 is maintained relative to the temporal order of the blocks . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . in the storage segments 302 .
  • any suitable number of storage segments 306 can be used to store data blocks corresponding to the data sub-stream 208
  • any suitable number of storage segments 308 can be used to store digests corresponding to the data stream 210 , such that the temporal order of the digests . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . in the storage segments 306 is maintained relative to the temporal order of the blocks . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . in the storage segments 308 .
  • t 0 represents the time period during which the first group of blocks . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . was received and its corresponding group of digests . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . was generated.
  • t 1 represents the time period during which the second group of blocks . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . was received and its corresponding group of digests . . . 212 ( q ⁇ 1), 212 ( q ), 212 ( q +1) . . . was generated.
  • FIG. 3 b depicts a garbage collection function performed by the garbage collector 124 on the data sub-stream 204 , which includes the first group of blocks . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) . . . written to the storage segment 310 , as well as blocks . . . 212 ( p +7), 212 ( p +8) . . . written to the storage segment 311 .
  • information is managed and/or maintained (e.g., in the memory 112 and/or the log-based storage media 114 ) in the log 214 , including one or more attributes of the original blocks 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1), 212 ( p +7), 212 ( p +8) such as the block type “INT.”
  • At least the original block 212 ( p +1) included in the data sub-stream 204 is modified.
  • Such modification of the original block 212 ( p +1) is represented by a new block 212 ( p +1), which is sequentially written as log structured data to the storage segment 311 , such as after the block 212 ( p +8).
  • the garbage collector 124 can perform its garbage collection function, which includes invalidating at least the original block 212 ( p +1) written to the storage segment 310 (as indicated by a cross “X” drawn through the block 212 ( p +1); see FIG.
  • any remaining valid data blocks such as the original blocks 212 ( p ⁇ 1), 212 ( p )
  • the group of valid data blocks including the original blocks 212 ( p ⁇ 1), 212 ( p )
  • a next unoccupied (or available) segment not shown
  • information pertaining to one or more attributes of the copied data blocks is managed and/or maintained in the log 214 relative to the time period during which the group of data blocks were copied, written, and/or created.
  • FIG. 3 c depicts a garbage collection function performed by the garbage collector 124 on the data stream 206 , which includes the group of digests . . . 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1) written to the segment 312 , as well as digests . . . 212 ( p +7), 212 ( p+ 8) . . . written to the storage segment 313 .
  • the digests 212 ( p ⁇ 1), 212 ( p ), 212 ( p+ 1) are generated for the blocks 212 ( p ⁇ 1), 212 ( p ), 212 ( p+ 1), respectively, and the digests 212 ( p+ 7), 212 ( p +8) are generated for the blocks 212 ( p +7), 212 ( p+ 8 ), respectively.
  • information is managed and/or maintained (e.g., in the memory 112 and/or the log-based storage media 114 ) in the log 214 , including one or more attributes of the original digests 212 ( p ⁇ 1), 212 ( p ), 212 ( p +1), 212 ( p +7), 212 ( p +8) such as the data type “DIG.”
  • the original block 212 ( p +1) included in the data sub-stream 204 was modified, and such modification of the original block 212 ( p +1) was written as the new block 212 ( p +1) to the storage segment 311 .
  • the original digest 212 ( p +1) for the original block 212 ( p +1) is effectively modified as a new digest 212 ( p +1) for the new block 212 ( p +1).
  • the deduplication logic 120 generates the new digest 212 ( p +1), which is sequentially written as log structured data to the storage segment 313 , such as after the digest 212 ( p +8).
  • the garbage collector 124 can perform its garbage collection function, which includes invalidating at least the original digest 212 ( p +1) written to the storage segment 312 (as indicated by a cross “X” drawn through the digest 212 ( p +1) see FIG.
  • any remaining valid digests such as the original digests 212 ( p ⁇ 1), 212 ( p )) in the storage segment 312 , copying the valid digests (including the original digests 212 ( p ⁇ 1), 212 ( p )) to a next unoccupied (or available) segment (not shown) among the storage segments 304 , and erasing at least the original digests 212 ( p ⁇ 1), 210 ( p ), 210 ( p +1) from the storage segment 312 to make its storage space available for reuse.
  • the temporal order of the digests . . . 212 ( p +7), 212 ( p +8), 212 ( p +1), (including the copied valid digests) in the storage segments 304 is maintained relative to the temporal order of the data blocks . . . 212 ( p +7), 212 ( p +8), 212 ( p +1) . . . (including the copied valid data blocks) in the storage segments 302 .
  • information pertaining to one or more attributes of the copied valid digests e.g., stream ID, LBA, data type
  • the size of the deduplication domain of the data sub-stream 204 can be reduced, allowing the deduplication logic 120 to perform its deduplication activities with increased efficiency. Such efficiencies can likewise be achieved while performing deduplication activities involving the data blocks of the data sub-stream 208 and the digests of the data stream 210 , due to the reduced size of the deduplication domain of the data sub-stream 208 .
  • improved temporal and/or spatial localities of data blocks in a data sub-stream can allow for the possibility of a reduced deduplication index footprint.
  • a predetermined sampling of the total number of digests can be maintained in the respective storage segments 304 , 308 to further increase deduplication efficiencies. Once a matching digest among the predetermined sampling of digests is identified, the deduplication logic 120 can then access a fuller or full set of the digests to complete its deduplication activities.
  • one or more data sub-streams are identified in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams.
  • one or more groups of data elements are formed from the respective data sub-streams.
  • one or more groups of data elements are written as log structured data to one or more segments of a data log.
  • storage system is intended to be broadly construed to encompass, for example, private or public cloud computing systems for storing data, as well as systems for storing data comprising virtual infrastructure and those not comprising virtual infrastructure.
  • client refers, interchangeably, to any person, system, or other entity that uses a storage system to read/write data.
  • the term “storage device” may also refer to a storage array including multiple storage devices.
  • a storage device may refer to any non-volatile memory (NVM) device, including hard disk drives (HDDs), solid state drivers (SSDs), flash devices (e.g., NAND flash devices, NOR flash devices), and similar devices that may be accessed locally and/or remotely (e.g., via a storage attached network (SAN)).
  • NVM non-volatile memory
  • HDDs hard disk drives
  • SSDs solid state drivers
  • flash devices e.g., NAND flash devices, NOR flash devices
  • SAN storage attached network
  • a storage array (or disk array) may refer to a data storage system used for block-based, file-based, or object storage, in which storage arrays can include, for example, dedicated storage hardware containing spinning hard disk drives (HDDs), solid state disk drives, and/or all-flash drives.
  • a data storage entity may be any one or more of a file system, object storage, a virtualized device, a logical unit (LU), a logical unit number (LUN), a logical volume, a logical device, a physical device, and/or a storage medium.
  • a logical unit (LU) may be a logical entity provided by a storage system for accessing data from the storage system.
  • a logical unit (LU) is used interchangeably with a logical volume.
  • a LU or LUN may be used interchangeably with each other.
  • a LUN may be a logical unit number for identifying a logical unit, and may also refer to one or more virtual disks or virtual LUNs, which may correspond to one or more virtual machines.
  • a physical storage unit may be a physical entity, such as a disk or an array of disks, for storing data in storage locations that can be accessed by address, in which a physical storage unit is used interchangeably with a physical volume.
  • the term “storage medium” may refer to one or more storage media such as a hard drive, a combination of hard drives, flash storage, a combination of flash storage, a combination of hard drives, flash storage, and other storage devices, and other types and/or combinations of computer readable storage media.
  • a storage medium may also refer to both physical and logical storage media, and may include multiple levels of virtual-to-physical mappings, and may be or include an image or disk image.
  • a storage medium may be computer-readable, and may also be referred to as a computer-readable program medium.
  • JO request or simply “JO” may be used to refer to an input or output request, such as a data read request or a data write request.
  • fragmentation refers to a process performed by a computer to reduce fragmentation by combining portions of data blocks, data files, or portions of other types of data storage units stored across non-contiguous areas of memory. Such combining of portions of data storage units makes subsequent access to the respective types of data storage units more efficient, and makes the resulting freed storage space available for reuse.
  • the terms, “such as,” “for example,” “e.g.,” “exemplary,” and variants thereof, describe non-limiting embodiments and mean “serving as an example, instance, or illustration.” Any embodiments described herein using such phrases and/or variants are not necessarily to be construed as preferred or more advantageous over other embodiments, and/or to exclude the incorporation of features from other embodiments.
  • the term “optionally” is employed herein to mean that a feature or process, etc., is provided in certain embodiments and not provided in other certain embodiments. Any particular embodiment of the present disclosure may include a plurality of “optional” features unless such features conflict with one another.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Techniques for handling multiple data streams in stream-aware data storage systems. The data storage systems can detect multiple sub-streams in an incoming stream of data, form a group of data blocks corresponding to each respective sub-stream, and associate, bind, and/or assign a stream ID to each data block in the respective sub-stream. The data storage systems can write each group of data blocks having the same stream ID to the same segment of a data log in one or more non-volatile storage devices, and manage and/or maintain, in persistent data storage, attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written and/or received. The techniques can improve the detection of multiple sub-streams in an incoming stream of data, and improve the management of attribute information pertaining to data blocks in the respective sub-streams.

Description

    BACKGROUND
  • Data storage systems include storage processing circuitry coupled to arrays of non-volatile storage devices, such as, for example, solid state drives (SSDs), hard disk drives (HDDs), optical drives, and so on. The storage processing circuitry is configured to service host-generated storage input/output (JO) requests, which specify data blocks, data files, data pages, and/or other data elements to be written to, read from, created on, and/or deleted from the respective non-volatile storage devices. Such storage processing circuitry is further configured to execute software programs for managing the storage JO requests (e.g., write requests, read requests), and for performing various processing tasks to organize and/or secure the data blocks, data files, data pages, and/or other data elements on the respective non-volatile storage devices.
  • SUMMARY
  • Solid state drives (e.g., flash-based SSDs) can be configured to support multi-streaming capabilities, which allow the placement of data within the flash-based SSDs to be controlled to reduce write amplification, as well as minimize the processing overhead of garbage collection activities performed within the flash-based SSDs. To implement such multi-streaming capabilities, storage processing circuitry can be configured to service host-generated write requests, each of which can include a tag (also referred to herein as a “stream identifier” or “stream ID”) added to the write request. The storage processing circuitry can group or associate a data element (e.g., a data block) specified by the write request with one or more other data blocks having the same stream ID. The storage processing circuitry can also direct an SSD to place the data block in the same storage segment as the other data block(s) in the group. For example, each such data block in the group can have at least one common attribute relating to, for example, temporal locality, spatial locality, stream ID, logical block address (LBA), data type, port number, and so on.
  • Unfortunately, there are shortcomings to implementing multi-streaming capabilities in data storage systems. For example, in such data storage systems, information pertaining to groups of data blocks written to SSDs (e.g., various attributes of data blocks in the respective groups) is often not well managed and/or maintained. As a result, locality information (e.g., temporal locality, spatial locality), stream IDs, LBAs, data types, and/or other attribute information for the respective data blocks can be lost or otherwise made unavailable, particularly after garbage collection activities are performed within the SSDs. In addition, incoming streams of data received at the data storage systems can sometimes include multiple separate sub-streams originating from different users and/or applications. However, in such data storage systems, multiple sub-streams in an incoming stream of data are often not well detected and/or identified. Even if such sub-streams were well detected/identified by the data storage systems, less than optimal management and/or maintenance of attribute information pertaining to groups of data blocks in the respective sub-streams can lead to undesirable comingling of data blocks from different sub-streams.
  • Techniques are disclosed herein for handling multiple data streams in stream-aware data storage systems. The disclosed techniques can be employed in data storage systems to improve the detection and/or identification of multiple sub-streams in an incoming stream of data, as well as improve the management and/or maintenance of attribute information pertaining to groups of data blocks in the respective sub-streams. Such a stream-aware data storage system can include storage processing circuitry configured to service host-generated storage IO requests (e.g., write requests, read requests), which can direct the data storage system to write and/or read data blocks, data files, data pages, and/or other data elements to/from file systems, logical units (LUNs), and/or any other suitable storage objects. The stream-aware data storage system can further include a file system that has a log-based architecture design, and can employ one or more SSDs (e.g., flash-based SSDs) that provide log-based data storage, which can include a data log divided into a series of storage segments of equal or varying size.
  • In the stream-aware data storage system, the storage processing circuitry can service a plurality of host-generated write requests specifying a plurality of data blocks, respectively, in an incoming stream of data. The storage processing circuitry can detect and/or identify one or more separate sub-streams in the incoming stream of data based on at least one attribute of the data. For example, if the incoming stream of data includes an incoming stream of data blocks, then the storage processing circuitry can detect and/or identify the respective sub-streams based on certain attribute information such as the temporal locality of the data blocks, the spatial locality of the data blocks, a stream ID associated with each data block, an LBA associated with the data blocks, the type of each data block, the port number through which each data block is received, the host computer that generated the storage JO request, and so on, or any suitable combination thereof Having detected and/or identified the separate sub-streams of data blocks, the storage processing circuitry can form a group of data blocks corresponding to each respective sub-stream, and associate, bind, and/or assign a stream ID to each data block in the respective sub-stream. The storage processing circuitry can then write each group of data blocks having the same stream ID to the same segment of the data log included in the SSD(s). In addition, the storage processing circuitry can manage and/or maintain, in persistent data storage, the attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written, received, and/or created.
  • In certain embodiments, a method of handling multiple data streams in a stream-aware data storage system includes identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, forming one or more groups of data elements from the respective data sub-streams, and writing one or more groups of data elements as log structured data to one or more segments of a data log.
  • In certain arrangements, the method includes identifying one or more data sub-streams in the incoming data stream based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
  • In certain arrangements, the method includes associating a stream identifier (ID) to each data element in each data sub-stream.
  • In certain arrangements, the method includes writing each group of data elements having the same stream ID to the same segment of the data log.
  • In certain arrangements, the method includes maintaining, in persistent data storage, information pertaining to at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
  • In certain arrangements, the method includes generating a digest for each data element in each respective data sub-stream, thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
  • In certain arrangements, the method includes forming a group of digests from the plurality of digests, and associating a stream identifier (ID) to each digest in the group of digests.
  • In certain arrangements, the method includes writing the group of digests as a data stream of log structured data to a segment of the data log.
  • In certain arrangements, the method includes maintaining, in persistent data storage, information pertaining to (i) at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
  • In certain embodiments, a data storage system includes a memory, and processing circuitry configured to execute program instructions out of the memory to identify one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, form one or more groups of data elements from the respective data sub-streams, and write one or more groups of data elements as log structured data to one or more segments of a data log.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to identify one or more data sub-streams in the incoming data stream based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to associate a stream identifier (ID) to each data element in each data sub-stream.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to write each group of data elements having the same stream ID to the same segment of the data log.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to maintain, in persistent data storage, information pertaining to at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to generate a digest for each data element in each respective data sub-stream, thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to form a group of digests from the plurality of digests, and associate a stream identifier (ID) to each digest in the group of digests.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to write the group of digests as a data stream of log structured data to a segment of the data log.
  • In certain arrangements, the processing circuitry is configured to execute the program instructions out of the memory to maintain, in persistent data storage, information pertaining to (i) at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
  • In certain embodiments, a computer program product includes a set of non-transitory, computer-readable media having instructions that, when executed by control circuitry of a computerized apparatus, cause the control circuitry to perform a method of handling multiple data streams in a stream-aware data storage system. The method includes identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams, forming one or more groups of data elements from the respective data sub-streams, and writing one or more groups of data elements as log structured data to one or more segments of a data log.
  • In certain arrangements, the method includes generating a digest for each data element in each respective data sub-stream in order to generate a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream, forming a group of digests from the plurality of digests, associating a stream identifier (ID) to each digest in the group of digests, and writing the group of digests as a data stream of log structured data to a segment of the data log.
  • Other features, functions, and aspects of the present disclosure will be evident from the Detailed Description that follows.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, features, and advantages will be apparent from the following description of particular embodiments of the present disclosure, as illustrated in the accompanying drawings, in which like reference characters refer to the same parts throughout the different views.
  • FIG. 1 is a block diagram of an exemplary storage environment, in which techniques can be practiced for handling multiple sub-streams in an incoming stream of data in a stream-aware data storage system;
  • FIG. 2 is a block diagram of exemplary stream detection logic, exemplary deduplication logic, and exemplary stream placement logic included in the data storage system of FIG. 1, for use in forming, as one or more sub-streams and/or streams, one or more groups of data blocks and/or digests having similar attributes, as well as exemplary persistent data storage for use in managing and/or maintaining information pertaining to the attributes of the groups of data blocks/digests in the respective sub-streams/streams;
  • FIG. 3a is a block diagram of an exemplary data log included in log-based data storage associated with the data storage system of FIG. 1, illustrating the placement of the sub-streams/streams of FIG. 2 in respective segments of the data log;
  • FIG. 3b is a block diagram illustrating an exemplary garbage collection function being performed on data blocks of one of the sub-streams of FIG. 2;
  • FIG. 3c is a block diagram illustrating an exemplary garbage collection function being performed on a stream of digests corresponding to the respective data blocks of FIG. 3b ; and
  • FIG. 4 is a flow diagram of an exemplary method of handling multiple sub-streams in an incoming stream of data in a stream-aware data storage system.
  • DETAILED DESCRIPTION
  • Techniques are disclosed herein for handling multiple data streams in stream-aware data storage systems. The data storage systems can detect and/or identify multiple sub-streams in an incoming stream of data, form a group of data elements (e.g., data blocks) corresponding to each respective sub-stream, and associate, bind, and/or assign a stream identifier (ID) to each data block in the respective sub-stream. The data storage systems can write each group of data blocks having the same stream ID to the same segment of a data log included in one or more non-volatile storage devices (e.g., solid state drives (SSDs)), and manage and/or maintain, in persistent data storage, attribute information pertaining to the groups of data blocks in the respective sub-streams relative to time periods during which the respective groups of data blocks were written, received, and/or created. The disclosed techniques can be employed in data storage systems to improve the detection and/or identification of multiple sub-streams in an incoming stream of data, as well as improve the management and/or maintenance of attribute information pertaining to groups of data blocks in the respective sub-streams.
  • FIG. 1 depicts an illustrative embodiment of an exemplary storage environment 100, in which techniques can be practiced for handling multiple data streams in stream-aware data storage systems. As shown in FIG. 1, the storage environment 100 can include a plurality of host computers 102.1, 102.2, . . . , 102.n, a data storage system 104, and a communications medium 103 that includes at least one network 106. For example, each of the plurality of host computers 102.1, . . . , 102.n can be configured as a web server computer, a file server computer, an email server computer, an enterprise server computer, or any other suitable client or server computer or computerized device. The plurality of host computers 102.1, . . . , 102.n can be further configured to provide, over the network 106, storage input/output (IO) requests (e.g., small computer system interface (SCSI) commands, network file system (NFS) commands) to the data storage system 104. For example, such storage IO requests (e.g., write requests, read requests) can direct the data storage system 104 to write and/or read data blocks, data files, data pages, and/or other data elements (also referred to herein as “host data”) to/from file systems, logical units (LUNs), and/or any other suitable storage objects maintained in association with the data storage system 104.
  • The communications medium 103 can be configured to interconnect the plurality of host computers 102.1, . . . , 102.n with the data storage system 104 to enable them to communicate and exchange data and/or control signaling. As shown in FIG. 1, the communications medium 103 can be illustrated as a “cloud” to represent different communications topologies such as a backbone topology, a hub-and-spoke topology, a loop topology, an irregular topology, and so on, or any suitable combination thereof. As such, the communications medium 103 can include copper based data communications devices and cabling, fiber optic based communications devices and cabling, wireless communications devices, and so on, or any suitable combination thereof. The communications medium 103 can be further configured to support storage area network (SAN) based communications, network attached storage (NAS) based communications, local area network (LAN) based communications, metropolitan area network (MAN) based communications, wide area network (WAN) based communications, wireless communications, distributed infrastructure communications, and/or any other suitable communications.
  • The data storage system 104 can include a communications interface 108, storage processing circuitry 110, a memory 112, and log-based storage media 114. The communications interface 108 can include SCSI target adapters, network interface adapters, and/or any other suitable adapters for converting electronic, optical, and/or wireless signals received over the network 106 to a form suitable for use by the storage processing circuitry 110. The memory 112 can include persistent memory (e.g., flash memory, magnetic memory) and non-persistent memory (e.g., dynamic random access memory (DRAM), static random access memory
  • (SRAM)). Further, the memory 112 can accommodate specialized software constructs including stream detection logic 116, stream placement logic 118, deduplication logic 120, and a log-based file system 122. The log-based storage media 114 can accommodate specialized hardware constructs (e.g., processor or processing circuitry, memory) and/or software constructs including a garbage collector 124, a data log 126, and/or any other suitable hardware/software construct(s), as well as one or more non-volatile storage devices 128.0, . . . , 128.m such as solid state drives (e.g., flash-based SSDs). The data log 126 can be implemented on one or more of the flash-based SSDs 128.0, . . . , 128.m, and can be divided into a series of storage segments (or “windows,” using CBFS® Storage terminology) of equal or varying size.
  • The stream detection logic 116 can be configured to detect and/or identify one or more separate sub-streams in an incoming data stream (e.g., an incoming data stream 202; see FIG. 2) based on at least one attribute of the data. For example, if the incoming data stream includes an incoming stream of data blocks, then the stream detection logic 116 can detect and/or identify the respective sub-streams based on certain attribute information such as the temporal locality of the data blocks, the spatial locality of the data blocks, a stream identifier (ID) associated with each data block, a logical block address (LBA) associated with the data blocks, the type of each data block (also referred to herein as the “block type”) (e.g., ASCII data type, integer data type, pointer data type, image data type, multimedia data type, digest data type), the port number through which each data block is received, the host computer that generated the storage IO request, and so on, or any suitable combination thereof. As employed herein, the term “temporal locality” refers to a number of data block addresses referenced by storage IO requests per unit time. For example, if the temporal locality of references to a data block address is high, then it is likely that the data block at that address will be accessed again soon. Further, the term “spatial locality” refers to a number of data block addresses referenced by storage IO requests per unit address space. For example, if the spatial locality of references relative to a data block address is high, then it is likely that one or more other data block addresses close to that data block address will also be accessed.
  • The stream placement logic 118 can be configured to form a group of data blocks for each detected and/or identified sub-stream, to associate, bind, and/or assign a stream ID to each data block in the group, and to write the group of data blocks having the same stream ID to logical addresses of the log-based file system 122. The log-based file system 122 can translate the logical addresses to physical addresses of the log-based storage media 114, and write the group of data blocks to the respective physical addresses, which can correspond to the same segment of the data log 126. In this way, the placement of a data sub-stream in a storage segment of the data log 126 of the log-based storage media 114 can be accomplished.
  • The deduplication logic 120 can be configured to generate a digest for each data block (e.g., by applying a hash function to the data block) in each group of data blocks formed by the stream placement logic 118. Once digests for a respective group of data blocks have been generated, the stream placement logic 118 can group the digests, associate, bind, and/or assign a stream ID to each digest in the group, and write the group of digests having the same stream ID to the same segment of the data log 126. For each received data block corresponding to a detected and/or identified sub-stream, the deduplication logic 120 can generate a digest for the received data block, compare the generated digest with the respective grouped digests for that sub-stream, and determine whether there is a matching digest, possibly signifying multiple copies of the received data block. If an actual copy of the received data block is found (such as by a bit-by-bit comparison), the storage processing circuitry 110 can remove the received data block from the data storage system 104, and replace it with a reference to the copy of the data block stored on the log-based storage media 114, thereby saving storage space. In addition, the storage processing circuitry 110 can manage and/or maintain, in persistent data storage (e.g., in the memory 112 and/or on the log-based storage media 114), attribute information pertaining to the groups of data blocks in the respective sub-streams, as well as attribute information pertaining to their respective digests, relative to time periods during which the groups of data blocks/digests were written, received, created, and/or generated.
  • The storage processing circuitry 110 can include one or more physical storage processors or engines (running specialized software), data movers, director boards, blades, IO modules, storage drive controllers, switches, and/or any other suitable computer hardware or combination thereof For example, the storage processing circuitry 110 can execute program instructions out of the memory 112, process storage IO requests (e.g., write requests, read requests) provided by the respective host computers 102.1, . . . , 102.n, and store host data in any suitable storage environment (e.g., a redundant array of independent disks (RAID) environment) implemented by the flash-based SSDs 128.0, . . . , 128.m.
  • In the context of the storage processing circuitry 110 being implemented using one or more processors running specialized software, a computer program product can be configured to deliver all or a portion of the specialized software constructs to the respective processor(s). Such a computer program product can include one or more non-transient computer-readable storage media, such as a magnetic disk, a magnetic tape, a compact disk (CD), a digital versatile disk (DVD), an optical disk, a flash drive, a solid state drive (SSD), a secure digital (SD) chip or device, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and so on. The non-transient computer-readable storage media can be encoded with sets of instructions that, when executed by the respective processor(s), perform the techniques disclosed herein.
  • During operation, the data log 126 included in the log-based storage media 114 can be divided into a series of storage segments of equal or varying size. A variety of techniques can be employed to partition the data log 126 into the series of storage segments based on, for example, logical addresses, physical addresses, RAID groups, RAID stripes, RAID extents, and/or storage device extents. In certain embodiments, the series of storage segments can be distributed across different storage tiers, such as a high speed tier of SSDs, a medium speed tier of serial attached SCSI (SAS) devices, a low speed tier of near-line SAS devices, and so on. Once a group of data blocks from a detected and/or identified sub-stream contains a full segment's worth of data, the stream placement logic 118 can associate, bind, and/or assign a stream ID to each data block in the group, and write the group of data blocks having the same stream ID to logical addresses of the log-based file system 122. The log-based file system 122 can, in turn, write the group of data blocks to physical addresses of the log-based storage media 114 that correspond to the same next adjacent or non-adjacent unoccupied (or available) segment of the data log 126.
  • In the event one or more data blocks in the group is modified, updated, overwritten, unmapped, or otherwise invalidated, the garbage collector 124 can perform garbage collection functions to reclaim storage space in the segment containing the group of data blocks, thereby reducing fragmentation. For example, such garbage collection functions can include combining or consolidating any remaining valid data blocks in the storage segment, copying the valid data blocks to unoccupied storage space in a next available segment of the data log 126, and/or erasing the data blocks in the segment to make its storage space available for reuse. Likewise, the garbage collector 124 can perform garbage collection functions to reclaim storage space in the segment containing the group of digests for the respective data blocks, due to one or more of the data blocks and their respective digests being concurrently invalidated. By controlling the placement of groups of data blocks and/or their respective digests in segments of the data log 126 based at least on the sub-stream/stream to which each group of data blocks/digests belongs, the processing overhead associated with performing such garbage collection functions can be significantly reduced.
  • The disclosed techniques for handling multiple data streams in stream-aware data storage systems will be further understood with reference to the following illustrative example, as well as FIGS. 1, 2, and 3 a-3 c. In this example, the data storage system 104 (see FIG. 1) services host-generated storage IO requests (e.g., write requests, read requests), which direct the data storage system 104 to write and/or read data blocks of the incoming data stream 202 (see FIG. 2) to/from logical addresses of the log-based file system 122.
  • FIG. 2 depicts the incoming data stream 202, which includes, in the order of the storage IO requests, at least a block 212(p+1), a block 212(p), a block 212(q+1), a block 212(q), a block 212(p−1), and a block 212(q−1). As shown in FIG. 2, the incoming data stream 202 is operated on by the stream detection logic 116, which detects and/or identifies one or more separate sub-streams in the incoming data stream 202 based on at least one attribute of the data. In this example, the stream detection logic 116 detects the respective sub-streams based on certain attribute information such as the block type (e.g., ASCII data type, integer data type, pointer data type, image data type, multimedia data type, digest data type). For example, the stream detection logic 116 can inspect a header of each data block to detect or identify at least (i) a first sub-stream including the blocks 212(p−1), 212(p), 212(p+1), each of which has the block type, “integer” (INT), and (ii) a second sub-stream including the blocks 212(q−1), 212(q), 212(q+1), each of which has the block type, “image” (IMG).
  • Having detected the first sub-stream and the second sub-stream in the incoming data stream 202, the stream placement logic 118 forms a first group of data blocks (i.e., . . . block 212(p−1), block 212(p), block 212(p+1) . . . ) corresponding to the first sub-stream, and a second group of data blocks (i.e., . . . block 212(q−1), block 212(q), block 212(q+1) . . . ) corresponding to the second sub-stream, and associates, binds, and/or assigns a stream ID to each data block in the respective groups of data blocks. In this example, the stream placement logic 118 associates the stream ID “1” to each data block in the first group of data blocks (i.e., . . . block 212(p−1), block 212(p), block 212(p+1) . . . ), and associates the stream ID “2” to each data block in the second group of data blocks (i.e., . . . block 212(q−1), block 212(q), block 212(q+1) . . . ). In this way, a data sub-stream 204 (see FIG. 2) is generated that includes the first group of data blocks having the stream ID “1” and block type “INT,” and a data sub-stream 208 (see also FIG. 2) is generated that includes the second group of data blocks having the stream ID “2” and block type “IMG.”
  • Having generated the data sub-stream 204 and the data sub-stream 208, the deduplication logic 120 generates a digest for each data block in the data sub-stream 204, and likewise generates a digest for each data block in the data sub-stream 208. Once the digests are generated for each data block in the respective sub-streams 204, 208, the stream placement logic 118 groups the digests for the data blocks in the data sub-stream 204, groups the digests for the data blocks in the data sub-stream 208, and associates, binds, and/or assigns a stream ID to each digest in the respective digest groupings. In this example, the stream placement logic 118 associates the stream ID “3” to each digest in the grouping corresponding to the data blocks in the data sub-stream 204, and associates the stream ID “4” to each digest in the grouping corresponding to the data blocks in the data sub-stream 208. The stream placement logic 118 also associates a data type, namely, “digest” (DIG), to the respective groupings of digests. In this way, a data stream 206 is generated that includes the grouping of digests having the stream ID “3” and data type “DIG,” and a data stream 210 is generated that includes the grouping of digests having the stream ID “4” and data type “DIG.”
  • The stream placement logic 118 (i) writes the data sub-stream 204 to the log-based file system 122 starting at logical block address (LBA) “W,” (ii) writes the data stream 206 to the log-based file system 122 starting at LBA “X,” (iii) writes the data sub-stream 208 to the log-based file system 122 starting at LBA “Y,” and (iv) writes the data stream 210 to the log-based file system 122 starting at LBA “Z.” The log-based file system 122 then (i) writes the data sub-stream 204 to a first segment of the data log 126 starting at a first physical address translated from the LBA “W,” (ii) writes the data stream 206 to a second segment of the data log 126 starting at a second physical address translated from the LBA “X,” (iii) writes the data sub-stream 208 to a third segment of the data log 126 starting at a third physical address translated from the LBA “Y,” and (iv) writes the data stream 210 to a fourth segment of the data log 126 starting at a fourth physical address translated from the LBA “Z.”
  • The storage processing circuitry 110 manages and/or maintains, in persistent data storage (e.g., in the memory 112 and/or on the log-based storage media 114), attribute information pertaining to the data sub-stream 204, the data stream 206, the data sub-stream 208, and the data stream 210 written to the first segment, the second segment, the third segment, and the fourth segment, respectively, of the data log 126. Such attribute information for the respective data sub-streams/ streams 204, 206, 208, 210 is managed and/or maintained in a log 214 (see FIG. 2) relative to time periods during which the corresponding groups of data blocks/digests were written, received, created, and/or generated. In this example, the attribute information for the data sub-stream 204 (i.e., ID “1”, LBA “W”, type “INT”), as well as the attribute information for its corresponding group of digests in the data stream 206 (i.e., ID “3”, LBA “X”, type “DIG”), are managed relative to a time period “t0” during which the corresponding groups of data blocks/digests were received/generated. Further, the attribute information for the data sub-stream 208 (i.e., ID “2”, LBA “Y”, type “IMG”), as well as the attribute information for its corresponding group of digests in the data stream 210 (i.e., ID “4”, LBA “Z”, type “DIG”), are managed relative to a time period “t1” during which the corresponding groups of data blocks/digests were received/generated.
  • FIG. 3a depicts an exemplary embodiment of the data log 122, which is divided into a series of storage segments of equal or varying size, including at least a storage segment 310, a storage segment 311, a storage segment 312, a storage segment 313, a storage segment 314, a storage segment 315, a storage segment 316, and a storage segment 317. In this example, the data sub-stream 204 including the first group of blocks . . . 212(p−1), 212(p), 212(p+1) . . . is written to the segment 310, the data stream 206 including the group of digests . . . 212(p−1), 212(p), 212(p+1) . . . is written to the segment 312, the data sub-stream 208 including the second group of blocks . . . 212(q−1), 212(q), 212(q+1) . . . is written to the segment 314, and the data stream 210 including the group of digests . . . 212(q−1), 212(q), 212(q+1) . . . is written to the segment 316.
  • It is noted that any suitable number of storage segments 302 (including the storage segments 310, 311) can be used to store data blocks corresponding to the data sub-stream 204, and any suitable number of storage segments 304 (including the storage segments 312, 313) can be used to store digests corresponding to the data stream 206, such that the temporal order of the digests . . . 212(p−1), 212(p), 212(p+1) . . . in the storage segments 304 is maintained relative to the temporal order of the blocks . . . 212(p−1), 212(p), 212(p+1) . . . in the storage segments 302. Likewise, any suitable number of storage segments 306 (including the storage segments 314, 315) can be used to store data blocks corresponding to the data sub-stream 208, and any suitable number of storage segments 308 (including the storage segments 316, 317) can be used to store digests corresponding to the data stream 210, such that the temporal order of the digests . . . 212(q−1), 212(q), 212(q+1) . . . in the storage segments 306 is maintained relative to the temporal order of the blocks . . . 212(q−1), 212(q), 212(q+1) . . . in the storage segments 308. It is also noted that “t0” represents the time period during which the first group of blocks . . . 212(p−1), 212(p), 212(p+1) . . . was received and its corresponding group of digests . . . 212(p−1), 212(p), 212(p+1) . . . was generated. Further, “t1” represents the time period during which the second group of blocks . . . 212(q−1), 212(q), 212(q+1) . . . was received and its corresponding group of digests . . . 212(q−1), 212(q), 212(q+1) . . . was generated.
  • FIG. 3b depicts a garbage collection function performed by the garbage collector 124 on the data sub-stream 204, which includes the first group of blocks . . . 212(p−1), 212(p), 212(p+1) . . . written to the storage segment 310, as well as blocks . . . 212(p+7), 212(p+8) . . . written to the storage segment 311. In this example, the blocks . . . 212(p−1), 212(p), 212(p+1), 212(p+7), 212(p+8) . . . (corresponding to the stream ID “1”) each have the block type “INT.” Further, at least the block 212(p−1), the block 212(p), and the block 212(p+1) are sequentially written as log structured data to the storage segment 310, and at least the block 212(p+7) and the block 212(p+8) are sequentially written as log structured data to the storage segment 311. In addition, information is managed and/or maintained (e.g., in the memory 112 and/or the log-based storage media 114) in the log 214, including one or more attributes of the original blocks 212(p−1), 212(p), 212(p+1), 212(p+7), 212(p+8) such as the block type “INT.”
  • In this example, at least the original block 212(p+1) included in the data sub-stream 204 is modified. Such modification of the original block 212(p+1) is represented by a new block 212(p+1), which is sequentially written as log structured data to the storage segment 311, such as after the block 212(p+8). Once at least the new block 212(p+1) is written to the storage segment 311, the garbage collector 124 can perform its garbage collection function, which includes invalidating at least the original block 212(p+1) written to the storage segment 310 (as indicated by a cross “X” drawn through the block 212(p+1); see FIG. 3b ), combining or consolidating any remaining valid data blocks (such as the original blocks 212(p−1), 212(p)) as a group in the storage segment 310, copying the group of valid data blocks (including the original blocks 212(p−1), 212(p)) to a next unoccupied (or available) segment (not shown) among the storage segments 302, and erasing at least the original blocks 212(p−1), 210(p), 210(p+1) from the storage segment 310 to make its storage space available for reuse. It is noted that information pertaining to one or more attributes of the copied data blocks (e.g., stream ID, LBA, block type) is managed and/or maintained in the log 214 relative to the time period during which the group of data blocks were copied, written, and/or created.
  • FIG. 3c depicts a garbage collection function performed by the garbage collector 124 on the data stream 206, which includes the group of digests . . . 212(p−1), 212(p), 212(p+1) written to the segment 312, as well as digests . . . 212(p+7), 212(p+8) . . . written to the storage segment 313. In this example, the digests 212(p−1), 212(p), 212(p+1) are generated for the blocks 212(p−1), 212(p), 212(p+1), respectively, and the digests 212(p+7), 212(p+8) are generated for the blocks 212(p+7), 212(p+8 ), respectively. The digests . . . 212(p−1 ), 212(p), 212(p+1), 212(p+7), 212(p+8) . . . (corresponding to the stream ID “3”) each have the data type “DIG.” Further, at least the digest 212(p−1), the digest 212(p), and the digest 212(p+1) are sequentially written as log structured data to the storage segment 312, and at least the digest 212(p+7) and the digest 212(p+8) are sequentially written as log structured data to the storage segment 313. In addition, information is managed and/or maintained (e.g., in the memory 112 and/or the log-based storage media 114) in the log 214, including one or more attributes of the original digests 212(p−1), 212(p), 212(p+1), 212(p+7), 212(p+8) such as the data type “DIG.”
  • As described herein, at least the original block 212(p+1) included in the data sub-stream 204 was modified, and such modification of the original block 212(p+1) was written as the new block 212(p+1) to the storage segment 311. In this example, because the original block 212(p+1) was modified, the original digest 212(p+1) for the original block 212(p+1) is effectively modified as a new digest 212(p+1) for the new block 212(p+1). The deduplication logic 120 generates the new digest 212(p+1), which is sequentially written as log structured data to the storage segment 313, such as after the digest 212(p+8). Once at least the new digest 212(p+1) is generated and written to the storage segment 313, the garbage collector 124 can perform its garbage collection function, which includes invalidating at least the original digest 212(p+1) written to the storage segment 312 (as indicated by a cross “X” drawn through the digest 212(p+1) see FIG. 3c ), combining or consolidating any remaining valid digests (such as the original digests 212(p−1), 212(p)) in the storage segment 312, copying the valid digests (including the original digests 212(p−1), 212(p)) to a next unoccupied (or available) segment (not shown) among the storage segments 304, and erasing at least the original digests 212(p−1), 210(p), 210(p+1) from the storage segment 312 to make its storage space available for reuse.
  • It is noted that the temporal order of the digests . . . 212(p+7), 212(p+8), 212(p+1), (including the copied valid digests) in the storage segments 304 is maintained relative to the temporal order of the data blocks . . . 212(p+7), 212(p+8), 212(p+1) . . . (including the copied valid data blocks) in the storage segments 302. It is further noted that information pertaining to one or more attributes of the copied valid digests (e.g., stream ID, LBA, data type) is managed and/or maintained in the log 214 relative to the time period during which the group of digests were copied, written, and/or generated.
  • By maintaining the digests of the data stream 206 as a group in the storage segments 304, while maintaining their temporal order relative to the data blocks of the data sub-stream 204 in the storage segments 302, the size of the deduplication domain of the data sub-stream 204 can be reduced, allowing the deduplication logic 120 to perform its deduplication activities with increased efficiency. Such efficiencies can likewise be achieved while performing deduplication activities involving the data blocks of the data sub-stream 208 and the digests of the data stream 210, due to the reduced size of the deduplication domain of the data sub-stream 208. Moreover, improved temporal and/or spatial localities of data blocks in a data sub-stream can allow for the possibility of a reduced deduplication index footprint. In certain embodiments, rather than maintaining each digest of the respective data streams 206, 210 in the storage segments 304, 308, respectively, a predetermined sampling of the total number of digests can be maintained in the respective storage segments 304, 308 to further increase deduplication efficiencies. Once a matching digest among the predetermined sampling of digests is identified, the deduplication logic 120 can then access a fuller or full set of the digests to complete its deduplication activities.
  • An exemplary method of handling multiple data streams in stream-aware data storage systems is described below with reference to FIG. 4. As depicted in block 402, one or more data sub-streams are identified in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams. As depicted in block 404, one or more groups of data elements are formed from the respective data sub-streams. As depicted in block 406, one or more groups of data elements are written as log structured data to one or more segments of a data log.
  • Several definitions of terms are provided below for the sole purpose of aiding understanding of the foregoing description, as well as the claims set forth hereinbelow.
  • As employed herein, the term “storage system” is intended to be broadly construed to encompass, for example, private or public cloud computing systems for storing data, as well as systems for storing data comprising virtual infrastructure and those not comprising virtual infrastructure.
  • As employed herein, the terms “client,” “host,” and “user” refer, interchangeably, to any person, system, or other entity that uses a storage system to read/write data.
  • As employed herein, the term “storage device” may also refer to a storage array including multiple storage devices. Such a storage device may refer to any non-volatile memory (NVM) device, including hard disk drives (HDDs), solid state drivers (SSDs), flash devices (e.g., NAND flash devices, NOR flash devices), and similar devices that may be accessed locally and/or remotely (e.g., via a storage attached network (SAN)). A storage array (or disk array) may refer to a data storage system used for block-based, file-based, or object storage, in which storage arrays can include, for example, dedicated storage hardware containing spinning hard disk drives (HDDs), solid state disk drives, and/or all-flash drives. A data storage entity may be any one or more of a file system, object storage, a virtualized device, a logical unit (LU), a logical unit number (LUN), a logical volume, a logical device, a physical device, and/or a storage medium. A logical unit (LU) may be a logical entity provided by a storage system for accessing data from the storage system. A logical unit (LU) is used interchangeably with a logical volume. A LU or LUN may be used interchangeably with each other. A LUN may be a logical unit number for identifying a logical unit, and may also refer to one or more virtual disks or virtual LUNs, which may correspond to one or more virtual machines. A physical storage unit may be a physical entity, such as a disk or an array of disks, for storing data in storage locations that can be accessed by address, in which a physical storage unit is used interchangeably with a physical volume.
  • As employed herein, the term “storage medium” may refer to one or more storage media such as a hard drive, a combination of hard drives, flash storage, a combination of flash storage, a combination of hard drives, flash storage, and other storage devices, and other types and/or combinations of computer readable storage media. A storage medium may also refer to both physical and logical storage media, and may include multiple levels of virtual-to-physical mappings, and may be or include an image or disk image. A storage medium may be computer-readable, and may also be referred to as a computer-readable program medium.
  • As employed herein, the term “JO request” or simply “JO” may be used to refer to an input or output request, such as a data read request or a data write request.
  • As employed herein, the term “defragmentation” refers to a process performed by a computer to reduce fragmentation by combining portions of data blocks, data files, or portions of other types of data storage units stored across non-contiguous areas of memory. Such combining of portions of data storage units makes subsequent access to the respective types of data storage units more efficient, and makes the resulting freed storage space available for reuse.
  • As employed herein, the terms, “such as,” “for example,” “e.g.,” “exemplary,” and variants thereof, describe non-limiting embodiments and mean “serving as an example, instance, or illustration.” Any embodiments described herein using such phrases and/or variants are not necessarily to be construed as preferred or more advantageous over other embodiments, and/or to exclude the incorporation of features from other embodiments. In addition, the term “optionally” is employed herein to mean that a feature or process, etc., is provided in certain embodiments and not provided in other certain embodiments. Any particular embodiment of the present disclosure may include a plurality of “optional” features unless such features conflict with one another.
  • While various embodiments of the present disclosure have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the present disclosure, as defined by the appended claims.

Claims (20)

1. A method of handling multiple data streams in a stream-aware data storage system, comprising:
identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams;
forming one or more groups of data elements from the respective data sub-streams;
writing the one or more groups of data elements as log structured data to one or more segments of a data log; and
maintaining, in persistent data storage, information pertaining to the at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
2. The method of claim 1 wherein the identifying of one or more data sub-streams in the incoming data stream is based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
3. The method of claim 1 further comprising:
associating a stream identifier (ID) to each data element in each data sub-stream.
4. The method of claim 3 wherein the writing of the one or more groups of data elements as log structured data to one or more segments of the data log includes writing each group of data elements having the same stream ID to the same segment of the data log.
5. (canceled)
6. The method of claim 1 further comprising:
generating a digest for each data element in each respective data sub-stream,
thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
7. The method of claim 6 further comprising:
forming a group of digests from the plurality of digests; and
associating a stream identifier (ID) to each digest in the group of digests.
8. The method of claim 7 further comprising:
writing the group of digests as a data stream of log structured data to a segment of the data log.
9. The method of claim 8 further comprising:
maintaining, in persistent data storage, information pertaining to (i) the at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
10. A data storage system, comprising:
a memory; and
processing circuitry configured to execute program instructions out of the memory to:
identify one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams;
form one or more groups of data elements from the respective data sub-streams;
write the one or more groups of data elements as log structured data to one or more segments of a data log; and
maintain, in persistent data storage, information pertaining to the at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
11. The data storage system of claim 10 wherein the processing circuitry is further configured to execute the program instructions out of the memory to identify one or more data sub-streams in the incoming data stream based on one or more of a temporal locality of the data elements, a spatial locality of the data elements, a type of each data element, and a port number through which each data element is received.
12. The data storage system of claim 10 wherein the processing circuitry is further configured to execute the program instructions out of the memory to associate a stream identifier (ID) to each data element in each data sub-stream.
13. The data storage system of claim 12 wherein the processing circuitry is further configured to execute the program instructions out of the memory to write each group of data elements having the same stream ID to the same segment of the data log.
14. (canceled)
15. The data storage system of claim 10 wherein the processing circuitry is further configured to execute the program instructions out of the memory to:
generate a digest for each data element in each respective data sub-stream,
thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream.
16. The data storage system of claim 15 wherein the processing circuitry is further configured to execute the program instructions out of the memory to:
form a group of digests from the plurality of digests; and
associate a stream identifier (ID) to each digest in the group of digests.
17. The data storage system of claim 16 wherein the processing circuitry is further configured to execute the program instructions out of the memory to write the group of digests as a data stream of log structured data to a segment of the data log.
18. The data storage system of claim 17 wherein the processing circuitry is further configured to execute the program instructions out of the memory to maintain, in persistent data storage, information pertaining to (i) the at least one attribute of the data elements in the respective data sub-stream, and (ii) the respective digests in the data stream, relative to a time period during which each of a respective group of data elements from the respective data sub-stream and the group of digests from the data stream were written to the data log.
19. A computer program product including a set of non-transitory, computer-readable media having instructions that, when executed by control circuitry of a computerized apparatus, cause the control circuitry to perform a method of handling multiple data streams in a stream-aware data storage system, the method comprising:
identifying one or more data sub-streams in an incoming data stream based on at least one attribute of data elements in the respective data sub-streams;
forming one or more groups of data elements from the respective data sub-streams;
writing the one or more groups of data elements as log structured data to one or more segments of a data log; and
maintaining, in persistent data storage, information pertaining to the at least one attribute of the data elements in the respective data sub-streams relative to time periods during which the respective groups of data elements were written.
20. The computer program product of claim 19 wherein the method further comprises:
generating a digest for each data element in each respective data sub-stream, thereby generating a plurality of digests for a plurality of data elements, respectively, in the respective data sub-stream;
forming a group of digests from the plurality of digests;
associating a stream identifier (ID) to each digest in the group of digests; and
writing the group of digests as a data stream of log structured data to a segment of the data log.
US16/526,391 2019-07-30 2019-07-30 User stream aware file systems with user stream detection Active US10929066B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/526,391 US10929066B1 (en) 2019-07-30 2019-07-30 User stream aware file systems with user stream detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US16/526,391 US10929066B1 (en) 2019-07-30 2019-07-30 User stream aware file systems with user stream detection

Publications (2)

Publication Number Publication Date
US20210034289A1 true US20210034289A1 (en) 2021-02-04
US10929066B1 US10929066B1 (en) 2021-02-23

Family

ID=74258574

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/526,391 Active US10929066B1 (en) 2019-07-30 2019-07-30 User stream aware file systems with user stream detection

Country Status (1)

Country Link
US (1) US10929066B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220131953A1 (en) * 2020-10-28 2022-04-28 Viavi Solutions Inc. Analyzing network data for debugging, performance, and identifying protocol violations using parallel multi-threaded processing
US20230106882A1 (en) * 2021-10-04 2023-04-06 Samsung Electronics Co., Ltd. Flash translation layer with rewind
US20230195386A1 (en) * 2021-12-20 2023-06-22 Samsung Electronics Co., Ltd. Method of writing data in storage device and method of reading data from storage device using sensor information, storage device performing the same and method of operating storage device using the same
US12099473B1 (en) * 2020-12-14 2024-09-24 Cigna Intellectual Property, Inc. Systems and methods for centralized logging for enhanced scalability and security of web services

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12093550B2 (en) 2021-07-28 2024-09-17 EMC IP Holding Company LLC Per-service storage of attributes
KR20230097866A (en) * 2021-12-24 2023-07-03 삼성전자주식회사 Storage device including memory controller and operation method thereof

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6928526B1 (en) 2002-12-20 2005-08-09 Datadomain, Inc. Efficient data storage system
US8060713B1 (en) 2005-12-21 2011-11-15 Emc (Benelux) B.V., S.A.R.L. Consolidating snapshots in a continuous data protection system using journaling
US8396839B1 (en) * 2010-06-25 2013-03-12 Emc Corporation Representing de-duplicated file data
US9317377B1 (en) * 2011-03-23 2016-04-19 Riverbed Technology, Inc. Single-ended deduplication using cloud storage protocol
US10120875B1 (en) 2014-12-02 2018-11-06 EMC IP Holding Company LLC Method and system for detecting boundaries of data blocks for deduplication
US9594513B1 (en) 2015-06-29 2017-03-14 EMC IP Holding Company LLC Data storage system with file system stream detection
US10459648B1 (en) * 2015-12-14 2019-10-29 EMC IP Holding Company LLC Change rate estimation
US10303797B1 (en) 2015-12-18 2019-05-28 EMC IP Holding Company LLC Clustering files in deduplication systems
US10305954B1 (en) 2016-07-25 2019-05-28 EMC IP Holding Company LLC Storage system for scalable video streaming using software-defined storage pool
US10289566B1 (en) 2017-07-28 2019-05-14 EMC IP Holding Company LLC Handling data that has become inactive within stream aware data storage equipment
US10402091B1 (en) 2018-04-30 2019-09-03 EMC IP Holding Company LLC Managing data in log-structured storage systems

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220131953A1 (en) * 2020-10-28 2022-04-28 Viavi Solutions Inc. Analyzing network data for debugging, performance, and identifying protocol violations using parallel multi-threaded processing
US11665262B2 (en) * 2020-10-28 2023-05-30 Viavi Solutions Inc. Analyzing network data for debugging, performance, and identifying protocol violations using parallel multi-threaded processing
US12099473B1 (en) * 2020-12-14 2024-09-24 Cigna Intellectual Property, Inc. Systems and methods for centralized logging for enhanced scalability and security of web services
US20230106882A1 (en) * 2021-10-04 2023-04-06 Samsung Electronics Co., Ltd. Flash translation layer with rewind
US11960757B2 (en) * 2021-10-04 2024-04-16 Samsung Electronics Co., Ltd. Flash translation layer with rewind
US20230195386A1 (en) * 2021-12-20 2023-06-22 Samsung Electronics Co., Ltd. Method of writing data in storage device and method of reading data from storage device using sensor information, storage device performing the same and method of operating storage device using the same

Also Published As

Publication number Publication date
US10929066B1 (en) 2021-02-23

Similar Documents

Publication Publication Date Title
US10929066B1 (en) User stream aware file systems with user stream detection
US9996435B2 (en) Reliability scheme using hybrid SSD/HDD replication with log structured management
US8046537B2 (en) Virtualization engine and method, system, and computer program product for managing the storage of data
US8775388B1 (en) Selecting iteration schemes for deduplication
US9229870B1 (en) Managing cache systems of storage systems
US9727245B2 (en) Method and apparatus for de-duplication for solid state disks (SSDs)
US8694563B1 (en) Space recovery for thin-provisioned storage volumes
US9846718B1 (en) Deduplicating sets of data blocks
US11061827B2 (en) Metadata representation for enabling partial page duplication
US10365845B1 (en) Mapped raid restripe for improved drive utilization
US20150363134A1 (en) Storage apparatus and data management
US20200341684A1 (en) Managing a raid group that uses storage devices of different types that provide different data storage characteristics
US11163496B1 (en) Systems and methods of updating persistent statistics on a multi-transactional and multi-node storage system
US20200341656A1 (en) Handling pattern identifiers in a data storage system
US11429318B2 (en) Redirect-on-write snapshot mechanism with delayed data movement
US11016884B2 (en) Virtual block redirection clean-up
US11526447B1 (en) Destaging multiple cache slots in a single back-end track in a RAID subsystem
US10732840B2 (en) Efficient space accounting mechanisms for tracking unshared pages between a snapshot volume and its parent volume
US11789622B2 (en) Method, device and computer program product for storage management
US11315028B2 (en) Method and apparatus for increasing the accuracy of predicting future IO operations on a storage system
US11157198B2 (en) Generating merge-friendly sequential IO patterns in shared logger page descriptor tiers
US12045505B2 (en) Scanning pages of shared memory
US20230176743A1 (en) Handling data with different lifetime characteristics in stream-aware data storage equipment
US11144445B1 (en) Use of compression domains that are more granular than storage allocation units
US11947803B2 (en) Effective utilization of different drive capacities

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RUEF, RICHARD P.;EVERSON, KURT W.;SIGNING DATES FROM 20190726 TO 20190729;REEL/FRAME:050162/0850

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NORTH CAROLINA

Free format text: SECURITY AGREEMENT;ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:050406/0421

Effective date: 20190917

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS

Free format text: PATENT SECURITY AGREEMENT (NOTES);ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:050724/0571

Effective date: 20191010

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001

Effective date: 20200409

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS COLLATERAL AGENT, TEXAS

Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC CORPORATION;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053311/0169

Effective date: 20200603

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DALMATOV, NICKOLAY ALEXANDROVICH;REEL/FRAME:053562/0289

Effective date: 20190911

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825

Effective date: 20211101

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825

Effective date: 20211101

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST AT REEL 050406 FRAME 421;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058213/0825

Effective date: 20211101

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742

Effective date: 20220329

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053311/0169);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060438/0742

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088

Effective date: 20220329

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (050724/0571);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060436/0088

Effective date: 20220329

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4