US10761750B2 - Selectively storing data into allocation areas using streams - Google Patents
Selectively storing data into allocation areas using streams Download PDFInfo
- Publication number
- US10761750B2 US10761750B2 US15/453,949 US201715453949A US10761750B2 US 10761750 B2 US10761750 B2 US 10761750B2 US 201715453949 A US201715453949 A US 201715453949A US 10761750 B2 US10761750 B2 US 10761750B2
- Authority
- US
- United States
- Prior art keywords
- data
- storage device
- allocation
- stream
- allocation area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000003860 storage Methods 0.000 claims abstract description 272
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000013507 mapping Methods 0.000 claims description 4
- 230000004044 response Effects 0.000 claims 6
- 238000005192 partition Methods 0.000 claims 2
- 230000008569 process Effects 0.000 abstract description 13
- 238000013467 fragmentation Methods 0.000 abstract description 11
- 238000006062 fragmentation reaction Methods 0.000 abstract description 11
- 230000003321 amplification Effects 0.000 abstract description 6
- 238000003199 nucleic acid amplification method Methods 0.000 abstract description 6
- 238000013500 data storage Methods 0.000 description 64
- 238000012545 processing Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 239000004744 fabric Substances 0.000 description 5
- 239000007787 solid Substances 0.000 description 5
- 238000003491 array Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000013523 data management Methods 0.000 description 3
- 239000000835 fiber Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0631—Configuration or reconfiguration of storage systems by allocating resources to storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0605—Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0626—Reducing size or complexity of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/068—Hybrid storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0685—Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- H04L65/4069—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/61—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Definitions
- a file system may store that data within a storage device or across multiple storage devices.
- Such data may have various characteristics, such as being user data (e.g., a user database file) or metadata (e.g., a volume size of a volume, a network address of a storage controller, a replication policy, and/or other data used by the file system and/or storage controller), for example.
- the characteristics can correspond to hot data (e.g., data that is being accessed above a threshold frequency, such as metadata that is being frequently modified by the file system) or cold data (e.g., user data that is being accessed below the threshold frequency).
- the characteristics can correspond to sequentially accessed data (e.g., data stored within contiguous blocks) or randomly accessed data (e.g., data stored within blocks that are not contiguous).
- a virtualization layer can be used as an indirection layer that groups together physical storage from multiple storage devices into what appears to be a single storage object to clients and applications (e.g., a volume or logical unit number (LUN) may span multiple physical storage devices).
- the virtualization layer abstracts away the physical layout of storage, and thus operates in a logical address space that is mapped to the underlying physical address space.
- the storage device may assume the role of physically storing data within physical blocks of the storage device in locations chosen by the storage device.
- the storage device may store any type of data, such as hot data, cold data, user data, and metadata together without any logical/physical separation. Unfortunately, data with different characteristics may have different access and overwrite patterns, and thus fragmentation can result when such data is store together.
- a solid state drive may not have the capability to overwrite a previously written block, and can only write to empty destination cells.
- SSD solid state drive
- the data may be moved to different empty cell and the destination cell must be reprogrammed (e.g., erased) so that new data can be written to the destination cell.
- the storage device can reserve space to provide for background garbage collection that can proactively free cells.
- a substantial amount of storage space may be reserved such as about 28% or any other percentage of storage of the storage device. This leads to inefficient usage of storage resources and increased cost due to over-provisioning.
- write amplification becomes problematic on subsequent overwrites, which can lead to degraded performance and wear on the storage device.
- FIG. 1 is a component block diagram illustrating an example clustered network in accordance with one or more of the provisions set forth herein.
- FIG. 2 is a component block diagram illustrating an example data storage system in accordance with one or more of the provisions set forth herein.
- FIG. 3 is a flow chart illustrating an exemplary method of selectively storing data into allocation areas using streams.
- FIG. 4 is a component block diagram illustrating an exemplary computing device for selectively storing data into allocation areas using streams, where allocation areas are defined and policies are assigned to allocation areas.
- FIG. 5 is a component block diagram illustrating an exemplary computing device for selectively storing data into allocation areas using streams, where allocation areas are defined across multiple storage devices.
- FIG. 6 is a flow chart illustrating an exemplary method of selectively storing data into allocation areas using streams.
- FIG. 7 is a component block diagram illustrating an exemplary computing device for selectively storing data into allocation areas using streams.
- FIG. 8 is an example of a computer readable medium in accordance with one or more of the provisions set forth herein.
- One or more techniques and/or computing devices for selectively storing data into allocation areas using streams are provided herein.
- a storage device may be used by a virtualization layer to provide virtualized storage to clients (e.g., the virtualization layer may hide the underlying details of physical storage, and may group physical storage of multiple physical devices into a single storage object exposed to clients and applications). If the storage device does not have a well-defined mapping of logical address space to physical address space, then the storage device will merely store any type of data together. Storing different types of data together overtime (e.g., data having different access frequencies, data having different overwrite patterns and frequencies, data of different aggregates, randomly accessed data, sequentially accessed data, hot data, cold data, user data, metadata, etc.) can result in fragmentation of the storage device. Write amplification will also result on subsequent overwrites.
- the virtualization layer may hide the underlying details of physical storage, and may group physical storage of multiple physical devices into a single storage object exposed to clients and applications.
- certain types of storage devices such as a solid state drive over-provision storage (e.g., reserve a percentage of otherwise free storage) for use by garbage collection functionality to proactively free cells of solid state drives.
- a solid state drive over-provision storage e.g., reserve a percentage of otherwise free storage
- garbage collection functionality e.g., garbage collection functionality to proactively free cells of solid state drives.
- overprovisioning wastes storage space that could otherwise we used to store user data and/or metadata.
- data of a write stream is assigned to different streams based upon characteristics of such data.
- frequently accessed data may be assigned to a first stream
- infrequently accessed data may be assigned to a second stream
- randomly accessed data may be assigned to a third stream
- sequentially accessed data may be assigned to a fourth stream, etc. based upon one or more policies specifying that data with different characteristics is to be stored in different allocation areas of a storage device (e.g., within different physical address ranges or virtual block numbers of the storage device).
- the policy and the assignment of data to streams may be implemented by a file system so that data can be stored in separate locations within the storage device even if storage of the storage device is virtualized and/or the storage device does not maintain a well-defined mapping of logical address space to physical address space and thus would otherwise just store all data together or without any discernment.
- Each stream may be tagged with a particular stream identifier assigned by the policy to a corresponding allocation area.
- the policy may specify that frequently accessed data is to be stored in an allocation area (C), and thus the first stream of frequently accessed data is tagged with a stream identifier that is used as an indicator to the storage device that data of the first stream is to be processed (e.g., stored within) using the allocation area (C).
- Storing data with similar characteristics together in the same allocation area and storing data with dissimilar characters in separate allocation areas will reduce fragmentation and write amplification for the overall storage device (e.g., frequently overwritten data can be contained within a single allocation area as opposed to be spread across the entire storage device such that fragmentation from overwrites will not affect the entire storage device, otherwise, fragmentation would result across the entire storage device especially for a write anywhere file system that writes data to new locations for any write operation). This also improves storage efficiency because a background garbage collection process may not be needed or may use a much smaller reserved area of the storage device for garbage collecting.
- FIG. 1 illustrates an embodiment of a clustered network environment 100 or a network storage environment. It may be appreciated, however, that the techniques, etc. described herein may be implemented within the clustered network environment 100 , a non-cluster network environment, and/or a variety of other computing environments, such as a desktop computing environment. That is, the instant disclosure, including the scope of the appended claims, is not meant to be limited to the examples provided herein. It will be appreciated that where the same or similar components, elements, features, items, modules, etc. are illustrated in later figures but were previously discussed with regard to prior figures, that a similar (e.g., redundant) discussion of the same may be omitted when describing the subsequent figures (e.g., for purposes of simplicity and ease of understanding).
- FIG. 1 is a block diagram illustrating the clustered network environment 100 that may implement at least some embodiments of the techniques and/or systems described herein.
- the clustered network environment 100 comprises data storage systems 102 and 104 that are coupled over a cluster fabric 106 , such as a computing network embodied as a private Infiniband, Fibre Channel (FC), or Ethernet network facilitating communication between the data storage systems 102 and 104 (and one or more modules, component, etc. therein, such as, nodes 116 and 118 , for example).
- a cluster fabric 106 such as a computing network embodied as a private Infiniband, Fibre Channel (FC), or Ethernet network facilitating communication between the data storage systems 102 and 104 (and one or more modules, component, etc. therein, such as, nodes 116 and 118 , for example).
- FC Fibre Channel
- nodes 116 , 118 comprise storage controllers (e.g., node 116 may comprise a primary or local storage controller and node 118 may comprise a secondary or remote storage controller) that provide client devices, such as host devices 108 , 110 , with access to data stored within data storage devices 128 , 130 .
- client devices such as host devices 108 , 110
- client devices such as host devices 108 , 110
- client devices such as host devices 108 , 110
- clustered networks are not limited to any particular geographic areas and can be clustered locally and/or remotely.
- a clustered network can be distributed over a plurality of storage systems and/or nodes located in a plurality of geographic locations; while in another embodiment a clustered network can include data storage systems (e.g., 102 , 104 ) residing in a same geographic location (e.g., in a single onsite rack of data storage devices).
- one or more host devices 108 , 110 which may comprise, for example, client devices, personal computers (PCs), computing devices used for storage (e.g., storage servers), and other computers or peripheral devices (e.g., printers), are coupled to the respective data storage systems 102 , 104 by storage network connections 112 , 114 .
- Network connection may comprise a local area network (LAN) or wide area network (WAN), for example, that utilizes Network Attached Storage (NAS) protocols, such as a Common Internet File System (CIFS) protocol or a Network File System (NFS) protocol to exchange data packets, a Storage Area Network (SAN) protocol, such as Small Computer System Interface (SCSI) or Fiber Channel Protocol (FCP), an object protocol, such as S3, etc.
- LAN local area network
- WAN wide area network
- NAS Network Attached Storage
- CIFS Common Internet File System
- NFS Network File System
- SAN Storage Area Network
- SCSI Small Computer System Interface
- FCP Fiber Channel Protocol
- object protocol such as S
- the host devices 108 , 110 may be general-purpose computers running applications, and may interact with the data storage systems 102 , 104 using a client/server model for exchange of information. That is, the host device may request data from the data storage system (e.g., data on a storage device managed by a network storage control configured to process I/O commands issued by the host device for the storage device), and the data storage system may return results of the request to the host device via one or more storage network connections 112 , 114 .
- the data storage system e.g., data on a storage device managed by a network storage control configured to process I/O commands issued by the host device for the storage device
- the data storage system may return results of the request to the host device via one or more storage network connections 112 , 114 .
- the nodes 116 , 118 on clustered data storage systems 102 , 104 can comprise network or host nodes that are interconnected as a cluster to provide data storage and management services, such as to an enterprise having remote locations, cloud storage (e.g., a storage endpoint may be stored within a data cloud), etc., for example.
- a node in the clustered network environment 100 can be a device attached to the network as a connection point, redistribution point or communication endpoint, for example.
- a node may be capable of sending, receiving, and/or forwarding information over a network communications channel, and could comprise any device that meets any or all of these criteria.
- One example of a node may be a data storage and management server attached to a network, where the server can comprise a general purpose computer or a computing device particularly configured to operate as a server in a data storage and management system.
- a first cluster of nodes such as the nodes 116 , 118 (e.g., a first set of storage controllers configured to provide access to a first storage aggregate comprising a first logical grouping of one or more storage devices) may be located on a first storage site.
- a second cluster of nodes may be located at a second storage site (e.g., a second set of storage controllers configured to provide access to a second storage aggregate comprising a second logical grouping of one or more storage devices).
- the first cluster of nodes and the second cluster of nodes may be configured according to a disaster recovery configuration where a surviving cluster of nodes provides switchover access to storage devices of a disaster cluster of nodes in the event a disaster occurs at a disaster storage site comprising the disaster cluster of nodes (e.g., the first cluster of nodes provides client devices with switchover data access to storage devices of the second storage aggregate in the event a disaster occurs at the second storage site).
- nodes 116 , 118 can comprise various functional components that coordinate to provide distributed storage architecture for the cluster.
- the nodes can comprise network modules 120 , 122 and disk modules 124 , 126 .
- Network modules 120 , 122 can be configured to allow the nodes 116 , 118 (e.g., network storage controllers) to connect with host devices 108 , 110 over the storage network connections 112 , 114 , for example, allowing the host devices 108 , 110 to access data stored in the distributed storage system.
- the network modules 120 , 122 can provide connections with one or more other components through the cluster fabric 106 .
- the network module 120 of node 116 can access a second data storage device by sending a request through the disk module 126 of node 118 .
- Disk modules 124 , 126 can be configured to connect one or more data storage devices 128 , 130 , such as disks or arrays of disks, flash memory, or some other form of data storage, to the nodes 116 , 118 .
- the nodes 116 , 118 can be interconnected by the cluster fabric 106 , for example, allowing respective nodes in the cluster to access data on data storage devices 128 , 130 connected to different nodes in the cluster.
- disk modules 124 , 126 communicate with the data storage devices 128 , 130 according to the SAN protocol, such as SCSI or FCP, for example.
- the data storage devices 128 , 130 can appear as locally attached to the operating system. In this manner, different nodes 116 , 118 , etc. may access data blocks through the operating system, rather than expressly requesting abstract files.
- clustered network environment 100 illustrates an equal number of network and disk modules
- other embodiments may comprise a differing number of these modules.
- there may be a plurality of network and disk modules interconnected in a cluster that does not have a one-to-one correspondence between the network and disk modules. That is, different nodes can have a different number of network and disk modules, and the same node can have a different number of network modules than disk modules.
- a host device 108 , 110 can be networked with the nodes 116 , 118 in the cluster, over the storage networking connections 112 , 114 .
- respective host devices 108 , 110 that are networked to a cluster may request services (e.g., exchanging of information in the form of data packets) of nodes 116 , 118 in the cluster, and the nodes 116 , 118 can return results of the requested services to the host devices 108 , 110 .
- the host devices 108 , 110 can exchange information with the network modules 120 , 122 residing in the nodes 116 , 118 (e.g., network hosts) in the data storage systems 102 , 104 .
- the data storage devices 128 , 130 comprise volumes 132 , which is an implementation of storage of information onto disk drives or disk arrays or other storage (e.g., flash) as a file-system for data, for example.
- a disk array can include all traditional hard drives, all flash drives, or a combination of traditional hard drives and flash drives.
- Volumes can span a portion of a disk, a collection of disks, or portions of disks, for example, and typically define an overall logical arrangement of file storage on disk space in the storage system.
- a volume can comprise stored data as one or more files that reside in a hierarchical directory structure within the volume.
- Volumes are typically configured in formats that may be associated with particular storage systems, and respective volume formats typically comprise features that provide functionality to the volumes, such as providing an ability for volumes to form clusters. For example, where a first storage system may utilize a first format for their volumes, a second storage system may utilize a second format for their volumes.
- the host devices 108 , 110 can utilize the data storage systems 102 , 104 to store and retrieve data from the volumes 132 .
- the host device 108 can send data packets to the network module 120 in the node 116 within data storage system 102 .
- the node 116 can forward the data to the data storage device 128 using the disk module 124 , where the data storage device 128 comprises volume 132 A.
- the host device can access the volume 132 A, to store and/or retrieve data, using the data storage system 102 connected by the storage network connection 112 .
- the host device 110 can exchange data with the network module 122 in the node 118 within the data storage system 104 (e.g., which may be remote from the data storage system 102 ).
- the node 118 can forward the data to the data storage device 130 using the disk module 126 , thereby accessing volume 1328 associated with the data storage device 130 .
- allocation areas may be defined within the data storage device 128 and/or the data storage device 130 .
- Data may be selectively sent through streams to the data storage device 128 and/or the data storage device 130 .
- the streams may be tagged with stream identifiers corresponding to allocation areas from which such streams are to be processed.
- selectively storing data into allocation areas using streams may be implemented for and/or between any type of computing environment, and may be transferrable between physical devices (e.g., node 116 , node 118 , a desktop computer, a tablet, a laptop, a wearable device, a mobile device, a storage device, a server, etc.) and/or a cloud computing environment (e.g., remote to the clustered network environment 100 ).
- physical devices e.g., node 116 , node 118 , a desktop computer, a tablet, a laptop, a wearable device, a mobile device, a storage device, a server, etc.
- cloud computing environment e.g., remote to the clustered network environment 100 .
- FIG. 2 is an illustrative example of a data storage system 200 (e.g., 102 , 104 in FIG. 1 ), providing further detail of an embodiment of components that may implement one or more of the techniques and/or systems described herein.
- the data storage system 200 comprises a node 202 (e.g., nodes 116 , 118 in FIG. 1 ), and a data storage device 234 (e.g., data storage devices 128 , 130 in FIG. 1 ).
- the node 202 may be a general purpose computer, for example, or some other computing device particularly configured to operate as a storage server.
- a host device 205 e.g., 108 , 110 in FIG.
- the node 202 can be connected to the node 202 over a network 216 , for example, to provide access to files and/or other data stored on the data storage device 234 .
- the node 202 comprises a storage controller that provides client devices, such as the host device 205 , with access to data stored within data storage device 234 .
- the data storage device 234 can comprise mass storage devices, such as disks 224 , 226 , 228 of a disk array 218 , 220 , 222 . It will be appreciated that the techniques and systems, described herein, are not limited by the example embodiment.
- disks 224 , 226 , 228 may comprise any type of mass storage devices, including but not limited to magnetic disk drives, flash memory, and any other similar media adapted to store information, including, for example, data (D) and/or parity (P) information.
- the node 202 comprises one or more processors 204 , a memory 206 , a network adapter 210 , a cluster access adapter 212 , and a storage adapter 214 interconnected by a system bus 242 .
- the data storage system 200 also includes an operating system 208 installed in the memory 206 of the node 202 that can, for example, implement a Redundant Array of Independent (or Inexpensive) Disks (RAID) optimization technique to optimize a reconstruction process of data of a failed disk in an array.
- RAID Redundant Array of Independent
- the operating system 208 can also manage communications for the data storage system, and communications between other data storage systems that may be in a clustered network, such as attached to a cluster fabric 215 (e.g., 106 in FIG. 1 ).
- the node 202 such as a network storage controller, can respond to host device requests to manage data on the data storage device 234 (e.g., or additional clustered devices) in accordance with these host device requests.
- the operating system 208 can often establish one or more file systems on the data storage system 200 , where a file system can include software code and data structures that implement a persistent hierarchical namespace of files and directories, for example.
- the operating system 208 is informed where, in an existing directory tree, new files associated with the new data storage device are to be stored. This is often referred to as “mounting” a file system.
- memory 206 can include storage locations that are addressable by the processors 204 and adapters 210 , 212 , 214 for storing related software application code and data structures.
- the processors 204 and adapters 210 , 212 , 214 may, for example, include processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures.
- the operating system 208 portions of which are typically resident in the memory 206 and executed by the processing elements, functionally organizes the storage system by, among other things, invoking storage operations in support of a file service implemented by the storage system.
- the network adapter 210 includes the mechanical, electrical and signaling circuitry needed to connect the data storage system 200 to a host device 205 over a network 216 , which may comprise, among other things, a point-to-point connection or a shared medium, such as a local area network.
- the host device 205 e.g., 108 , 110 of FIG. 1
- the host device 205 may be a general-purpose computer configured to execute applications. As described above, the host device 205 may interact with the data storage system 200 in accordance with a client/host model of information delivery.
- the storage adapter 214 cooperates with the operating system 208 executing on the node 202 to access information requested by the host device 205 (e.g., access data on a storage device managed by a network storage controller).
- the information may be stored on any type of attached array of writeable media such as magnetic disk drives, flash memory, and/or any other similar media adapted to store information.
- the information can be stored in data blocks on the disks 224 , 226 , 228 .
- the storage adapter 214 can include input/output (I/O) interface circuitry that couples to the disks over an I/O interconnect arrangement, such as a storage area network (SAN) protocol (e.g., Small Computer System Interface (SCSI), iSCSI, hyperSCSI, Fiber Channel Protocol (FCP)).
- SAN storage area network
- SCSI Small Computer System Interface
- iSCSI iSCSI
- hyperSCSI HyperSCSI
- FCP Fiber Channel Protocol
- the information is retrieved by the storage adapter 214 and, if necessary, processed by the one or more processors 204 (or the storage adapter 214 itself) prior to being forwarded over the system bus 242 to the network adapter 210 (and/or the cluster access adapter 212 if sending to another node in the cluster) where the information is formatted into a data packet and returned to the host device 205 over the network 216 (and/or returned to another node attached to the cluster over the cluster fabric 215 ).
- storage of information on disk arrays 218 , 220 , 222 can be implemented as one or more storage volumes 230 , 232 that are comprised of a cluster of disks 224 , 226 , 228 defining an overall logical arrangement of disk space.
- the disks 224 , 226 , 228 that comprise one or more volumes are typically organized as one or more groups of RAIDs.
- volume 230 comprises an aggregate of disk arrays 218 and 220 , which comprise the cluster of disks 224 and 226 .
- the operating system 208 may implement a file system (e.g., write anywhere file system) that logically organizes the information as a hierarchical structure of directories and files on the disks.
- file system e.g., write anywhere file system
- respective files may be implemented as a set of disk blocks configured to store information
- directories may be implemented as specially formatted files in which information about other files and directories are stored.
- data can be stored as files within physical and/or virtual volumes, which can be associated with respective volume identifiers, such as file system identifiers (FSIDs), which can be 32-bits in length in one example.
- FSIDs file system identifiers
- a physical volume corresponds to at least a portion of physical storage devices whose address, addressable space, location, etc. doesn't change, such as at least some of one or more data storage devices 234 (e.g., a Redundant Array of Independent (or Inexpensive) Disks (RAID system)).
- data storage devices 234 e.g., a Redundant Array of Independent (or Inexpensive) Disks (RAID system)
- RAID system Redundant Array of Independent (or Inexpensive) Disks
- the location of the physical volume doesn't change in that the (range of) address(es) used to access it generally remains constant.
- a virtual volume in contrast, is stored over an aggregate of disparate portions of different physical storage devices.
- the virtual volume may be a collection of different available portions of different physical storage device locations, such as some available space from each of the disks 224 , 226 , and/or 228 . It will be appreciated that since a virtual volume is not “tied” to any one particular storage device, a virtual volume can be said to include a layer of abstraction or virtualization, which allows it to be resized and/or flexible in some regards.
- a virtual volume can include one or more logical unit numbers (LUNs) 238 , directories 236 , Qtrees 235 , and files 240 .
- LUNs logical unit numbers
- directories 236 directories 236
- Qtrees 235 files 240 .
- files 240 files 240 .
- these features allow the disparate memory locations within which data is stored to be identified, for example, and grouped as data storage unit.
- the LUNs 238 may be characterized as constituting a virtual disk or drive upon which data within the virtual volume is stored within the aggregate.
- LUNs are often referred to as virtual drives, such that they emulate a hard drive from a general purpose computer, while they actually comprise data blocks stored in various parts of a volume.
- one or more data storage devices 234 can have one or more physical ports, wherein each physical port can be assigned a target address (e.g., SCSI target address).
- a target address on the data storage device can be used to identify one or more LUNs 238 .
- a connection between the node 202 and the one or more LUNs 238 underlying the volume is created.
- respective target addresses can identify multiple LUNs, such that a target address can represent multiple volumes.
- the I/O interface which can be implemented as circuitry and/or software in the storage adapter 214 or as executable code residing in memory 206 and executed by the processors 204 , for example, can connect to volume 230 by using one or more addresses that identify the one or more LUNs 238 .
- allocation areas may be defined within the one or more data storage devices 234 .
- Data may be selectively sent through streams to the one or more data storage devices 234 .
- the streams may be tagged with stream identifiers corresponding to allocation areas from which such streams are to be processed.
- selectively storing data into allocation areas using streams may be implemented for and/or between any type of computing environment, and may be transferrable between physical devices (e.g., node 202 , host device 205 , a desktop computer, a tablet, a laptop, a wearable device, a mobile device, a storage device, a server, etc.) and/or a cloud computing environment (e.g., remote to the node 202 and/or the host device 205 ).
- physical devices e.g., node 202 , host device 205 , a desktop computer, a tablet, a laptop, a wearable device, a mobile device, a storage device, a server, etc.
- cloud computing environment e.g., remote to the node 202 and/or the host device 205 .
- a first region of a storage device may be defined as a first allocation area.
- a second region of the storage device may be defined as a second allocation area. It may be appreciated that any number of allocation areas may be defined for the storage device and/or that a single allocation area may span across any number and types of storage devices (e.g., an allocation area spanning a first portion of first storage media, a second portion of second storage media, etc.; an allocation area spanning a first disk, a second disk, and a parity disk of a RAID configuration; etc.).
- the storage device may comprise any type of storage device, such as a solid state device, a flash device, a partitioned storage device, a storage device lacking a straight forward mapping of a logical address space to a physical address space, a storage device used by an indirection layer such as a virtualization layer that virtualizes storage of the storage device, etc.
- allocation areas may be defined as integer multiples of erase block units of the storage device, such as of a solid state device (e.g., data is written to flash memory in page units comprised of multiple cells, and the flash memory can only be erased in larger units referred to as block units comprised of multiple page units).
- a negotiation may be facilitated with the storage device (e.g., by a file system) to specify that a first stream identifier will be used as a first indicator for the storage device to indicate that streams tagged with the first stream identifier are to be processed using the first allocation area (e.g., data of a stream tagged with the first stream identifier is to be stored within the first allocation area by the storage device and not stored within other allocation areas).
- a negotiation may be facilitated with the storage device (e.g., by the file system) to specify that a second stream identifier will be used as a second indicator for the storage device to indicate that streams tagged with the second stream identifier are to be processed using the second allocation area. In this way, the storage device will agree to process streams using allocation areas corresponding to stream identifiers used to tag such streams by the file system.
- a policy specifying that data with certain characteristics are to be processed using certain allocation areas, may be maintained.
- the policy may specify that data with a first characteristic is to be processed using the first allocation area (e.g., such data is to be stored and read from the first allocation area) and that data with a second characteristic is to be processed using the second allocation area (e.g., such data is to be stored and read from the second allocation area).
- a single policy may be specified for a single characteristic or for multiple characteristics (e.g., the policy specifies where to store hot data, where to store cold data, where to store randomly accessed data, where to store sequentially accessed data) and/or that one or more policies may be specified for individual characteristics or pairings of characteristics (e.g., a first policy for hot data and cold data, a second policy for user data and metadata, etc.). Policies may be assigned to allocation areas for which such policies are to apply. Policies may specify stream identifiers for allocation areas for which such policies are to apply.
- the first characteristic may correspond to a user data characteristic and the second characteristic may correspond to a metadata characteristic (e.g., metadata may be overwritten more frequently than user data, and thus has a different access pattern and should be stored separately).
- the first characteristic may correspond to a first data frequency access characteristic and the second characteristic may correspond to a second data frequency access characteristic (e.g., more frequently accessed data such as hot data may be stored within a different allocation area than less frequently accessed data such as cold data).
- the first characteristic may correspond to a sequential access characteristic and the second characteristic may correspond to a random access characteristic (e.g., sequentially accessed data may be stored within a different allocation area than randomly accessed data).
- the first characteristic may correspond to a first storage aggregate characteristic and the second characteristic may correspond to a second storage aggregate characteristic (e.g., data of a first storage aggregate provided to a first client may be stored within a different allocation area than data of a second storage aggregate provided to a second client). It may be appreciated that a variety of other characteristics may be defined within the policy.
- a set of allocation areas are defined for the storage device.
- Policies may be assigned to allocation areas of the set of allocation areas.
- the set of allocation areas are dynamically sorted (e.g., sorted and/or resorted on-the-fly as write streams are received by the file system for processing) as a sorted set of allocation areas based upon the policies, amounts of available free space of each allocation area, and/or other sorting criteria (e.g., if user data can be stored within the first allocation area and a fifth allocation area, then the allocation area with more available storage space may be ranked higher and thus used).
- the set of policies are used to assign data to streams based upon characteristics of the data.
- the set of policies are also used to tag streams with appropriate stream identifiers (e.g., a policy may indicate that metadata is to be stored within a third allocation area, and thus the policy is used to assign metadata of a write stream into a stream and the policy is used to tag the stream with a stream identifier of the third allocation area).
- appropriate stream identifiers e.g., a policy may indicate that metadata is to be stored within a third allocation area, and thus the policy is used to assign metadata of a write stream into a stream and the policy is used to tag the stream with a stream identifier of the third allocation area.
- a write stream of data to write to the storage device is received.
- a file system receives the write stream.
- Characteristics of the data may be identified, such as user data, metadata, and/or other types of data such as randomly accessed data.
- the sorted set of allocation areas may be evaluated to identify allocation areas that are to be used to process the user data and the metadata.
- a policy may specify that a first allocation area and/or other allocation areas are to be used for processing user data.
- the policy or a different policy may specify that a second allocation area and/or other allocation areas are to be used for processing metadata.
- a target allocation area may be selected from the sorted set of allocation areas for storing the user data based upon the target allocation area having a sorted rank above a threshold in relation to user data (e.g., a highest rank of allocation areas that can be used for storing user data, such as the first allocation area).
- a target allocation area may be selected from the sorted set of allocation areas for storing the metadata based upon the target allocation area having a sorted rank above the threshold in relation to metadata (e.g., a highest rank of allocation areas that can be used for storing metadata, such as the second allocation area).
- data of the write stream may be provided to the storage device through streams tagged with stream identifiers of corresponding allocation areas.
- the user data may be assigned to a first stream.
- the first stream may be tagged with the first stream identifier for the first allocation area that is to be used for processing user data.
- the metadata may be assigned to a second stream.
- the second stream may be tagged with the second stream identifier for the second allocation area that is to be used for processing metadata. In this way, when the storage device receives the second stream, the storage device will know to process the metadata of the second stream using the second allocation area based upon the second stream identifier.
- An allocation area may be determined to have an amount of free space below a threshold.
- a policy for the allocation area can be terminated (e.g., automatically terminated or a suggestion may be provided to a storage administrator for terminating the policy).
- the allocation area may be redefined to increase the amount of free space, and the policy may be retained for the allocation area.
- FIG. 4 illustrates an example of a system 400 for selectively storing data into allocation areas using streams.
- a set of allocation areas may be defined for a storage device 416 , such as for storage media 418 of the storage device 416 .
- a first allocation area 420 may be defined as encompassing a first block range such as from a virtual block number ( 0 ) to a virtual block number ( 49 ).
- a second allocation area 422 may be defined as encompassing a second block range such as from a virtual block number ( 50 ) to a virtual block number ( 99 ).
- a third allocation area 424 may be defined as encompassing a third block range such as from a virtual block number ( 100 ) to a virtual block number ( 149 ).
- allocation areas may have the same or different sizes as one another and that the entire storage space or merely a portion of the storage space of the storage device 416 may be used for defining allocation areas. In this way, any number of allocation areas may be defined for the storage device 416 .
- Policies 402 may be assigned to allocation areas.
- a first policy 404 may be assigned to the first allocation area 420 and/or other allocation areas.
- the first policy 404 may specify that hot data (e.g., data that is accessed above a threshold frequency) is to be stored within the first allocation area 420 and/or the other allocation areas.
- a second policy 406 may be assigned to the second allocation area 422 and/or other allocation areas.
- the second policy 406 may specify that cold data (e.g., data that is accessed below the threshold frequency) is to be stored within the second allocation area 422 and/or the other allocation areas.
- a third policy 408 may be assigned to the third allocation area 424 and/or other allocation areas.
- the third policy 408 may specify that user data (e.g., a user text document) is to be stored within the third allocation area 424 and/or the other allocation areas.
- a fourth policy 410 may be assigned to a fourth allocation area and/or other allocation areas.
- the fourth policy 410 may specify that metadata (e.g., metadata maintained by a storage file system, such as volume size information, partner storage controller information, replication policy information, backup policies, etc.) is to be stored within a fourth allocation area and/or the other allocation areas.
- a fifth policy 412 may be assigned to a fifth allocation area and/or other allocation areas.
- the fifth policy 412 may specify that randomly access data is to be stored within a fifth allocation area and/or the other allocation areas.
- a sixth policy 414 may be assigned to a sixth allocation area and/or other allocation areas.
- the sixth policy 414 may specify that sequentially accessed data is to be stored within a sixth allocation area and/or the other allocation areas.
- a seventh policy may be assigned to a seventh allocation area and/or other allocation areas.
- the seventh policy may specify that data of a first aggregate is to be stored within a seventh allocation area and/or the other allocation areas.
- An eighth policy may be assigned to an eighth allocation area and/or other allocation areas.
- the eighth policy may specify that data of a second aggregate is to be stored within an eighth allocation area and/or the other allocation areas.
- policies may be assigned to a single allocation area (e.g., a ninth policy specifying that hot data can be stored within a seventh allocation area and an tenth policy specifying that metadata can be stored within the seventh allocation area), and that a policy may be assigned to more than one allocation area (e.g., a policy specify that hot data can be stored within the first allocation area 420 , the seventh allocation area, and a ninth allocation area).
- a policy may apply to a single classification of data (e.g., hot data) or may apply to multiple classifications of data (e.g., a policy specifying where to store hot data, where to store cold data, where to store user data, etc.).
- FIG. 5 illustrates an example of a system 500 for selectively storing data into allocation areas using streams.
- Allocation areas may be defined across multiple storage devices 502 .
- a first allocation area 512 may be defined across first portions of a first storage device 504 , a second storage device 506 , a third storage device 508 , a parity storage device 510 , and/or other storage devices (e.g., storage devices having a RAID configuration).
- data within a stream that is tagged with a stream identifier associated with the first allocation area 512 , may be stored within the first allocation area 512 such as stored across one or more of the first storage device 504 , the second storage device 506 , the third storage device 508 , and/or the parity storage device 510 .
- a second allocation area 514 may be defined across second portions of the first storage device 504 , the second storage device 506 , the third storage device 508 , the parity storage device 510 , and/or other storage devices. In this way, data, within a stream that is tagged with a stream identifier associated with the second allocation area 514 , may be stored within the second allocation area 514 such as stored across one or more of the first storage device 504 , the second storage device 506 , the third storage device 508 , and/or the parity storage device 510 .
- a third allocation area 516 may be defined across third portions of the first storage device 504 , the second storage device 506 , the third storage device 508 , the parity storage device 510 , and/or other storage devices. In this way, data, within a stream that is tagged with a stream identifier associated with the third allocation area 516 , may be stored within the third allocation area 516 such as stored across one or more of the first storage device 504 , the second storage device 506 , the third storage device 508 , and/or the parity storage device 510 .
- allocation areas may be defined for a single storage device or across any number of storage devices. It also may be appreciated that an allocation area may be defined within a single storage device or across any number of storage devices.
- first data and second data may be received.
- a file system of a storage controller may receive a write stream comprising the first data and the second data from client applications.
- the file system may be associated with an indirection layer, such as a virtualization layer that virtualizes data of a storage device into which the first data and the second data are to be written.
- a policy may define a first characteristic as a user data characteristic, a second characteristic as a metadata characteristic, a third characteristic as a hot data characteristic, a fourth characteristic of a cold data characteristic, etc.
- the first data may be identified as having the first characteristic defined within a policy (e.g., the first data is user data in a user database).
- the second data may be identified as having the second characteristic defined within the policy (e.g., the second data is metadata used by a storage controller to manage replication of the user database). In this way, the first data may be identified as user data and the second data may be identified as metadata.
- the first data is assigned to a first stream.
- the first stream is tagged with a first stream identifier specified by the policy for the first characteristic of user data.
- the first stream identifier is associated with a first allocation area, of the storage device, that is defined by the policy for storing user data (e.g., the file system and the storage device may have negotiated to determine that the first stream identifier would be used to tag streams of user data that is to be processed using the first allocation area).
- the second data is assigned to a second stream.
- the second stream is tagged with a second stream identifier specified by the policy for the second characteristic of metadata.
- the second stream identifier is associated with a second allocation area, of the storage device, that is defined by the policy for storing metadata (e.g., the file system and the storage device may have negotiated to determine that the second stream identifier would be used to tag streams of metadata that is to be processed using the second allocation area).
- the first stream is sent to the storage device for writing the first data of user data to the first allocation area based upon the first stream being tagged with the first stream identifier.
- user data may be selectively stored within the first allocation area and not in other allocation area that are not designated for user data.
- the second stream is sent to the storage device for writing the second data of metadata to the second allocation area based upon the second stream being tagged with the second stream identifier.
- metadata may be selectively stored within the second allocation area and not in other allocation areas that are not designated for metadata.
- FIG. 7 illustrates an example of a system 700 for selectively storing data into allocation areas using streams.
- a file system 702 or any other hardware or software module may define one or more allocation areas within a storage device 712 .
- a first allocation area 714 may be defined for a first block range (e.g., a first range of virtual block numbers) of the storage device 712 .
- a second allocation area 716 may be defined for a second block range (e.g., a second range of virtual block numbers) of the storage device 712 .
- a third allocation area 718 may be defined for a third block range (e.g., a third range of virtual block numbers) of the storage device 712 .
- a plurality of allocation areas may be defined within the storage device 712 and/or across other storage devices.
- the file system 702 may negotiate with the storage device 712 to determine stream identifiers that the file system 702 will use to tag streams of data.
- a stream identifier will be an indicator to the storage device 712 that data of a stream tagged with the stream identifier is to be processed (e.g., stored) within a corresponding allocation area.
- a first stream identifier may be specified for the first allocation area 714 .
- a second stream identifier 724 may be specified for the second allocation area 716 .
- a third stream identifier 726 may be specified for the third allocation area 718 .
- the file system 702 may assign policies 704 to allocation areas. For example, a policy may be assigned to the second allocation area 716 and/or the third allocation area 718 .
- the policy may specify that hot data (e.g., data accessed at a frequency greater than a threshold) is to be stored within the second allocation area 716 and that streams of hot data are to be tagged with the second stream identifier 724 specified for the second allocation area 716 . In this way, hot data will be stored/contained within the second allocation area 716 and not in other allocation areas. Thus, fragmentation resulting from frequent access to the hot data may be contained within the second allocation area 716 and will not introduce additional fragmentation to other allocation areas.
- hot data e.g., data accessed at a frequency greater than a threshold
- the policy may specify that cold data (e.g., data accessed at a frequency below the threshold) is to be stored within the third allocation area 718 and that streams of cold data are to be tagged with the third stream identifier 726 specified for the third allocation area 718 .
- cold data will be stored/contained within the third allocation area 718 and not in other allocation areas.
- garbage collection techniques and/or other techniques that move valid data from destination cells to free cells so that new data can be written to those destination cells are not needlessly moving the cold data around.
- Allocation areas may be sorted into a sorted set of allocation areas.
- the allocation areas may be sorted based upon the policies 704 , available free space, and/or other sorting criteria. For example, a policy may specify that randomly accessed data can be stored within the second allocation area 716 and the third allocation area 718 .
- allocation areas may be dynamically sorted in relation to a current scenario of storing randomly accessed data within the storage device 712 .
- the first allocation area 714 may be ranked below the second allocation area 716 and below the third allocation area 718 based upon the policy specifying that the second allocation area 716 and the third allocation area 718 but not the first allocation area 714 is to be used for storing randomly accessed data.
- the second allocation area 716 may be ranked higher than the third allocation area 718 based upon the second allocation area 716 having more available free space than the third allocation area 718 .
- the file system 702 may receive a write stream 706 .
- the write stream 706 may comprise hot data 708 (e.g., data accessed at a frequency greater than a threshold), cold data 710 (e.g., data accessed at a frequency below the threshold), and/or data having other characteristics.
- the file system 702 may utilize the policies 704 to determine (e.g., to sort allocation areas and select a highest ranked allocation area) that the second allocation area 716 is to be used for storing the hot data 708 .
- the file system 702 may assign the hot data 708 to a first stream 720 .
- the file system 702 may tag the first stream 720 with the second stream identifier 724 of the second allocation area 716 .
- the first stream 720 is provided to the storage device 712 .
- the storage device 712 will process the hot data 708 of the first stream 720 using the second allocation area 716 based upon the first stream 720 being tagged with the second stream identifier 724 .
- the file system 702 may utilize the policies 704 to determine (e.g., to sort allocation areas and select a highest ranked allocation area) that the third allocation area 718 is to be used for storing the cold data 710 .
- the file system 702 may assign the cold data 710 to a second stream 722 .
- the file system 702 may tag the second stream 722 with the third stream identifier 726 of the third allocation area 718 . In this way, the second stream 722 is provided to the storage device 712 .
- the storage device 712 will process the cold data 710 of the second stream 722 using the third allocation area 718 based upon the second stream 722 being tagged with the third stream identifier 726 .
- Still another embodiment involves a computer-readable medium comprising processor-executable instructions configured to implement one or more of the techniques presented herein.
- An example embodiment of a computer-readable medium or a computer-readable device that is devised in these ways is illustrated in FIG. 8 , wherein the implementation 800 comprises a computer-readable medium 808 , such as a compact disc-recordable (CD-R), a digital versatile disc-recordable (DVD-R), flash drive, a platter of a hard disk drive, etc., on which is encoded computer-readable data 806 .
- CD-R compact disc-recordable
- DVD-R digital versatile disc-recordable
- flash drive a platter of a hard disk drive, etc.
- This computer-readable data 806 such as binary data comprising at least one of a zero or a one, in turn comprises a processor-executable computer instructions 804 configured to operate according to one or more of the principles set forth herein.
- the processor-executable computer instructions 804 are configured to perform a method 802 , such as at least some of the exemplary method 300 of FIG. 3 and/or at least some of the exemplary method 600 of FIG. 6 , for example.
- the processor-executable computer instructions 804 are configured to implement a system, such as at least some of the exemplary system 400 of FIG. 4 , at least some of the exemplary system 500 of FIG. 5 , and/or at least some of the exemplary system 700 of FIG. 7 for example.
- Many such computer-readable media are contemplated to operate in accordance with the techniques presented herein.
- Computer readable media can include processor-executable instructions configured to implement one or more of the methods presented herein, and may include any mechanism for storing this data that can be thereafter read by a computer system.
- Examples of computer readable media include (hard) drives (e.g., accessible via network attached storage (NAS)), Storage Area Networks (SAN), volatile and non-volatile memory, such as read-only memory (ROM), random-access memory (RAM), electrically erasable programmable read-only memory (EEPROM) and/or flash memory, compact disk read only memory (CD-ROM)s, CD-Rs, compact disk re-writeable (CD-RW)s, DVDs, cassettes, magnetic tape, magnetic disk storage, optical or non-optical data storage devices and/or any other medium which can be used to store data.
- NAS network attached storage
- SAN Storage Area Networks
- volatile and non-volatile memory such as read-only memory (ROM), random-access memory (RAM), electrically erasable programmable read-only memory (EEPROM) and/or flash memory
- CD-ROM compact disk read only memory
- CD-Rs compact disk re-writeable
- DVDs cassettes
- magnetic tape magnetic disk storage
- the claimed subject matter is implemented as a method, apparatus, or article of manufacture using standard application or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter.
- article of manufacture as used herein is intended to encompass a computer application accessible from any computer-readable device, carrier, or media.
- a component includes a process running on a processor, a processor, an object, an executable, a thread of execution, an application, or a computer.
- an application running on a controller and the controller can be a component.
- One or more components residing within a process or thread of execution and a component may be localized on one computer or distributed between two or more computers.
- exemplary is used herein to mean serving as an example, instance, illustration, etc., and not necessarily as advantageous.
- “or” is intended to mean an inclusive “or” rather than an exclusive “or”.
- “a” and “an” as used in this application are generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form.
- at least one of A and B and/or the like generally means A or B and/or both A and B.
- such terms are intended to be inclusive in a manner similar to the term “comprising”.
- first,” “second,” or the like are not intended to imply a temporal aspect, a spatial aspect, an ordering, etc. Rather, such terms are merely used as identifiers, names, etc. for features, elements, items, etc.
- a first set of information and a second set of information generally correspond to set of information A and set of information B or two different or two identical sets of information or the same set of information.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/453,949 US10761750B2 (en) | 2017-03-09 | 2017-03-09 | Selectively storing data into allocation areas using streams |
EP18714658.4A EP3593238A1 (en) | 2017-03-09 | 2018-03-09 | Selectively storing data into allocations areas using streams |
CN201880028884.8A CN110612511B (zh) | 2017-03-09 | 2018-03-09 | 使用流选择性地向分配区域中存储数据 |
PCT/US2018/021659 WO2018165502A1 (en) | 2017-03-09 | 2018-03-09 | Selectively storing data into allocations areas using streams |
JP2019548590A JP7097379B2 (ja) | 2017-03-09 | 2018-03-09 | ストリームを使用するデータの割り振りエリアへの選択的記憶 |
US16/940,448 US11409448B2 (en) | 2017-03-09 | 2020-07-28 | Selectively storing data into allocation areas using streams |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/453,949 US10761750B2 (en) | 2017-03-09 | 2017-03-09 | Selectively storing data into allocation areas using streams |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/940,448 Continuation US11409448B2 (en) | 2017-03-09 | 2020-07-28 | Selectively storing data into allocation areas using streams |
Publications (2)
Publication Number | Publication Date |
---|---|
US20180260154A1 US20180260154A1 (en) | 2018-09-13 |
US10761750B2 true US10761750B2 (en) | 2020-09-01 |
Family
ID=61832594
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/453,949 Active 2037-10-14 US10761750B2 (en) | 2017-03-09 | 2017-03-09 | Selectively storing data into allocation areas using streams |
US16/940,448 Active US11409448B2 (en) | 2017-03-09 | 2020-07-28 | Selectively storing data into allocation areas using streams |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/940,448 Active US11409448B2 (en) | 2017-03-09 | 2020-07-28 | Selectively storing data into allocation areas using streams |
Country Status (5)
Country | Link |
---|---|
US (2) | US10761750B2 (ko) |
EP (1) | EP3593238A1 (ko) |
JP (1) | JP7097379B2 (ko) |
CN (1) | CN110612511B (ko) |
WO (1) | WO2018165502A1 (ko) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10776023B2 (en) * | 2016-11-07 | 2020-09-15 | Gaea LLC | Data storage device with configurable policy-based storage device behavior |
US10338842B2 (en) * | 2017-05-19 | 2019-07-02 | Samsung Electronics Co., Ltd. | Namespace/stream management |
US10877691B2 (en) * | 2017-12-29 | 2020-12-29 | Intel Corporation | Stream classification based on logical regions |
US11461023B1 (en) * | 2018-01-31 | 2022-10-04 | EMC IP Holding Company LLC | Flexible expansion of data storage capacity |
CN111078144A (zh) * | 2019-11-30 | 2020-04-28 | 苏州浪潮智能科技有限公司 | 一种提高自动分层效率的方法、系统、终端及存储介质 |
KR20210083448A (ko) * | 2019-12-26 | 2021-07-07 | 삼성전자주식회사 | 비지도 학습 기법을 사용하는 스토리지 장치 및 그것의 메모리 관리 방법 |
GB2607642A (en) * | 2020-04-01 | 2022-12-14 | Mobileye Vision Technologies Ltd | Flow control integrity |
KR102691862B1 (ko) * | 2020-04-09 | 2024-08-06 | 에스케이하이닉스 주식회사 | 데이터 저장 장치 및 그 동작 방법 |
CN111782632B (zh) * | 2020-06-28 | 2024-07-09 | 百度在线网络技术(北京)有限公司 | 数据处理方法、装置、设备和存储介质 |
CN112118262B (zh) * | 2020-09-21 | 2022-07-29 | 武汉中元华电科技股份有限公司 | 一种基于动态内存分配实现数据排序与合并的系统及方法 |
US11947803B2 (en) * | 2020-10-26 | 2024-04-02 | EMC IP Holding Company LLC | Effective utilization of different drive capacities |
US11405456B2 (en) * | 2020-12-22 | 2022-08-02 | Red Hat, Inc. | Policy-based data placement in an edge environment |
KR20230097866A (ko) * | 2021-12-24 | 2023-07-03 | 삼성전자주식회사 | 메모리 컨트롤러를 포함하는 스토리지 장치 및 스토리지 장치의 동작 방법 |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008134165A1 (en) | 2007-04-23 | 2008-11-06 | Microsoft Corporation | Hints model for optimization of storage devices connected to host and write optimization schema for storage devices |
US20110107042A1 (en) | 2009-11-03 | 2011-05-05 | Andrew Herron | Formatting data storage according to data classification |
US8959284B1 (en) * | 2010-06-28 | 2015-02-17 | Western Digital Technologies, Inc. | Disk drive steering write data to write cache based on workload |
US20160313943A1 (en) * | 2015-04-24 | 2016-10-27 | Kabushiki Kaisha Toshiba | Storage device that secures a block for a stream or namespace and system having the storage device |
US20170242788A1 (en) * | 2016-02-19 | 2017-08-24 | International Business Machines Corporation | Regrouping data during relocation to facilitate write amplification reduction |
US20170374147A1 (en) * | 2016-06-22 | 2017-12-28 | Tektronix Texas, Llc | Method and system for dynamic handling in real time of data streams with variable and unpredictable behavior |
US20180059988A1 (en) * | 2016-08-29 | 2018-03-01 | Samsung Electronics Co., Ltd. | STREAM IDENTIFIER BASED STORAGE SYSTEM FOR MANAGING AN ARRAY OF SSDs |
US20180276118A1 (en) * | 2017-03-23 | 2018-09-27 | Toshiba Memory Corporation | Memory system and control method of nonvolatile memory |
US20180307596A1 (en) * | 2017-04-25 | 2018-10-25 | Samsung Electronics Co., Ltd. | Garbage collection - automatic data placement |
US20180307598A1 (en) * | 2017-04-25 | 2018-10-25 | Samsung Electronics Co., Ltd. | Methods for multi-stream garbage collection |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7529849B2 (en) * | 2006-07-27 | 2009-05-05 | International Business Machines Corporation | Reduction of message flow between bus-connected consumers and producers |
KR100855467B1 (ko) | 2006-09-27 | 2008-09-01 | 삼성전자주식회사 | 이종 셀 타입을 지원하는 비휘발성 메모리를 위한 맵핑장치 및 방법 |
JP4697146B2 (ja) | 2007-01-19 | 2011-06-08 | Tdk株式会社 | メモリコントローラ及びメモリコントローラを備えるフラッシュメモリシステム、並びにフラッシュメモリの制御方法 |
KR101498673B1 (ko) | 2007-08-14 | 2015-03-09 | 삼성전자주식회사 | 반도체 드라이브, 그것의 데이터 저장 방법, 그리고 그것을포함한 컴퓨팅 시스템 |
US9134917B2 (en) | 2008-02-12 | 2015-09-15 | Netapp, Inc. | Hybrid media storage system architecture |
US8074043B1 (en) * | 2009-01-30 | 2011-12-06 | Symantec Corporation | Method and apparatus to recover from interrupted data streams in a deduplication system |
CN102939765B (zh) * | 2009-12-01 | 2016-05-04 | 博马里斯网络公司 | 动态服务群组发现 |
US10102117B2 (en) * | 2012-01-12 | 2018-10-16 | Sandisk Technologies Llc | Systems and methods for cache and storage device coordination |
US8751725B1 (en) | 2012-01-27 | 2014-06-10 | Netapp, Inc. | Hybrid storage aggregate |
US9106721B2 (en) * | 2012-10-02 | 2015-08-11 | Nextbit Systems | Application state synchronization across multiple devices |
US10592106B2 (en) * | 2013-03-20 | 2020-03-17 | Amazon Technologies, Inc. | Replication target service |
US9471585B1 (en) * | 2013-12-20 | 2016-10-18 | Amazon Technologies, Inc. | Decentralized de-duplication techniques for largescale data streams |
GB2528333A (en) * | 2014-07-15 | 2016-01-20 | Ibm | Device and method for determining a number of storage devices for each of a plurality of storage tiers and an assignment of data to be stored in the plurality |
US10552085B1 (en) * | 2014-09-09 | 2020-02-04 | Radian Memory Systems, Inc. | Techniques for directed data migration |
US9632927B2 (en) * | 2014-09-25 | 2017-04-25 | International Business Machines Corporation | Reducing write amplification in solid-state drives by separating allocation of relocate writes from user writes |
US9904480B1 (en) * | 2014-12-18 | 2018-02-27 | EMC IP Holding Company LLC | Multiplexing streams without changing the number of streams of a deduplicating storage system |
US10540107B2 (en) * | 2015-01-02 | 2020-01-21 | Reservoir Labs, Inc. | Systems and methods for energy proportional scheduling |
US10437671B2 (en) * | 2015-06-30 | 2019-10-08 | Pure Storage, Inc. | Synchronizing replicated stored data |
US10257258B2 (en) * | 2016-10-31 | 2019-04-09 | International Business Machines Corporation | Transferring data between block and file storage systems |
-
2017
- 2017-03-09 US US15/453,949 patent/US10761750B2/en active Active
-
2018
- 2018-03-09 CN CN201880028884.8A patent/CN110612511B/zh active Active
- 2018-03-09 JP JP2019548590A patent/JP7097379B2/ja active Active
- 2018-03-09 EP EP18714658.4A patent/EP3593238A1/en active Pending
- 2018-03-09 WO PCT/US2018/021659 patent/WO2018165502A1/en unknown
-
2020
- 2020-07-28 US US16/940,448 patent/US11409448B2/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008134165A1 (en) | 2007-04-23 | 2008-11-06 | Microsoft Corporation | Hints model for optimization of storage devices connected to host and write optimization schema for storage devices |
US7853759B2 (en) * | 2007-04-23 | 2010-12-14 | Microsoft Corporation | Hints model for optimization of storage devices connected to host and write optimization schema for storage devices |
US20110107042A1 (en) | 2009-11-03 | 2011-05-05 | Andrew Herron | Formatting data storage according to data classification |
US8959284B1 (en) * | 2010-06-28 | 2015-02-17 | Western Digital Technologies, Inc. | Disk drive steering write data to write cache based on workload |
US20160313943A1 (en) * | 2015-04-24 | 2016-10-27 | Kabushiki Kaisha Toshiba | Storage device that secures a block for a stream or namespace and system having the storage device |
US20170242788A1 (en) * | 2016-02-19 | 2017-08-24 | International Business Machines Corporation | Regrouping data during relocation to facilitate write amplification reduction |
US20170374147A1 (en) * | 2016-06-22 | 2017-12-28 | Tektronix Texas, Llc | Method and system for dynamic handling in real time of data streams with variable and unpredictable behavior |
US20180059988A1 (en) * | 2016-08-29 | 2018-03-01 | Samsung Electronics Co., Ltd. | STREAM IDENTIFIER BASED STORAGE SYSTEM FOR MANAGING AN ARRAY OF SSDs |
US20180276118A1 (en) * | 2017-03-23 | 2018-09-27 | Toshiba Memory Corporation | Memory system and control method of nonvolatile memory |
US20180307596A1 (en) * | 2017-04-25 | 2018-10-25 | Samsung Electronics Co., Ltd. | Garbage collection - automatic data placement |
US20180307598A1 (en) * | 2017-04-25 | 2018-10-25 | Samsung Electronics Co., Ltd. | Methods for multi-stream garbage collection |
Non-Patent Citations (1)
Title |
---|
Int. Search Report/Written Opinion cited in PCT Application No. PCT/US2018/021659 dated Jun. 11, 2018, 14 pgs. |
Also Published As
Publication number | Publication date |
---|---|
US20180260154A1 (en) | 2018-09-13 |
CN110612511B (zh) | 2023-10-03 |
EP3593238A1 (en) | 2020-01-15 |
JP7097379B2 (ja) | 2022-07-07 |
CN110612511A (zh) | 2019-12-24 |
US11409448B2 (en) | 2022-08-09 |
WO2018165502A1 (en) | 2018-09-13 |
JP2020511714A (ja) | 2020-04-16 |
US20200356288A1 (en) | 2020-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11409448B2 (en) | Selectively storing data into allocation areas using streams | |
US11620064B2 (en) | Asynchronous semi-inline deduplication | |
US10769024B2 (en) | Incremental transfer with unused data block reclamation | |
US20220083247A1 (en) | Composite aggregate architecture | |
EP4139802B1 (en) | Methods for managing input-ouput operations in zone translation layer architecture and devices thereof | |
CN109313538B (zh) | 内联去重 | |
US20170046095A1 (en) | Host side deduplication | |
US20150312337A1 (en) | Mirroring log data | |
US12086116B2 (en) | Object and sequence number management | |
US11709603B2 (en) | Multi-tier write allocation | |
US20190034092A1 (en) | Methods for managing distributed snapshot for low latency storage and devices thereof | |
US20150213047A1 (en) | Coalescing sequences for host side deduplication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NETAPP INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DRONAMRAJU, RAVIKANTH;STERLING, KYLE DIGGS;BHATTACHARJEE, MRINAL K.;AND OTHERS;SIGNING DATES FROM 20170221 TO 20170227;REEL/FRAME:041517/0738 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |