TECHNICAL FIELD
The present disclosure relates generally to distributed storage systems. More specifically, but not by way of limitation, this disclosure relates to pooling storage nodes of a distributed storage system that have specialized hardware, such as hardware with built-in compression or encryption capabilities.
BACKGROUND
Distributed storage systems can include storage nodes in communication with each other over a network for synchronizing, coordinating, and storing data. The storage nodes can work together so that the distributed storage system behaves as one storage system. Distributed storage systems can implement block storage, file storage, or object storage techniques.
There are numerous advantages to using distributed storage systems, such as improved scalability, redundancy, and performance. In particular, distributed storage systems can be easily scaled horizontally, in the sense that they can combine many storage nodes into a single, shared storage system. Distributed storage systems can also store many copies of the same data for high availability, backup, and disaster recovery purposes. Additionally, some distributed storage systems can execute compute workloads on the same storage nodes that are also used to store data, thereby yielding a hyper-converged infrastructure that is highly efficient.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of an example of a distributed storage system according to some aspects of the present disclosure.
FIG. 2 is a sequence diagram of an example of a process for enabling specialized functionality on a storage node according to some aspects of the present disclosure.
FIG. 3 is a block diagram of an example of data migration in a distributed storage system according to some aspects of the present disclosure.
FIG. 4 is a block diagram of another example of a distributed storage system according to some aspects of the present disclosure.
FIG. 5 a flow chart of an example of a process for pooling distributed storage nodes according to some aspects of the present disclosure.
DETAILED DESCRIPTION
A distributed storage system can include storage nodes formed from relatively heterogeneous servers in communication with each other over a network, such as a local area network. Some of the storage nodes may have specialized hardware. Specialized hardware is a piece of hardware that is preconfigured by a manufacturer to implement specialized functionality, prior to receipt of the specialized hardware by an end user. For example, the specialized hardware can include specialized circuitry configured to implement the specialized functionality. The specialized circuitry may include a custom or semi-custom integrated circuit with a special hardware design or firmware specifically tailored for performing the specialized functionality, where the specialized circuitry is built into the specialized hardware by a manufacturer during a manufacturing process. Thus, specialized hardware is different from generic hardware on which aftermarket software is installed by the end user to implement the specialized functionality. The specialized functionality can be functionality that is supplemental to the main purpose of the piece of hardware. By way of example, the specialized hardware can include a storage adapter, a storage device, an input/output (I/O) interface, or a processing device, where the specialized hardware has appropriate circuitry built-in to implement specialized functionality such as data encryption, data compression, and checksum calculation capabilities. As one particular example, the specialized hardware can be a self-encrypting hard drive with inline data-encryption capabilities. Using specialized hardware to perform the specialized functionality can be faster and more robust than executing aftermarket software on generic hardware.
While some storage nodes may have the specialized hardware, other storage nodes may not. In such situations, the distributed storage system may operate as if all of the storage nodes lack the specialized hardware, since distributed storage systems generally operate on the basis of the lowest common denominator. As a result, the distributed storage system may not implement the specialized functionality of the specialized hardware, even though the specialized functionality could improve the distributed storage system.
Some examples of the present disclosure can overcome one or more of the abovementioned problems by identifying storage nodes in a distributed storage system that have specialized hardware with specialized functionality, enabling the specialized functionality on the storage nodes, grouping the storage nodes together into a pool of storage nodes (“node pool”) that have the specialized functionality enabled, and using the node pool to utilize the specialized functionality in relation to data requests. The data requests can include read requests for reading data, write requests for writing data, or both. Grouping the storage nodes in this way can prevent the data requests from being divided up among some storage nodes that have the specialized hardware and other storage nodes that lack the specialized hardware, so that the data requests arrive at a node pool in which all of the storage node members have the specialized hardware or none of the storage node members have the specialized hardware. With the data requests arriving at a node pool in which all of the storage nodes have the specialized hardware, the distributed storage system can leverage the specialized functionality to achieve various improvements, such as improvements to data security and throughput.
One particular example can involve a distributed storage system, such as Ceph Storage by Red Hat®. The distributed storage system can include hundreds or thousands of storage nodes. Each storage node can determine if it has specialized hardware capable of performing specialized functionality, such as data encryption, data compression, or both. Each storage node may determine if it has the specialized hardware by scanning its hardware. For example, a storage node can analyze its hardware upon booting up to determine if the specialized hardware is connected. As another example, a storage node may periodically analyze its hardware at predefined intervals to determine if the specialized hardware is connected. As yet another example, a storage node may analyze its hardware in response to an event determine if the specialized hardware is connected. After scanning their hardware, the storage nodes can then transmit status communications indicating whether or not they have the specialized hardware capable of performing the specialized functionality.
A centralized management node of the distributed storage system can receive the status communications from the storage nodes. Based on the status communications, the management node can determine a subset of the storage nodes that have the specialized hardware. The management node may then transmit communications to the subset of storage nodes for causing the storage nodes to enable the specialized functionality. Alternatively, the storage nodes may automatically enable the specialized functionality upon discovering that they are connected to the specialized hardware. Either way, the specialized functionality can be enabled on the storage nodes.
Next, the management node can assign the storage nodes in the subset to the same node pool. As a result, the node pool may only contain storage nodes with specialized hardware capable of performing the specialized functionality. The node pool can then be used to service data requests, so that the specialized functionality can be performed in relation to the data requests. Such node pools may be considered higher-tiered pools with better performance or data protection, given their specialized functionality. Thus, a service provider that is selling access to the distributed storage system may charge higher fees for using the node pool than for other node pools, such as node pools that lack specialized hardware with the specialized functionality.
These illustrative examples are given to introduce the reader to the general subject matter discussed here and are not intended to limit the scope of the disclosed concepts. The following sections describe various additional features and examples with reference to the drawings in which like numerals indicate like elements but, like the illustrative examples, should not be used to limit the present disclosure.
FIG. 1 is a block diagram of an example of a distributed storage system 100 according to some aspects of the present disclosure. The distributed storage system 100 includes storage nodes 102 a-e. The storage nodes 102 a-e may be physical servers for storing data.
Some storage nodes 102 a-c can have specialized hardware 108 a-c, while other storage nodes 102 d-e may lack the specialized hardware. Examples of the specialized hardware 108 a-c can include a storage adapter for interacting with storage devices, a storage device such as a hard disk or solid-state drive, a processing device such as a central processing unit (CPU) or graphics processing unit (GPU), or an I/O interface such as a peripheral component interface (PCI) card. The specialized hardware 108 a-c may be internal to the storage nodes 102 a-c or external and coupled to the storage nodes 102 a-c. The specialized hardware 108 a-c is configured to implement specialized functionality. Examples of the specialized functionality can include data compression/decompression capabilities, data encryption/decryption capabilities, checksum calculation capabilities, or any combination of these. The specialized functionality may be selectively enabled and disabled.
In some examples, the storage nodes 102 a-e can each analyze their hardware to determine if they have a corresponding piece of specialized hardware. For example, storage nodes 102 a-c may analyze their hardware and determine that the specialized hardware 108 a-c is attached. And storage nodes 102 d-e may analyze their hardware and determine that they lack the specialized hardware. The storage nodes 102 a-e may analyze their hardware to determine if they have the specialized hardware in response to any suitable event. For example, the storage nodes 102 a-e may each analyze their hardware to detect the presence of a piece of specialized hardware in response to a boot up event. As another example, the storage nodes 102 a-e may each analyze their hardware to detect the presence of the specialized hardware in response to the passage of a predefined amount of time, such as one hour. As yet another example, the storage nodes 102 a-e may each analyze their hardware to detect the presence of the specialized hardware in response to a request 110 from a management node 112 of the distributed storage system 100. One example of such a request 110 is shown in FIG. 1 , but the management node 112 can transmit similar types of requests to many or all of the storage nodes 102 a-e.
After analyzing their hardware, the storage nodes 102 a-e can generate respective status information indicating whether or not they have the specialized hardware. Each of the storage nodes 102 a-e can then transmit a status communication that includes the respective status information over a network to the management node 112. The network may be a local area network or the Internet. One example of such a status communication 114 with status information 116 is shown in FIG. 1 , but many or all of the storage nodes 102 a-e may transmit similar types of status communications to the management node 112.
The management node 112 is configured to manage one or more aspects of the distributed storage system 100. For example, the management node 112 can generate node pools and manage which virtual storage units are mapped to which nodes. The management node 112 may also manage which storage nodes 102 a-e have write-back caching enabled or disabled, as described below.
In some examples, the management node 112 can receive the status communications from the storage nodes 102 a-e and determine a subset of storage nodes 102 a-c that have the specialized hardware 108 a-c based on the status information in the status communications. The specialized functionality of the specialized hardware 108 a-c can then be enabled on the subset of storage nodes 102 a-c. For example, the management node 112 can transmit signals, such as signal 118, to the subset of storage nodes 102 a-c for causing the storage nodes 102 a-c to enable the specialized functionality. Each signal can include a command or other information configured to cause a corresponding storage node to enable the specialized functionality. For instance, the storage nodes 102 a-c may be in a first state in which the specialized functionality is disabled. To enable the specialized functionality on the storage nodes 102 a-c, the management node 112 can transmit signals with activation commands to the storage nodes 102 a-c. The storage nodes 102 a-c can receive the signals, detect the activation commands in the signals, and responsively switch from the first state to a second state in which the specialized functionality is enabled. In other examples, at least some of the storage nodes 102 a-c may automatically enable the specialized functionality in response to determining that they have back the specialized hardware 108 a-c. In still other examples, a system administrator may manually enable the specialized functionality on at least some of the storage nodes 102 a-c based on determining that the storage nodes 102 a-c have specialized hardware 108 a-c. By using one or more of the above techniques, the specialized functionality can be enabled on the storage nodes 102 a-c.
With the specialized functionality engaged on the subset of storage nodes 102 a-c, the management node 112 can assign the subset of storage nodes 102 a-c to a node pool 122. A node pool can be a defined group of storage nodes configured to implement storage functionality to service one or more read/write requests. In this example, the node pool 122 only includes the storage nodes 102 a-c that have the specialized hardware 108 a-c. The node pool 122 may be designated as a higher-tiered pool, since it may have better performance characteristics than another node pool (e.g., formed from storage nodes 102 d-e) that lacks the specialized hardware or specialized functionality.
A user can obtain access to the node pool 122 for storing data. For example, a user may purchase a subscription to the node pool 122, allowing the user to store and retrieve data therefrom by submitting data requests. Upon the user being granted such access, the distributed storage system 100 can cause the node pool 122 to perform the specialized functionality in relation to the data requests submitted by the user. The specialized functionality may yield better performance or security than is otherwise possible absent the specialized functionality.
In some examples, storage nodes may have more than one type of specialized hardware or may have specialize hardware capable of implementing more than one type of specialized functionality. In some such examples, the management node 112 can group together storage nodes having the same amounts and/or types of specialized functionality into the same node pool. For example, the distributed storage system 100 can include a first set of storage nodes with specialized hardware for implementing both data encryption and data compression. The distributed storage system 100 can also include a second set of storage nodes with specialized hardware for implementing data encryption but not data compression. And the distributed storage system 100 include a third set of storage nodes with specialized hardware for implementing data compression but not data encryption. As a result, the first set of storage nodes may be grouped into a first node pool capable of implementing both data encryption and data compression. The second set of storage nodes may be grouped into a second node pool capable of implementing data encryption but not data compression. And the third set of storage nodes may be grouped into a third node pool capable of implementing data compression but not data encryption. A user may purchase a subscription to whichever of the node pools meets the user's needs, where the first node pool may be more expensive than the second node pool or the third node pool.
It will be appreciated that FIG. 1 is intended to be illustrative and non-limiting. Other examples may include more components, fewer components, different components, or a different arrangement of the components shown in FIG. 1 . For instance, although the distributed storage system 100 includes five storage nodes in the example of FIG. 1 , the distributed storage system 100 may have hundreds or thousands of storage nodes in other examples.
FIG. 2 is a sequence diagram of an example of a process for enabling specialized functionality on distributed storage nodes according to some aspects of the present disclosure. Although the example shown in FIG. 2 includes a certain sequence of steps, other examples may involve more steps, fewer steps, different steps, or a different order of the steps shown in FIG. 2 .
The process begins with a management node 112 of a distributed storage system transmitting a request for status information to a storage node 102. The storage node 102 can receive the request and responsively determine if specialized hardware is coupled to the storage node. In this example, the storage node 102 has determined that the specialized hardware is coupled to the storage node 102. Next, the storage node 102 can transmit a response to the request, where the response is in the form of a status communication with status information indicating that the storage node 102 is coupled to the specialized hardware. The management node 112 can receive the status communication and determine that the storage node 102 is coupled to the specialized hardware based on the status information. The management node 112 may then transmit a signal to the storage node 102 for causing the storage node 102 to enable the specialized functionality of the specialized hardware. The storage node 102 can receive the signal and responsively enable the specialized functionality.
In some examples, the distributed storage system can use the storage node 102 to service a data request. The data request may be a higher-priority data request for which the specialized functionality may be desirable. A data request may be higher-priority if it is more critical, demanding, or higher cost than other data requests. For example, a user may pay a premium for the user's data requests to be deemed higher priority.
FIG. 3 is a block diagram of an example of a migration process in a distributed storage system 100 according to some aspects of the present disclosure. In this example, the distributed storage system 100 includes a first node pool 302 for servicing data requests 308 from a client device, such as a laptop computer, desktop computer, server, or mobile device. The client device may be external to the distributed storage system 100. Within the first node pool 302 is a storage node 102 d that lacks specialized hardware. As a result, the storage node 102 d may be incapable of implementing the specialized functionality corresponding to the specialized hardware. The storage node 102 d may include any number of virtual storage units (VSU) 306 d. Virtual storage units can be logical devices that are mapped to physical storage devices for reading and writing data associated with data requests. The node locations and physical storage-device mappings of the VSUs in the distributed storage system 100 may be adjustable by the management node 112.
It may be desirable for the data requests 308 to be serviced by storage nodes that have the specialized functionality enabled, for example to obtain the performance or security improvements associated with the specialized functionality. For example, if the data requests 308 are higher priority, it may be desirable to service the data requests 308 using higher-performance storage nodes that have the specialized functionality enabled. To that end, the management node 112 can generate a second node pool 304 using the techniques described above, where the second node pool 304 includes storage nodes 102 a-b on which the specialized functionality is enabled. The specialized functionality may be enabled on the storage nodes 102 a-b based on the storage nodes 102 a-b having corresponding specialized hardware 108 a-b. The management node 112 can determine that VSU 306 d_1 corresponds to (e.g., is a destination for) the data requests 308, and then migrate a VSU 306 d_1 from storage node 102 d to storage node 102 a. This migration is represented in FIG. 1 by a dashed arrow. The management node 112 can also transmit a communication to the client device from which the data requests 308 originated, to notify the client device of the change in location of the VSU 306 d_1. As a result, the client device can direct future data requests 308 corresponding to VSU 306 d_1 to storage node 102 a, so that the data requests 308 can be serviced by the second node pool 304 using the specialized functionality.
As one particular example, the management node 112 can determine that the data requests 308 have a particular priority level, such as a high priority level associated with sensitive data. Different data requests 308 may have different priority levels assigned by a user or the system. The management node 112 can also determine a VSU 306 d_1 associated with the data requests 308 is located on a particular storage node 102 d of the distributed storage system 100. The management node 112 can communicate with the particular storage node 102 d to determine that the particular storage node 102 d lacks a specialized storage adapter with inline data-encryption capabilities. For example, the management node 112 can receive status information from the particular storage node 102 d indicating that the particular storage node 102 d lacks the specialized storage adapter. Since the particular storage node 102 d lacks the specialized storage adapter, the storage node 102 d is likely incapable of implementing the inline data-encryption. Based on determining that (i) the data requests 308 have the particular priority level and (ii) the particular storage node 102 d having the VSU 306 d_1 associated with the data requests 308 does not have the specialized hardware, the management node 112 can migrate the VSU 306 d_1 from the particular storage node 102 d to another storage node 102 a that has the specialized hardware 108 a. As a result, the data requests 308 can be serviced by the other storage node 102 a using the specialized functionality moving forward, given the presence of the specialized hardware 108 a.
It will be appreciated that FIG. 3 is intended to be illustrative and non-limiting. Other examples may include more components, fewer components, different components, or a different arrangement of the components shown in FIG. 3 . For instance, although the distributed storage system 100 includes three storage nodes in two node pools in the example of FIG. 3 , the distributed storage system 100 may have any number of storage nodes spread across any number of node pools.
FIG. 4 is a block diagram of another example of a distributed storage system 400 according to some aspects of the present disclosure. The distributed storage system 400 includes a management node 112 and storage nodes 102 a-e with specialized hardware 108 a-c.
In this example, the management node 112 includes a processor 402 communicatively coupled with a memory 404. The processor 402 can include one processor or multiple processors. Non-limiting examples of the processor 402 include a Field-Programmable Gate Array (FPGA), an application-specific integrated circuit (ASIC), a microprocessor, etc. The processor 402 can execute instructions 406 stored in the memory 404 to perform operations. The instructions 406 can include processor-specific instructions generated by a compiler or an interpreter from code written in any suitable computer-programming language, such as C, C++, C#, etc.
The memory 404 can include one memory or multiple memories. Non-limiting examples of the memory 404 can include electrically erasable and programmable read-only memory (EEPROM), flash memory, or any other type of non-volatile memory. At least some of the memory 404 includes a non-transitory computer-readable medium from which the processor 402 can read the instructions 406. The non-transitory computer-readable medium can include electronic, optical, magnetic, or other storage devices capable of providing the processor 402 with computer-readable instructions or other program code. Examples of the non-transitory computer-readable medium can include magnetic disks, memory chips, ROM, random-access memory (RAM), an ASIC, optical storage, or any other medium from which a computer processor can read the instructions 406.
The processor 402 of the management node 112 can execute the instructions 406 to perform operations. In particular, the processor 402 can determine a subset of storage nodes 102 a-c that include specialized hardware 108 a-c based on status information 410 received from a plurality of storage nodes 102 a-e of the distributed storage system 400. For example, the processor 402 can receive the status information 410 from the plurality of storage nodes 102 a-e. The status information 410 can indicate whether each storage node in the plurality of storage nodes 102 a-e has (e.g., is coupled to) a corresponding piece of specialized hardware. For example, the status information 410 can indicate that storage nodes 102 a-c have specialized hardware 108 a-c, and that storage nodes 102 d-e does not have the specialized hardware. Based on the status information 410, the processor 402 can determine a subset of storage nodes 102 a-c, from among the plurality of storage nodes 102 a-e, having the specialized hardware 108 a-c. The processor 402 can then generate a node pool 122 that includes the subset of storage nodes 102 a-c. The specialized functionality 408 a-c can be enabled on the subset of storage nodes 102 a-c. The node pool 122 can be configured to perform the specialized functionality 408 a-c in relation to a data request 412.
In some examples, the processor 402 can implement some or all of the steps shown in FIG. 5 . Other examples can include more steps, fewer steps, different steps, or a different order of the steps than is shown in FIG. 5 . The steps of FIG. 5 are discussed below with reference to the components discussed above in relation to FIG. 4 .
In block 502, a processor 402 determines a subset of storage nodes 102 a-c that include specialized hardware 108 a-c. The specialized hardware 108 a-c can be preconfigured with specialized functionality 408 a-c for compressing data or encrypting data, in some examples. The processor 402 can determine the subset of storage nodes 102 a-c based on status information 410 received from the plurality of storage nodes 102 a-e of a distributed storage system 400. The status information 410 can indicate whether each storage node in the plurality of storage nodes 102 a-e has a corresponding piece of specialized hardware.
In block 504, the processor 402 generates a node pool 122 that includes the subset of storage nodes 102 a-c with the specialized hardware 108 a-c. In some examples, generating the node pool 122 may involve transmitting one or more commands to an application programming interface (API) of the distributed storage system 100 for causing the distributed storage system 100 to assign the storage nodes into the node pool 122.
After the node pool 122 is generated, the node pool 122 can perform the specialized functionality in relation to a data request 412. For example, data request 412 can correspond to a particular VSU that is part of the node pool 122. As a result, the data request 412 can be serviced by the node pool 122 using the specialized functionality.
The foregoing description of certain examples, including illustrated examples, has been presented only for the purpose of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Numerous modifications, adaptations, and uses thereof will be apparent to those skilled in the art without departing from the scope of the disclosure. For instance, any examples described herein can be combined with any other examples to yield further examples.