CN117992211A - Calculation memory unit and method for operating a calculation memory unit - Google Patents
Calculation memory unit and method for operating a calculation memory unit Download PDFInfo
- Publication number
- CN117992211A CN117992211A CN202311230559.9A CN202311230559A CN117992211A CN 117992211 A CN117992211 A CN 117992211A CN 202311230559 A CN202311230559 A CN 202311230559A CN 117992211 A CN117992211 A CN 117992211A
- Authority
- CN
- China
- Prior art keywords
- resources
- user
- storage unit
- type
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 95
- 238000004364 calculation method Methods 0.000 title abstract description 15
- 230000008569 process Effects 0.000 claims description 51
- 238000004519 manufacturing process Methods 0.000 claims description 20
- 238000013507 mapping Methods 0.000 claims description 14
- 238000012545 processing Methods 0.000 description 13
- 230000008901 benefit Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 238000009434 installation Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000001824 photoionisation detection Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000013403 standard screening design Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
A calculation storage unit and a method of operating a calculation storage unit are described. The computing storage unit may include a first resource of a first type and a second resource of the first type. The table may map a User Identifier (UID) of the user to a number of resources of the first type.
Description
The present application claims the benefit of U.S. provisional patent application No. 63/422,918 filed on month 11 and 4 of 2022, which is incorporated herein by reference for all purposes.
The present application claims the benefit of U.S. non-provisional patent application Ser. No. 18/094,342, filed on 1/6 of 2023, which is incorporated herein by reference for all purposes.
Technical Field
The disclosure relates generally to computing storage and, more particularly, to managing resources in computing storage.
Background
The computational storage unit may provide near data processing. The user may request the execution engine to execute a program for the user. These programs may utilize resources (such as memory and/or programming slots) that compute memory locations.
There remains a need to manage the use of computing storage unit resources.
Disclosure of Invention
The disclosed embodiments include a computing storage unit that may limit resources allocated to a user. The table may map the user identifier to a Service Level Agreement (SLA). The SLA may identify how much resources may be allocated to the user's computing storage unit.
In one general aspect, a computing storage unit includes: a first resource of a first type; a second resource of the first type; a table maps User Identifiers (UIDs) of users to the number of resources of the first type.
In one general aspect, a method of operating a computing storage unit includes: receiving, at a computing storage unit, a request from a user device of a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit; determining the number of resources of the first type that the user device should be able to access; and limiting the user device to calculating a maximum number of resources of the first type in the storage unit.
In one general aspect, an article of manufacture includes a non-transitory storage medium having instructions stored thereon that, when executed by a machine, result in comprising: receiving, at a computing storage unit, a request from a user device of a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit; determining the number of resources of the first type that the user device should be able to access; and limiting the user device to calculating a maximum number of resources of the first type in the storage unit.
Drawings
The drawings described below are examples of how the disclosed embodiments may be implemented and are not intended to limit the disclosed embodiments. Various embodiments disclosed may include elements not shown in a particular drawing and/or may omit elements shown in a particular drawing. The drawings are intended to provide an illustration and may be to scale.
FIG. 1 illustrates a system including a computing storage unit that can limit resources used by a user in accordance with a disclosed embodiment.
FIG. 2 shows details of the machine of FIG. 1, according to a disclosed embodiment.
FIG. 3 illustrates details of the computational storage unit of FIG. 1, in accordance with the disclosed embodiments.
Fig. 4 illustrates a table used by the computing storage unit of fig. 1 to determine a Service Level Agreement (SLA) for a user, in accordance with a disclosed embodiment.
FIG. 5 illustrates a table used by the computing storage unit of FIG. 1 to track resources used by a user in accordance with a disclosed embodiment.
Fig. 6 illustrates a data structure created or used in response by the computing storage unit of fig. 1 in response to an open session request from a user, in accordance with a disclosed embodiment.
FIG. 7 depicts a high-level flow chart of how the compute storage unit of FIG. 1 responds to resource requests from a user in accordance with a disclosed embodiment.
FIG. 8 depicts a high-level flow chart of how the compute storage unit of FIG. 1 reclaims resources in accordance with a disclosed embodiment.
FIG. 9 illustrates a flowchart of an example process for the compute storage unit of FIG. 1 to process the request of FIG. 6 from a user in accordance with a disclosed embodiment.
FIG. 10 illustrates a flowchart of an example process for the computing storage unit of FIG. 1 to limit resources allocated to a user in accordance with a disclosed embodiment.
FIG. 11 illustrates a flowchart of an example process for the computing storage unit of FIG. 1 to determine resource limitations of a user, in accordance with a disclosed embodiment.
FIG. 12 illustrates a flowchart of an example process for the computing storage unit of FIG. 1 to track resources allocated to a user in accordance with a disclosed embodiment.
FIG. 13 illustrates a flowchart of an example process for computing storage unit reclamation resources of FIG. 1, in accordance with a disclosed embodiment.
FIG. 14 illustrates a flowchart of an example process for defining or updating the table of FIG. 4 for the compute storage unit of FIG. 1, in accordance with a disclosed embodiment.
Detailed Description
Reference will now be made in detail to the disclosed embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the disclosure. However, it will be understood by those of ordinary skill in the art that the disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first module may be referred to as a second module, and similarly, a second module may be referred to as a first module, without departing from the scope of the disclosure.
The terminology used in the description of the disclosure herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in the description of the disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or groups thereof. The components and features of the drawings are not necessarily to scale.
A Computation Storage Unit (CSU) may provide execution of a program proximate to a storage device storing data. An application (which may be executing on behalf of a user) may request the CSU to perform one or more operations. By providing near-memory execution of programs, the CSU may more efficiently access data. The CSU may avoid the need to transfer data from the storage device to the memory and may avoid the use of host processor cycles for execution.
But CSU resources may be limited. That is, the number of execution engines, programming slots, and/or memory within the CSU may be limited. A single application may request sufficient resources to use the CSU to prevent other applications from being able to use the CSU. Such use of resources may be accomplished maliciously or accidentally.
The disclosed embodiments address these problems by assigning a Service Level Agreement (SLA) to each user. The SLA may specify the number of resources that the user (and any applications executing on behalf of the user) may use. By limiting each user to a maximum number of resources, other resources may be kept available to other users.
The disclosed embodiments enable multiple users to have the same SLA. By allowing multiple users to be assigned to the same SLA, management of resource limitations can be simplified: all users assigned to the same SLA may change their resource limitations by changing the resource limitations associated with the SLA instead of the resource limitations associated with individual users. Note that in some embodiments of the disclosure, each user may have their own set of resources: both users may have the same resource limitations do not necessarily mean that the combined two users are limited to that amount of resources (although the disclosed embodiments may implement resource limitations in this manner).
The disclosed embodiments may track resources used by a user. If the user requests more resources than allowed by the resource constraint, then the disclosed embodiments may return an error if the user requests additional resources.
The disclosed embodiments may periodically check whether the resources allocated to the user are still in use. If the user is not currently using the resource, the disclosed embodiments may reclaim the unused resource.
FIG. 1 illustrates a system including a computing storage unit that can limit resources used by a user in accordance with a disclosed embodiment. In fig. 1, machine 105 (machine 105 may also be referred to as a host or system) may include a processor 110, a memory 115, and a storage 120. The processor 110 may be any kind of processor. (processor 110 and other components discussed below are shown as external to the machine; the disclosed embodiments may include these components within the machine.) while FIG. 1 shows a single processor 110, machine 105 may include any number of processors, each of which may be single-core or multi-core processors, each of which may implement a Reduced Instruction Set Computer (RISC) architecture or a Complex Instruction Set Computer (CISC) architecture (and other possibilities), and may be mixed in any desired combination.
The processor 110 may be coupled to a memory 115. The memory 115 may be any kind of memory, such as flash memory, dynamic Random Access Memory (DRAM), static Random Access Memory (SRAM), persistent random access memory (DRAM), ferroelectric Random Access Memory (FRAM), or non-volatile random access memory (NVRAM) (such as Magnetoresistive Random Access Memory (MRAM), etc.), as desired, the memory 115 may be volatile memory or non-volatile memory, as well as any desired combination of different memory types, and may be managed by the memory controller 125.
The processor 110 and memory 115 may also support an operating system under which various applications may be running. These applications may issue requests (requests may also be referred to as commands) for reading data from any memory 115 or writing data to any memory 115. When the storage device 120 is used to support applications that read or write data via a particular kind of file system, the storage device 120 may be accessed using the device driver 130-1. Although FIG. 1 shows one storage device 120, any number of storage devices may be present in machine 105. The storage 120 may support any desired protocol or protocols, including, for example, the non-volatile memory express (NVMe) protocol. Different storage devices 120 may support different protocols and/or interfaces. For example, storage 120 may support a cache coherence interconnect protocol, which may support both block level protocol (or any other higher level granularity) access and byte level protocol (or any other lower level granularity) access to data on storage 120. Examples of such cache coherence interconnect protocols are the computing fast link (CXL) protocol, which supports access to data in blocks using the cxl.io protocol and bytes using the cxl.mem protocol. In this way, data on a CXL storage device can be accessed as either block-level data (such as a Solid State Drive (SSD)) or byte-level data (such as memory): CXL storage may be used to extend system memory. In some embodiments disclosed, the CXL storage may be used solely to extend system memory; in other embodiments disclosed, a CXL storage device can be used to extend system memory and to act as a storage device (i.e., to handle file system requests to access data on the storage device).
Although fig. 1 uses the generic term "storage," the disclosed embodiments may include any storage format that may benefit from the use of computing storage units, examples of which may include hard disk drives and SSDs. Any reference below to an "SSD" should be understood to include such other embodiments of the disclosure. Furthermore, different types of storage devices may be mixed. For example, one storage device 120 may be a hard disk drive and another storage device 120 may be an SSD.
Machine 105 may be connected to a network (not shown in fig. 1). The network may be any kind of network. The network may be a wired network or a wireless network. The network may be a Local Area Network (LAN), wide Area Network (WAN), metropolitan Area Network (MAN) or a global network such as the internet, among other possibilities. The network may also include portions that may be different types of networks. For example, the network may include a wired part and a wireless part, or the network may include various LANs connected through the internet.
For interfacing with a network, machine 105 may have components (not shown in FIG. 1) that interface with the network. The component may be, for example, a network interface card.
Machine 105 may also include a computing storage unit 135 (storage unit 135 may also be referred to as computing storage 135 and possibly other terminology). The computing storage unit 135 may provide additional processing capabilities beyond those provided by the processor 110. The computing storage unit 135 may provide any desired functionality. For example, in some embodiments disclosed, the compute storage unit 135 may provide an offload process from the processor 110, which may release the processor 110 from performing other tasks. Additionally, in some embodiments of the present disclosure, the compute storage unit 135 may be used for near data processing to access data from the storage 120 rather than having to load data from the storage 120 into the memory 115 before the processor 110 can process the data. In some embodiments disclosed, as shown, the computing storage unit 135 may be separate from the storage 120; in other embodiments disclosed, computing storage 135 may be combined with storage 120 as a single component. The computation storage unit 135 may be implemented in any desired manner including, for example, a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Graphics Processor (GPU), a General Purpose GPU (GPGPU), a Tensor Processor (TPU), a Neural Processor (NPU), or a Central Processing Unit (CPU) running appropriate software, among other possibilities.
Just as the storage device 120 may be accessed using the device driver 130-1, the computing storage unit 135 may be accessed using the device driver 130-2. (device drivers 130-1 and 130-2 may be collectively referred to as device driver 130 or driver 130). Device driver 130 may provide a mechanism for an operating system of machine 105 to send requests to particular devices, such as storage device 120 and/or computing storage unit 135. In some embodiments disclosed, the device driver 130 may be implemented in a manner that enables a single device driver 130 to communicate with multiple components: in such an embodiment as disclosed, the device drivers 130-1 and 130-2 may be the same device driver 130. The computational storage unit is discussed further below with reference to fig. 3.
FIG. 2 shows details of the machine of FIG. 1, according to a disclosed embodiment. In FIG. 2, in general, machine 105 includes one or more processors 110, and one or more processors 110 may include a memory controller 125 and a clock 205, where memory controller 125 and clock 205 may be used to coordinate the operation of the components of the machine. The processor 110 may also be coupled to a memory 115, and the memory 115 may include, by way of example, random Access Memory (RAM), read Only Memory (ROM), or other state preserving medium. The processor 110 may also be coupled to the storage 120 and the network connector 210, the network connector 210 may be, for example, an ethernet connector or a wireless connector. The processor 110 may also be connected to a bus 215, and input/output (I/O) interface ports and user interfaces 220, which may be managed using an I/O engine 225, as well as other components, may be attached to the bus 215.
Fig. 3 shows details of the calculation storage unit 135 of fig. 1 according to the disclosed embodiments. In fig. 3, the computation storage unit 135 may include a memory 305. Similar to memory 115 of fig. 1, memory 305 may be any desired form of memory including, for example, DRAM or SRAM. The memory 305 may have any size: for example, memory 305 may be 1 Megabyte (MB) or 8 Gigabytes (GB).
The memory 305 may be divided into regions. For example, in FIG. 3, memory 305 is shown divided into three regions 310-1, 310-2, and 310-3 (the three regions 310-1, 310-2, and 310-3 may be collectively referred to as region 310). As can be seen, region 310-1 is relatively large (about 44% of memory 305), region 310-2 is relatively small (about 25% of memory 305), and region 310-3 is smaller (about 13% of memory 305). In general, each region 310 of memory 305 may have any desired size, and may be different from the size of any other region 310 of memory 305: the only limitation is that the sum of the sizes of the regions 310 may not exceed the total size of the memory 305. (in some embodiments disclosed, a portion of memory 305 may be reserved for use by computing storage unit 135, in which case the size of region 310 may always be smaller than the total size of memory 305). Although fig. 3 shows memory 305 divided into three regions 310, the disclosed embodiments may support dividing memory 305 into any number (zero or more) of regions 310 (where there may be no open session using computing storage unit 135, there may be no regions 310).
Fig. 3 also shows region 310 as continuous: that is, there is no gap between adjacent regions. For example, region 310-1 may end at one address and region 310-2 may begin at the next address, with regions 310-2 and 310-3 similarly adjacent. In some embodiments disclosed, however, memory 305 may be allocated into discrete areas 310. This situation may occur due to design or accident. For example, computing storage 135 may allocate regions 310-1 and 310-3 in response to successive requests from a user, and may leave region 310-2 unassigned. Or computing storage unit 135 may allocate regions 310-1, 310-2, and 310-3 in response to successive requests, but the application requesting region 310-2 may have completed and deallocated region 310-2, leaving a gap between regions 310-1 and 310-3.
When a user (or perhaps an application running at the user's request) requests memory from computing storage unit 135, computing storage unit 135 (or a controller of computing storage unit 135, not shown in fig. 3) may allocate region 310 of memory 305 to the user. That is, a new region 310 that can be allocated to the user can be established in the memory 305.
It may happen that the user is already using the area 310 of memory 305 in the computing storage unit 135 and may request additional memory. If the user has access to more memory 305, then in some embodiments disclosed, the user's existing area 310 may be extended; in other embodiments disclosed, the second region 310 may be allocated to the user. Thus, for example, regions 310 may each be assigned to a different user, or regions 310-1 and 310-3 may be assigned to one user, and region 310-2 may be assigned to a different user. (it may be simpler to combine regions 310-1 and 310-2 into a single contiguous region if they are both assigned to the same user).
Memory 305 may act as local memory within computing storage unit 135 that may be used by processes executing on computing storage unit 135. Memory is but one potential resource within computing storage unit 135. Another potential resource of compute storage unit 135 (or alternatively, compute storage unit resources or compute storage device resources) is program slots 315-1 through 315-3 (program slots 315-1 through 315-3 may be collectively referred to as program slots 315). Program slots 315 may represent locations where programs may be loaded into computing storage unit 135. For example, an application may have a particular program running on data stored in memory 305 (more specifically, region 310 of memory 305): the application may download the program into program slot 315 and request its execution. As such, program slots 315 may represent additional memory locations, but for programs rather than data. The programs loadable into program slots 315 may be custom programs requested by the application, or they may be standard programs available within computing storage unit 135 (and loadable into program slots 315 from other storage devices (not shown in FIG. 3) within computing storage unit 135).
Fig. 3 shows three program slots 315, but the disclosed embodiments may include any number (zero or more) of program slots. (e.g., if the computing storage units 135 only allow execution of pre-installation programs that may be run directly from their storage locations rather than loaded into the program slots 315, although the pre-installation locations themselves may be considered as program slots 315, the computing storage units 135 may not have program slots 315.) additionally, while FIG. 3 illustrates that the program slots 315 are all the same size, some embodiments disclosed may support program slots 315 of different sizes, in which case the program slots 315 may be allocated to users based on a desired size, or the program slots 315 may be allocated similarly to the regions 310 of the memory 305.
Another potential resource within computing storage unit 135 is execution engines 320-1 through 320-3 (execution engines 320-1 through 320-3 may be collectively referred to as execution engines 320). The execution engine 320 may be a processor (or processing core) that is operable to execute programs, such as programs that may be loaded into the program slots 315. That is, the execution engine 320 may be used to execute programs loaded into the program slots 315. Fig. 3 shows three execution engines 320, but the disclosed embodiments may include any number (zero or more) of execution engines 320. (e.g., if the compute storage units 135 only allow execution of pre-installers that may be run directly from their storage locations, rather than by the execution engines 320, although the pre-installation locations themselves may be considered as execution engines 320, the compute storage units 135 may not have execution engines 320.) additionally, while FIG. 3 illustrates that there may be a one-to-one correspondence between program slots 315 and execution engines 320, in some embodiments disclosed, the number of program slots 315 and execution engines 320 may be different: for example, if some execution engines 320 are dedicated to pre-installers rather than downloadable programs.
Resources such as memory 305, program slots 315, or execution engines 320 of fig. 3 may be considered to be organized into types. For example, memory 305 is a type of resource, as are program slots 315 and execution engines 320. Memory 305, program slots 315, and execution engine 320 are merely examples of the types of resources that may be provided by computing storage unit 135: in addition to or in lieu of some or all of memory 305, program slots 315, and execution engines 320, there may be other types of resources provided by computing storage unit 135. In general, it may not matter which particular instance of a resource is allocated to a user as long as the instance is of the correct type. Thus, for example, a user may not be concerned about whether he or she is allocated memory regions 310-1, 310-2, or 310-3 as long as the correct amount of memory 305 is allocated. Similarly, the user may not be concerned about whether he or she is assigned program slots 315-1, 315-2, or 315-3, or whether he or she is assigned execution engines 320-1, 320-2, or 320-3. On the other hand, if he or she requested the program slot 315 but was instead allocated the execution engine 320, the user may be concerned because the allocated resources will be of the wrong type.
In some embodiments disclosed, there may be differences between specific instances of a resource. For example, program slot 315-1 may provide more storage for programs than program slot 315-2. In this case, the user may request a particular instance of the resource (either by identifying the particular resource in some manner, or specifying an attribute (such as size) of the desired resource) to ensure that the user receives the appropriate resource, or the resource may be further subdivided within each type. For example, rather than allocating memory 305 to accommodate a user's request, memory 305 may be subdivided in advance into regions of various sizes (the various sizes may be fixed sizes), and each size may be described as a different type that may be requested by the user.
The compute storage unit 135 may also include a mapping table 325, a tracking table 330, and a reclamation unit 335. Mapping table 325 may be used to determine resource limitations for a user. The resource limits of users may be established independently for each user, or may be based on their Service Level Agreements (SLAs), where all users with a common SLA have the same resource limit. The mapping table is discussed further below with reference to fig. 4.
Tracking table 330 may be used to track resources allocated to a particular user. For example, for a given user, the tracking table 330 may track how much memory 305 (and potentially what areas 310) has been allocated to the user, how many slots 315 (and which slot numbers) have been allocated to the user, and/or how many execution engines 320 (and which execution engine numbers) have been allocated to the user. For example, when a user issues a request for additional resources (to ensure that the user does not exceed his or her resource limits), it is tracked what resources have been allocated to the user that can be used, and/or if the user does not properly deallocate resources, resources that can be reclaimed are identified. Tracking table 330 is discussed further below with reference to fig. 5.
The reclamation unit 335 may be used to reclaim resources that have been allocated to users but are no longer in use. For example, a user may be running an application that uses some of the resources of computing storage unit 135. But if the application does not release the resource when it is no longer needed (as it should) or if the application is accidentally terminated (e.g., if the process does not respond and is killed), the resource may remain allocated to users that are no longer using the resource. The reclamation unit 335 may be used to identify such resources and reclaim them so that they may be re-allocated to other users.
Fig. 4 illustrates a table used by the computing storage unit 135 of fig. 1 to determine a Service Level Agreement (SLA) for a user, in accordance with a disclosed embodiment. In fig. 4, a table 325 is shown. In fact, in FIG. 4, table 325 is implemented as two tables 405 and 410, but as discussed further below, some embodiments of the disclosure may combine tables 405 and 410 into a single table. Because the function of the tables is the same, any reference to tables 405 and 410 or to map 325 of FIG. 3 below can be understood to include a reference to another: only the structure of one or more tables is different.
Table 405 may map user identifiers to SLA identifiers. Table 405 may include a user identifier 415 and an SLA identifier 420. Entries 425-1 to 425-5 (entries 425-1 to 425-5 may be collectively referred to as entries 425) may map respective user identifiers 415 to SLA identifiers 420. For example, entry 425-1 may map a default user identifier to SLA level 0, entry 425-2 may map a user identifier "JDOE" to SLA level 1, and so on, until entry 425-5, entry 425-5 may map a user identifier "root" to SLA level 5.
Note that table 405 includes two special user identifiers: "default" (in entry 425-1) and "root" (in entry 425-5). Users that do not have a particular associated SLA level may be limited to default resource limitations. That is, if the table 405 does not include an entry 425 for a particular user identifier 425, then the default entry 425-1 may be available for that user.
"Root" (in entry 425-5) may represent an administrator user identifier. That is, the "root" may be a user identifier associated with an administrator of the system. When administrators log into the system, they can use the root account. The root account may have special permissions that are not associated with other user accounts: such as the capabilities of configuration tables 405 and 410 as described below with reference to fig. 14.
Although fig. 4 shows table 405 as including a textual user identifier 415, the disclosed embodiments may use other ways to identify a user. For example, user identifier 415 may be a numeric identifier used by the operating system. Additionally, while FIG. 4 shows the table 405 as including five entries 425, the disclosed embodiments may have a table 405 including any number (zero or more) of entries 425.
Table 410 may map SLA identifiers to resource limitations. Table 410 may include SLA identifier 420, memory size 430, number of program slots 435, and number of execution engines 440. Entries 445-1 through 445-4 (entries 445-1 through 445-4 may be collectively referred to as entries 445) may map individual SLA identifiers 420 to various resource restrictions for that SLA level. For example, entry 445-1 may map SLA level 0 to 64MB of memory, one program slot, and one execution engine, while entry 445-5 may map SLA level 5 to 4GB of memory, 5 program slots, and 5 execution engines.
Although FIG. 4 shows table 410 as including a memory size 430, a number of program slots 435, and a number of execution engines 440, other resource limitations may be used by the disclosed embodiments according to computing storage unit 135 of FIG. 1. For example, table 410 may include only a subset of these resources. Additionally, while FIG. 4 shows table 410 as including four entries 445, the disclosed embodiments may have table 405 including any number (zero or more) of entries 445.
By using both tables 405 and 410, resource management may be simplified. Each user may have a particular SLA level, which in turn may specify the particular resource limitations applicable. The number of SLA levels may be a relatively small number compared to the number of users of the system.
For example, assume that the system has 1000 users, where each user identifier 415 requires 2 bytes (16 bits), and that the system supports six different SLA levels. Each SLA identifier 420 may be represented using 4 bits. If each SLA level includes a maximum allocated memory size 430 (the memory size 430 may require 16 bits to represent the number of megabytes stored), a number of program slots 435 (the program slots 435 may require 4 bits), and a number of execution engines 440 (the execution engines 440 may require another 4 bits), then the total storage required by the SLA level may be 28 bits. Thus, the total storage required for table 410 may be (4+16+4) ×6=168 bits (21 bytes), and the total storage required for table 405 may be (32+4) ×1000=36,000 bits (4500 bytes). On the other hand, if the two tables are combined into one (each user identifier 415 is directly associated with the maximum allocated memory size 430, the number of program slots 435, and the number of execution engines 440), the total storage of the combined tables is (32+16+4+4) ×1000=56,000 bits (7000 bytes). Thus, using table 405 to map user identifier 415 to SLA identifier 420, and using table 410 to map SLA identifier 420 to individual resource limitations, the total storage required may be reduced. Combining tables 405 and 410 into a single table may allow for greater flexibility: for example, even though different user identifiers 415 may be associated with the same SLA level, different user identifiers 415 may have different resource limitations.
Fig. 5 illustrates a table used by the computing storage unit 135 of fig. 1 to track resources used by a user in accordance with a disclosed embodiment. In fig. 5, a trace table 330 is shown (trace table 330 may also be referred to as a usage table). Tracking table 330 may include user identifier 415, memory usage 505, program slots 510 for use, and execution engine 515 for use. Tracking table 330 may also include entries such as entries 520-1 and 520-2 (520-1 and 520-2 may be collectively referred to as entries 520). Each entry 520 may track what resources a particular user is using. Thus, for example, entry 520-1 indicates that user "JDOE" is currently using 54MB of memory 305 of FIG. 3, one program slot 315 of FIG. 3, and two execution engines 320 of FIG. 3, while entry 520-2 indicates that user "MBrown" is currently using 12MB of memory 305 of FIG. 3, one program slot 315 of FIG. 3, and one execution engine 320 of FIG. 3.
Although FIG. 5 shows trace table 330 as including two entries 520, the disclosed embodiments may support trace table 330 including any number (zero or more) of entries 520. For example, if entry 520 is deleted when the user ends using computing storage unit 135 of fig. 1, tracking table 330 may not have entry 520: if no user is currently using the calculation storage unit 135 of FIG. 1, the tracking table 330 may not have an entry 520.
There are some points of note with respect to tracking table 330. First, note that only user "JDoe" is using the maximum number of any type of resource—in this case, execution engine 320 of fig. 3 (assuming the resource limitations shown in tables 405 and 410 of fig. 4 represent the resource limitations of user "JDoe"). Each other resource is used at less than the maximum number/size, which means that the user does not need to always use the resource to the maximum extent. Second, note that user "JDoe" is using two execution engines 320 of fig. 3 but one program slot 315 of fig. 3. This may occur, for example, if a user is using a built-in program for one of the execution engines 320 of FIG. 3 (and thus does not require the program slot 315 of FIG. 3 for that execution engine 320 of FIG. 3), or if the user is using the same program in multiple execution engines 320 of FIG. 3 (which may occur if the user is using the same program with different data in parallel).
Fig. 6 illustrates a data structure created by or used for responding by the computing storage unit 135 of fig. 1 in response to an open session request from a user, in accordance with a disclosed embodiment. In fig. 6, a user (or an application running under the user's identifier) may issue a request 605 to open a session with computing storage unit 135 of fig. 1. As can be seen, the request 605 is passed from the user space into the kernel space for processing.
Each task (such as an application running under the user's identifier) may have a task structure (such as task structure 610). The task structure 610 may include an identifier of the user and a process identifier. Note that a single user may have multiple processes executing, which may have different process identifiers, but should all have the same user identifier. The process identifier may point to the session context 615 created for the computing storage unit 135 of fig. 1. The session context 615 may also include the user identifier 415 of fig. 4. Both the task structure 610 and the user identifier 415 of fig. 4 in the session context 615 may point to the user structure 620 (the user structure 620 may be, for example, the entry 425 of fig. 4 of the table 405 of fig. 4, and the user structure 620 may include the SLA identifier 420 of fig. 4). In one embodiment, the session context 615 may indicate that the resource is used by a session of the user device of the user.
The reason that session context 615 includes a process identifier is to support reclamation of resources, as discussed further below with reference to fig. 8.
Fig. 7 depicts a high-level flow chart of how the computing storage unit 135 of fig. 1 responds to a resource request from a user in accordance with a disclosed embodiment. In fig. 7, a user (or an application running under the user's identifier) may issue a request 705 to request the resources of computing storage unit 135 of fig. 1. The request 705 may be a request for additional resources beyond those already allocated to the user, or the request 705 may be a request for resources that are needed as part of the request 605 of fig. 6 (i.e., as part of an initial request using the computing storage unit 135 of fig. 1). As can be seen, the request 705 is passed from the user space into the kernel space for processing.
At block 710, the computing storage unit 135 of fig. 1 may determine whether to make resources available to the user. In addition to information about the resources requested in the request 705, the block 710 may also take into account the SLA description 715 and the tracking table 330 (indicating what resources the user is currently using), the SLA description 715 may indicate resource limitations applicable to the user (i.e., the entry 445 of FIG. 4 of the table 410 of FIG. 4). Of course, if the request 705 is part of the request 605 of fig. 6 (that is, the request 705 is a request for resources associated with a new session), the tracking table 330 may indicate that no resources are currently allocated to the user.
Block 710 may compare the SLA description 715 with a combination of the requested resources in the request 705 and the resources currently allocated to the user in the tracking table 330. If the combination of the requested resources in the request 705 and the resources currently allocated to the user in the tracking table 330 is somewhat greater than the SLA description 715, the calculation store 135 of FIG. 1 may return an error 720 indicating that the user does not have access to the requested resources. Otherwise, block 710 may update tracking table 330 and allocate the requested resources to the user.
As described above, the requested resources may only partially exceed the user's resource limitations. Consider, for example, user "MBrown". As shown in Table 330 of FIG. 5, user "MBrown" is currently using 12MB of memory 305 of FIG. 3, one program slot 315 of FIG. 3, and one execution engine 320 of FIG. 3. If the request 705 requests an additional 10MB of memory 305 of FIG. 3, the request may be granted because user "MBrown" may not be exceeding the resource limits in tables 405 and 410 of FIG. 4. However, if the request 705 also requests the three execution engines 320 of FIG. 3, then the user "MBrown" would be allocated the four execution engines 320 of FIG. 3, which would exceed the user's resource limitations according to tables 405 and 410 of FIG. 4. In other words, after allocating the resources requested in request 705, the user should still not exceed any resource limitations: if the user would exceed any resource limits, the request 705 should result in an error 720.
In some embodiments of the disclosure, the request 705 may result in a return error 720 if any resource limits are exceeded. In such embodiments of the disclosure, if the request 705 may even be partially disallowed, the request 705 may not be allocated at all. In other embodiments of the disclosure, the request 705 may result in resources being allocated to the extent that they do not exceed the user's resource limitations: error 720 may still be returned, but only indicates that not all requested resources are allocated. In other words, continuing with the example above, user "MBrown" may be allocated an additional 10MB of memory 305 of FIG. 3 and two execution engines 320 of FIG. 3, but not allocated the execution engine 320 of the third request of FIG. 3.
While the above discussion focuses on the resources requested by the user and the resource constraints applicable to the user, there is another factor to consider: what resources are available. For example, if the request 705 requires 10MB of memory 305 of FIG. 3, for example, the request may be within the user's resource limitations. But if the memory 305 of fig. 3 has been fully (or nearly fully) allocated, then 10MB of the memory 305 of fig. 3 may not be allocated to the user. In this case, block 710 may return error 720, not because request 705 would exceed the user's resource limit, but because the requested resources are not available. Thus, block 710 may also take into account what resources are available to allocate to the user: that is, what resources are not currently allocated to any other user.
Request 705 may also free resources previously allocated to the user (deallocate resources previously allocated to the user). By returning the resources to the computing storage unit 135 of fig. 1, the resources may be allocated to another user. When resources are being deallocated, there is no need to check whether the user is exceeding his or her resource limit: the user is decreasing rather than increasing his or her resource allocation.
Request 705 may also mix requests for new resources with deallocations of existing resources. In such cases, some embodiments of the disclosure may first deallocate resources and then allocate new resources, which may reduce the likelihood that a user may exceed his or her resource limits. In some embodiments disclosed, however, the computing storage unit 135 of fig. 1 may allocate new resources first and deallocate existing resources later.
By checking whether the request 705 would cause the user to override his or her resource limitations, a given user may be prevented from using too much resources of the computing storage unit 135 of FIG. 1. By preventing users from using excessive resources, malicious users may be prevented from preventing other users from using computing storage unit 135 of fig. 1 (such as when denial of service attacks).
FIG. 8 depicts a high-level flow chart of how the compute storage unit 135 of FIG. 1 reclaims resources in accordance with a disclosed embodiment. In fig. 8, when a user issues a request 605 to open a session, the request may trigger reclamation unit 335 of fig. 3. As shown in block 805, when reclamation unit 335 of FIG. 3 is triggered, reclamation unit 335 of FIG. 3 may schedule resource reclamation. When the scheduled time for resource reclamation is reached, the reclamation unit may attempt to reclaim the resources at block 810. Finally, at block 815, reclamation unit 335 of fig. 3 may determine whether any open sessions still exist. If there is still an open session, reclamation unit 335 of FIG. 3 may schedule another attempted reclamation of resources; otherwise (since no resources should be in use if there is no active session), the reclamation unit 335 of fig. 3 may (at least temporarily) end its processing. In this manner, reclamation unit 335 of FIG. 3 may periodically check to see what resources may be reclaimed. Any period may be used to schedule reclamation unit 335 of fig. 3: for example every 30 seconds.
A problem may arise as to why resource reclamation may be necessary. In general, resource reclamation should not be necessary. For example, when a user (or an application running on behalf of the user) completes its use of the compute storage unit 135 of FIG. 3, it should deallocate (release) the resources back to the compute storage unit 135 of FIG. 3 so that the resources can be used by other applications. But this assumes that the application is performing well and properly returning resources. This assumption is not always correct: some applications may fail to return resources. Unexpected termination of the application may also occur: an error may have occurred to terminate the application, or the user may have killed the application process. In such a case, the application may not be able to deallocate the resources back to the compute storage unit 135 of fig. 1.
It is for these reasons that the recovery unit 335 of fig. 3 may be present. The reclamation unit 335 of fig. 3 may use the process identifier (or user identifier) of the application to check whether the process identifier (or user identifier) is currently being used by the operating system. If the operating system indicates that the process identifier or user identifier is no longer in use, there are no applications/users that may be actively using the resource. The resources may then be reclaimed by the computing storage 135 of FIG. 1 and then allocated to other users.
Fig. 9 illustrates a flowchart of an example process for computing storage unit 135 of fig. 1 to process request 605 of fig. 6 from a user, in accordance with a disclosed embodiment. In fig. 9, at block 905, the computing storage unit 135 of fig. 3 may receive the request 605 of fig. 6 from the user (or an application running for the user). The request 605 of FIG. 6 may identify the resources of the computing storage unit 135 of FIG. 1 that the user wants to use. This resource may be of any type, such as memory 305, program slot 315, or execution engine 320 (and, of course, request 605 may request more than one instance of a particular resource type and/or may request multiple different resource types). At block 910, the computing storage unit 135 of fig. 1 may determine the number of resources of the request type (such as the maximum allocation memory size 430 of fig. 4, the number of program slots 435 of fig. 4, and/or the number of execution engines 440 of fig. 4), which may be allocated to the user of the SLA level of the given user in the tables 405 and 410 of fig. 4. Finally, at 915, the computing storage unit 135 of FIG. 1 may limit the user to allocate no more than the maximum number of resources that the user may be allocated to given the user's SLA level in tables 405 and 410 of FIG. 4.
As discussed above, as the session of the user using the computing storage unit 135 of fig. 1 continues, the user may issue a subsequent request for additional resources of the computing storage unit 135 of fig. 1 (such as the request 705 of fig. 7). In such a case, the computing storage unit 135 of FIG. 1 may revisit whether the user has access to obtain additional resources, aggregating the request 605 of FIG. 6 and the request 705 of FIG. 7 for the purpose of determining whether the user has access to obtain the resources requested in the request 705 of FIG. 7. How the calculation storage unit 135 of fig. 1 performs this determination is discussed below with reference to fig. 10.
Fig. 10 illustrates a flowchart of an example process by which the computing storage unit 135 of fig. 1 limits resources allocated to a user, in accordance with a disclosed embodiment. Fig. 10 details the process performed in block 915 of fig. 9.
In FIG. 10, at block 1005, the computing storage unit 135 of FIG. 1 may determine the resources requested by the user in the request 605 of FIG. 6 and/or the request 705 of FIG. 7. At block 1010, the calculation storage unit 135 of fig. 1 may determine what resources, as stored in the tracking table 330 of fig. 3, have been allocated to the user. If the request being processed is the request 605 of FIG. 6, then block 1010 may be expected to determine that resources have not been allocated to the user (because the tracking table 330 of FIG. 3 should not indicate that any resources were allocated to the user before the user has opened the session). At block 1015, the computing storage unit 135 of FIG. 1 may determine the resource constraints applicable to the user as stored in tables 405 and 410 of FIG. 4. At block 1020, the computing storage unit 135 of fig. 1 may compare the resource limit as determined in block 1015 to the requested and previously allocated resources as determined in blocks 1005 and 1010. If allocation of the resources requested in request 605 of FIG. 6 and/or request 705 of FIG. 7 would result in exceeding the user's resource limit, then at block 1025, computing storage unit 135 of FIG. 1 may return error 720 of FIG. 7; otherwise, at block 1030, the computing storage unit 135 of fig. 1 may allocate additional resources to the user (and the tracking table 330 of fig. 3 may be updated as discussed further below with reference to fig. 12).
FIG. 11 illustrates a flowchart of an example process for computing storage unit 135 of FIG. 1 to determine a user's resource limitations, in accordance with a disclosed embodiment. Fig. 11 details the process performed in block 910 of fig. 9.
In fig. 11, at block 1105, the computing storage unit 135 of fig. 1 may identify to the user the entry 425 of fig. 4 in the table 405 of fig. 4. From entry 425 of FIG. 4 in table 405 of FIG. 4, computing storage unit 135 of FIG. 1 may determine SLA identifier 420 of FIG. 4. Then, at block 1110, the computing storage unit 135 of fig. 1 may use the SLA identifier 420 of fig. 4 to determine the entry 445 of fig. 4 of the SLA identifier 420 of fig. 4, the maximum allocation memory size 430 of fig. 4, the number of program slots 435 of fig. 4, and/or the number of execution engines 440 of fig. 4 may be determined from the entry 445 of fig. 4 as a resource limit for the user.
FIG. 12 illustrates a flowchart of an example process for computing storage unit 135 of FIG. 1 to track resources allocated to a user, in accordance with a disclosed embodiment. In fig. 12, at block 1205, the computing storage unit 135 of fig. 1 may update the tracking table 330 of fig. 3 based on the resources requested in the request 605 of fig. 6 (and assuming that the user did not request more resources than were allowed in the request 605 of fig. 6). At block 1210, the computing storage unit 135 of FIG. 1 may receive the request 705 of FIG. 7 for a user requesting additional resources of the computing storage unit 135 of FIG. 1. At block 1215, the computing storage unit 135 of fig. 1 may again update the tracking table 330 of fig. 3 based on the resources requested in the request 705 of fig. 7 (and assuming that the user would not exceed his or her resource limits in aggregating the request 605 of fig. 6 and the request 705 of fig. 7).
As can be seen by the dashed arrows 1220 and 1225, either of blocks 1210 and 1215 may be omitted or may be repeated as desired.
While the above description focuses on request 705 of FIG. 7 as requesting additional resources, it is also possible that: the request 705 of fig. 7 may release the resource. For example, a user may need two execution engines for processing data and an area of memory for storing results. Once the results are complete, the user may need to perform further processing, but only one execution engine is needed to perform the processing. The request 705 of FIG. 7 may then release the execution engine that is no longer needed so that another user may use the execution engine. If the request 705 deallocates resources, the computing storage unit 135 of FIG. 1 may update the tracking table 330 of FIG. 3 to reduce the resources allocated to the user. It is also possible that: request 705 may mix requests for both deallocating one resource and requesting allocation of a new resource.
FIG. 13 illustrates a flowchart of an example process for computing storage unit 135 of FIG. 1 to reclaim resources, in accordance with a disclosed embodiment. In fig. 13, at block 1305, the computing storage unit 135 of fig. 1 may determine that the user (or at least the application/process requesting the resources of the computing storage unit 135 of fig. 1) is no longer active. At block 1310, the reclamation unit 335 of FIG. 3 may then reclaim the resources allocated to the user that are no longer being used (because the user or process is no longer active).
Fig. 14 shows a flowchart of an example process for defining or updating the table of fig. 4 for the compute storage unit 135 of fig. 1, in accordance with a disclosed embodiment. In fig. 14, at block 1405, the calculation storage unit 135 of fig. 1 may receive a request to define the table 325 of fig. 3 (i.e., to initialize or add to the user entries 425 and/or 445 of fig. 4 indicating how many of each type of resource may be allocated to the user) or to update the table 325 of fig. 3 (i.e., to change the number of resources that may be allocated to the user in entries 425 and/or 445 of fig. 4). At block 1410, computing storage unit 135 of FIG. 1 may check to see if the request originated from an administrator of the system (e.g., a "root" user account). If so, at block 1415, the table 325 of FIG. 3 may be defined or updated upon request; otherwise, at block 1420, the request may be denied.
In fig. 9-14, some embodiments of the disclosure are shown. Those skilled in the art will recognize that other embodiments of the disclosure are possible by changing the order of blocks, by omitting blocks, or by including links not shown in the drawings. All such variations of the flow diagrams, whether or not explicitly described, are considered embodiments disclosed.
The disclosed embodiments may have a computing storage unit that may limit resources allocated to a user. The user identifier may be mapped to a resource limit (this may be achieved by mapping the user identifier to a Service Level Agreement (SLA) identifier, which in turn may be mapped to the resource limit). The computing storage unit may then allocate resources up to a resource limit for the user and return an error if the user attempts to request resources that exceed the limit. The disclosed embodiments provide technical advantages by preventing a single user (or a small group of users) from requesting all available resources and rejecting other users from using a computing storage unit.
Computing Storage (CSD) is designed to offload processing from host Central Processing Unit (CPU) overhead to storage. But while a generic interface (compute storage command set) may be defined, there is a lack of consideration for a monopolization attack that can preempt available Compute Storage (CS) resources and result in denial of CS service to other applications.
In some cases, an attacker may obtain all available CS resources of the CSD through a malicious CS application. As a result, since the general user may fail to allocate CS resources of the CSD, the general user may not be able to use CS (computational storage) services.
The disclosed embodiments may address this problem by providing a way to set a quota for each CS resource, identify a relationship between the CS application and the user, track and manage CS resource usage based on the user, and reclaim unreleased CS resources from dead/killed CS applications.
SLA (service level agreement) and CS resource quota
To address a monopoly attack, the disclosed embodiments may set a CS resource quota for each resource. The CSD may expose various types of CS resources such as memory area (memory area may be used for input/output data buffers or temporary buffers for computation), program slots (program slots may be used to hold downloaded fixed program code), and execution engines (execution engines may be used to launch download/fixed programs to offload CPU overhead of a host).
The disclosed embodiments may support different options for defining a quota for CS resources. For example, a predetermined fixed amount may be used as the quota for each application. But this approach may not support some CS applications that require more CS resources than quota. Another approach may be to set a resource quota based on a Service Level Agreement (SLA) level. A CS resource quota for each SLA level may be defined between the CS service provider and the user, and the CS service provider may assign an SLA level to each user based on the charging policy. Thus, it is the responsibility of the service provider to guarantee the user with the predefined CS resources of the system, and it is the responsibility of the user to estimate the required CS resources and to select the SLA level from the CS service provider providing the estimated required CS resources.
The disclosed embodiments may include an interface for setting an SLA quota description table, which may describe a quota for each CS resource type for each SLA level, and an SLA user mapping table, which may describe an SLA level value for each user. Users not otherwise described in the table may use a default SLA level (such as a 0SLA level).
To protect the table from non-privileged users, the disclosed embodiments may check that the process context belongs to the root (administrator) and only allow access by the root (administrator).
Session (resource) context and user
The disclosed embodiments may provide a way to identify tenants (applications) using a Process ID (PID) of a process context (task_struct) and use the PID to isolate CS resources and grant access to CS resources. Internally, the session (resource) context may track CS resources allocated by the CS application.
To track CS resource usage based on a user, the disclosed embodiments may use a login User ID (UID) from a process context (task_struct) and search or create a user structure that may save the user's current CS resource usage. As mentioned previously, the disclosed embodiments may extract the UID from the process context and search for a user object with a given UID. If the UID is not found, the disclosed embodiments may create a user structure with PIDs and set the service level according to the SLA-user mapping table. If the UID is not described in the SLA-user mapping table, the disclosed embodiments may set a default level (0) for the user. A pointer may then be set for the user within the session context.
Tracking CS resource usage and preventing monopoly attacks
For CS requests to allocate or release CS resources of a CS application, the disclosed embodiments may find a PID (task_struct) with a process context and a session context of the user, and then may check whether the user has available resources by comparing the user's CS resource usage with the user's SLA-level CS resource quota in the SLA quota description table. If the user would exceed his/her CS resource quota, the disclosed embodiments may respond to the request with a failure. If the user has sufficient space in the CS resource quota, the disclosed embodiments can allocate and maintain resources over the session context and can then update the user's CS resource usage.
Recovery of CS resources of dead/killed CS applications
In general, a CS application may close a session, and the disclosed embodiments may clear CS resources allocated by the CS application and reduce CS resource usage from the user structure. The CS application may terminate without a general clean call: for example, the CS application may be forced to kill, or the error code may cause a catastrophic failure, causing the system to kill the CS application. In this case, the session context and CS resources would remain floating. To address this issue, the disclosed embodiments may include a reclamation worker that may be scheduled when the first new session context is created and may be periodically rescheduled until all session contexts have been removed from the system.
When the reclamation worker is triggered, it will loop through the session context and find the dead/killed CS application by searching for the process context (task_struct) with the PID of the session context. If the reclamation worker finds a dead/killed CS application, the reclamation worker may reclaim CS resources of the session context and then update CS resource usage in the user.
The CS resources of the application may be isolated by the PID and managed with the session context. CS resource usage may be tracked by a user structure. The quota of CS resources may be set by an SLA quota description table update, and the SLA level of the user may be set by an administrator (root) by an SLA-user mapping table update. The reclamation worker may be scheduled while the session remains open and may find dead/killed CS applications and reclaim CS resources.
The following discussion is intended to provide a brief, general description of a suitable machine or machines in which the particular aspects disclosed may be implemented. One or more machines may be controlled, at least in part, by input from conventional input devices, such as a keyboard, mouse, etc., as well as by instructions received from another machine, interactions with a Virtual Reality (VR) environment, biometric feedback, or other input signals. As used herein, the term "machine" is intended to broadly encompass a single machine, a virtual machine, or a system of communicatively coupled machines, multiple virtual machines, or devices operating together. Exemplary machines include computing devices (such as personal computers, workstations, servers, portable computers, hand-held devices, telephones, tablets, etc.) and transportation devices (such as private or public transportation (e.g., automobiles, trains, taxis, etc.)).
One or more machines may include an embedded controller (such as a programmable or non-programmable logic device or array, an Application Specific Integrated Circuit (ASIC), an embedded computer, a smart card, etc.). One or more machines may utilize one or more connections (such as through a network interface, modem, or other communication combination) to one or more remote machines. The machines may be interconnected by physical and/or logical networks, such as an intranet, the internet, a local area network, a wide area network, etc. Those skilled in the art will appreciate that network communications may utilize a variety of wired and/or wireless short-range or long-range carriers and protocols, including Radio Frequency (RF), satellite, microwave, institute of Electrical and Electronics Engineers (IEEE) 802.11, a variety of wireless and/or wired short-range carriers and protocols,Light, infrared, cable, laser, etc.
The disclosed embodiments may be described by reference to or in conjunction with associated data including functions, procedures, data structures, applications, etc., which when accessed by a machine results in the machine performing tasks or defining abstract data types or low-level hardware contexts. The associated data may be stored in, for example, volatile and/or nonvolatile memory (e.g., RAM, ROM, etc.), or in other storage devices and their associated storage media (including hard disk drives, floppy disks, optical storage, magnetic tape, flash memory, memory sticks, digital video disks, biological storage, etc.). The associated data may be transmitted over a transmission environment (including physical and/or logical networks) in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format. The associated data may be used in a distributed environment and stored locally and/or remotely for machine access.
The disclosed embodiments may include a tangible, non-transitory machine-readable medium comprising instructions executable by one or more processors, the instructions comprising instructions for executing the disclosed elements as described herein.
The various operations of the methods described above may be performed by any suitable device capable of performing the operations, such as various hardware and/or one or more software components, circuits, and/or one or more modules. The software may comprise an ordered listing of executable instructions for implementing logical functions, and can be embodied in any "processor-readable medium" for use by or in connection with an instruction execution system, apparatus, or device, such as a single-core or multi-core processor or a system that includes a processor.
The blocks or steps of a method or algorithm and function described in connection with the embodiments disclosed herein may be embodied in hardware, in a software module executed by a processor, or in a combination of the two. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a tangible, non-transitory computer-readable medium. A software module may reside in Random Access Memory (RAM), flash memory, read-only memory (ROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a CD ROM, or any other form of storage medium known in the art.
Having described and illustrated the principles of the disclosure with reference to the illustrated embodiments, it will be recognized that the illustrated embodiments can be modified in arrangement and detail without departing from such principles, and can be combined in any desired manner. And, while the foregoing discussion has focused on particular embodiments, other configurations are contemplated. In particular, even though expressions such as "in accordance with the disclosed embodiments" and the like are used herein, these phrases are generally intended to refer to embodiment possibilities and are not intended to limit the disclosure to particular embodiment configurations. As used herein, these terms may reference the same or different embodiments that are combinable into other embodiments.
The foregoing illustrative embodiments should not be construed as limiting the disclosure thereof. Although a few embodiments have been described, those skilled in the art will readily appreciate that many modifications are possible in the embodiments without materially departing from the novel teachings and advantages of this disclosure. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the claims.
The disclosed embodiments extend to the following statements, without limitation:
Statement 1: the disclosed embodiments include a computing storage unit comprising:
a first resource of a first type;
A second resource of the first type;
a table maps User Identifiers (UIDs) of users to the number of resources of the first type.
Statement 2: the disclosed embodiment includes a computing storage unit according to claim 1, wherein the computing storage unit is configured to limit a user to the number of resources of the first type.
Statement 3: the disclosed embodiments include a compute storage unit according to claim 1, wherein the resource is an execution engine, a program slot, or a memory region.
Statement 4: the disclosed embodiment includes a computing storage unit according to claim 1, wherein:
The first resource is a first execution engine;
The second resource is a second execution engine;
the calculation storage unit further includes:
a first program slot;
A second program slot; and
A memory; and
The table is configured to map the UID of the user to a number of execution engines, a second number of program slots, and a size of an area of memory.
Statement 5: the disclosed embodiment includes the computing storage unit of claim 1, further comprising a device driver.
Statement 6: the disclosed embodiment includes a computing storage unit according to claim 5, wherein the device driver includes the table.
Statement 7: the disclosed embodiment includes a computing storage unit according to claim 1, wherein the table includes:
a first table mapping a UID of a user to a Service Level Agreement (SLA) identifier (SLA ID); and
A second table mapping SLA IDs to the number of resources of the first type.
Statement 8: the disclosed embodiment includes a computing storage unit according to claim 1, further comprising: the session context indicates that the first resource is used by the user's session.
Statement 9: the disclosed embodiment includes a computing storage unit according to claim 8, further comprising: a second table maps UIDs to the number of resources of the first type of usage of the computing storage unit based at least in part on the session context.
Statement 10: the disclosed embodiments include a computing storage unit according to claim 9, wherein the computing storage unit is configured to limit the number of resources of the first type used to no more than the number of resources of the first type.
Statement 11: the disclosed embodiments include a calculation storage unit according to claim 1, wherein the calculation storage unit is configured to: based at least in part on receiving a request from a user to access the computing storage, an entry is added to a table mapping the UID to a number of resources of a first type.
Statement 12: the disclosed embodiment includes a computing storage unit according to claim 1, further comprising: and a reclamation unit to re-declare the resource based at least in part on the user being inactive.
Statement 13: the disclosed embodiment includes a computing storage unit according to claim 12, wherein the reclamation unit is configured to execute periodically.
Statement 14: the disclosed embodiments include a method comprising:
Receiving, at a computing storage unit, a request from a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit;
determining the number of resources of the first type that the user should be able to access; and
The user is limited to no more than the number of resources of the first type in the computational storage unit.
Statement 15: the disclosed embodiments include a method according to claim 14, wherein the resource is an execution engine, a program slot, or a memory region.
Statement 16: the disclosed embodiments include a method according to claim 14, wherein:
the first type of resource is a first execution engine;
the computing storage unit comprises a second execution engine, a first program slot, a second program slot and a memory; and
The step of limiting the user to no more than the number of resources of the first type in the computing storage unit comprises: the user is limited to no more than a first number of execution engines, a second number of program slots, and a size of the region of memory.
Statement 17: the disclosed embodiment includes a method according to claim 14, wherein the step of determining the number of resources of the first type that the user should have access to includes: the number of resources of the first type that the user should be able to access is accessed from the table.
Statement 18: the disclosed embodiment includes a method according to claim 17, wherein the step of accessing from the table the number of first type resources that the user should be able to access comprises:
accessing from the table a Service Level Agreement (SLA) identifier (SLA ID) associated with a User Identifier (UID) of the user; and
The number of resources of the first type associated with the SLA ID is accessed from the second table.
Statement 19: the disclosed embodiment includes the method of claim 17, further comprising: the table is initialized with the number of resources of the first type that the user should have access to.
Statement 20: the disclosed embodiments include a method according to claim 14, wherein initializing the table comprises: the table is initialized based at least in part on a second request from the administrator.
Statement 21: the disclosed embodiment includes a method according to claim 14, wherein limiting the user to no more than the number of resources of the first type in the computing storage unit comprises:
Determining a second number of resources of the first type requested in the request;
determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
Statement 22: the disclosed embodiment includes a method according to claim 21, wherein the step of restricting the user to no more than the number of resources of the first type in the computing storage unit further comprises: the second number of resources of the first type is allocated to the user based at least in part on the second number of resources of the first type and the third number of resources of the first type used not exceeding the number of resources of the first type.
Statement 23: the disclosed embodiment includes a method according to claim 21, wherein the step of restricting the user to no more than the number of resources of the first type in the computing storage unit further comprises: an error is reported based at least in part on the second number of resources of the first type and the third number of resources of the first type being used exceeding the number of resources of the first type.
Statement 24: the disclosed embodiment includes the method of claim 14, further comprising: a table of resources used by the user is updated based at least in part on the request.
Statement 25: the disclosed embodiment includes the method of claim 24, further comprising:
Receiving, at the computing storage unit, a second request from the user to use the computing storage unit; and
A table of the resources used by the user is updated based at least in part on the second request.
Statement 26: the disclosed embodiment includes a method according to claim 25, wherein updating the table of resources used by the user based at least in part on the second request comprises:
determining a second number of resources of the first type requested in the second request;
determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
Statement 27: the disclosed embodiment includes a method according to claim 26, wherein updating the table of resources used by the user based at least in part on the second request further comprises: the second number of resources of the first type is allocated to the user based at least in part on the second number of resources of the first type and the third number of resources of the first type used not exceeding the number of resources of the first type.
Statement 28: the disclosed embodiment includes a method according to claim 26, wherein updating the table of resources used by the user based at least in part on the second request comprises: an error is reported based at least in part on the second number of resources of the first type and the third number of resources of the first type being used exceeding the number of resources of the first type.
Statement 29: the disclosed embodiment includes the method of claim 14, further comprising: it is determined that the user is inactive.
Statement 30: the disclosed embodiment includes the method of claim 29, further comprising: the resource is recovered based at least in part on the user being inactive.
Statement 31: the disclosed embodiment includes a method according to claim 29, wherein the step of determining that the user is inactive comprises: periodically, it is determined that the user is inactive.
Statement 32: the disclosed embodiments include a method according to claim 14, wherein:
the step of receiving a request from a user at a computing storage unit to use the computing storage unit includes: receiving, at the computing storage unit, a request from a process of the user to use the computing storage unit; and
The method further comprises the steps of:
Determining that the process is inactive; and
Resources are reclaimed based at least in part on the process being inactive.
Statement 33: the disclosed embodiment includes a method according to claim 32, wherein the step of determining that the process is inactive comprises: the process is periodically determined to be inactive.
Statement 34: the disclosed embodiment includes a method according to claim 14, wherein limiting the user to no more than the number of resources of the first type in the computing storage unit comprises: reporting the error.
Statement 35: the disclosed embodiments include an article of manufacture comprising a non-transitory storage medium having instructions stored thereon that, when executed by a machine, result in comprising:
Receiving, at a computing storage unit, a request from a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit;
determining the number of resources of the first type that the user should be able to access; and
The user is limited to no more than the number of resources of the first type in the computational storage unit.
Statement 36: the disclosed embodiments include an article of manufacture according to claim 35, wherein the resource is an execution engine, a program slot, or a memory region.
Statement 37: the disclosed embodiment includes an article of manufacture according to statement 35, wherein:
the first type of resource is a first execution engine;
the computing storage unit comprises a second execution engine, a first program slot, a second program slot and a memory; and
The step of limiting the user to no more than the number of resources of the first type in the computing storage unit comprises: the user is limited to no more than a first number of execution engines, a second number of program slots, and a size of the region of memory.
Statement 38: the disclosed embodiment includes an article of manufacture according to claim 35, wherein the step of determining the number of resources of the first type that the user should have access to comprises: the number of resources of the first type that the user should be able to access is accessed from the table.
Statement 39: the disclosed embodiment includes an article of manufacture according to claim 38, wherein the step of accessing from the table the number of resources of the first type that the user should be able to access comprises:
accessing from the table a Service Level Agreement (SLA) identifier (SLA ID) associated with a User Identifier (UID) of the user; and
The number of resources of the first type associated with the SLA ID is accessed from the second table.
Statement 40: the disclosed embodiments include the article of manufacture of claim 38 having stored thereon further instructions that, when executed by a machine, result in initializing the table with the number of resources of the first type that should be accessible to the user.
Statement 41: the disclosed embodiment includes an article of manufacture according to claim 35, wherein initializing the table comprises: the table is initialized based at least in part on a second request from the administrator.
Statement 42: the disclosed embodiment includes an article of manufacture according to claim 35, wherein limiting the user to no more than the number of resources of the first type in the computing storage unit comprises:
Determining a second number of resources of the first type requested in the request;
determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
Statement 43: the disclosed embodiment includes an article of manufacture according to claim 42, wherein the step of restricting the user to no more than the number of resources of the first type in the computing storage unit further comprises: the second number of resources of the first type is allocated to the user based at least in part on the second number of resources of the first type and the third number of resources of the first type used not exceeding the number of resources of the first type.
Statement 44: the disclosed embodiments include an article of manufacture according to claim 42, wherein the step of restricting the user to no more than the number of resources of the first type in the computing storage unit further comprises: an error is reported based at least in part on the second number of resources of the first type and the third number of resources of the first type being used exceeding the number of resources of the first type.
Statement 45: the disclosed embodiment includes the article of claim 35, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in updating a table of resources used by a user based at least in part on the request.
Statement 46: the disclosed embodiment includes the article of claim 45, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in:
Receiving, at the computing storage unit, a second request from the user to use the computing storage unit; and
A table of resources used by the user is updated based at least in part on the second request.
Statement 47: the disclosed embodiments include an article of manufacture according to claim 46, wherein the step of updating the table of resources used by the user based at least in part on the second request comprises:
determining a second number of resources of the first type requested in the second request;
determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
Statement 48: the disclosed embodiment includes an article of manufacture according to claim 47, wherein the step of updating the table of resources used by the user based at least in part on the second request further comprises: the second number of resources of the first type is allocated to the user based at least in part on the second number of resources of the first type and the third number of resources of the first type used not exceeding the number of resources of the first type.
Statement 49: the disclosed embodiment includes an article of manufacture according to claim 47, wherein the step of updating the table of resources used by the user based at least in part on the second request comprises: an error is reported based at least in part on the second number of resources of the first type and the third number of resources of the first type being used exceeding the number of resources of the first type.
Statement 50: the disclosed embodiment includes the article of claim 35, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in determining that a user is inactive.
Statement 51: the disclosed embodiments include the article of claim 50, the non-transitory storage medium having stored thereon further instructions that, when executed by the machine, result in reclaiming a resource based at least in part on a user being inactive.
Statement 52: the disclosed embodiments include an article of manufacture according to claim 50, wherein the step of determining that the user is inactive comprises: periodically, it is determined that the user is inactive.
Statement 53: the disclosed embodiment includes an article of manufacture according to statement 35, wherein:
the step of receiving a request from a user at a computing storage unit to use the computing storage unit includes: receiving, at the computing storage unit, a request from a process of the user to use the computing storage unit; and
The non-transitory storage medium has stored thereon further instructions that, when executed by the machine, result in:
Determining that the process is inactive; and
Resources are reclaimed based at least in part on the process being inactive.
Statement 54: the disclosed embodiments include an article of manufacture according to claim 53, wherein the step of determining that the process is inactive comprises: the process is periodically determined to be inactive.
Statement 55: the disclosed embodiment includes an article of manufacture according to claim 35, wherein limiting the user to no more than the number of resources of the first type in the computing storage unit comprises: reporting the error.
Accordingly, such detailed description and accompanying materials are intended to be illustrative only, and should not be taken as limiting the scope of the disclosure in view of the various arrangements of the embodiments described herein. What is claimed as the disclosure, therefore, is all such modifications as may come within the scope and spirit of the following claims and equivalents thereto.
Claims (20)
1. A computing storage unit comprising:
a first resource of a first type;
A second resource of the first type;
a table mapping a user identifier UID of the user to a number of resources of the first type.
2. The computing storage unit of claim 1, wherein the table is configured to: the user device of the user is limited to the number of resources of the first type.
3. The computing storage unit of claim 1, wherein the table comprises:
A first table mapping the UID of the user to a service level agreement SLA identifier SLA ID; and
A second table mapping SLA IDs to the number of resources of the first type.
4. The computing storage unit of claim 1, further comprising: the session context indicates that the first resource is used by a session of a user device of the user.
5. The computing storage unit of claim 4, further comprising: a second table maps UIDs to the number of resources of the first type of usage of the computing storage unit based at least in part on the session context.
6. The computing storage unit of any one of claims 1 to 5, further comprising: a reclamation unit to reclaim the resources based at least in part on an activity state associated with the user device of the user.
7. A method of operating a computing storage unit, comprising:
Receiving, at a computing storage unit, a request from a user device of a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit;
Determining a number of resources of a first type that are accessible to the user device; and
The user device is limited to calculating the number of resources of the first type in the storage unit.
8. The method of claim 7, wherein determining the number of resources of the first type that are accessible to the user device comprises: the number of resources of the first type that the user device is able to access is accessed from the table.
9. The method of claim 8, wherein accessing from the table the number of resources of the first type that are accessible to the user device comprises:
Accessing from the table a service level agreement SLA identifier SLA ID associated with a user identifier UID of the user; and
The number of resources of the first type associated with the SLA ID is accessed from the second table.
10. The method of claim 8, further comprising: the table is initialized with the number of resources of the first type that are accessible to the user device.
11. The method of claim 7, wherein limiting the user device to calculating the number of resources of the first type in the storage unit comprises:
Determining a second number of resources of the first type requested in the request;
determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
12. The method of claim 7, further comprising: a table of resources used by the user device is updated based at least in part on the request.
13. The method of claim 12, further comprising:
receiving, at the computing storage unit, a second request from the user device to use the computing storage unit; and
A table of resources used by the user device is updated based at least in part on the second request.
14. The method of claim 13, wherein updating the table of resources used by the user device based at least in part on the second request comprises:
determining a second number of resources of the first type requested in the second request;
Determining a third number of resources of the first type used; and
The second number of resources of the first type and the third number of resources of the first type used are compared with the number of resources of the first type.
15. The method of any one of claims 7 to 14, further comprising: an activity state associated with the user device is determined.
16. The method of claim 15, further comprising: resources are recovered based at least in part on an activity state associated with the user device.
17. The method of claim 15, wherein determining an activity state associated with the user device comprises: an activity state associated with the user device is periodically determined.
18. The method of claim 7, wherein:
The step of receiving a request from a user device of a user at a computing storage unit to use the computing storage unit comprises: receiving, at the computing storage unit, a request from a process of the user device to use the computing storage unit; and
The method further comprises the steps of:
determining an activity state associated with the process; and
Resources are reclaimed based at least in part on the activity state associated with the process.
19. An article of manufacture comprising a non-transitory storage medium having instructions stored thereon that, when executed by a machine, result in:
Receiving, at a computing storage unit, a request from a user device of a user to use the computing storage unit, the request identifying a first type of resource of the computing storage unit;
Determining a number of resources of a first type that are accessible to the user device; and
The user device is limited to calculating the number of resources of the first type in the storage unit.
20. The article of manufacture of claim 19, wherein determining the number of resources of the first type that are accessible to the user device comprises: the number of resources of the first type that the user device is able to access is accessed from the table.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63/422,918 | 2022-11-04 | ||
US18/094,342 US20240152397A1 (en) | 2022-11-04 | 2023-01-06 | Computational storage resource quota management |
US18/094,342 | 2023-01-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117992211A true CN117992211A (en) | 2024-05-07 |
Family
ID=90895186
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311230559.9A Pending CN117992211A (en) | 2022-11-04 | 2023-09-22 | Calculation memory unit and method for operating a calculation memory unit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117992211A (en) |
-
2023
- 2023-09-22 CN CN202311230559.9A patent/CN117992211A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8051243B2 (en) | Free space utilization in tiered storage systems | |
CN107690622B (en) | Method, equipment and system for realizing hardware acceleration processing | |
US8195798B2 (en) | Application server scalability through runtime restrictions enforcement in a distributed application execution system | |
US8775755B2 (en) | Peer-to-peer transcendent memory | |
US20210072923A1 (en) | Storage device and computer system | |
US9092426B1 (en) | Zero-copy direct memory access (DMA) network-attached storage (NAS) file system block writing | |
US11809707B2 (en) | File operations in a distributed storage system | |
US9106483B2 (en) | Systems and methods for serving applications in an application server environment | |
CN103154911A (en) | Systems and methods for managing an upload of files in a shared cache storage system | |
US9858120B2 (en) | Modifying memory space allocation for inactive tasks | |
CN104539708B (en) | A kind of capacity reduction method, device and the system of cloud platform resource | |
US12099412B2 (en) | Storage system spanning multiple failure domains | |
US10963182B2 (en) | System and method for on-demand recovery points | |
US20140075142A1 (en) | Managing backing of virtual memory | |
US7783849B2 (en) | Using trusted user space pages as kernel data pages | |
US20160054926A1 (en) | System and Method for Pre-Operating System Memory Map Management to Minimize Operating System Failures | |
US9317306B2 (en) | Computer device and memory management method thereof | |
US11099740B2 (en) | Method, apparatus and computer program product for managing storage device | |
CN117992211A (en) | Calculation memory unit and method for operating a calculation memory unit | |
EP4365739A1 (en) | Computational storage resource quota management | |
CN109478151B (en) | Network accessible data volume modification | |
WO2022222977A1 (en) | Method and apparatus for managing memory of physical server for running cloud service instances | |
US10228859B2 (en) | Efficiency in active memory sharing | |
US9715460B1 (en) | Enabling a first virtual storage director running in a container maintained by a hypervisor to achieve direct memory access to memory of a second virtual storage director running in a different container | |
KR20180133730A (en) | Storage system and operating method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |