CN112131166A - Lightweight bridge circuit and method of operating the same - Google Patents

Lightweight bridge circuit and method of operating the same Download PDF

Info

Publication number
CN112131166A
CN112131166A CN202010582886.0A CN202010582886A CN112131166A CN 112131166 A CN112131166 A CN 112131166A CN 202010582886 A CN202010582886 A CN 202010582886A CN 112131166 A CN112131166 A CN 112131166A
Authority
CN
China
Prior art keywords
lwb
exposed
physical
endpoint
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010582886.0A
Other languages
Chinese (zh)
Inventor
朗姆达斯·P·卡查尔
奥斯卡·P·平托
斯蒂芬·费舍尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/846,271 external-priority patent/US11809799B2/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN112131166A publication Critical patent/CN112131166A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Information Transfer Systems (AREA)
  • Bus Control (AREA)

Abstract

Lightweight bridge circuits and methods of operating the same are disclosed. An endpoint of a lightweight bridge (LWB) may expose a plurality of Physical Functions (PFs) to a host. The root port of the LWB may be connected to a device and determine the PF and Virtual Function (VF) exposed by the device. An application layer-endpoint (APP-EP) and an application layer-root port (APP-RP) may switch between a PF published by the endpoint and a PF/VF published by the device. APP-EP and APP-RP may implement the mapping between PF as disclosed by the endpoint and PF/VF as disclosed by the device.

Description

Lightweight bridge circuit and method of operating the same
This application claims benefit of united states provisional patent application No. 62/865,962 filed 24/6/2019 and united states provisional patent application No. 62/964,114 filed 21/1/2020, both of which are incorporated herein by reference for all purposes.
Technical Field
The inventive concept relates generally to storage devices and, more particularly, to emulating a Peripheral Component Interconnect Express (PCIe) Virtual Function (VF) as a PCIe Physical Function (PF).
Background
Devices such as peripheral component interconnect express (PCIe) devices expose (expose) various functions that may be accessed by other components in the computer system. For example, a host processor may perform various operations within a Solid State Drive (SSD) using such functionality disclosed by the SSD. These functions may operate on data stored on the SSD or may operate on data provided by the host. Typically, the functions disclosed by the device relate to the normal operation of the device, but such limitations are not required: for example, while SSDs are traditionally used to store data, if an SSD includes a processor, the processor may be used to offload processing from a host processor.
The functionality disclosed by the device may be discovered by the host machine when the device is enumerated at startup (or, if hot installation of the device is supported, when installed). As part of the discovery, the host machine may query the device for any disclosed functions, which may then be added to a list of available functions for the device.
The functions are divided into two categories: physical and virtual. The Physical Function (PF) may be implemented using hardware within the device. The resources of the PF may be managed and configured independently of any other PF provided by the device. Virtual Functions (VFs) are lightweight functions that can be considered virtualization functions. Unlike a PF, a VF is typically associated with a particular PF and typically shares resources with its associated PF (and possibly other VFs associated with that PF as well).
Because the PFs are independent of each other, and because the VFs may need to use Single Root Input/Output Virtualization (SR-IOV) for host access to the VFs, it is desirable for devices to disclose the PFs. But because PFs are stand-alone, they may require separate hardware, increasing the space required within the device and the power consumption of the device. The SR-IOV protocol may impose complexities in the host system software stack, particularly for virtualization usage scenarios.
There remains a need to provide a PF for a device without imposing the cost requirements of the PF so that system software complexity may be reduced.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
An aspect of an embodiment relates to a lightweight bridge (LWB) circuit, comprising: an endpoint for connecting to a host, the endpoint disclosing a plurality of Physical Functions (PFs); a root port for connecting to a device, the device exposing at least one PF and at least one Virtual Function (VF) to the root port; and an application layer-end point (APP-EP) and an application layer-root port (APP-RP) for converting between the plurality of PFs exposed to the host and the at least one PF and the at least one VF exposed by the device. APP-EP and APP-RP implement the mapping between the plurality of PFs disclosed by the endpoints and the at least one PF and the at least one VF disclosed by the apparatus.
Another aspect of the embodiments relates to a method, comprising: enumerating at least one Physical Function (PF) exposed by a device using a root port of a lightweight bridge (LWB); enumerating at least one Virtual Function (VF) exposed by the device using a root port of the LWB; generating a plurality of PFs at endpoints of the LWB for disclosure to a host; and mapping the plurality of PFs at the end points of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Yet another aspect of the embodiments relates to an article comprising a non-transitory storage medium having instructions stored thereon that, when executed by a machine, cause: enumerating at least one Physical Function (PF) exposed by a device using a root port of a lightweight bridge (LWB); enumerating at least one Virtual Function (VF) exposed by the device using a root port of the LWB; generating a plurality of PFs at endpoints of the LWB for disclosure to a host; and mapping the plurality of PFs at the end points of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Other features and aspects will be apparent from the following detailed description, the accompanying drawings, and the claims.
Drawings
Fig. 1 illustrates a machine including a lightweight bridge (LWB) capable of emulating a Physical Function (PF) disclosed by the LWB using a Virtual Function (VF) of a Solid State Drive (SSD), according to an embodiment of the inventive concept.
Figure 2 shows additional detail of the machine of figure 1.
Fig. 3 shows a detail of the SSD of fig. 1.
Fig. 4A-4C illustrate the LWB of fig. 1 according to various embodiments of the inventive concept.
Fig. 5 shows details of the configuration manager of fig. 4A to 4C.
FIG. 6 shows a mapping between PFs disclosed by the LWB of FIG. 1 and PF/VF disclosed by the SSD of FIG. 1.
FIG. 7 illustrates the LWB of FIG. 1 processing a configuration write request from the host of FIG. 1.
FIG. 8 illustrates the LWB of FIG. 1 processing a configuration read request from the host of FIG. 1.
FIG. 9 illustrates the Application Layer-Endpoint (APP-EP) and Application Layer-Root Port (APP-RP) processes address translation within the LWB of FIGS. 4A-4C.
FIG. 10 shows the mapping of FIG. 6 being changed within the LWB of FIG. 1.
Fig. 11 illustrates that the PF disclosed by the LWB of fig. 1 has an associated quality of service (QoS) policy.
Fig. 12A to 12B illustrate that the LWB of fig. 1 performs bandwidth throttling.
FIG. 13 illustrates the LWB of FIG. 1 issuing credits to the SSD of FIG. 1.
Fig. 14A through 14B illustrate a flowchart of an example process in which the LWB of fig. 1 identifies PFs/VFs disclosed by the SSD of fig. 1, discloses PFs from the LWB, and generates a mapping between the PFs disclosed by the LWB of fig. 1 and the PFs/VFs disclosed by the SSD of fig. 1, according to an embodiment of the inventive concept.
Fig. 15A-15B illustrate a flow diagram of an example process by which the LWB of fig. 1 receives and processes requests from the host of fig. 1.
FIG. 16 shows a flow diagram of an example process of the APP-EP and APP-RP of FIGS. 4A-4C to translate addresses between the host of FIG. 1 and the SSD of FIG. 1.
FIG. 17 illustrates a flow diagram of an example process by which the LWB of FIG. 1 issues credits to the SSD of FIG. 1.
FIG. 18 illustrates a flow diagram of an example process by which the LWB of FIG. 1 processes a configuration write request.
FIG. 19 illustrates a flow diagram of an example process by which the LWB of FIG. 1 processes a configuration read request.
Fig. 20 illustrates a flow diagram of an example process by which the LWB of fig. 1 associates QoS policies with the PFs disclosed by the LWB of fig. 1.
FIG. 21 illustrates a flow chart of an example process by which the LWB of FIG. 1 dynamically changes the PF to PF/VF mapping disclosed by the LWB of FIG. 1 to the SSD of FIG. 1.
Fig. 22A-22B illustrate a flow diagram of an example process by which the LWB of fig. 1 performs bandwidth throttling.
Detailed Description
Reference will now be made in detail to embodiments of the invention, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail as not to unnecessarily obscure aspects of the embodiments.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first module may be termed a second module, and, similarly, a second module may be termed a first module, without departing from the scope of the invention.
The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used herein means and includes any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The components and features of the drawings are not necessarily to scale.
Embodiments of the inventive concepts include methods and systems for emulating multiple Physical Functions (PFs) in a device using Virtual Functions (VFs) supported by the device, such as a Solid State Drive (SSD). An SSD may have multiple non-volatile memory express (NVMe) controllers represented by multiple PCIe Physical Functions (PFs) or PCIe Virtual Functions (VFs). Each PF or VF essentially represents an NVMe controller that can be used by the host NVMe drive to perform data storage functions. The PF has actual physical resources, while the virtual functions share resources with the PF. Embodiments of the inventive concepts may expose (or be called expose) multiple PFs to a host using a set of VFs in an SSD controller. That is, multiple PFs may be emulated by the device and VFs may be used internally. From the perspective of the host, the host system software stack sees multiple PCIe PFs and NVMe controllers behind those PFs. Embodiments of the inventive concept may internally map and convert all host access to a PF to a VF.
The PCIe PF is independent of any other functions (physical or virtual). That is, the PFs have their own dedicated physical resources (such as memory buffers). For devices that support a large number of PFs, the logical area, power consumption, and complexity of the underlying devices are increased. Therefore, to reduce cost and complexity on the device side, the PCIe specification provides support for Virtual Functions (VFs). The VFs share physical resources with the PF and are dependent on the PF for all physical aspects. These physical aspects include PCIe link control, device control, power management, and the like.
Although VF reduces the cost and complexity on the device side, VF increases the complexity on the system software side. The system software needs to support a Single Root Input/Output Virtualization (SR-IOV) protocol to be able to communicate with the VFs. This added functionality sometimes degrades I/O performance in terms of additional latency. Therefore, from a system software perspective, it is desirable to have a PF.
Embodiments of the inventive concept allow a device to emulate a PF using a VF in an SSD controller. That is, from the perspective of the host, the storage SSD appears to have multiple PFs. But on the device side, those PFs may be simulated using a set of VFs. To reduce cost, lightweight bridges (LWB) may be used to expose the functionality of a device (such as an SSD) without the need to implement the various functions as PFs in the SSD controller ASIC.
In an example embodiment of LWB, even though the underlying device may implement some or most of the functions as VFs instead of PFs, a total of 16 functions may be disclosed by the LWB as if they were physical functions. LWB acts as a bridge between the host machine and the SSD controller itself: the LWB may be implemented as part of the entire SSD device, or as a separate component. The device may be an SSD with functionality (1 PF and 15 VFs) implemented as part of the endpoint of the SSD controller. (an SSD may also conventionally include a Host Interface Layer (HIL), a Flash Translation Layer (FTL), and a Flash Controller (FC) to access flash memory.)
The LWB may communicate with the host machine using four lanes of a third generation (Gen3) PCIe bus, while communication may be accomplished using 16 lanes of a Gen3 PCIe bus inside the LWB. Embodiments of the inventive concept may support the use of any particular version of a PCIe bus (or other bus type) and may support any desired speed or path or bandwidth with and within a host machine without limitation. The PCIe lane widths and speeds in this description are examples only, and it should be understood that any combination may be implemented using the same concepts.
Instead of the SSD controller communicating with the root port or root complex on the host machine, the SSD controller may communicate with the root port of the LWB. The SSD controller may not be aware of this change and may treat the communication from the root port of the LWB as a communication from the host machine. Similarly, the host machine may communicate with the endpoints of the LWB without knowing that the endpoints are not SSD controllers (implementing the disclosed functionality). In the case of an SSD controller or host machine, the party with which it communicates (LWB in embodiments of the inventive concept) may be considered a black box.
The endpoints of the LWB may expose the same number of functions as the endpoints of the SSD controller. However, the endpoints of the LWB may disclose all functions as PFs, rather than some of them as VFs. The LWB may also include PCIe Application Layer-Endpoint (PAPP-EP) and PCIe Application Layer-Root Port (PAPP-RP). PAPP-EP and PAPP-RP may manage the mapping from PF as disclosed by the endpoints of LWB to functions (physical or virtual) as disclosed by the endpoints of SSD controllers. The PAPP-RP may include a Configuration translation table to assist in managing the mapping of the PCIe Configuration Space (Configuration Space) of the PF exposed to the host to the PCIe Configuration Space of the PF and/or VF of the SSD controller EP. This table may also indicate which PF disclosed by the LWB endpoint maps to which function disclosed by the SSD controller endpoint, as well as other information about the mapping (e.g., what address in the SSD controller may store data for the disclosed function). The PCIe configuration features and capabilities provided by the endpoints of the LWB may (and often are) different from those provided by the endpoints of the SSD controller: configuration translation tables may help manage these differences. The PAPP-RP may also handle the translation of memory Base Address Register (BAR) addresses of PFs and BAR addresses of PFs and/or VFs of SSD controller EPs disclosed to the host. The PAPP-EP and/or PAPP-RP may also include other tables, as desired. The PF and SSD controller internal PF/VF mappings exposed to the host may be flexible and dynamic in nature. The mapping may change during runtime based on certain events and/or policy changes issued by a management entity, such as a host and/or Baseboard Management Controller (BMC). Some examples of these events are Virtual Machine (VM) migration, changes in SLA, power/performance throttling, date, time, etc.
While the PAPP-EP and PAPP-RP may be separate components in some embodiments of the inventive concept, other embodiments of the inventive concept may implement these components (and potentially the endpoints, root ports, and configuration manager of the LWB) in a single implementation. For example, the LWB may be implemented using a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC), just to name two examples of possible implementations. The PAPP-EP and PAPP-RP may communicate using any desired mechanism. For example, an intermediate data format may be defined that may be used to exchange data between the PAPP-EP and the PAPP-RP to invoke a particular function and return a result.
The LWB may also include a configuration manager. The configuration manager may enumerate the functions (both PFs and VFs) provided by the endpoints of the SSD controller. The configuration manager may then have both information to "define" the functionality disclosed by the endpoints of the LWB and help construct the configuration space translation table in the PAPP-RP. Typically, the configuration manager is used primarily during startup. The host machine may change the PCIe configuration space of the PF (such as interrupts for communicating with endpoints (of the LWB) for various functions), or the host machine may enable or disable particular functions. A configuration manager may be used to facilitate these changes as needed. LWB may also need to ensure that similar PCIe configuration space changes are propagated to the SSD controller as appropriate. That is, the LWB performs PCIe configuration mirroring between the PF disclosed to the host and the PF/VF disclosed by the internal SSD controller EP. Thus, the LWB may also manage the functions disclosed by the SSD controller.
Because there are two endpoints (one in the LWB and one in the SSD controller) in the entire device, each endpoint maintains its own configuration space. These configuration spaces should be synchronized to avoid potential problems or conflicts. When the host sends a configuration write command, the LWB may update the configuration space in the LWB endpoint. The configuration write command may also be forwarded to the SSD's endpoints (via the PAPP-EP, PAPP-RP, and root port) so that the SSD controller's endpoints may also be updated. Not all host-initiated configuration write commands may be reflected to the SSD controller. That is, some of the host configuration space changes may not be reflected to the back-end SSD controller as is. For example, the host may make power management changes to the LWB EP, however the changes or similar changes may not be made by the LWB to the SSD controller EP.
When the host sends a configuration read command, the endpoints of the LWB may satisfy the command. The endpoints of the LWB may forward the configuration read command to the PAPP-EP, but the PAPP-EP may terminate the command: when the two configuration spaces are synchronized, the data in each configuration space should be the same. (alternatively, the endpoint of the LWB may respond to the configuration read command and terminate the command at that point without forwarding the configuration read command to the PAPP-EP.)
When the LWB receives a memory read or write transaction (transaction) from the host via an endpoint of the LWB or from the SSD controller via a root point of the LWB, the LWB may forward the transaction to another party (host or SSD controller, as appropriate). The LWB may perform appropriate address translation using BAR tables in the PAPP-EP and/or the PAPP-RP. One example of such address translation for a transaction received from a host is: the PAPP-EP may subtract the address in the PF BAR from the received address and the PAPP-RP may add the address in the VF BAR to the address to arrive at the correct address in the SSD controller.
Embodiments of the inventive concept may also include optional intermediate elements between the PAPP-EP and PAPP-RP that may be used for mapping, bandwidth control, etc. This optional element may also be used to variably "throttle" bandwidth for a variety of reasons. For example, different functions disclosed by the endpoints of the LWB may have different quality of service (QoS) requirements or Service Level Agreements (SLAs). PFs with low bandwidth QoS may be throttled to ensure that there is sufficient bandwidth for PFs with higher QoS bandwidth requirements. Other reasons to throttle bandwidth may include power or temperature. The LWB may be used to perform bandwidth throttling on a per PF basis, or it may perform bandwidth throttling for all PFs based on configuration settings or policy settings of a management entity (such as a host and/or BMC). The temperature or power throttling decision may be based on two thresholds high _ limit and low _ limit. When power consumption or temperature exceeds high _ limit, bandwidth throttling may be applied to all PFs or selective PFs based on policy. Bandwidth throttling may be applied until the power or temperature drops below the respective low _ limit threshold.
In the example embodiments of the inventive concepts above, the endpoints of the SSD controller are described as providing 1 PF and 15 VFs, which may be mapped to 16 different PFs at the endpoints of the LWB. These numbers are arbitrary: the endpoints of the SSD controller may provide any number of PFs and VFs, which may map to 16 PFs, or more or less PFs, on the endpoints of the LWB.
VF on the SSD controller may be migrated appropriately within the LWB, if desired, with respect to any appropriate changes made within the LWB and/or SSD controller. Thus, the mapping from PFs disclosed by the endpoints of the LWB to functions disclosed by the SSD controller can be modified at runtime, and thus the mapping is flexible.
Communication within the LWB may be performed using any desired technique. For example, the information exchanged between the end points, the PAPP-EP, the PAPP-RP, and the root port may use the Transaction Layer Packet (TLP) protocol.
The device itself may have any shape parameters. For example, SSDs may be packaged using U.2 or Full Height Half Length (FHHL) shape parameters, among other possibilities. Other types of devices may be similarly packaged in any desired shape parameters.
In another embodiment of the inventive concept, multiple SSD controllers may be included, each providing any number of functions: for example, each SSD controller may provide a total of 8 functions. Incorporated between the endpoints of the LWB may be a multiplexer/demultiplexer that can direct data related to a particular disclosed PF to the appropriate PAPP-EP/PAPP-RP/root port/SSD controller for eventual execution. In this way, a single LWB may expose more functionality than a single SSD controller may provide.
Each SSD controller may communicate with a separate root port in the LWB, each SSD controller having its own slice (slice) of PAPP-EPs, PAPP-RPs and configuration translation tables. In this way, operations involving one SSD controller may not have an impact on operations involving another SSD controller inside the LWB.
Embodiments of the inventive concepts such as this may be used, for example, when endpoints on a single SSD controller do not meet all requirements of the host machine. For example, if an SSD controller only supports 8 PCIe lanes but the host wants to support a bandwidth of 16 lanes, multiple SSD controllers may be used to support the required bandwidth. Or if the SSD controller supports only eight functions in total and the host wants to support 16 functions, multiple SSD controllers may be used to supply the required set of functions. That is, the LWB may be used to connect any number of SSD controllers to a host using a suitable number of PAPP-EP/PAPP-RP/RP slices as a single multi-function PCIe storage device.
As with other embodiments of the inventive concept, the numbers used are exemplary and not limiting. Thus, an endpoint of an LWB may use any type of bus, any version of the bus, and any number of lanes on the bus; the endpoints of the LWB may disclose any number of PFs; the number of PFs disclosed by each SSD controller may vary; the internal bus type, version, and speed may vary, as may the connections to the SSD controller (also individually); the number of SSD controllers connected to the LWB may vary (and may be connected to different or the same routing points in the LWB); and the like.
Fig. 1 illustrates a machine including a lightweight bridge (LWB) capable of emulating a Physical Function (PF) disclosed by the LWB using a Virtual Function (VF) of a device, according to an embodiment of the inventive concept. In fig. 1, a machine 105, which may also be referred to as a host, is shown. Machine 105 may include a processor 110. The processor 110 may be any kind of processor: such as Intel Xeon, Celeron, Itanium or Atom processors, AMD Opteron processors, ARM processors, etc. Although fig. 1 shows a single processor 110 in machine 105, machine 105 may include any number of processors, each of which may be single-core or multi-core processors, and may be mixed in any desired combination.
Machine 105 may also include a memory 115. The memory 115 may be any of various memories such as flash memory, Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), persistent random access memory, Ferroelectric Random Access Memory (FRAM), or non-volatile random access memory (NVRAM) such as Magnetoresistive Random Access Memory (MRAM), etc. Memory 115 may also be any desired combination of different memory types. Machine 105 may also include a memory controller 120, where memory controller 120 may be used to manage access to memory 115.
The machine 105 may also include a Solid State Drive (SSD) 125. SSD125 may be used to store data and may expose both Physical Functions (PFs) and Virtual Functions (VFs) for use by processor 110 (and software executing thereon) to invoke various functions of SSD 125. Although fig. 1 illustrates an SSD125, embodiments of the inventive concept may include any type of device that provides PFs and VFs. For example, other types of storage devices (such as hard disk drives) that provide PF and VF may be used instead of SSD125, and devices that provide basic functionality in addition to data storage may also be substituted for SSD 125. In the remainder of this document, any reference to SSD125 is intended to include references to other types of devices that can provide PF/VF. In some embodiments of the inventive concepts, SSD125 may use a peripheral component interconnect express (PCIe) bus and provide PCIe PFs and VFs, although embodiments of the inventive concepts may use other interfaces. Processor 110 may execute device driver 130, and device driver 130 may support access to SSD 125.
Located between SSD125 and the rest of machine 105 may be a lightweight bridge (LWB) 135. The LWB may act as a bridge between the SSD125 and the rest of the machine 105, and may disclose the functionality of the SSD125, but as a PF instead of a VF.
Although fig. 1 depicts machine 105 as a server (which may be a stand-alone server or a rack server), embodiments of the inventive concept may include any desired type of machine 105 without limitation. For example, machine 105 may be replaced with a desktop or laptop computer or any other machine that may benefit from embodiments of the inventive concepts. Machines 105 may also include special purpose portable computing machines, tablet computers, smart phones, and other computing machines. Additionally, applications that may access data from SSD125 may be located in another machine that is separate from machine 105 and accesses machine 105 via a network connection that traverses one or more networks of any type (wired, wireless, global, etc.).
Fig. 2 shows additional detail of the machine 105 of fig. 1. In fig. 2, generally, machine 105 includes one or more processors 110, and processors 110 may include a memory controller 215 and a clock 205, which may be used to coordinate the operation of the components of machine 105. Processor 110 may also be coupled to memory 115, and memory 115 may include, by way of example, Random Access Memory (RAM), Read Only Memory (ROM), or other state-retaining media. The processor 110 may also be coupled to the storage device 125 and to a network connector 210, the network connector 210 may be, for example, an ethernet connector or a wireless connector. Processor 110 may be connected to a bus 215, among other components, to which bus 215 input/output (I/O) interface ports and user interfaces 220, which may be managed using an input/output engine 225, may be attached.
Fig. 3 shows a detail of SSD125 of fig. 1. In FIG. 3, SSD125 may include Host Interface Logic (HIL)305, an SSD controller 310, and various flash memory chips 315-1 through 315-8 (also referred to as "flash memory"), which flash memory chips 315-1 through 315-8 may be organized into various channels 320-1 through 320-4. Host interface logic 305 may manage communications between SSD125 and other components, such as processor 110 of fig. 1 and LWB 135 of fig. 1. Host interface logic 305 may also manage communication with devices remote from SSD125 (i.e., devices that are not considered part of machine 105, but that communicate with SSD125, such as over one or more network connections). These communications may include read requests for reading data from SSD125, write requests for writing data to SSD125, and delete requests for deleting data from SSD 125. The host interface logic 305 may manage interfaces through only a single port or the machine interface logic 305 may manage interfaces through multiple ports. Alternatively, SSD125 may include multiple ports, each of which may have separate host interface logic 305 to manage the interface through the port. Embodiments of the inventive concepts may also mix possibilities (e.g., an SSD with three ports may have a first host interface logic for managing one port and a second host interface logic for managing the other two ports).
SSD controller 310 may use a flash controller (not shown in fig. 3) to manage read and write operations on flash chips 315-1 through 315-8 as well as garbage collection and other operations. SSD controller 310 may include flash translation layer 325, flash controller 330, and endpoints 335. The flash translation layer can manage the mapping of Logical Block Addresses (LBAs) (as used by the host 105 of fig. 1) to Physical Block Addresses (PBAs) that actually store data on the SSD 310. By using the flash translation layer 325, the host 105 of FIG. 1 need not be notified when data is moved from one block to another within the SSD 125.
The flash controller 330 may manage writing of data to the flash memory chips 315-1 to 315-8 and reading of data from the flash memory chips 315-1 to 315-8. Endpoint 335 may serve as an endpoint for SSD125, and endpoint 335 may be connected to a root port on another device (such as host 105 of fig. 1 or LWB 135 of fig. 1).
Although fig. 3 illustrates SSD125 as including eight flash chips 315-1 through 315-8 organized into four channels 320-1 through 320-4, embodiments of the inventive concepts may support any number of flash chips organized into any number of channels. Similarly, while fig. 3 illustrates the structure of an SSD, other storage devices (e.g., hard disk drives) may be implemented using different (but with similar potential benefits) structures.
Fig. 4A-4C illustrate the LWB 135 of fig. 1, according to various embodiments of the inventive concepts. In fig. 4A, LWB 135 is shown to include an endpoint 405, an application layer-endpoint (APP-EP)410, an application layer-root port (APP-RP)415, and a root port 420. Endpoint 405 may be used to communicate with root ports of other devices, such as host 105 of fig. 1. In fig. 4A, the endpoint 405 is shown disclosing 16 PFs (PF0-PF15) to upstream devices, and is shown using a PCIe generation 3 (Gen3 x 4) interface with four lanes, the endpoint 405 may be used to communicate with the upstream devices. Thus, as shown by interface 425, the interface to host 105 of FIG. 1 may be a PCIe generation 3 interface with four lanes. Similarly, root port 420 may communicate with endpoints of other devices (such as SSD 125). Thus, as shown by interface 430, the interface to SSD125 may be a PCIe generation 3 interface with four lanes. In fig. 4A, root port 420 is shown using a PCIe generation 3 interface with four lanes, root port 420 being available for communication with downstream devices. In fig. 4A, root port 420 is shown as having enumerated one PF and 15 VFs from SSD125, and endpoint 405 is shown as disclosing 16 PFs: one PF is disclosed by endpoint 405 for each function (physical or virtual) disclosed by SSD 125.
Endpoint 405 may communicate with APP-EP 410, and APP-EP 410 may in turn communicate with APP-RP 415. APP-EP 410 and APP-RP 415 may manage the translation of information between endpoint 405 and root port 420. Because any desired communication mechanism may be used, FIG. 4A does not show details of how endpoint 405 communicates with APP-EP 410, how APP-EP 410 communicates with APP-RP 415, or how APP-RP 415 communicates with root port 420. For example, the endpoints 405, APP-EP 410, APP-RP 415, and root ports 420 may communicate with any desired version of an interface including any desired number of data bus widths using PCIe or another bus, such as an advanced extensible interface (AXI) bus. Alternatively, the endpoints 405, APP-EP 410, APP-RP 415, and root port 420 may communicate using a proprietary messaging scheme (proprietary messaging scheme) using any desired bus/interface. Embodiments of the inventive concept may also include other mechanisms of communication between the endpoint 405, APP-EP 410, APP-RP 415, and root port 420. There may be no relationship between the manner of communication internal to the LWB 135 and the manner of communication from the LWB 135 to the host 105 or SSD125 of fig. 1.
APP-RP 415 may include a configuration table 435. Configuration table 435 may store configuration information regarding LWB 135. Examples of such configuration information may include a mapping between PFs disclosed by endpoint 405 and PFs/VFs provided by SSD125 (and determined via enumeration through root port 420). The configuration information stored in the configuration table 435 may also include information regarding quality of service (QoS) policies (which may also be referred to as Service Level Agreements (SLAs)) that may be associated with a particular PF that is disclosed by the endpoint 405. Configuration manager 440 may be used to configure LWB 135, which stores information in configuration table 435. The configuration manager 440 may also be used to determine information about functions disclosed by the SSD125, and may program the endpoint 405 to provide similar (or identical) functionality (but using only PFs instead of PFs and VFs). The details of what the configuration manager 440 does may depend on the specific functionality provided by the SSD125 (since the configuration manager 440 may configure the endpoint 405 to provide a set of PFs that match the PFs and VFs provided by the SSD 125), but the principles may generally be generalized as: determine the configuration of each PF/VF disclosed by SSD125, establish the appropriate number of PFs to be disclosed by endpoint 405, and configure those PFs to match the configuration of PFs/VFs disclosed by SSD 125.
Although FIG. 4A shows APP-EP 410 and APP-RP 415 as separate components, embodiments of the inventive concept may combine these two components into a single component (responsible for handling the functionality of both APP-EP 410 and APP-RP 415 as described). Indeed, the entirety of the LWB 135 may be implemented as a single unit, rather than as separate components in communication with each other. LWB 135 (as well as endpoints 405, APP-EP 410, APP-RP 415, root port 420, and configuration manager 440) may be implemented in any desired manner. Embodiments of the inventive concepts may use, among other possibilities, general purpose processors, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Graphics Processors (GPUs), and general purpose GPUs (gpgpgpus) with suitable software to implement the LWB 135 (and/or components thereof), individually and/or collectively.
In fig. 4A (and similar to fig. 4B-4C below), any particular number shown is meant to be an example only, and embodiments of the inventive concept may include other numbers. For example, the root port 420 is shown as having enumerated 1 PF and 15 VFs from the SSD125, but the SSD125 may disclose any number (zero or more) of PFs and VFs (although since any disclosed VF depends on hardware from a certain PF, there must be at least one disclosed PF to provide hardware for any disclosed VF). Similarly, endpoint 405 is shown disclosing 16 PFs, but endpoint 405 may disclose any number (zero or more) of PFs to host 105 of fig. 1. Note that while endpoint 405 discloses the same total number of functions as SSD125, there need not be a one-to-one correspondence between the functions disclosed by SSD125 and the functions disclosed by endpoint 405. For example, some of the functions disclosed by SSD125 may not have a corresponding PF disclosed by endpoint 405, or a single PF disclosed by endpoint 405 may map to multiple functions (physical and/or virtual) disclosed by SSD 125. As an example of the former, write requests may be directed to only one SSD at a time, which would cause the functionality that supports write requests to non-selected SSDs to not have a corresponding PF that is disclosed by endpoint 405. As an example of the latter, in embodiments of the inventive concept where LWB 135 is connected to multiple SSDs (such as shown in fig. 4C below), even though endpoint 405 may only disclose a single PF to host 105 of fig. 1 to request to perform garbage collection, the request by host 105 of fig. 1 to perform garbage collection may be sent to all SSDs connected to LWB 135.
In a similar manner, any version of a particular product shown in fig. 4A is intended to be an example only, and embodiments of the inventive concepts may use other products or other versions of products. For example, fig. 4A illustrates LWB 135 communicating with host 105 of fig. 1 (via interface 425) and with SSD125 (via interface 430) using PCIe generation 3 with four lanes. Other embodiments of the inventive concept may use other generations of PCIe buses/interfaces, or other numbers of lanes, or even additional types of buses, such as an advanced extensible interface (AXI) bus. For example, FIG. 4B is similar to FIG. 4A except for the fact that interface 425 and interface 430 use PCIe generation 3 (Gen 3X 8) with eight lanes instead of four lanes.
In some embodiments of the inventive concept, a single SSD125 may not provide sufficient capability for the purposes of host 105 of fig. 1. For example, if the SSD125 of fig. 4A-4B only provides 1 PF and 7 VFs, and thus does not provide the particular functionality desired by the host 105 of fig. 1, it would be possible to replace the SSD125 with another device that provides the desired functionality. However, LWB 135 may alternatively be connected to multiple devices, as shown in fig. 4C.
In FIG. 4C, LWB 135 is shown connected to SSD 125-1 and SSD 125-2 via interfaces 430-1 and 430-2, respectively. Because SSDs 125-1 and 125-2 each provide 1 PF and 7 VFs (as can be seen in root ports 420-1 and 420-2), neither of SSDs 125-1 and 125-2 can provide 16 functions individually. But SSDs 125-1 and 125-2 together provide 16 functions. Thus, by aggregating the resources (such as the disclosed functions) of SSDs 125-1 and 125-2, LWB 135 may provide a total of 16 PFs without a single SSD providing all of the functions.
As can be seen in FIG. 4C, LWB 135 includes only one endpoint 405, but includes two APP-EPs 410-1 and 410-2, two APP-RPs 415-1 and 415-2, and two root ports 420-1 and 420-2. In this way, LWB 135 may expose all 16 functions from a single endpoint (endpoint 405) aggregating the functions provided by SSDs 125-1 and 125-2. LWB 135 may also include a multiplexer/demultiplexer (MDM)445 because different PFs disclosed by endpoint 405 may be mapped to functions on SSDs 125-1 and 125-2. The multiplexer/demultiplexer 445 may route requests associated with a particular exposed PF of the endpoint 405 to the appropriate APP-EP 410-1 or 410-2, whose communication path leads to the SSD 125-1 or 125-2 that provides the corresponding PF or VF. Although fig. 4A-4B do not show multiplexer/demultiplexer 445 (since there is only SSD125 connected to LWB 135), in embodiments of the inventive concept where LWB 135 is connected to only one SSD125, LWB 135 may still include multiplexer/demultiplexer 445: the multiplexer/demultiplexer 445 may not add any functionality, but there is also no practical cost associated with including the multiplexer/demultiplexer 445. (however, if LWB 135 can only connect to a single device, then including a multiplexer/demultiplexer does not add any benefit.)
Note that embodiments of the inventive concept may aggregate resources other than just the disclosed functionality. For example, in FIG. 4C, interfaces 430-1 and 430-2 may each include eight lanes of a PCIe generation 3 bus/interface. But because SSDs 125-1 and 125-2 can be used in parallel (i.e., both SSDs 125-1 and 125-2 can process requests simultaneously), LWB 135 can provide 16 lanes of PCIe 3 rd generation bus/interface (Gen3 x 16) to host 105 of FIG. 1 via interface 425. In the same manner, the bandwidth provided by each of SSDs 125-1 and 125-2 may be aggregated such that the LWB may advertise (advertise) a bandwidth equal to the sum of the bandwidths provided by SSDs 125-1 and 125-2, respectively. Embodiments of the inventive concept may also aggregate other resources besides functions, PCIe lanes, or bandwidth.
As with fig. 4A-4B, any number or version shown in fig. 4C is merely used as an example, and embodiments of the inventive concepts may support other numbers or versions. For example, SSDs 125-1 and 125-2 need not provide the same number of exposed functions (physical and/or virtual), nor do they necessarily require the use of the same generation or number of PCIe bus/interface lanes.
In fig. 4A-4C, LWB 135 is shown separate from SSD 125. Embodiments of the inventive concept may combine these two devices into a single unit. That is, SSD125 may include LWB 135 within its housing prior to host interface logic 305 of fig. 3 (or at least prior to SSD controller 310 of fig. 3). Embodiments of the inventive concept are also not limited to inclusion of LWB 135 within a single SSD: a single housing containing multiple SSDs may also contain LWB 135 to implement the embodiment shown in fig. 4C. For purposes of this document, the term "connect" and similar terms are intended to mean any device with which LWB 135 communicates, whether by virtue of physically sharing any hardware with the device, or by virtue of LWB 135 physically differing from SSD125 by virtue of being connected through some interface.
As described above, one of the problems with devices that disclose VFs is that the host 105 of FIG. 1 may need to implement SR-IOV software to access the disclosed VFs. The SR-IOV software increases the latency of communicating with the device. But since the SR-IOV sequence for managing access to the device's VFs is known in advance, the SR-IOV sequence may be implemented using a state machine, relieving the host 105 of FIG. 1 of having to use the SR-IOV to access the device's VFs. Additionally, because the state machine may be implemented using hardware, using a state machine within LWB 135 instead of SR-IOV software within host 105 of FIG. 1 may also reduce latency in processing requests. Using a state machine to implement the SR-IOV sequence within LWB 135 essentially relieves and hides all of the burden of the SR-IOV protocol from the host to the device. In other words, host 105 in FIG. 1 may be agnostic to the existence of any implementation of SR-IOV within the device with which host 105 of FIG. 1 is communicating.
Fig. 5 shows details of the configuration manager 440 of fig. 4A-4C. In FIG. 5, configuration manager 440 may include a single root input/output virtualization (SR-IOV) sequence 505 and a state machine 510. SR-IOV sequence 505 may be a sequence to be implemented in SR-IOV software within host 105 of FIG. 1 and may be executed using state machine 510. The SR-IOV sequence 505 may be stored in Read Only Memory (ROM) within the configuration manager 440. Like the LWB 135 itself and its components shown in fig. 4A-4C, state machine 510 may be implemented using a general purpose processor, FPGA, ASIC, GPU, or GPGPU, among other possibilities.
FIG. 6 shows a mapping between PFs disclosed by LWB 135 of FIG. 1 and PF/VFs disclosed by SSD125 of FIG. 1 (or SSDs 125-1 and 125-2 of FIG. 4C). In fig. 6, PF 605 is the PF disclosed by LWB 135 of fig. 1 (more specifically, the PF disclosed by endpoint 405 of fig. 4A-4C). On the other hand, the PF/VF 610 is a function (both physical and virtual) disclosed by the SSD125 of FIG. 1. The mapping 615 may then represent a mapping between the PF 605 and the PF/VF 610. The mapping shown in FIG. 6 is for illustration purposes and is basically arbitrary, however the configuration manager 440 of FIGS. 4A-4C may use any desired mechanism to decide which PF of PFs 605 maps to which PF/VF 610. The mapping may be implemented in consideration of the boot and/or policy set by the host 105 of fig. 1 (or a Baseboard Management Controller (BMC) within the host 105 of fig. 1) or the SSD125 of fig. 1.
FIG. 7 illustrates the LWB 135 of FIG. 1 processing a configuration write request from the host 105 of FIG. 1. In fig. 7, the LWB 135 of fig. 1 (more specifically, the endpoint 405) may receive a configuration write request 705. The endpoint 405 may then process the configuration write request 705 locally to make any changes included in the configuration write request 705.
Note that configuration write request 705 may not require changes to SSD 125. For example, the QoS policy established by host 105 of fig. 1 may only affect the PFs disclosed by endpoint 405 and does not require any changes within SSD 125. However, in some cases, configuration write request 705 may also require modification of SSD 125. For example, the configuration write request may change data about the underlying functions on SSD125, rather than data that may only be managed by LWB 135 of fig. 1. In such a case, endpoint 405 may propagate configuration write request 705 to SSD125 via root port 420 (as shown by dashed arrows 710 and 715). The configuration write request 705 may be forwarded to the root port 420 via the APP-EP 410 of FIGS. 4A-4C and the APP-RP 415 of FIGS. 4A-4C, or through the configuration manager 440 of FIGS. 4A-4C. The configuration table 435 of fig. 4A-4C may be used to determine whether the configuration write request 705 should also be applied to the SSD125 (to ensure that the configurations of the endpoint 405 and the SSD125 mirror each other).
As with fig. 4A-4C, the details of how endpoint 405 (and possibly SSD 125) is configured depend largely on the details of configuration write request 705. But once the details of the configuration write request 705 are known, changes made to the endpoint 405 (and possibly to the SSD 125) may be performed explicitly.
As described above, it is desirable that the configurations of endpoint 405 and SSD125 mirror each other. However, as noted above, there may be some configurations that only apply to endpoint 405 (or may not matter to SSD125 because endpoint 405 may handle those changes). In the event that configuration changes will not matter to SSD125, those configuration changes need not be delivered to SSD 125.
In FIG. 7, configuration write request 705 is depicted as originating from host 105 of FIG. 1. Embodiments of the inventive concepts may support SSD125 requesting a configuration change in endpoint 405 of fig. 4A-4C. That is, SSD125 of fig. 1 may dynamically request changes to the configuration, features, or capabilities of PFs disclosed to host 105 of fig. 1 by endpoints 405 of fig. 4A-4C during runtime. For example, SSD125 of fig. 1 may increase or decrease the number of interrupt vectors of PFs disclosed by endpoints 405 of fig. 4A-4C. Such changes may result from changes in application requirements that host 105 of fig. 1 communicates to SSD125 of fig. 1 using a higher-level storage communication protocol, such as non-volatile memory express (NVMe) (via LWB 125 of fig. 1).
FIG. 8 illustrates the LWB 135 of FIG. 1 processing a configuration read request from the host 105 of FIG. 1. In fig. 8, endpoint 405 may receive a configuration read request 805 from host 105 of fig. 1. Endpoint 405 may then read the requested configuration information and return it as configuration information 810 to host 105 of fig. 1.
Just as configuration write request 705 of fig. 7, configuration read request 805 may be delivered to SSD125 to read from SSD 125. Thus, configuration read request 805 may be delivered to root port 420 and SSD125 as configuration read requests 815 and 820, respectively, with configuration information 810 returned as configuration information 825 and 830. But since the configuration of endpoint 405 should mirror that of SSD125, configuration read request 805 may not need to be delivered to SSD125 to determine the requested configuration information. Additionally, just as configuration write request 705 of FIG. 7, configuration read request 805 may be delivered to root port 420 via APP-EP 410 of FIGS. 4A-4C and APP-RP 415 of FIGS. 4A-4C, or through configuration manager 440 of FIGS. 4A-4C.
FIG. 9 illustrates application layer-endpoint (APP-EP)410 of FIGS. 4A-4C and application layer-root Port (APP-RP)415 of FIGS. 4A-4C handling address translation within LWB 135. In fig. 9, address 905 may be an address received from host 105 of fig. 1 (e.g., NVMe register requested to be read by host 105 of fig. 1). The APP-EP 410 may then subtract out the host Base Address Register (BAR)910 for the PF invoked by the host 105 of FIG. 1. The APP-RP 415 may then add the SSD BAR 915 of the PF/VF to which the requested PF of the endpoint 405 of FIGS. 4A-4C maps, resulting in the SSD address 920. In this manner, APP-EP 410 and APP-RP 415 may translate addresses from the perspective of the host to addresses that may be handled by SSD125 of FIG. 1.
Note that this process can be reversed for translating SSD addresses back to addresses for the host's perspective. Thus, APP-RP 415 may subtract address 915 in SSD BAR from SSD address 920, and APP-EP 410 may add address 910 in host BAR, resulting in host address 905.
FIG. 10 shows the mapping of FIG. 6 being changed within the LWB 135 of FIG. 1. There are a number of reasons the mapping 615 may change. For example, the storage parameter 1005 may trigger a change to the mapping 615: for example, if the available capacity on a particular SSD falls below a threshold amount of free space, a new write request may be directed to a different SSD (such as SSD 125-2 of fig. 4C) connected to LWB 135 of fig. 1. Alternatively, a day (day)1010 or a time of day 1015 may trigger a change to the mapping 615: for example, the peak time for a particular request at a particular host 105 may be between 6:30am and 8:00am and between 5:00pm and 9:00pm, so the bandwidth associated with the PF processing such requests may increase during those times and decrease at other times. Even if not specified using a conventional or known day/time value, day 1010 and time of day 1015 can be generalized to any trigger related to day and/or time. For example, at the end of a sporting event, it may be expected that fans going home may request directions from a GPS application on their smart phone, even though the specific time at which such a request will begin may not be known (the duration of the sporting event is generally known, but may run shorter or longer depending on the event itself). The bandwidth usage 1020 itself may also trigger changes to the mapping 615: for example, to balance the load on the SSD connected to the LWB 135 of fig. 1. Finally, a change 1025 in QoS policy may trigger a change to mapping 615: for example, the addition of a new QoS policy may require a change in how the functionality is mapped. Embodiments of the inventive concept may also include other triggers for changes to the mapping 615. Regardless of the trigger, as a result of the change, the mapping 615 may be replaced by the mapping 1030, the mapping 1030 still mapping PFs disclosed by the endpoints 405 of fig. 4A-4C to PF/VFs disclosed by the SSD125 of fig. 1, but possibly in a different arrangement.
Fig. 11 shows that the PF disclosed by LWB 135 of fig. 1 has an associated quality of service (QoS) policy. In fig. 11, one PF "PF 1" (as disclosed by endpoint 405 of fig. 4A-4C) is shown, with a single QoS policy 1105 associated with the disclosed PF. Embodiments of the inventive concept may include a disclosed PF having any number of associated QoS policies. For example, some PFs disclosed by the endpoint 405 of fig. 4A-4C may not have an associated QoS policy; other PFs disclosed by the endpoint 405 of fig. 4A-4C may have two or more associated QoS policies.
Given the QoS policy 1105 associated with the PF disclosed by the endpoint 405 of fig. 4A-4C, the LWB 135 of fig. 1 may enforce the QoS policy according to its specification. Furthermore, since the specifications may vary widely, the details of how the LWB 135 of fig. 1 implements a particular QoS policy may not be set forth herein. But by way of example, the QoS policies 1105 may specify a maximum (and/or minimum) bandwidth for a particular PF (thus ensuring that a particular PF does not prevent other PFs from receiving their fair share of bandwidth or ensuring that a particular PF receives its specified share of bandwidth). The LWB 135 of fig. 1 may then ensure that the QoS policy 1105 is satisfied by preventing communications using the corresponding PF/VF on the SSD125 of fig. 1 from exceeding the maximum bandwidth, or by ensuring that communications using PFs/VFs other than the PF/VF corresponding to the PF on the SSD125 of fig. 1 guarantee the minimum bandwidth of the PF that is disclosed.
While the LWB 135 of fig. 1 may enforce the QoS policy 1105, it is ensured that the policy is appropriate by the host 105 of fig. 1 (or by the SSD125 of fig. 1 if the SSD125 of fig. 1 requests that the policy be applied). For example, assume that SSD125 of FIG. 1 has a maximum bandwidth of 100 GB/sec, supporting a total of 16 functions (PF and VF). If the host were to allocate a QoS policy that guarantees a minimum bandwidth of 10 GB/sec per PF as disclosed by LWB 135, the total allocated bandwidth would be 160 GB/sec, which exceeds the capabilities of SSD125 of fig. 1. As a result, some PFs will not satisfy their associated QoS policies. Thus, while the LWB 135 of fig. 1 may enforce the QoS policy 1105, the LWB 105 may not check: the QoS policy 1105 may be enforced all at the same time in combination with other QoS policies assigned to various PFs.
Thus, there is a difference between enforcing QoS policies and ensuring that QoS policies are always enforceable. LWB 135 of FIG. 1 may perform the former; it may not perform the latter. However, this fact does not necessarily mean that any individual QoS policy will not be satisfied in the future, simply because a set of QoS policies may not be enforced simultaneously. For example, consider again that SSD125 of fig. 1 provides a total of 10 GB/sec of bandwidth, and assume that host 105 of fig. 1 specifies QoS policies for two different PFs disclosed by endpoints 405 of fig. 4A-4C, guaranteeing a minimum bandwidth of 6 GB/sec to each PF. In general, since each PF is "guaranteed" 6 GB/sec of the available 10 GB/sec bandwidth of SSD125 of fig. 1, two QoS policies may not be satisfied. But if it turns out that it is logically (or physically) impossible to invoke both functions simultaneously (e.g., if one function involves reading data from or writing data to SSD125 of fig. 1, and another function involves SSD125 of fig. 1 performing an internal consistency check that makes SSD125 of fig. 1 unable to respond to any other function before the consistency check is complete), host 105 of fig. 1 may specify two QoS policies: even if they cumulatively exceed the bandwidth provided by SSD125 of fig. 1, the two policies will not be implemented at the same time, so there is only a logical conflict, not an actual (or even potential) conflict.
As noted above, at times LWB 135 of fig. 1 may need to perform bandwidth throttling: for example, a specific PF disclosed by endpoint 405 of fig. 4A-4C is prevented from using too much bandwidth, such that other functionality of SSD125 of fig. 1 is overly limited. Fig. 12A-12B illustrate that LWB 135 of fig. 1 performs bandwidth throttling. In FIG. 12A, LWB 135 is shown receiving a "high" or "large" bandwidth 1205 from host 105 of FIG. 1. If this bandwidth is too large for any reason, LWB 135 may throttle the bandwidth to SSD125 of FIG. 1, as shown by "low" or "small" bandwidth 1210. In fig. 12B, the opposite case is shown: LWB 135 may have "high" or "large" bandwidth 1215 as SSD125 of fig. 1, but may throttle bandwidth 1220 for host 105 of fig. 1. Thus, LWB 135 may throttle bandwidth when the available bandwidth of SSD125 of fig. 1 is higher than the maximum bandwidth of the SLA set for the PF.
Embodiments of the inventive concept may perform bandwidth throttling for any number of reasons. In addition to the QoS policies 1105 of fig. 11 (if the bandwidth for a particular PF disclosed by the endpoint 405 of fig. 4A-4C is too large, the QoS policies 1105 may throttle the bandwidth associated with that PF), other reasons for throttling bandwidth may include temperature or power consumption. For example, if the temperature of LWB 135 (or SSD125 of fig. 1) becomes too high (i.e., exceeds a certain threshold), LWB 135 may throttle bandwidth to lower the temperature. LWB 135 may then stop throttling bandwidth once the temperature has sufficiently dropped (which may indicate a drop below the original threshold or a drop below another threshold that may be lower than the original threshold). Power consumption may trigger bandwidth throttling in the same manner as temperature, with the exception that power consumption by LWB 135 (or SSD125 of fig. 1) may trigger bandwidth throttling, and a reduced power consumption threshold removes bandwidth throttling. These thresholds may be stored somewhere within the LWB 135. Yet another reason that bandwidth may be throttled may be due to priorities established for PFs disclosed by the endpoints 405 of fig. 4A-4C: when using different PFs, the bandwidth on the PF with lower priority may be throttled in favor of the bandwidth of the PF with higher priority.
Another reason to throttle bandwidth may be to manage QoSs or SLAs. For example, host 105 of fig. 1 may only pay for a level of bandwidth that is lower than the bandwidth that LWB 135 is capable of providing. Thus, even though the hosts 105 and LWB 135 of fig. 1 may communicate using the higher overall bandwidth, LWB 135 limits the bandwidth of hosts 105 of fig. 1 so that the services provided are no larger than what has been guaranteed to hosts 105 of fig. 1. In other words, the LWB 135 may limit or restrict the maximum storage bandwidth of the PF based on the SLA or pricing plan.
Note that while fig. 12A-12B specifically discuss bandwidth, other resources may be similarly monitored and throttled for communication with host 105 of fig. 1 or SSD125 of fig. 1. For example, the delay may be controlled to some extent, favoring a PF with a lower target delay over a PF with a higher target delay.
The LWB 135 may measure and monitor the bandwidth (or other resources) consumed by each PF separately disclosed by the endpoints 405 of fig. 4A-4C. Bandwidth can be measured and monitored in two directions: i.e., host-to-device (host write operation) and device-to-host (host read operation). Throttling the bandwidth may also be performed independently in either direction. Bandwidth (or other resource) throttling may be performed as a result of both measured bandwidth and QoS policy settings.
FIG. 13 illustrates LWB 135 of FIG. 1 issuing credits (credits) to SSD125 of FIG. 1. Credit is one way LWB 135 can manage how much bandwidth SSD125 uses. Given a particular request, LWB 135 may issue a number of credits 1305, each credit 1305 representing a particular amount of data that SSD125 may transfer. Thus, the number of credits 1305 issued by LWB 135 may control the bandwidth of SSD 125: LWB 135 may issue just enough (or almost just enough, or any other desired number of) credits 1305 to cover the data transfer. When the SSD125 transfers data, the data transfer uses credits 1305: if SSD125 does not have any available credits, SSD125 may not transfer any data until a new credit is issued.
The credits 1305 may be issued in any desired manner. In some embodiments of the inventive concept, LWB 135 may send credits 1305 to SSD125 via a message (which may be, for example, a proprietary message). In other embodiments of the inventive concept, LWB 135 may write credits 1305 into a particular address on SSD 125: e.g., reserved addresses in NVMe address space. In other embodiments of the inventive concept, LWB 135 may write credits 1305 into an address (again, possibly a reserved address) within LWB 135: SSD125 may then read the address to see what credits are available 1305. The use of credits 1305 is one way to throttle the bandwidth of SSD 125. By limiting the number of credits 1305 issued to SSD125, SSD125 may be prevented from downloading all of the data required for a particular transaction in a given unit of time. By reducing the bandwidth on SSD125, SSD125 may experience reduced power consumption and/or lower temperatures. Eventually, if power consumption and/or temperature drops to an acceptable level (which may be different from the level at which bandwidth throttling may begin, as discussed above with reference to fig. 12), LWB 135 may stop throttling the bandwidth of SSD 125.
In another embodiment, LWB 135 may insert idle periods between data transfer packets to achieve bandwidth throttling. LWB 135 may implement the desired bandwidth limit for an individual PF or all PFs in total by adjusting inter-packet gaps (inter-packet gaps) of data packets based on measured bandwidth (or other resources) and QoS policy settings.
Fig. 14A-14B illustrate a flowchart of an example process by which the LWB 135 of fig. 1 identifies PFs/VFs disclosed by the SSD125 of fig. 1, discloses PFs from the LWB 135, and generates a mapping between the PFs disclosed by the LWB 135 of fig. 1 and the PFs/VFs disclosed by the SSD125 of fig. 1, according to an embodiment of the inventive concepts. In fig. 14A, at block 1405, the LWB 135 of fig. 1 may enumerate the PFs and VFs disclosed by the SSD125 of fig. 1. At block 1410, the LWB 135 of fig. 1 may generate a PF to be disclosed by the LWB 135 of fig. 1 (more specifically, a PF to be disclosed by the endpoint 405 of fig. 4A-4C). Note that the number of PFs generated by the LWB 135 of fig. 1 may not necessarily equal the number of PFs and VFs disclosed by the SSD125 of fig. 1: certain functions disclosed by SSD125 of fig. 1 may not obtain a corresponding PF disclosed by LWB 135 of fig. 1, and a single PF disclosed by LWB 135 of fig. 1 may map to multiple PFs/VFs of SSD125 of fig. 1. At block 1415, LWB 135 of fig. 1 may check to see if there are any other connected devices to enumerate (such as SSD 125-2 of fig. 4C). If so, processing may return to block 1405 to enumerate the PFs and VFs of the next device.
Assuming all connected devices have been enumerated, at block 1420 (fig. 14B), the LWB 135 of fig. 1 may map PFs disclosed by the endpoints 405 of fig. 4A-4C to PFs and VFs disclosed by the SSD125 of fig. 1 (and other connected devices). LWB 135 of fig. 1 may determine other resources (such as bandwidth) of SSD125 of fig. 1 and any other connected devices at block 1425 (this determination may also be made as part of block 1405 of fig. 14A), and LWB 135 of fig. 1 may aggregate the resources of all connected devices at block 1430. In this manner, the LWB 135 of fig. 1 may appear to provide higher overall resources than any of the individual devices.
At block 1435, the LWB 135 of fig. 1 may receive an enumeration request from the host 105 of fig. 1. At block 1440, LWB 135 of fig. 1 may respond to the enumeration request of host 105 of fig. 1 by disclosing the PF of endpoint 405 of fig. 4A-4C.
Fig. 15A-15B illustrate a flow diagram of an example process by which the LWB 135 of fig. 1 receives and processes requests from the host 105 of fig. 1. In fig. 15A, at block 1505, the LWB 135 of fig. 1 may receive a request from the host 135 of fig. 1 for a certain PF disclosed by the endpoint 405 of fig. 4A-4C: which PF the request is directed to and what function the PF represents is irrelevant to the analysis. At block 1510, LWB 135 of fig. 1 may map the PF to which the request is directed to a PF or VF of SSD125 of fig. 1 (this mapping may be performed using, for example, configuration table 435 of fig. 4A-4C). At block 1515, LWB 135 may select SSD125 of fig. 1 as the device that includes the PF or VF to which the PF disclosed by endpoint 405 of fig. 4A-4C maps. Note that as shown in fig. 4C, block 1515 may be important only if LWB 135 of fig. 1 is connected to multiple devices: if there may be multiple devices, multiple APP-EPs 410-1 and 410-2 of FIG. 4C may receive the request. In embodiments of the inventive concept where only one device may be connected to LWB 135 of fig. 1 (because LWB 135 includes only one APP-EP 410 of fig. 4A-4B, or because only one SSD125 of fig. 1 is connected to LWB 135 of fig. 1), block 1515 may be omitted, as shown by dashed line 1520.
Once the appropriate device and appropriate destination PF/VF have been identified, the LWB 135 of fig. 1 may translate the request into a request for the identified PF/VF at block 1525. This may involve address translation, for example, as described above with reference to fig. 9 (and as described below with reference to fig. 16). At block 1530 (fig. 15B), the LWB 135 of fig. 1 may send a request for a translation to the identified PF/VF on the identified device.
At block 1535, the LWB 135 of fig. 1 may receive a response from the identified PF/VF of the identified device. At block 1540, the LWB 135 of fig. 1 may map the PF/VF of the identified device to a PF exposed by the endpoint 405 of fig. 4A-4C (this mapping may be performed using, for example, the configuration table 435 of fig. 4A-4C). At block 1545, the LWB 135 of fig. 1 may translate the response into a response to the host 135 of fig. 1. Finally, at block 1550, the LWB 135 of fig. 1 may send a response back to the host as the disclosed PF from the endpoint 405 of fig. 4A-4C.
FIG. 16 shows a flow diagram of an example process of the APP-EP and APP-RP of FIGS. 4A-4C to translate addresses between host 105 of FIG. 1 and SSD125 of FIG. 1. In fig. 16, at block 1605, APP-EP 410 of fig. 4A-4C may subtract the address in the host BAR from the address included in the request from host 105 of fig. 1, and at block 1610, APP-RP 415 of fig. 4A-4C may add the address in the device BAR to the address determined in block 1605.
Note that reverse address translation from SSD125 of FIG. 1 to host 105 of FIG. 1 may be omitted. SSD125 of fig. 1 (as any such device) can receive a full address for use by host 105 of fig. 1. Thus, the address provided by SSD125 of FIG. 1 can be used to access a particular address within host 105 of FIG. 1 without translation. But if SSD125 of FIG. 1 does not have the full address of host 105 of FIG. 1, then address translation can be performed in reverse, with APP-RP 415 of FIGS. 4A-4C subtracting the address in device BAR from the address included in the request from SSD125 of FIG. 1, and APP-EP 410 of FIGS. 4A-4C adding the address in host BAR to the address.
Fig. 17 illustrates a flow diagram of an example process by which the LWB 135 of fig. 1 issues credits 1305 of fig. 13 to the SSD125 of fig. 1. In fig. 17, at block 1705, the LWB 135 of fig. 1 may measure or monitor the bandwidth consumed by each PF disclosed by the endpoint 405 of fig. 4A-4C. At block 1710, the LWB 135 of fig. 1 may determine the credit 1305 of fig. 13 issued to the SSD125 of fig. 1. These credits may be determined based on the bandwidth consumed by the PF disclosed by endpoint 405 of fig. 4A-4C and the amount of data SSD125 of fig. 1 may need to transfer. SSD125 of fig. 1 may use these credits to transfer data to SSD125 of fig. 1 or to transfer data from SSD125 of fig. 1. In this regard, there are a variety of options for delivering the credits 1305 of fig. 13 to the SSD125 of fig. 1. At block 1715, LWB 135 of FIG. 1 may write credit 1305 of FIG. 13 to an address in SSD125 of FIG. 1. Alternatively, at block 1720, LWB 135 may write credit 1305 of FIG. 13 to an address in LWB 135 of FIG. 1, and SSD125 of FIG. 1 may read credit 1305 of FIG. 13 from the address in LWB 135 of FIG. 1. Optionally, at block 1725, LWB 135 of fig. 1 may send a message to SSD125 of fig. 1, the message may include credit 1305 of fig. 13.
FIG. 18 illustrates a flow diagram of an example process by which the LWB 135 of FIG. 1 processes the configuration write request 705 of FIG. 7. In fig. 18, at block 1805, LWB 135 of fig. 1 may receive configuration write request 705 of fig. 7: the configuration write request 705 of FIG. 7 may be received from the host 105 of FIG. 1 or the SSD125 of FIG. 1. At block 1810, the LWB 135 of fig. 1 (more specifically, the configuration manager 440 of fig. 4A-4C) may configure the PF disclosed by the endpoint 405 of fig. 4A-4C as specified by the configuration write request 705 of fig. 7. At block 1815, LWB 135 of FIG. 1 may perform the configuration using state machine 510 of FIG. 5 and SR-IOV sequence 505 of FIG. 5. Finally, at block 1820, the LWB 135 of fig. 1 may ensure that the configuration of PFs/VFs disclosed by the SSD125 of fig. 1 are mirrored by the configuration of PFs disclosed by the endpoint 405 of fig. 4A-4C: this may involve sending configuration write request 705 of FIG. 7 to SSD125 of FIG. 1, as indicated by dashed arrows 710 and 715. Since the configuration write request 705 of fig. 7 may not need to be sent to the SSD125 of fig. 1, block 1820 is optional as indicated by dashed line 1825.
FIG. 19 illustrates a flow diagram of an example process by which the LWB 135 of FIG. 1 processes the configuration read request 805 of FIG. 8. In FIG. 19, at block 1905, LWB 135 of FIG. 1 may receive configuration read request 805 of FIG. 8: the configuration read request 805 of FIG. 8 may be received from the host 105 of FIG. 1 or the SSD125 of FIG. 1 (although the configuration read request 805 of FIG. 8 may typically be received from the host 105 of FIG. 1). In this regard, there are several alternatives. At block 1910, the LWB 135 of fig. 1 (more specifically, the configuration manager 440 of fig. 4A-4C) may determine a configuration of the PF disclosed by the endpoint 405 of fig. 4A-4C. In embodiments of the inventive concept using block 1910, since the configuration of PFs disclosed by SSD125 of fig. 1 should mirror the configuration of PFs disclosed by endpoint 405 of fig. 4A-4C, SSD125 of fig. 1 may not need to be queried for the configuration of PFs/VFs it discloses (although SSD125 of fig. 1 may also be queried for the configuration of PFs/VFs it discloses). Optionally, at block 1915, LWB 135 of fig. 1 may deliver configuration read request 805 of fig. 8 to SSD125 of fig. 1 (as illustrated by dashed arrows 815 and 820 of fig. 8), and may receive configuration information 810 of fig. 8 as a return (as illustrated by dashed arrows 825 and 830 of fig. 8). Either way, at block 1920, the LWB 135 of fig. 1 may return the configuration information 810 of fig. 8 to the requestor.
Fig. 20 illustrates a flow diagram of an example process by which the LWB 135 of fig. 1 associates the QoS policies 1105 of fig. 11 with the PF disclosed by the LWB 135 of fig. 1. In fig. 20, at block 2005, LWB 135 of fig. 1 may receive QoS policy 1105 of fig. 11 from a source, which may be host 105 of fig. 1 or SSD125 of fig. 1. Then, at block 2010, the LWB 135 of fig. 1 may associate the policy with the identified PF disclosed by the endpoint 405 of fig. 4A-4C and implement the policy as appropriate.
FIG. 21 shows a flowchart of an example process by which the LWB 135 of FIG. 1 dynamically changes the PF to PF/VF mapping disclosed by the LWB 135 of FIG. 1 to the SSD125 of FIG. 1. In fig. 21, at block 2105, LWB 135 of fig. 1 may identify storage parameter 1005 of fig. 10, day 1010 of fig. 10, time 1015 of day of fig. 10, bandwidth usage 1020 of fig. 10, or QoS policy change 1025 of fig. 10 that has changed. At block 2110, the LWB 135 of fig. 1 may dynamically change the mapping 615 of fig. 6 between the PF 605 of fig. 6 and the PF/VFS 610 of fig. 6 in response to the detected change.
Fig. 22A-22B illustrate a flow diagram of an example process by which LWB 135 of fig. 1 performs bandwidth throttling. In fig. 22A, at block 2205, the LWB 135 of fig. 1 can determine the QoS policy 1105 of fig. 11, or the temperature or power consumption of the LWB 135 of fig. 1 or the SSD125 of fig. 1. At block 2210, LWB 135 of fig. 1 may determine whether bandwidth throttling is applicable: for example, by comparing the temperature or power consumption of the LWB 135 of fig. 1 or the SSD125 of fig. 1 to a threshold value stored in the LWB 135 of fig. 1.
If bandwidth throttling is applicable, at block 2215, LWB 135 of FIG. 1 may throttle the bandwidth of SSD125 of FIG. 1. This throttling may be accomplished by limiting the number of credits 1305 of FIG. 13 issued to SSD125 of FIG. 1. Then, at block 2220 (fig. 22B), LWB 135 of fig. 1 may determine QoS policy 1105 of fig. 11 (which may be the same QoS policy 1105 of fig. 11 or a different QoS policy as triggering bandwidth throttling in block 2210 of fig. 22A) or the temperature or power consumption of LWB 135 of fig. 1 or SSD125 of fig. 1. At block 2225, the LWB 135 of fig. 1 may determine whether the QoS policy 1105 of fig. 11 or the new temperature or power consumption of the LWB 135 of fig. 1 or the SSD125 of fig. 1 has reached a level at which bandwidth throttling is no longer needed. If bandwidth throttling is still applicable, control may return to block 2220 to again check whether a change that may end bandwidth throttling has occurred. Otherwise, at block 2230, LWB 135 of FIG. 1 may stop bandwidth throttling.
In fig. 14A-22B, some embodiments of the invention are shown. Those skilled in the art will recognize that other embodiments of the invention are possible by changing the order of the blocks, by omitting blocks, or by including connections not shown in the figures. All such variations of the flow diagrams, whether explicitly described or not, are considered embodiments of the invention.
Embodiments of the inventive concept include technical advantages over conventional implementations. A lightweight bridge (LWB) may expose Physical Functions (PFs) to a host machine, which eliminates the need for the host machine to execute single root input/output virtualization (SR-IOV) software to access Virtual Functions (VFs) of a device, such as a Solid State Drive (SSD). Since the LWB may be a single device, and the device does not require specific hardware to support multiple PFs, the device also does not necessarily require any hardware modifications. Indeed, even if a single housing contains both LWB and devices, and even if LWB and devices are implemented on a shared printed circuit board, the inclusion of LWB may not require any hardware changes to the devices, thus allowing existing SSDs (and other devices) to be used with LWB, and allowing future SSDs (and other devices) to continue to provide VF, rather than including hardware to support multiple PFs. (a device may require some firmware updates-e.g., to support use of credits to manage data transfers-but such updates may be easily performed even in the realm of existing devices.) additionally, because the LWB may aggregate the resources of multiple devices, the LWB may provide better overall performance than any individual device may provide.
The following discussion is intended to provide a brief, general description of one or more suitable machines in which certain aspects of the invention may be implemented. One or more machines may be controlled, at least in part, by input from conventional input devices, such as a keyboard, mouse, etc., as well as by instructions received from another machine, interaction with a Virtual Reality (VR) environment, biometric feedback, or other input signals. As used herein, the term "machine" is intended to broadly encompass a single machine, virtual machine, or system of communicatively connected machines, virtual machines, or devices operating together. Exemplary machines include computing devices (such as personal computers, workstations, servers, portable computers, handheld devices, telephones, tablets, etc.) and transportation devices (such as private or public transportation (e.g., cars, trains, taxis, etc.)).
One or more machines may include embedded controllers (such as programmable or non-programmable logic devices or arrays, Application Specific Integrated Circuits (ASICs), embedded computers, smart cards, etc.). One or more machines may utilize one or more connections to one or more remote machines, such as through a network interface, modem, or other communication coupling. The machines may be interconnected by way of a physical and/or logical network (e.g., intranet, internet, local area network, wide area network, etc.). Those skilled in the art will appreciate that network communications may utilize various wired and/or wireless short or long range carriers and protocols, including: radio Frequency (RF), satellite, microwave, Institute of Electrical and Electronics Engineers (IEEE)802.11, bluetooth, optical, infrared, cable, laser, etc.
Embodiments of the invention may be described by reference to or in conjunction with associated data including functions, programs, data structures, applications, etc., which when accessed by a machine result in the machine performing tasks or defining abstract data types or low-level hardware contexts. The associated data may be stored, for example, in volatile and/or non-volatile memory (e.g., RAM, ROM, etc.), or in other storage devices and their associated storage media (including hard drives, floppy disks, optical storage, tapes, flash memory, memory sticks, digital video disks, biological memory, etc.). The associated data may be transmitted over a transmission environment (including physical and/or logical networks) in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format. The associated data may be used in a distributed environment and stored locally and/or remotely for machine access.
Embodiments of the invention may include a tangible, non-transitory, machine-readable medium that may include instructions executable by one or more processors, the instructions including instructions for performing elements of the invention as described herein.
Having described and illustrated the principles of the invention with reference to illustrated embodiments, it will be recognized that the illustrated embodiments can be modified in arrangement and detail, and can be combined in any desired manner, without departing from such principles. Also, while the foregoing discussion has focused on particular embodiments, other configurations (constructions) are contemplated. In particular, although expressions such as "embodiments in accordance with the inventive concept" or the like are used herein, these phrases are intended to refer to embodiment possibilities as a whole and are not intended to limit the invention to particular embodiment configurations. As used herein, these terms may refer to the same or different embodiments that are combinable into other embodiments.
The foregoing illustrative embodiments should not be construed as limiting the invention. Although a few embodiments have been described, those skilled in the art will readily appreciate that many modifications are possible in those embodiments without materially departing from the novel teachings and advantages of the present disclosure. Accordingly, all such modifications are intended to be included within the scope of this invention as defined in the claims.
Embodiments of the invention may be extended to the following claims without limitation:
statement 1, an embodiment of the inventive concept includes a lightweight bridge (LWB) circuit comprising:
an endpoint for connecting to a host, the endpoint disclosing a plurality of Physical Functions (PFs);
a root port for connecting to a device, the device exposing at least one PF and at least one Virtual Function (VF) to the root port; and
an application layer-endpoint (APP-EP) and an application layer-root port (APP-RP) for converting between the plurality of PFs exposed to the host and the at least one PF and the at least one VF exposed by the device,
wherein the APP-EP and APP-RP implement a mapping between the plurality of PFs disclosed by the endpoints and the at least one PF and the at least one VF disclosed by the apparatus.
Statement 2, embodiments of the inventive concept include the LWB circuit of statement 1, wherein the device includes a Solid State Drive (SSD).
Statement 3, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein a device includes the LWB circuit.
Statement 4, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein the LWB circuit is connected to a device.
Statement 5, embodiments of the inventive concept include the LWB circuit described in statement 1, wherein APP-EP and APP-RP can be implemented as a single component.
Statement 6, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein the LWB circuit includes at least one of a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a general purpose processor, a Graphics Processor (GPU), and a general purpose GPU (gpgpu).
Statement 7, an embodiment of the inventive concept includes the LWB circuit of statement 1, wherein:
the plurality of PFs includes a plurality of peripheral component interconnect express (PCIe) PFs;
the at least one PF comprises at least one PCIe PF; and is
The at least one Virtual Function (VF) comprises at least one PCIe VF.
Statement 8, an embodiment of the inventive concept includes the LWB circuit of statement 7, wherein:
APP-EP includes PCIe APP-EP (PAPP-EP); and is
APP-RP includes PCIe APP-RP (PAPP-RP).
Statement 9, embodiments of the inventive concepts include the LWB circuit of statement 1, further comprising: a configuration manager for configuring the APP-EP and the APP-RP.
Statement 10, embodiments of the inventive concepts include the LWB circuit of statement 9, wherein the APP-RP may include a configuration table for storing a mapping between the plurality of PFs disclosed by the endpoint and the at least one PF and the at least one VF disclosed by the apparatus.
Statement 11, embodiments of the inventive concepts include the LWB circuit of statement 9, wherein the configuration manager may enumerate the at least one PF and the at least one VF exposed by the device and generate the plurality of PFs exposed by the endpoint based at least in part on the at least one PF and the at least one VF exposed by the device.
Statement 12, an embodiment of the inventive concept includes the LWB circuit of statement 9, wherein the configuration manager is configurable of the device.
Statement 13, an embodiment of the inventive concept includes the LWB circuit of statement 12, wherein the configuration manager can configure the device based at least in part on a configuration write request that can be received at the endpoint from the host.
Statement 14, embodiments of the inventive concept include the LWB circuit of statement 9, wherein the APP-EP can intercept a configuration read request from the host and return configuration information to the host.
Statement 15, embodiments of the inventive concepts include the LWB circuit of statement 9, wherein the configuration manager can ensure that the first configuration of the device mirrors the second configuration of the endpoint.
Statement 16, embodiments of the inventive concepts include the LWB circuit of statement 9, wherein the configuration manager includes: a Read Only Memory (ROM) and a state machine to implement a single root input/output virtualization (SR-IOV) sequence.
Statement 17, an embodiment of the inventive concept includes the LWB circuit of statement 1, wherein:
the APP-EP may subtract the Base Address Register (BAR) of the PF disclosed by the endpoint from the address received in the request from the host; and is
The APP-RP may add the BAR of the VF exposed by the device to the address received from the APP-EP.
Statement 18, embodiments of the inventive concepts include the LWB circuit of statement 1, further comprising:
a second APP-EP and a second APP-RP;
a second means disclosing at least one second PF and at least one second VF; and
a multiplexer/demultiplexer arranged to be connected to the end points and to each of the APP-EP and the second APP-EP,
wherein the second APP-EP and the second APP-RP implement a second mapping between the plurality of PFs disclosed by the endpoint and the at least one second PF and the at least one second VF disclosed by the second apparatus.
Statement 19, an embodiment of the inventive concept includes the LWB circuit of statement 18, wherein the LWB circuit can provide aggregated resources of a device and a second device.
Statement 20, an embodiment of the inventive concept includes the LWB circuit of statement 1, wherein a mapping between the plurality of PFs disclosed by the endpoints and the at least one PF and the at least one VF disclosed by the devices may be dynamically changed.
Statement 21, embodiments of the inventive concepts include the LWB circuit of statement 20, wherein a mapping between the plurality of PFs disclosed by an endpoint and the at least one PF and the at least one VF disclosed by a device may change based at least in part on a change in a quality of service (QoS) policy for at least one PF of the plurality of PFs disclosed by an endpoint, a storage parameter, a day, a time of day, or a bandwidth usage.
Statement 22, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein the LWB circuit can achieve bandwidth throttling.
Statement 23, embodiments of the inventive concepts include the LWB circuit of statement 22, wherein the LWB circuit can measure a bandwidth used by at least one PF of the plurality of PFs disclosed by the endpoint.
Statement 24, embodiments of the inventive concepts include the LWB circuit of statement 22, wherein the LWB circuit can implement bandwidth throttling based at least in part on a policy set by a host, a policy set by a device, a temperature of the LWB circuit, a power consumption of the LWB circuit, a temperature of an SSD, or a power consumption of an SSD.
Statement 25, embodiments of the inventive concepts include the LWB circuit of statement 24, wherein the LWB circuit further comprises: a memory for triggering a threshold for bandwidth throttling.
Statement 26, an embodiment of the inventive concept includes the LWB circuit of statement 25, wherein the LWB circuit further comprises: a memory for triggering a second threshold for end of bandwidth throttling.
Statement 27, an embodiment of the inventive concept includes the LWB circuit of statement 22, wherein the LWB circuit can issue credits to a device for data transfer.
Statement 28, an embodiment of the inventive concept includes the LWB circuit of statement 27, wherein the LWB circuit can write credits to an address in a device.
Statement 29, embodiments of the inventive concepts include the LWB circuit of statement 27, wherein a device can read credits from an address in the LWB circuit.
Statement 30, embodiments of the inventive concepts include the LWB circuit of statement 27, wherein the LWB circuit can send credits to a device in a message.
Statement 31, embodiments of the inventive concepts include the LWB circuit of statement 22, wherein the LWB circuit can achieve bandwidth throttling by introducing an inter-packet gap between a first data packet and a second data packet.
Statement 32, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein the LWB circuit can enforce QoS policies on PFs of the plurality of PFs disclosed by the device.
Statement 33, an embodiment of the inventive concept includes the LWB circuit described in statement 32, wherein the policy is settable by one of the host, the BMC, and the device.
Statement 34, embodiments of the inventive concepts include the LWB circuit of statement 1, wherein the LWB circuit can configure the capability of at least one PF of the plurality of PFs disclosed by an endpoint based at least in part on a configuration write request received from a device.
Statement 35, an embodiment of the inventive concept includes a method comprising:
enumerating at least one Physical Function (PF) exposed by a device using a root port of a lightweight bridge (LWB);
enumerating at least one Virtual Function (VF) exposed by the device using a root port of the LWB;
generating a plurality of PFs at endpoints of the LWB for disclosure to a host; and
mapping the PFs at the end points of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Statement 36, an embodiment of the inventive concept includes the method of statement 35, wherein:
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating at least one PF exposed by the device using a root port of the LWB at start-up of the LWB; and is
The step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: the root port of the LWB is used at LWB startup to enumerate at least one VF exposed by the device.
Statement 37, an embodiment of the inventive concept includes the method of statement 35, wherein:
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating the at least one PF exposed by the device using a root port of the LWB when the device is connected to the LWB; and is
The step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: when a device is connected to the LWB, the root port of the LWB is used to enumerate at least one VF exposed by the device.
Statement 38, an embodiment of the inventive concept includes the method of statement 35, wherein the apparatus comprises a Solid State Drive (SSD).
Statement 39, an embodiment of the inventive concept includes the method of statement 35, wherein:
the plurality of PFs includes a plurality of peripheral component interconnect express (PCIe) PFs;
the at least one Physical Function (PF) comprises at least one PCIe PF; and is
The at least one Virtual Function (VF) comprises at least one PCIe VF.
Statement 40, an embodiment of the inventive concept includes the method of statement 39, wherein:
APP-EP includes PCIe APP-EP (PAPP-EP); and is
APP-RP includes PCIe APP-RP (PAPP-RP).
Statement 41, an embodiment of the inventive concept includes the method of statement 35, further comprising:
receiving an enumeration from a host; and
disclosing the plurality of PFs at the endpoints of the LWBs to a host.
Statement 42, an embodiment of the inventive concept includes the method of statement 35, wherein,
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating at least one second PF exposed by a second device using a second root port of the LWB;
the step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: enumerating at least one second VF exposed by the second device using the second root port of the LWB; and is
The step of mapping the PFs to the at least one PF and the at least one VF comprises: mapping the PFs to the at least one PF, the at least one VF, the at least one second PF, and the at least one second VF.
Statement 43, an embodiment of the inventive concept includes the method of statement 42, further comprising:
determining resources of the apparatus and the second apparatus; and
resources of the device and the second device are aggregated.
Statement 44, an embodiment of the inventive concept includes the method of statement 35, further comprising:
receiving, at a PF of the plurality of PFs exposed by an endpoint, a request from a host;
mapping PFs of the PFs exposed by the endpoints of the LWB to VFs of the at least one VF exposed by the device;
converting the request from a PF of the plurality of PFs exposed by the endpoint to a VF of the at least one VF exposed by the device; and
the request is sent to a VF of the at least one VF exposed by the device.
Statement 45, an embodiment of the inventive concept includes the method of statement 44, further comprising:
receiving a response to the request from a VF of the at least one VF exposed by the device;
mapping a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by an endpoint of the LWB;
converting a response from a response of a VF of the at least one VF exposed by the device to a response of a PF of the plurality of PFs exposed by the endpoint; and
sending a response to the host via a PF of the plurality of PFs exposed by the endpoints of the LWB.
Statement 46, an embodiment of the inventive concept includes the method of statement 45, wherein the step of translating the response from the response of the VF of the at least one VF exposed by the device to the response of the PF of the plurality of PFs exposed by the endpoint comprises: modifying an address in the response based on a first Base Address Register (BAR) of a PF of the plurality of PFs exposed by the endpoint and a second BAR of a VF of the at least one VF exposed by the device.
Statement 47, an embodiment of the inventive concept includes the method of statement 45, wherein mapping the VF of the at least one VF exposed by the device to a PF of the PFs exposed by the endpoints of the LWB comprises: mapping a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by the endpoints of the LWB using a configuration table in the APP-RP of the LWB.
Statement 48, embodiments of the inventive concepts include the method of statement 44, wherein the step of converting the request from a PF of the plurality of PFs exposed by the endpoint to a VF of the at least one VF exposed by the device comprises: modifying the address in the request based on a first Base Address Register (BAR) of a PF of the plurality of PFs exposed by the endpoint and a second BAR of a VF of the at least one VF exposed by the device.
Statement 49, an embodiment of the inventive concept includes the method of statement 44, wherein the step of mapping PFs of the plurality of PFs exposed by endpoints of an LWB to VFs of the at least one VF exposed by a device comprises: mapping PFs of the PFs exposed by the endpoints of the LWB to VFs of the at least one VF exposed by the device using a configuration table in the APP-RP of the LWB.
Statement 50, embodiments of the inventive concepts include the method of statement 44, wherein sending a request to a VF of the at least one VF exposed by the device comprises:
selecting among a device and a second device that exposes at least one second PF and at least one second VF based at least in part on a mapping of PFs of the PFs exposed by endpoints of the LWB to VFs of the at least one VF exposed by the device; and
the request is sent to an APP-EP associated with the device.
Statement 51, an embodiment of the inventive concept includes the method of statement 44, further comprising:
determining a credit for a data transfer involving an apparatus; and
a credit is issued to the device.
Statement 52, an embodiment of the inventive concept includes the method of statement 51, wherein issuing credits to the device comprises: the credit is written to an address in the device.
Statement 53, an embodiment of the inventive concept includes the method of statement 51, wherein issuing credits to the device comprises: the device reads credits from an address in the LWB.
Statement 54, an embodiment of the inventive concept includes the method of statement 51, wherein issuing credits to the device comprises: a message is sent from the LWB to the device, the message including the credit.
Statement 55, an embodiment of the inventive concept includes the method of statement 35, further comprising:
receiving a configuration write request; and
configuring the plurality of PFs exposed by the endpoints of the LWB based at least in part on the configuration write request.
Statement 56, an embodiment of the inventive concept includes the method of statement 55, further comprising: configuring the at least one PF and the at least one VF exposed by the device to mirror a configuration of the plurality of PFs exposed by the endpoints of the LWB based at least in part on the configuration write request.
Statement 57, an embodiment of the inventive concept includes the method of statement 55, wherein receiving the configuration write request comprises: a configuration write request is received from a host.
Statement 58, an embodiment of the inventive concept includes the method of statement 55, wherein receiving the configuration write request comprises: a configuration write request is received from a device.
Statement 59, an embodiment of the inventive concept includes the method of statement 58, wherein configuring the plurality of PFs disclosed by the endpoints of the LWB based at least in part on the configuration write request comprises: configuring capabilities of at least one PF of the plurality of PFs exposed by an endpoint based at least in part on a configuration write request received from a device.
Statement 60, an embodiment of the inventive concept includes the method of statement 55, wherein configuring the plurality of PFs disclosed by the endpoints of the LWB based at least in part on the configuration write request comprises: a single root input/output virtualization (SR-IOV) sequence is performed using a state machine.
Statement 61, an embodiment of the inventive concept includes the method of statement 35, further comprising:
receiving a configuration reading request;
determining a configuration of the at least one PF and the at least one VF of a device; and
responding to a configuration read request with a configuration of the at least one PF and the at least one VF of the device.
Statement 62, an embodiment of the inventive concept includes the method of statement 61, wherein determining the configuration of the at least one PF and the at least one VF of the device comprises: determining a configuration of the plurality of PFs exposed by the endpoints of the device.
Statement 63, an embodiment of the inventive concept includes the method of statement 61, wherein determining the configuration of the at least one PF and the at least one VF of the device comprises: querying a device for a configuration of the at least one PF and the at least one VF of the device.
Statement 64, an embodiment of the inventive concept includes the method of statement 61, wherein determining the configuration of the at least one PF and the at least one VF of the device comprises: the configuration read request is not delivered to the device.
Statement 65, an embodiment of the inventive concept includes the method of statement 35, further comprising: dynamically changing a mapping of the plurality of PFs at the end point of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Statement 66, an embodiment of the inventive concept includes the method of statement 65, wherein dynamically changing the mapping of the plurality of PFs at the end points of the LWB to the at least one PF and the at least one VF disclosed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB comprises: dynamically changing a mapping of the plurality of PFs at the end point of the LWB to the at least one PF and the at least one VF exposed by the device using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB based at least in part on a change in quality of service (QoS) policy, a storage parameter, a day, a time of day, or a bandwidth usage for at least one PF of the plurality of PFs exposed by the end point of the LWB.
Statement 67, an embodiment of the inventive concept includes the method of statement 35, further comprising: the bandwidth of the device is throttled.
Statement 68, an embodiment of the inventive concept includes the method of statement 67, further comprising: measuring a bandwidth used by at least one PF of the plurality of PFs disclosed by the endpoint.
Statement 69, an embodiment of the inventive concept includes the method of statement 67, wherein throttling the bandwidth of the device comprises: an inter-packet gap is introduced between the first data packet and the second data packet.
Statement 70, an embodiment of the inventive concept includes the method of statement 67, wherein throttling the bandwidth of the device comprises: throttling a bandwidth of the device based at least in part on one of a QoS policy set by the host, a QoS policy set by the device, a temperature of the LWB, a power consumption of the LWB, a temperature of the SSD, and a power consumption of the SSD.
Statement 71, an embodiment of the inventive concept includes the method of statement 70, wherein throttling the bandwidth of the device based at least in part on one of the QoS policy set by the host, the QoS policy set by the device, the temperature of the LWB, the power consumption of the LWB, the temperature of the SSD, and the power consumption of the SSD comprises: throttling a bandwidth of the device based at least in part on one of a temperature of the LWB, a power consumption of the LWB, a temperature of the SSD, and a power consumption of the SSD exceeding a threshold.
Statement 72, an embodiment of the inventive concept includes the method of statement 71, further comprising: ending bandwidth throttling of the device based at least in part on one of the temperature of the LWB, the power consumption of the LWB, the temperature of the SSD, and the power consumption of the SSD falling below a second threshold.
Statement 73, an embodiment of the inventive concept includes the method of statement 35, further comprising:
receiving a QoS policy of a PF of the plurality of PFs exposed by an endpoint of an LWB; and
enforcing QoS policies of PFs of the plurality of PFs exposed by endpoints of the LWB.
Statement 74, an embodiment of the inventive concept includes the method of statement 73, wherein receiving the QoS policy of the PF of the plurality of PFs disclosed by the endpoint of the LWB comprises: receiving, from one of the host and the device, a QoS policy of a PF of the plurality of PFs disclosed by an endpoint of the LWB.
Statement 75, an embodiment of the inventive concept includes an article comprising a non-transitory storage medium having instructions stored thereon that, when executed by a machine, cause:
enumerating at least one Physical Function (PF) exposed by a device using a root port of a lightweight bridge (LWB);
enumerating at least one Virtual Function (VF) exposed by the device using a root port of the LWB;
generating a plurality of PFs at endpoints of the LWB for disclosure to a host; and
mapping the PFs at the end points of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Statement 76, an embodiment of the inventive concept includes the article of claim 75, wherein:
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating the at least one PF exposed by the device using a root port of the LWB at start-up of the LWB; and is
The step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: enumerating the at least one VF exposed by the device using the LWB's root port at LWB startup.
Statement 77, an embodiment of the inventive concept includes the article of claim 75, wherein:
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating the at least one PF exposed by the device using a root port of the LWB when the device is connected to the LWB; and is
The step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: when a device is connected to the LWB, the at least one VF exposed by the device is enumerated using the root port of the LWB.
Statement 78, an embodiment of the inventive concept includes the article of claim 75, wherein the apparatus comprises a Solid State Drive (SSD).
Statement 79, embodiments of the inventive concepts include an article according to statement 75, wherein:
the plurality of PFs includes a plurality of peripheral component interconnect express (PCIe) PFs;
the at least one physical function PF comprises at least one PCIe PF; and is
The at least one Virtual Function (VF) comprises at least one PCIe VF.
Statement 80, embodiments of the inventive concepts include an article according to statement 79, wherein:
APP-EP includes PCIe APP-EP (PAPP-EP); and is
APP-RP includes PCIe APP-RP (PAPP-RP).
Statement 81, embodiments of the inventive concepts include an article according to statement 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving an enumeration from a host; and
disclosing the plurality of PFs at the endpoints of the LWBs to a host.
Statement 82, an embodiment of the inventive concept includes the article of claim 75, wherein:
the step of enumerating at least one Physical Function (PF) exposed by the device using a root port of a lightweight bridge (LWB) comprises: enumerating at least one second PF exposed by a second device using a second root port of the LWB;
the step of enumerating at least one Virtual Function (VF) exposed by the device using the root port of the LWB comprises: enumerating at least one second VF exposed by the second device using the second root port of the LWB; and
the step of mapping the PFs to the at least one PF and the at least one VF comprises: mapping the PFs to the at least one PF, the at least one VF, the at least one second PF, and the at least one second VF.
Statement 83, an embodiment of the inventive concept includes an article according to statement 82, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, cause:
determining resources of the apparatus and the second apparatus; and
resources of the device and the second device are aggregated.
Statement 84, an embodiment of the inventive concept includes an article according to statement 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving, at a PF of the plurality of PFs exposed by an endpoint, a request from a host;
mapping PFs of the PFs exposed by the endpoints of the LWB to VFs of the at least one VF exposed by the device;
converting the request from a PF of the plurality of PFs exposed by the endpoint to a VF of the at least one VF exposed by the device; and
the request is sent to a VF of the at least one VF exposed by the device.
Statement 85, embodiments of the inventive concepts include the article of claim 84, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving a response to the request from a VF of the at least one VF exposed by the device;
mapping a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by an endpoint of the LWB;
converting a response from a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by the endpoint; and
sending a response to the host via a PF of the plurality of PFs exposed by the endpoints of the LWB.
Statement 86, embodiments of the inventive concepts include the article of claim 85, wherein the step of converting the response from a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by the endpoint comprises: modifying an address in the response based on a first Base Address Register (BAR) of a PF of the plurality of PFs exposed by the endpoint and a second BAR of a VF of the at least one VF exposed by the device.
Statement 87, embodiments of the inventive concepts include the article of claim 85, wherein the step of mapping a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by the endpoints of the LWB comprises: mapping a VF of the at least one VF exposed by the device to a PF of the plurality of PFs exposed by the endpoints of the LWB using a configuration table in the APP-RP of the LWB.
Statement 88, embodiments of the inventive concepts include an article according to statement 84, wherein the step of converting a request from a PF of the plurality of PFs exposed by an endpoint to a VF of the at least one VF exposed by a device comprises: modifying the address in the request based on a first Base Address Register (BAR) of a PF of the plurality of PFs exposed by the endpoint and a second BAR of a VF of the at least one VF exposed by the device.
Statement 89, embodiments of the inventive concepts include the article of claim 84, wherein the step of mapping PFs of the plurality of PFs exposed by endpoints of an LWB to VFs of the at least one VF exposed by a device comprises: mapping PFs of the PFs exposed by the endpoints of the LWB to VFs of the at least one VF exposed by the device using a configuration table in the APP-RP of the LWB.
Statement 90, embodiments of the inventive concepts include the article of claim 84, wherein sending a request to a VF of the at least one VF exposed by the device comprises:
selecting among a device and a second device that exposes at least one second PF and at least one second VF based at least in part on a mapping of PFs of the PFs exposed by endpoints of the LWB to VFs of the at least one VF exposed by the device; and
the request is sent to an APP-EP associated with the device.
Statement 91, an embodiment of the inventive concept includes an article according to statement 84, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
determining a credit for a data transfer involving an apparatus; and
a credit is issued to the device.
Statement 92, an embodiment of the inventive concept includes the article of claim 91, wherein issuing credits to a device comprises: the credit is written to an address in the device.
Statement 93, an embodiment of the inventive concept includes the article of claim 91, wherein the step of issuing credits to the device comprises: the device reads credits from an address in the LWB.
Statement 94, an embodiment of the inventive concept includes the article of claim 91, wherein issuing credits to a device comprises: a message is sent from the LWB to the device, the message including the credit.
Statement 95, embodiments of the inventive concepts include an article according to statement 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving a configuration write request; and
configuring the plurality of PFs exposed by the endpoints of the LWB based at least in part on the configuration write request.
Statement 96, an embodiment of the inventive concept includes an article according to statement 95, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in: configuring the at least one PF and the at least one VF exposed by the device to mirror a configuration of the plurality of PFs exposed by the endpoints of the LWB based at least in part on the configuration write request.
Statement 97, an embodiment of the inventive concept includes the article of claim 95, wherein the step of receiving the configuration write request comprises: a configuration write request is received from a host.
Statement 98, an embodiment of the inventive concept includes the article of claim 95, wherein the step of receiving the configuration write request comprises: a configuration write request is received from a device.
Statement 99, embodiments of the inventive concepts include the article of claim 98, wherein configuring the plurality of PFs disclosed by the endpoints of the LWB based at least in part on the configuration write request comprises: configuring capabilities of at least one PF of the plurality of PFs exposed by an endpoint based at least in part on a configuration write request received from a device.
Statement 100, an embodiment of the inventive concepts includes the article of claim 95, wherein configuring the plurality of PFs disclosed by the endpoints of the LWB based at least in part on the configuration write request comprises: a single root input/output virtualization (SR-IOV) sequence is performed using a state machine.
Statement 101, an embodiment of the inventive concept includes an article according to statement 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving a configuration reading request;
determining a configuration of the at least one PF and the at least one VF of a device; and
responding to a configuration read request with a configuration of the at least one PF and the at least one VF of the device.
Statement 102, embodiments of the inventive concepts include the article of claim 101, wherein determining a configuration of the at least one PF and the at least one VF of a device comprises: determining a configuration of the plurality of PFs exposed by the endpoints of the device.
Statement 103, embodiments of the inventive concepts include the article of claim 101, wherein determining a configuration of the at least one PF and the at least one VF of a device comprises: querying a device for a configuration of the at least one PF and the at least one VF of the device.
Statement 104, embodiments of the inventive concepts include the article of claim 101, wherein determining a configuration of the at least one PF and the at least one VF of a device comprises: the configuration read request is not delivered to the device.
Statement 105, an embodiment of the inventive concept includes an article according to statement 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in: dynamically changing a mapping of the plurality of PFs at the end point of the LWB to the at least one PF and the at least one VF exposed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB.
Statement 106, an embodiment of the inventive concept includes an article according to statement 105, wherein dynamically changing the mapping of the plurality of PFs at the end points of the LWB to the at least one PF and the at least one VF disclosed by the apparatus using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB comprises: dynamically changing a mapping of the plurality of PFs at the end point of the LWB to the at least one PF and the at least one VF exposed by the device using an application layer-end point (APP-EP) and an application layer-root port (APP-RP) of the LWB based at least in part on a change in quality of service (QoS) policy, a storage parameter, a day, a time of day, or a bandwidth usage for at least one PF of the plurality of PFs exposed by the end point of the LWB.
Statement 107, an embodiment of the inventive concept includes the article of claim 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in: the bandwidth of the device is throttled.
Statement 108, an embodiment of the inventive concept includes an article according to statement 107, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in: measuring a bandwidth used by at least one PF of the plurality of PFs disclosed by the endpoint.
Statement 109, an embodiment of the inventive concept includes the article of claim 107, wherein throttling bandwidth of the device comprises: an inter-packet gap is introduced between the first data packet and the second data packet.
Statement 110, an embodiment of the inventive concept includes the article of manufacture of statement 107, wherein throttling bandwidth of the device comprises: throttling a bandwidth of the device based at least in part on one of a QoS policy set by the host, a QoS policy set by the device, a temperature of the LWB, a power consumption of the LWB, a temperature of the SSD, and a power consumption of the SSD.
Statement 111, embodiments of the inventive concepts include an article according to statement 110, wherein throttling a bandwidth of a device based at least in part on one of a QoS policy set by a host, a QoS policy set by the device, a temperature of an LWB, a power consumption of the LWB, a temperature of an SSD, and a power consumption of the SSD comprises: throttling a bandwidth of the device based at least in part on one of a temperature of the LWB, a power consumption of the LWB, a temperature of the SSD, and a power consumption of the SSD exceeding a threshold.
Statement 112, an embodiment of the inventive concept includes an article according to statement 111, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in: ending bandwidth throttling of the device based at least in part on one of the temperature of the LWB, the power consumption of the LWB, the temperature of the SSD, and the power consumption of the SSD falling below a second threshold.
Statement 113, an embodiment of the inventive concept includes the article of claim 75, the non-transitory storage medium having stored thereon further instructions that, when executed by a machine, result in:
receiving a QoS policy of a PF of the plurality of PFs exposed by an endpoint of an LWB; and
enforcing QoS policies of PFs of the plurality of PFs exposed by endpoints of the LWB.
Statement 114, an embodiment of the inventive concept includes the article of claim 75, wherein receiving the QoS policy for a PF of the plurality of PFs disclosed by the endpoints of the LWB comprises: receiving, from one of the host and the device, a QoS policy of a PF of the plurality of PFs disclosed by an endpoint of the LWB.
Accordingly, the present detailed description and accompanying materials are intended to be illustrative only, and should not be taken as limiting the scope of the invention, in view of the wide variety of permutations to the embodiments described herein. What is claimed as the invention, therefore, is all such modifications as may come within the scope and spirit of the following claims and equivalents thereto.

Claims (20)

1. A lightweight bridge circuit, comprising:
an endpoint for connecting to a host, the endpoint disclosing a plurality of physical functions,
a root port for connecting to a device, the device exposing at least one physical function and at least one virtual function to the root port; and
an application layer-endpoint and an application layer-root port for translating between the plurality of physical functions exposed to the host and the at least one physical function and the at least one virtual function exposed by the device,
wherein the application layer-endpoint and the application layer-root port implement a mapping between the plurality of physical functions exposed by the endpoint and the at least one physical function and the at least one virtual function exposed by the appliance.
2. A lightweight bridge circuit according to claim 1, further comprising: and the configuration manager is used for configuring the application layer-end point and the application layer-root port.
3. A lightweight bridge circuit according to claim 2, wherein the application layer-root port comprises a configuration table for storing a mapping between the plurality of physical functions exposed by the endpoints and the at least one physical function and the at least one virtual function exposed by the apparatus.
4. A lightweight bridge circuit according to claim 2, wherein the configuration manager enumerates the at least one physical function and the at least one virtual function exposed by the device, and generates the plurality of physical functions exposed by the endpoints based at least in part on the at least one physical function and the at least one virtual function exposed by the device.
5. A lightweight bridge circuit according to claim 2, wherein the configuration manager ensures that a first configuration of the device mirrors a second configuration of the endpoint.
6. A lightweight bridge circuit according to claim 2, wherein the configuration manager comprises: a read-only memory and a state machine for implementing a single root input/output virtualization sequence.
7. A lightweight bridge circuit according to any of claims 1 to 6, further comprising:
a second application layer-endpoint and a second application layer-root port;
a second device disclosing at least one second physical function and at least one second virtual function; and
a multiplexer and a demultiplexer arranged to be connected to the end points and to each of the application layer-end points and the second application layer-end points,
wherein the second application layer-endpoint and application layer-root port implement a second mapping between the plurality of physical functions exposed by the endpoint and the at least one second physical function and the at least one second virtual function exposed by the second apparatus.
8. A lightweight bridge circuit according to claim 7, wherein the lightweight bridge circuit provides an aggregate resource for a device and a second device.
9. A lightweight bridge circuit according to claim 1, wherein the lightweight bridge circuit implements bandwidth throttling.
10. A lightweight bridge circuit according to claim 9, wherein the lightweight bridge circuit implements bandwidth throttling based at least in part on at least one of a policy set by a host, a policy set by a device, a temperature of the lightweight bridge circuit, a power consumption of the lightweight bridge circuit, a temperature of a device, and a power consumption of a device.
11. The lightweight bridge circuit of claim 1, wherein the lightweight bridge circuit implements QoS policies on a physical function of the plurality of physical functions exposed by an apparatus.
12. A method of operating a lightweight bridge, comprising:
enumerating at least one physical function exposed by the device using a root port of the lightweight bridge;
enumerating at least one virtual function exposed by the device using a root port of the lightweight bridge;
generating a plurality of physical functions at endpoints of a lightweight bridge to be exposed to a host; and
mapping the plurality of physical functions at the end points of the lightweight bridge to the at least one physical function and the at least one virtual function exposed by the apparatus using the application layer-end points and the application layer-root ports of the lightweight bridge.
13. The method of claim 12, wherein:
the step of enumerating at least one physical function exposed by the device using a root port of the lightweight bridge comprises: enumerating at least one second physical function exposed by the second device using a second root port of the lightweight bridge;
the step of enumerating at least one virtual function exposed by the device using a root port of the lightweight bridge comprises: enumerating at least one second virtual function exposed by the second device using a second root port of the lightweight bridge; and is
Mapping the plurality of physical functions to the at least one physical function and the at least one virtual function comprises: mapping the plurality of physical functions to the at least one physical function, the at least one virtual function, the at least one second physical function, and the at least one second virtual function.
14. The method of claim 13, further comprising:
determining resources of the apparatus and the second apparatus; and
resources of the device and the second device are aggregated.
15. The method of any of claims 12 to 14, further comprising:
receiving a request from a host for a physical function of the plurality of physical functions exposed by an endpoint;
mapping the physical function of the plurality of physical functions exposed by the endpoints of the lightweight bridge to a virtual function of the at least one virtual function exposed by the apparatus;
translating the request from a request for the physical function of the plurality of physical functions exposed by an endpoint to a request for the virtual function of the at least one virtual function exposed by an appliance; and
sending the request to the virtual function of the at least one virtual function exposed by the device.
16. The method of claim 15, further comprising:
determining a credit for a data transfer involving a device; and
issuing the credit to a device.
17. The method of claim 12, further comprising: the bandwidth of the device is throttled.
18. The method of claim 17, wherein throttling the bandwidth of the device comprises: throttling a bandwidth of the device based at least in part on at least one of a QoS policy set by the host, a QoS policy set by the device, a temperature of the lightweight bridge, a power consumption of the lightweight bridge, a temperature of the device, and a power consumption of the device.
19. An article comprising a non-transitory storage medium having stored thereon instructions that, when executed by a machine, cause:
enumerating at least one physical function exposed by the device using a root port of the lightweight bridge;
enumerating at least one virtual function exposed by the device using a root port of the lightweight bridge;
generating a plurality of physical functions at endpoints of a lightweight bridge to be exposed to a host; and
mapping the plurality of physical functions at the end points of the lightweight bridge to the at least one physical function and the at least one virtual function exposed by the apparatus using the application layer-end points and the application layer-root ports of the lightweight bridge.
20. The article of claim 19, wherein:
the step of enumerating at least one physical function exposed by the device using a root port of the lightweight bridge comprises: enumerating at least one second physical function exposed by the second device using a second root port of the lightweight bridge;
the step of enumerating at least one virtual function exposed by the device using a root port of the lightweight bridge comprises: enumerating at least one second virtual function exposed by the second device using a second root port of the lightweight bridge; and is
Mapping the plurality of physical functions to the at least one physical function and the at least one virtual function comprises: mapping the plurality of physical functions to the at least one physical function, the at least one virtual function, the at least one second physical function, and the at least one second virtual function.
CN202010582886.0A 2019-06-24 2020-06-23 Lightweight bridge circuit and method of operating the same Pending CN112131166A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201962865962P 2019-06-24 2019-06-24
US62/865,962 2019-06-24
US202062964114P 2020-01-21 2020-01-21
US62/964,114 2020-01-21
US16/846,271 2020-04-10
US16/846,271 US11809799B2 (en) 2019-06-24 2020-04-10 Systems and methods for multi PF emulation using VFs in SSD controller

Publications (1)

Publication Number Publication Date
CN112131166A true CN112131166A (en) 2020-12-25

Family

ID=73851757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010582886.0A Pending CN112131166A (en) 2019-06-24 2020-06-23 Lightweight bridge circuit and method of operating the same

Country Status (2)

Country Link
JP (1) JP7446167B2 (en)
CN (1) CN112131166A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160154756A1 (en) * 2014-03-31 2016-06-02 Avago Technologies General Ip (Singapore) Pte. Ltd Unordered multi-path routing in a pcie express fabric environment
US20160306580A1 (en) * 2015-04-17 2016-10-20 Samsung Electronics Co., Ltd. System and method to extend nvme queues to user space
CN106484492A (en) * 2015-08-28 2017-03-08 杭州华为数字技术有限公司 The method and system of configuration interface
CN109656473A (en) * 2017-10-11 2019-04-19 三星电子株式会社 The method that bridge-set and offer are calculated close to storage

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7836238B2 (en) 2006-12-19 2010-11-16 International Business Machines Corporation Hot-plug/remove of a new component in a running PCIe fabric
CN105721357B (en) 2016-01-13 2019-09-03 华为技术有限公司 Switching equipment, peripheral parts interconnected High Speed System and its initial method
US20170212579A1 (en) 2016-01-25 2017-07-27 Avago Technologies General Ip (Singapore) Pte. Ltd. Storage Device With Power Management Throttling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160154756A1 (en) * 2014-03-31 2016-06-02 Avago Technologies General Ip (Singapore) Pte. Ltd Unordered multi-path routing in a pcie express fabric environment
US20160306580A1 (en) * 2015-04-17 2016-10-20 Samsung Electronics Co., Ltd. System and method to extend nvme queues to user space
CN106484492A (en) * 2015-08-28 2017-03-08 杭州华为数字技术有限公司 The method and system of configuration interface
CN109656473A (en) * 2017-10-11 2019-04-19 三星电子株式会社 The method that bridge-set and offer are calculated close to storage

Also Published As

Publication number Publication date
JP2021002348A (en) 2021-01-07
JP7446167B2 (en) 2024-03-08

Similar Documents

Publication Publication Date Title
CN107690622B9 (en) Method, equipment and system for realizing hardware acceleration processing
US20200278880A1 (en) Method, apparatus, and system for accessing storage device
US11809799B2 (en) Systems and methods for multi PF emulation using VFs in SSD controller
US9467512B2 (en) Techniques for remote client access to a storage medium coupled with a server
JP2019153304A (en) Acceleration module based on fpga for ssd, system, and method for operating the same
TWI772279B (en) Method, system and apparauts for qos-aware io management for pcie storage system with reconfigurable multi -ports
CN110727617B (en) Method and system for simultaneously accessing a two-wire SSD device via a PCIe EP and a network interface
AU2013388031A1 (en) Data processing system and data processing method
US20220300448A1 (en) Peripheral component interconnect express device and method of operating the same
JP2023524665A (en) Utilizing coherently attached interfaces in the network stack framework
US20230051825A1 (en) System supporting virtualization of sr-iov capable devices
US9753883B2 (en) Network interface device that maps host bus writes of configuration information for virtual NIDs into a small transactional memory
US20220300442A1 (en) Peripheral component interconnect express device and method of operating the same
US11347512B1 (en) Substitution through protocol to protocol translation
US20150142995A1 (en) Determining a direct memory access data transfer mode
US11003618B1 (en) Out-of-band interconnect control and isolation
JP6760579B2 (en) Network line card (LC) integration into host operating system (OS)
US9146693B2 (en) Storage control device, storage system, and storage control method
CN112131166A (en) Lightweight bridge circuit and method of operating the same
JP2022087808A (en) Method, system and computer program of notifying endpoint of storage area network congestion
JP2023528782A (en) How network adapters process data and network adapters
US10936219B2 (en) Controller-based inter-device notational data movement system
CN117591450B (en) Data processing system, method, equipment and medium
US8856481B1 (en) Data processing system having host-controlled provisioning of data storage resources
US11513690B2 (en) Multi-dimensional I/O service levels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination