US20230071775A1 - Method and apparatus for performing power analytics of a storage system - Google Patents

Method and apparatus for performing power analytics of a storage system Download PDF

Info

Publication number
US20230071775A1
US20230071775A1 US18/055,331 US202218055331A US2023071775A1 US 20230071775 A1 US20230071775 A1 US 20230071775A1 US 202218055331 A US202218055331 A US 202218055331A US 2023071775 A1 US2023071775 A1 US 2023071775A1
Authority
US
United States
Prior art keywords
power
storage device
local service
storage
service processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/055,331
Inventor
Ramdas P. Kachare
Wentao Wu
Sompong Paul Olarig
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/975,463 external-priority patent/US11481016B2/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority to US18/055,331 priority Critical patent/US20230071775A1/en
Assigned to SAMSUNG ELECTRONICS CO., LTD. reassignment SAMSUNG ELECTRONICS CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KACHARE, RAMDAS P., OLARIG, SOMPONG PAUL, WU, WENTAO
Publication of US20230071775A1 publication Critical patent/US20230071775A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3268Power saving in hard disk drive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3275Power saving in memory, e.g. RAM, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3215Monitoring of peripheral devices
    • G06F1/3221Monitoring of peripheral devices of disk drive devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3206Monitoring of events, devices or parameters that trigger a change in power modality
    • G06F1/3215Monitoring of peripheral devices
    • G06F1/3225Monitoring of peripheral devices of memory devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3287Power saving characterised by the action undertaken by switching off individual functional units in the computer system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations
    • G06F11/3062Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations where the monitored property is the power consumption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/263Arrangements for using multiple switchable power supplies, e.g. battery and AC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/28Supervision thereof, e.g. detecting power-supply failure by out of limits supervision
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/30Means for acting in the event of power-supply failure or interruption, e.g. power-supply fluctuations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2015Redundant power supplies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Definitions

  • U.S. application Ser. No. 16/167,306 is also a continuation-in-part of U.S. patent application Ser. No. 15/975,463, filed May 9, 2018, entitled “METHOD AND APPARATUS FOR SELF-REGULATING POWER USAGE AND POWER CONSUMPTION IN ETHERNET SSD STORAGE SYSTEMS”, which claims priority to and the benefit of U.S. Provisional Application No. 62/638,035, filed Mar. 2, 2018, the entire contents of both of which are incorporated herein by reference.
  • Ethernet-attached solid state drives eSSDs
  • NVMe Non-Volatile Memory Express
  • NVMe-oF NVMe Over Fabrics
  • Cloud-based storage providers typically charge users for storing their data on a monthly or annual basis based on the total storage space allocated to the user and either the average cost of energy consumed by all users or the maximum power consumption capable of being consumed by the user based on the system. For example, for two users who have purchased the same amount of cloud storage space, a user who stores only a small amount of data relative to the total purchased storage space and only stores data on an infrequent basis will be charged the same as a user who is regularly removing and added new data and using the majority of his/her purchased storage space. Ideally, users should be charged for storage based on the energy resources actually consumed. However, there is no accurate method for calculating the power consumption of individual users, or calculating power consumption in real time.
  • aspects of embodiments of the present invention are directed to a storage system, and a method of operating the same, capable of managing (e.g., optimizing) operation of the power supplies of the system by dynamically monitoring their operation and ensuring that active power supplies operate in their high power-efficiency range.
  • aspects of embodiments of the present invention are directed to a storage system, and a method of operating the same, capable of managing (e.g., optimizing) power usage of storage devices of a storage bank by dynamically adjusting their maximum power caps based on the workload of the storage bank.
  • a storage system comprising: one or more storage devices; a plurality of power supplies configured to supply power to the storage device; a processor; and a memory having stored thereon instructions that, when executed by the processor, cause the processor to perform: determining whether multiple power supplies of the plurality of power supplies are active; in response to determining that multiple power supplies are active: determining a total power consumption of the one or more storage devices; in response to determining that the total power consumption is less than a first percentage threshold of a load of active ones of the power supplies, deactivating the active ones of the power supplies one by one until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active ones of the power supplies; and in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active ones of the power supplies, activating deactivated ones of the power supplies one by one until the total power consumption is less than the second percentage threshold of the load of each of the active
  • the determining the total power consumption of the one or more storage devices comprises: obtaining an actual power consumption of each storage device of the one or more storage devices from the storage device or a corresponding power meter; and summing the actual power consumption of each storage device to obtain the total power consumption.
  • the obtaining the actual power consumption of each storage device comprises: retrieving power measurement information from a power log corresponding to the storage device, wherein the power measurement information is measured, and recorded in the power log, by the corresponding power meter.
  • the corresponding power meter is internal to the storage device.
  • the corresponding power meter is external to and coupled to the storage device.
  • the first percentage threshold of the load of each of the active ones of the power supplies is 40% of the load of each of the active ones of the power supplies.
  • the second percentage threshold of the load of each of the active ones of the power supplies is 90% of the load of each of the active ones of the power supplies.
  • the instructions further cause the processor to perform: determining whether only one power supply of the plurality of power supplies is in a high availability mode; and in response to determining that only one power supply of the plurality of power supplies is in a high-availability mode, generating a warning message indicating that the one power supply is in high-availability mode.
  • the deactivating the active ones of the power supplies one by one comprises: deactivating an active power supply of the active ones of the power supplies; determining that the total power consumption of the one or more storage devices is less than the first percentage threshold of a load of the active ones of the power supplies; and in response to the determining, deactivating an other active power supply of the active ones of the power supplies.
  • the activating the deactivated ones of the power supplies one by one comprises: activating a deactivated power supply of the power supplies; determining that the total power consumption of the one or more storage devices is equal to or greater than the second percentage threshold of a load of the active ones of the power supplies; and in response to the determining, enabling an other deactivated power supply of the power supplies.
  • a method of managing a storage system comprising one or more storage devices and a plurality of power supplies configured to supply power to the storage device, the method comprising: determining, by a processor of the storage device, whether multiple power supplies of the plurality of power supplies are active; in response to determining that multiple power supplies are active: determining, by the processor, a total power consumption of the one or more storage devices; in response to determining that the total power consumption is less than a first percentage threshold of a load of active ones of the power supplies, deactivating, by the processor, the active ones of the power supplies one by one until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active ones of the power supplies; and in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active ones of the power supplies, activating, by the processor, deactivated ones of the power supplies one by one until the total power consumption is less than the second percentage threshold of
  • a storage system comprising: a plurality of storage devices, each storage device of the plurality of storage devices being configured to measure a power consumption of the storage device; a processor in communication with the plurality of storage devices; and a memory having stored thereon instructions that, when executed by the processor, cause the processor to perform: determining whether one or more first storage devices of the plurality of storage devices are idle or are in an idle state; in response to determining that the one or more first storage devices are in an idle state, instructing the one or more first storage devices to operate at lower power caps; determining whether one or more second storage devices of the plurality of storage devices are consuming power under a threshold power level; and in response to determining that the one or more second storage devices are consuming power under the threshold power level, instructing the one or more second storage devices to operate at or below the threshold power level.
  • the determining whether one or more first storage devices are in idle state obtaining power consumption of each storage device of the plurality of storage devices by retrieving a corresponding power log from the storage device; comparing the power consumption of each storage device with an idle power level; and determining whether the one or more first storage devices have power consumptions that are at or below the idle power level.
  • the power log stores actual power consumption of the corresponding storage device as measured by a corresponding power meter.
  • instructing the one or more first storage devices to operate at the lower power caps comprises: instructing the one or more first storage devices to change power states to a power state having a lower maximum power rating.
  • determining whether the one or more second storage devices of the plurality of storage devices are consuming power under a threshold power level comprises: obtaining power consumption of each storage device of the plurality of storage devices by retrieving a corresponding power log from the storage device; comparing the power consumption of each storage device with the threshold power level; and determining whether the one or more first storage devices have power consumptions that below the threshold power level.
  • instructing the one or more second storage devices to operate at or below the threshold power level comprises: instructing the one or more second storage devices to change power states to a power state having a maximum power rating corresponding to the threshold power level.
  • the instructions further cause the processor to perform: determining whether one or more storage slots are not occupied by any storage device; and in response to determining that the one or more storage slots are not occupied by any storage device: identifying one or more power meters associated with the one or more storage slots; and instructing the identified one or more power meters to operate at lower power cap.
  • instructing the identified one or more power meters to operate at lower power cap comprises: instructing the one or more power meters to operate at a lowest power state.
  • instructing the identified one or more power meters to operate at lower power cap comprises: instructing the one or more power meters to deactivate.
  • FIG. 1 is an internal block diagram of a storage device according to an embodiment of the present invention.
  • FIG. 2 is a flow chart of a method for collecting power consumption measurements from a power measurement unit in the storage device of FIG. 1 .
  • FIG. 3 is a schematic diagram of a storage system incorporating multiple storage devices that are capable of providing power measurements.
  • FIG. 4 is a block diagram of an embodiment of the storage system of FIG. 3 in which a PCIe switch is used.
  • FIG. 5 is a diagram depicting an embodiment in which power measurements are transferred to a local service processor based on a query from the local service processor.
  • FIG. 6 is a diagram depicting an embodiment in which power measurements are set by the local service processor.
  • FIG. 7 shows an example of a power policy which can be used to by the local service processor 50 to control power consumption of a storage device.
  • FIG. 8 is a diagram depicting an embodiment in which power measurements are stored in a controller memory buffer until fetched by the local service processor.
  • FIG. 9 is a diagram depicting an embodiment in which power measurements taken by a power measurement unit are directly accessible to the local service processor.
  • FIG. 10 is an example of a power log according to an embodiment of the present invention.
  • FIG. 11 is an illustrative method of how a storage system manages the power reporting of multiple storage devices in its chassis using the power log of FIG. 10 .
  • FIG. 12 is a block diagram illustrating a storage system utilizing a storage bank and a power distribution unit, according to some exemplary embodiments of the present invention.
  • FIGS. 13 A- 13 D illustrate histograms of power consumption of a storage system as generated by the local service processor, according to some exemplary embodiments of the present invention.
  • FIG. 14 is flow diagram illustrating a process of managing the operation of the power supplies of a storage system, according to some exemplary embodiments of the present invention.
  • FIG. 15 is flow diagram illustrating a process of managing the storage devices of the storage system, according to some exemplary embodiments of the present invention.
  • the term “substantially,” “about,” and similar terms are used as terms of approximation and not as terms of degree, and are intended to account for the inherent deviations in measured or calculated values that would be recognized by those of ordinary skill in the art. Further, the use of “may” when describing embodiments of the present invention refers to “one or more embodiments of the present invention.” As used herein, the terms “use,” “using,” and “used” may be considered synonymous with the terms “utilize,” “utilizing,” and “utilized,” respectively. Also, the term “exemplary” is intended to refer to an example or illustration.
  • the electronic or electric devices and/or any other relevant devices or components according to embodiments of the present invention described herein may be implemented utilizing any suitable hardware, firmware (e.g. an application-specific integrated circuit), software, or a combination of software, firmware, and hardware.
  • the various components of these devices may be formed on one integrated circuit (IC) chip or on separate IC chips.
  • the various components of these devices may be implemented on a flexible printed circuit film, a tape carrier package (TCP), a printed circuit board (PCB), or formed on one substrate.
  • the various components of these devices may be a process or thread, running on one or more processors, in one or more computing devices, executing computer program instructions and interacting with other system components for performing the various functionalities described herein.
  • the computer program instructions are stored in a memory which may be implemented in a computing device using a standard memory device, such as, for example, a random access memory (RAM).
  • the computer program instructions may also be stored in other non-transitory computer readable media such as, for example, a CD-ROM, flash drive, or the like.
  • a person of skill in the art should recognize that the functionality of various computing devices may be combined or integrated into a single computing device, or the functionality of a particular computing device may be distributed across one or more other computing devices without departing from the spirit and scope of the exemplary embodiments of the present invention.
  • Embodiments of the present invention include a storage device, such as an SSD (e.g., NVMe or NVMe-oF SSD), that is capable of reporting its actual power consumption to the local service processor, for example, a baseboard management controller (BMC).
  • BMC baseboard management controller
  • the storage device can report to the local service processor or BMC via a system management bus (SMBus) or a Peripheral Component Interconnect Express (PCIe), and can report by one of various protocols, such as by a Management Component Transport Protocol (MCTP) or by a NVMe Management Interface protocol for NVMe SSDs storage devices.
  • the storage system may be an NVMe-oF based system.
  • Further embodiments include a storage system including several storage devices in which each storage device is capable of reporting its actual power consumption to the local service processor. In such a system, the local service processor can provide power profiles and analytics of the storage system and individual storage devices in the system.
  • FIG. 1 depicts an internal block diagram of a storage device 10 according to an embodiment of the present invention. While diagram depicts features relevant to the illustrated embodiment of the invention, the storage device 10 may include additional components.
  • the storage device 10 may be an SSD, an Ethernet SSD (eSSD), an NVMe SSD, an NVMe-oF SSD, a SAS or SATA SSD.
  • the storage device 10 includes internal components, including a controller 11 , a memory 12 , flash dies 13 , a power metering unit (PMU) 14 and a connector 15 .
  • the controller 11 as known as the processor, implements firmware to retrieve and store data in the memory 12 and flash dies 13 and to communicate with a host computer.
  • the controller 11 may be an SSD controller, an ASIC SSD controller, or an NVMe-oF/EdgeSSD controller.
  • the memory 12 can be a random access memory such as DRAM or MRAM and the flash dies 13 may be NAND flash memory devices, though the invention is not limited thereto.
  • the controller 11 can be connected to the memory 12 via memory channel 22 and can be connected to the flash dies 13 via flash channels 23 .
  • the controller 11 can communicate with a host computer via a host interface 20 that connects the controller 11 to the host computer through the connector 15 .
  • the host interface 20 may be a PCIe connection, an Ethernet connection or other suitable connection.
  • the connector 15 may be U.2/M.2 connectors or other suitable connector(s).
  • the PMU 14 allows the storage device 10 to support power management capabilities by measuring actual power consumption of the storage device 10 .
  • the storage device 10 is supplied power through the connector 15 via power rails or pins 30 .
  • the pins 30 may be 12 V and 30 V pins.
  • the pins 30 may be 5 V and 12 V pins (an NVMe SSD may only use the 12 V pin, while a SAS or SATA SSD may use both rails).
  • Power rails 30 supply power to the various components of the storage device 10 .
  • the power rail may supply power to the various components of the storage device 10 via the PMU 14 and various intermediary voltage rails. An embodiment of this is shown in FIG.
  • the power rails 30 supply power to the PMU 14 , which then distributes power to other components of the storage device 10 .
  • the PMU 14 drives power to the flash dies 13 via flash voltage rails 33 .
  • the PMU 14 may similarly drive all power rails to the memory 12 via memory voltage rails 32 .
  • Power can be supplied to the controller 11 by the PMU 14 through multiple voltage rails, such as, for example, a core voltage rail 34 , an I/O voltage rail 35 and one or more other voltage rails 36 .
  • Additional voltage rails, such as an additional voltage rail 37 may be included to connect other various components that may be included in the storage device 10 .
  • the various voltage rails 30 , 33 , 34 , 35 , 36 , 37 used in the storage device 10 can be in a range of from 12V down to 0.6V, including 12V and/or 3.3V rails, for example, when the storage device 10 is an NVMe SSD. While, in the embodiments shown in FIG. 1 , the voltage regulators are built into (or integrated with) the PMU 14 , embodiments of the present invention are not limited thereto, and the voltage regulators may be external to the PMU 14 .
  • power supply rails 20 are provided by the PMU 14 inside the storage device 10 to generate power consumption measurements (“power measurements”) of the various voltages rails used by the components of the storage device 10 , for example, used by components such as the controller 11 , the flash dies 13 , the memory 12 and other various components that may be included in the storage device 10 .
  • the PMU 14 can be programmed to support get/set Power State by Power Info from the host computer or BMC.
  • the PMU 14 can measure the amount of current drawn on various voltage rails it is driving, for example, voltage rails 32 , 33 , 34 , 35 , 36 and 37 .
  • the PMU can output power measurements including the average, minimum and maximum voltage usage by the voltage rails 32 , 33 , 34 , 35 , 36 and 37 of the storage device 10 .
  • the PMU 14 can meter each voltage rail 32 , 33 , 34 , 35 , 36 and 37 individually, with the summation of all voltage rails 32 , 33 , 34 , 35 , 36 and 37 used by the storage device 10 being the total power consumed by the storage device 10 .
  • the power measurements metered at the PMU 14 can be read by the controller 11 using a PMU/controller interface 41 .
  • the PMU/controller interface 41 may be an I2C/SMBus.
  • the controller 11 can then provide these power measurements to a local service processor 50 (see FIG. 3 ), such as a BMC, via either the host interface 20 or a separate controller/host interface 42 . If a separate controller/host interface or side band bus 42 is used, that interface may be an I2C/SMBus. If the controller/host interface 42 is a PCIe connection, the controller 11 can provide power measurements to the local service processor 50 via NVMe-MI or MCTP protocols, as shown in FIG. 4 . The PMU 14 can report/output the power measurements periodically as specified by the local service processor 50 or passively keep track via internal counters which are accessible to the local service processor 50 .
  • FIG. 2 is a flow chart of a method for collecting power consumption measurements from the PMU 14 of the storage device 10 .
  • power measurements can be read at predetermined intervals.
  • the power measurements can be read from the PMU 14 of the storage device 10 at the user's configurable frequency such as 1 second, 5 seconds, more than 5 seconds, or every few minutes.
  • that storage device 10 can read the power measurements only as needed (see, e.g., FIGS. 8 and 9 ), for example, at the completion of a specific job.
  • the frequency at which the power measurements are read is hereinafter called a time unit.
  • the controller 11 For every time unit, the controller 11 prepares (S 1 ) to receive power measurements from the PMU 14 for the various voltage rails 30 , 33 , 34 , 35 , 36 , 37 .
  • the power measurement is then annotated with a timestamp (S 4 ) and a Host ID (S 5 ).
  • the received power measurement is then saved (S 6 ) to a power log.
  • the power log may include internal register(s) or may be included as part of the PMU's embedded non-volatile memory.
  • the PMU 14 is again queried (S 7 ) until all power measurements are received from the various voltage rails 30 , 33 , 34 , 35 , 36 , 37 . Once all power measurements are complete and the annotated power measurements are saved in the power log, these power measurements persist (S 8 ) in the power log through resets and power cycles.
  • the power log pages can also include any or all of the following: Namespace ID, NMV Set, read I/Os, write I/Os, SQ ID, Stream ID, and other suitable parameters.
  • the controller 11 also implements actual power (AP) registers which are accessible by the local service processor 50 . This allows a variety of parameters associated with the storage device and the power measurements to be mapped with fine granularity.
  • the power log can be special proprietary or vendor defined log pages.
  • the power log can be read by the local service processor 50 using existing standard protocols through either the host interface 20 or the separate controller/host interface or side-band bus 42 , whichever is used.
  • the power log can be read by a BMC using the NVMe-MI protocol via the controller/host interface 42 , which may be a SMBus or PCIe.
  • the above method provides dynamic, real-time output of actual power consumption measurements without affecting the I/O of the storage device.
  • the local service processor can implement power budgets and allocate power to the storage device based on its actual power usage.
  • the local service processor can implement power budgets similar to existing industry standards for allocated power budget registers.
  • the storage device can report real time power consumption to system management software, such as Samsung's DCP or Redfish.
  • FIG. 3 is a block diagram of a storage system 100 incorporating multiple storage devices 10 .
  • the storage system 100 includes the local service processor 50 attached to multiple storage devices 10 .
  • Each storage device 10 has a PMU 14 to measure power consumption as described above with respect to FIGS. 1 and 2 .
  • the storage devices 10 provide power measurements to the local service processor 50 via the controller/host interface 42 .
  • the controller/host interface 42 may be an I2C/SMBus or PCIe bus.
  • the power measurements may be transferred to the local service processor 50 using NVMe protocols, such as NVMe-MI, MCTP over PCI-e, or I2C Bus protocols. If the storage device 10 is connected via a SMBus/I2C connection, the local service processor 50 can even access the power log during a power failure using these existing standard protocols.
  • FIG. 4 is a block diagram of an embodiment of the storage system 100 of FIG. 3 in which a PCIe switch 60 is used.
  • the storage devices 10 are connected to the local processor 50 via the PCIe switch 60 .
  • the power measurements may be transferred to the local service processor 50 via the PCIe switch 60 using suitable protocols such as, for example, NVMe-MI and/or MCTP.
  • the local service processor 50 and the multiple storage devices 10 can be housed within the same chassis allowing the local service processor 50 to process the power measurements of the multiple storage devices 10 according to chassis power management requirements; however, the invention is not limited thereto. For example, power measurements can also be processed at the individual storage device level.
  • NVMe specifications can define power measurements and their process mechanism. Based on this mechanism, the storage devices 10 (e.g., an NVMe SSD) can support power management either queried by the local service processor 50 ( FIG. 5 ) or set by the local service processor 50 ( FIG. 6 ).
  • FIG. 5 is a diagram depicting an embodiment in which power measurements are transferred to the local service processor 50 based on a query from the local service processor 50 .
  • the controller's firmware then fetches (S 11 ) the power measurement information from the PMU 14 .
  • the firmware of the controller 11 receives the information and sends (S 12 ) that information via direct memory access (DMA) to the local service processor 50 .
  • the controller's firmware then sends (S 13 ) a completion notice to the local service processor 50 to signal completion of the query.
  • This embodiment allows for real-time retrieval of power measurements from the storage device 10 .
  • FIG. 6 is a diagram depicting an embodiment in which power measurements are set by the local service processor 50 .
  • the controller's firmware then uses DMA to request (S 21 ) the power measurement budget from the local service processor 50 .
  • the firmware of the controller 11 receives the information and sets (S 22 ) the power measurement budget of the PMU 14 . In response, the controller's firmware processes the new power state transaction.
  • the controller's firmware queries the current power state job in the PMU 14 to ensure that all tasks that rely on the current power state are fully completed successfully. Then, the firmware changes the current power state from the current one to the next one required by the power measurement budget. The controller's firmware starts to process new tasks which rely on the power state using the allocated power measurement budget. The controller's firmware then sends (S 23 ) a completion notice to the local service processor 50 to signal that the new power state has been set.
  • the local service processor 50 can control and throttle the power consumption of a particular storage device 10 to meet an allocated power budget of the local service processor 50 .
  • the controller 11 can enforce the power budget allocations programmed by the local service processor 50 . If the actual power consumption exceeds the set threshold, the controller 11 can throttle the I/O performance for that parameter in order to minimize power consumption and to stay within the allocated power budget.
  • the controller 11 can, for example, self-adjust by lowering the internal power state automatically when exceeding the allocated power budget.
  • the controller 11 can then report back to the local service processor 50 so that the local service processor 50 can reallocate the available power to some other devices which may need additional power.
  • the controller 11 may also collect statistics about such performance throttling on a fine granularity.
  • FIG. 7 shows an example of a power policy which can be used by the local service processor 50 to control power consumption of a storage device 10 .
  • the local service processor 50 can manage the power policy by monitoring each storage device 10 in the storage system and instructing each storage device 10 to maintain its respective allocated power budget. For example, if a storage device 10 changes from operating at normal 61 to operating at greater than 90% of its allocated power budget, as shown at 62 , the controller 11 may throttle I/O performance by, for example, introducing additional latency of a small percentage (e.g., 10% or 20% of idle or overhead).
  • a small percentage e.g. 10% or 20% of idle or overhead
  • the controller 11 may introduce a much bigger latency (e.g., 50% or larger) or may introduce delays to NAND cycles, etc., in order to throttle the storage device 10 to meet its allocated budget. If the storage device 10 continues to exceed its allocated budget despite the introduced latencies, the local service processor 50 may execute shutdown instructions 64 to shutdown the device 10 or the controller 11 may shutdown itself.
  • a much bigger latency e.g. 50% or larger
  • the local service processor 50 may execute shutdown instructions 64 to shutdown the device 10 or the controller 11 may shutdown itself.
  • the local service processor 50 can also monitor and detect thermal load increases (temperature rises) or operate the resource during peak utility rate such as hot day times or during brown-out periods to ensure that each storage device 10 is behaving as intended performance-wise.
  • the above feature makes the storage device capable of autonomous optimizing power vs. performance vs. assigned power budget/state.
  • FIG. 8 is a diagram depicting a further embodiment in which power measurements are stored in the controller memory buffer until fetched by the local service processor 50 .
  • the controller 11 can store the power measurements locally in its own memory 12 until requested by the local service processor 50 .
  • the controller 11 could store the power measurement information in a controller memory buffer of the memory 12 in an embodiment in which the storage device 10 is an NVMe SSD.
  • the NVMe specification define the controller memory buffer (CMB), which is a portion of the storage device's memory, but is assigned by the host/local service processor and owned by the host/local service processor logically.
  • CMB controller memory buffer
  • the firmware of the controller 11 can fetch power measurement information from the PMU 14 and store it in the control memory buffer of the memory 12 .
  • the control memory buffer can be updated at any designated time unit.
  • the local service processor 50 can then query the power measurement information by reading the power measurements directly from the controller memory buffer of the memory 12 .
  • the power measurements can be read from the control memory buffer via the controller/host interface 42 . If the controller/host interface 42 is PCIe, the power measurement information can go through the PCIe to directly process memRd/memWr based on the BAR configuration in order to read from the control memory buffer. In other embodiments, the power measurement information can go through side band such as SMBus or I2C to directly access the control memory buffer.
  • the storage device 10 can be configured so that the PMU 14 is directly accessible by the local service processor 50 in order for the local service processor to be able to access the power measurement information when desired/needed and in real-time.
  • FIG. 9 is a diagram depicting an embodiment in which power measurements taken by the PMU 14 are directly accessible to the local service processor 50 .
  • the storage device 10 can be configured with an assistant bus, such as, for example, I2C or AXI, to allow direct access to the PMU 14 by the local service processor 50 .
  • an assistant bus such as, for example, I2C or AXI
  • FIG. 10 is an example of a power log 70 according to an embodiment of the present invention.
  • a storage device 10 may have, for example, up to 32 Power States (PowerState) 71 , which are recorded in the power log 70 .
  • PowerState 71 has predefined performance information, a Maximum Power (MP) 72 capable of being utilized in that Power State 71 and an Actual Power (AP) 73 actually being used at that PowerState.
  • MP 72 capable of being utilized in that Power State 71
  • AP 73 is a measured period according to the time unit (e.g., 1 minute) and Workload/QoS.
  • each row in the power log 70 represents a power state which has been defined in the NVMe Specifications 1.3. For example, there are total 32 Power State defined in NVMe Specifications.
  • a vendor-specific definition can be used for each PowerState 71 .
  • the power log 70 can include in its table entries the various PowerStates 71 and each PowerState's respective MP 72 , AP 73 and additional information for identifying the power measurements and a relationship among Max Power/Power State, Actual Power, and QoS.
  • QoS information can include, for example, current Entry Latency (ENTLAT), current Exit Latency (EXTLAT), RRT (Relative Read Throughput), RWT (Relative Write Throughput) and other suitable variables.
  • An expected power state i.e. power measurement budget
  • can be set by the local service processor 50 through the SetFeature (FeatureID 0x2), as discussed with respect to FIG. 6 .
  • Other power-related information can be managed by local service processor 50 through VUCmd(Vendor Unique Cmd) or directly accessed through the local service processor 50 . For example, if the user would like to get power measurement information which is not defined in the NVMe specification, a VUCmd can be used to allow host retrieve such non-standard power information, similar to LogPage.
  • FIG. 11 is an illustrative method of how a storage system 200 manages the power reporting of multiple storage devices 10 in its chassis.
  • each PMU 14 of each storage device 10 measures the current AP 73 and stores the information in the power log 70 , which is queried and/or retrieved (S 50 ) by the local service processor 50 .
  • the local service processor 50 then updates/uploads (S 51 ) the power log 70 from the local service processor 50 to the storage system 200 .
  • Various applications 80 in the storage system 200 can analyze (S 52 ) the power logs 70 of the storage devices 10 in the chassis at the local service processor 50 .
  • the results of these analyses can determine how to allocate power for better performance, e.g., whether more power needs to be allocated to a particular PowerState 71 or whether power should be reallocated from one PowerState 71 to another to meet QoS demands.
  • the local service processor 50 can request (S 53 ) that the storage device 10 , as illustrated with respect to the center storage device 10 shown in FIG. 10 , transfer Max Power State, in this example, from PowerState 3 to PowerState 0.
  • the local service processor 50 can then either assign a new MP 72 to the storage devices 10 or can request (S 54 ) a power distribution unit (PDU) 90 to assign a new MP 72 budget to the storage devices 10 , i.e. redistributing power allocations.
  • PDU power distribution unit
  • the PDU 90 will then assign (S 55 ) the new MP 72 to the storage devices 10 .
  • the PDU 90 may be an independent component located in the chassis and may responsible for distributing MP to each storage device 10 .
  • the local service processor 50 then updates (S 56 ) the power log 70 with the changes.
  • the local service processor 50 can then use that information to create graphs or histograms to trend projections and to run diagnostics.
  • Embodiments of the present invention also enable the local service processor to provide individual actual power profiles of each storage devices in the system to software developers, cloud service providers, users and others by allowing them to know the actual power consumption of their workloads consumed on each storage device. This provides the ability for software developers/users to optimize performance based on the actual cost of energy and also allows cloud service providers to provide more accurate billing of storage system users based on actual power consumption. Embodiments of the present invention can also provide better policing and tracking of storage devices violating an allocated power budget.
  • Embodiments of the present invention may be used in a variety of areas.
  • the embodiments of the present invention provide building blocks of crucial information that may be used for analysis purposes for artificial intelligence software, such as Samsung's DCP.
  • the embodiments also provide information that may be useful to an ADRC (Active Disturbance Rejection) High Efficient Thermal control based system.
  • ADRC Active Disturbance Rejection
  • FIG. 12 is a block diagram illustrating a storage system 300 utilizing a storage bank 302 and a power distribution unit (PDU) 90 , according to some exemplary embodiments of the present invention.
  • PDU power distribution unit
  • the storage bank (e.g., an Ethernet SSD chassis or Just-a-bunch-of-flashes (JBOF)) 302 includes a plurality of storage devices 10
  • the PDU 90 includes a plurality of power supply units (PSUs or power supplies) 304 for supplying power to the storage devices 10 of the storage bank 302 under the direction of the local service processor (or BMC) 50
  • the PSUs 304 are interchangeable, that is, each may have the same form factor and the same power supply capacity (e.g., have same output wattage); however, embodiments of the present invention are not limited thereto, and one or more of the PSUs 304 may have a power supply capacity that is different from other PSUs 304 .
  • the plurality of PSUs 304 may be in an N+1 configuration in which N (an integer greater than or equal to 1) PSUs are sufficient to service the power needs of the storage bank 302 , and an additional PSU 304 is provided as redundancy, which may be activated in the event that any of the PSUs experiences a failure.
  • the PSUs 304 may be coupled together using a switch network (e.g., a FET network) 305 , rather than directly connected to the power bus 306 , in order to protect the power bus 306 from electrical short circuits and transients when other PSUs 304 are connected.
  • the switch network may include a plurality of switches (e.g., transistors) that are connected to the plurality of PSUs 304 , on one end, and connected to the power bus 306 , at the other end.
  • the switches are independently controlled by the local service provider (BMC) 50 , so that any one of the PSUs 304 may be connected to, or disconnected from, the power bus 306 , based on a control signal from the local service provider 50 .
  • each storage devices 10 is configured to report its actual power consumption to the local service processor 50 via, for example, SMBus or PCI-e, and by, for example, NVMe-MI or MCTP protocols.
  • the actual power consumption is measured by the PMU (i.e., power meter) 14 , which may be internal to (e.g., integrated within) the storage device 10 (as shown in FIG. 12 ) or be external to, but coupled to, the storage device 10 .
  • the power consumption reporting enables the local service processor 50 to provide power profiles and perform analytics on the storage bank 302 , which can in turn be used for diagnostics as well as offering value added services. This also allows each storage device 10 to more flexibly manage its own power usage as dictated by the system administrator 308 , via the local service processor 50 .
  • FIGS. 13 A- 13 D illustrate histograms of power consumption of a storage system as generated by the local service processor 50 , according to some exemplary embodiments of the present invention.
  • the local service processor 50 reads the power measurements periodically from the storage devices 10 .
  • local service processor 50 may use NVMe-MI protocol over SMBus or PCIe to read the power log 70 pages, according to some examples.
  • the local service processor 50 may then process the read power data to generate power usage trends, such as whole power usage of the storage bank 302 over time (e.g., per hour, during day time, night time, weekdays, or weekends, etc.), each storage device's 10 power consumption over time, relative power consumption of the storage devices 10 in a storage bank 302 , and/or the like.
  • the local service processor 50 may generate many derivative/additional graphs to learn about the power consumption behavior with respect to time, user, activity, etc.
  • the local service processor 50 may also utilize such data for diagnostics purposes, power provisioning, future needs, cooling, and planning, etc.
  • FIG. 13 A illustrates the power consumption of a single storage device 10 over time.
  • the Y axis represents power consumption in terms of Watts
  • the X axis represents time in terms of hours.
  • the local service processor 50 manages host access policies, and receives raw power data and host IDs of active storage devices.
  • the local service processor 50 is cognizant/aware of which host or application is accessing each storage device 10 at any given time, and is able to combine this information with power usage metrics to profile the power consumption by various hosts or applications. Such information can provide deeper insights into storage power needs to various applications and can be used to calculate the storage costs per host or application more accurately.
  • FIG. 13 B illustrates power consumption by different hosts or applications.
  • the Y axis represents average power consumption in Watts over a period of time (e.g., per hour, day, etc.)
  • the X axis represents the host ID or application ID.
  • the local service processor 50 is capable of using power usage metrics for diagnostic purposes.
  • the local service processor 50 may alert the storage administrator 308 .
  • the abnormal power consumption may be a result of a fault within the storage device 10 , or may be due to anomalous activity of the host or application that is accessing the storage device 10 .
  • the faults may be a result of flash die or flash channel failures, which may initiate RAID like recovery mechanism consuming excess power; or higher bit rate errors in the media or volatile memory, which may cause error correction algorithms not to converge and spend more time and energy on a process.
  • the local service processor 50 may query storage device health and status logs, such as SMART Logs, as well as proprietary diagnostic logs to asses abnormal behavior. Based on the policies set by the administrator 308 , some of the abnormal behavior may be alerted to the administrator 308 for further action.
  • storage device health and status logs such as SMART Logs, as well as proprietary diagnostic logs to asses abnormal behavior. Based on the policies set by the administrator 308 , some of the abnormal behavior may be alerted to the administrator 308 for further action.
  • FIG. 13 C illustrates a potential fault detected in a storage device 10 when the power consumption per hour suddenly spikes about normal levels (e.g., 3-10 W/hr) to close to maximum values (e.g., around 25 W).
  • the Y axis represents average power consumption in Watts
  • the X axis represents time in terms of hours.
  • the criterion for fault detection may be the derivative of power consumption being greater than a set threshold.
  • the actual power consumption may be measured against storage device performance to determine if a fault has occurred or not.
  • the fault detection criteria/policy may be set by the administrator 308 .
  • FIG. 13 D illustrates an example, in which a potential fault is detected in a storage device 10 (e.g., the storage device in slot #8).
  • the storage device 10 may be expected to consume a maximum power of about 25 W at 1 MIOPS (one million input/output operations per second) of performance.
  • MIOPS one million input/output operations per second
  • the local service processor may tag the storage device in slot #8 as potentially faulty or at least a good candidate for further fault analysis.
  • aspects of the present invention provide the building block of crucial information for other artificial intelligence SW to analyze.
  • it also provides useful information for an ADRC (active disturbance rejection control), high-efficiency, thermal-control based system to take advantage of.
  • ADRC active disturbance rejection control
  • FIG. 14 is flow diagram illustrating the process 400 of managing operations of the PDU 90 , according to some exemplary embodiments of the present invention.
  • the local service provider 50 manages (e.g., optimizes) operations of the PDU 90 by dynamically monitoring the operation of the PSUs 304 of the PDU 90 and ensuring that active PSUs 304 operate in their high power-efficiency range. In so doing, the local service provider 50 determines (S 100 ) whether the PDU 90 includes multiple active PSUs 304 or not.
  • the active PSUs 304 may be connected to the power bus 306 through the switch network (i.e., have the corresponding witches turned on), and the deactivated PSUs 304 may be disconnected from the power bus 306 (e.g., by having the corresponding switches turned off).
  • the local service provider 50 determines the status of each PSU 304 in the PDU 90 through a bus (e.g., SMBus/PMBus), and is thus able to determine the number of PSUs 304 at the PDU 90 .
  • the local service provider 50 reads the PSU status register of each PSU 304 present in the PDU 90 to determine its status (i.e., active/enabled or deactivated/disabled). If only one active PSU 304 is present, the local service provider 50 proceed to determine (S 114 ) if the active PSU 304 is the only one PSU 304 present and is in HA mode (more on this below).
  • the local service provider 50 determines (S 102 ) whether the total power consumption of the storage bank 302 is less than a first percentage threshold (e.g., 40% or a value between 30% to 50%) of the load of each of the active PSUs 304 .
  • a first percentage threshold e.g., 40% or a value between 30% to 50%
  • the local power processor 50 does so by obtaining the actual power consumption of each storage device 10 , as measured by the corresponding PMU 14 , and adding together the actual power consumptions.
  • the local service provider 50 may obtain the actual power consumption of each storage device 10 by querying/retrieving the power log 70 from the storage device 10 or the PMU 14 corresponding to the storage device 10 (which may be internal to or external to the storage device 10 ).
  • the local service provider 50 disables an active PSU 304 (S 104 ), waits (S 106 ) for a period of time (e.g., seconds or minutes), and rechecks (S 102 ) whether the total power consumption of the storage bank 302 is still less than the first percentage threshold of the load of each of the active PSUs 304 . If so, the loop continues and the local service provider 50 continues to disable the active PSUs 304 one by one until the total power consumption is equal to or greater than the first percentage threshold of the load of each of the active PSUs 304 .
  • the local service provider 50 proceeds to determine (S 108 ) whether the total power consumption of the storage bank 302 is greater than a second percentage threshold (e.g., about 90% or a value between 85% and 95%) of the load of each of the active PSUs 304 . If so, the active PSUs 304 may be operating in high-power state, which may be detrimental to the longevity of the PSUs 304 if prolonged.
  • a second percentage threshold e.g., about 90% or a value between 85% and 95
  • the local service provider 50 enables (i.e., activates) a disabled (i.e., a deactivated) PSU 304 (S 110 ), waits (S 112 ) for a period of time (e.g., seconds or minutes), and rechecks (S 108 ) whether the total power consumption of the storage bank 302 is still equal to or greater than the second percentage threshold of the load of each of the active PSUs 304 . If so, the loop continues and the local service provider 50 continues to enable the active PSUs 304 one by one until the total power consumption is less than the second percentage threshold of the load of each of the active PSUs 304 .
  • the local service provider 50 proceeds to determine (S 114 ) if only one PSU 304 is present in the PDU 90 while the storage system 300 is in high availability (HA) mode, which indicates multi-path IO mode and N+1 redundant PSUs.
  • HA mode the storage system 300 is in multi-path IO mode and N+1 redundant PSUs are present to ensure that there is no single point of failure.
  • the local service provider 50 issues a warning (e.g., a critical warning) message (S 116 ) to the system administrator 308 to install another redundant PSU 304 in the PDU 90 . Otherwise, the system is operating normally and no warning message is sent to the system administrator 308 .
  • a warning e.g., a critical warning
  • FIG. 15 is flow diagram illustrating a process 500 of managing the storage devices 10 of the storage system 300 , according to some exemplary embodiments of the present invention.
  • the local service provider 50 manages (e.g., optimizes) storage devices 10 by dynamically adjusting (e.g., lowering) their maximum power range or power cap based on the current workload of the storage bank 302 .
  • the local service provider 50 identifies (S 118 ) which storage devices 10 of the storage bank 302 are in an idle state or consume near-idle power.
  • an idle state may refer to an operational state in which a storage device 10 does not have any active or outstanding host commands such as read or write in its command queue for a period of time. That is to say that the host command queues of the storage device controller have been empty for a period of time, which may be programmable (e.g., by the system administrator 308 ).
  • Near-idle power may be any power consumption that is below a set threshold, which may be programmable (e.g., by the system administrator 308 ).
  • the local power processor 50 obtains the actual power consumption of each storage device 10 , which is measured by the corresponding PMU 14 , by querying/retrieving the power log 70 from the storage device 10 .
  • the local service provider 50 compares the actual power consumption with an idle power level. If consumed power of the storage device 10 is at or below the idle power level, the storage device is identified as being in an idle state.
  • the local service provider 50 then instructs (S 120 ) the identified storage devices 10 to operate at a lower power cap.
  • the local service processor 50 may instruct each of the identified storage devices 10 to change power states to a power state having a lower maximum power rating (e.g., change from PowerState 2 to PowerState 5). This may be done based on a power policy that is implemented by the local service provider 50 (and is, e.g., defined by the system administrator 308 ), which associates each power state to a range of actual power consumption.
  • the local service provider 50 identifies (S 122 ) which storage devices 10 consume power at a level less than a threshold power level.
  • the threshold may be set at 75% of maximum power, which may be 25 W, or 75 W, etc., depending on the kind of PSUs and/or power connectors used.
  • the local power processor 50 obtains the actual power consumption of each storage device 10 , which is measured by the corresponding PMU 14 , by querying/retrieving the power log 70 from the storage device 10 .
  • the local service provider 50 compares the actual power consumption with threshold power level to determine if consumed power of the storage device 10 is below the threshold power level.
  • the local service provider 50 then dynamically instructs the identified storage devices 10 to operate at a power cap corresponding to the first level (e.g., at 75% or 80% of maximum power), as opposed to the default power cap of 100% maximum power.
  • lowering the power cap of the storage devices 10 may bring down the overall power usage of the storage bank 302 , thus allowing the PSU to operate at a lower power level and at a higher (e.g., peak) power efficiency range. This may be particularly desirable in large data centers, where overall power usage and cooling is a great concern.
  • the local service provider 50 may dynamically instruct each of the identified storage devices 10 to operate at a lower power cap by instructing them to change their power state to one where the maximum power corresponds to (e.g., is at or less than) the threshold power level (e.g., the power states may be changed from PowerState 0 to PowerState 1).
  • the local service provider 50 identifies (S 126 ) which storage device slots are empty (i.e., not occupied by, or connected to, any storage device 10 ).
  • each storage device 10 may have a presence pin on the slot connector 15 , which is used by the service provider 50 to determine whether the slot is empty or occupied by a storage device 10 .
  • the local service provider 50 instructs (S 128 ) that these PMUs 14 operate at lower power caps (e.g., operate at the lowest power state, PowerState 31 ) or disable/deactivate altogether. This will allow the storage bank 302 to eliminate or reduce unnecessary power usage.
  • operations S 118 -S 120 , S 122 -S 124 , and S 126 -S 128 are ordered in a particular sequence in FIG. 15 , embodiments of the present invention are not limited thereto.
  • the operations S 118 -S 120 can be performed after either or both of operations S 122 -S 124 and S 126 -S 128
  • operations S 126 -S 128 may be performed before either or both of operations S 118 -S 120 and S 122 -S 124 .
  • the operations performed by the local service provider 50 may be described in terms of a software routine executed by one or more processors in the local service provider 50 based on computer program instructions stored in memory.
  • the routine may be executed via hardware, firmware (e.g. via an ASIC), or in combination of software, firmware, and/or hardware.
  • the sequence of steps of the process is not fixed, but may be altered into any desired sequence as recognized by a person of skill in the art.

Abstract

A storage system comprises one or more storage devices, power supplies supplying power to the storage device, a processor that performs in response to determining that the total power consumption of the one or more storage devices is less than a first percentage threshold of a load of the active power supplies, deactivating one or more of the active power supplies until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active power supplies, and in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active power supplies, activating one or more of the deactivated ones of the power supplies until the total power consumption is less than the second percentage threshold of the load of each of the active power supplies.

Description

    CROSS-REFERENCE TO RELATED APPLICATION(S)
  • This application is a continuation of U.S. patent application Ser. No. 16/167,306, filed Oct. 22, 2018, which claims the benefit of U.S. Provisional Patent Application Ser. No. 62/713,466, filed Aug. 1, 2018, the content of which is hereby incorporated by reference in its entirety.
  • U.S. application Ser. No. 16/167,306 is also a continuation-in-part of U.S. patent application Ser. No. 15/975,463, filed May 9, 2018, entitled “METHOD AND APPARATUS FOR SELF-REGULATING POWER USAGE AND POWER CONSUMPTION IN ETHERNET SSD STORAGE SYSTEMS”, which claims priority to and the benefit of U.S. Provisional Application No. 62/638,035, filed Mar. 2, 2018, the entire contents of both of which are incorporated herein by reference.
  • BACKGROUND
  • Many companies provide cloud-based storage to end users so that end users will have the ability to remotely access their stored data. Such companies generally take advantage of Ethernet-attached solid state drives (eSSDs) for their storage requirements. In particular, Ethernet-attached non-volatile memory express NVMe (Non-Volatile Memory Express) SSDs (e.g., NVMe Over Fabrics [NVMe-oF] storage devices) are considered an emerging and disruptive technology in this area.
  • Cloud-based storage providers typically charge users for storing their data on a monthly or annual basis based on the total storage space allocated to the user and either the average cost of energy consumed by all users or the maximum power consumption capable of being consumed by the user based on the system. For example, for two users who have purchased the same amount of cloud storage space, a user who stores only a small amount of data relative to the total purchased storage space and only stores data on an infrequent basis will be charged the same as a user who is regularly removing and added new data and using the majority of his/her purchased storage space. Ideally, users should be charged for storage based on the energy resources actually consumed. However, there is no accurate method for calculating the power consumption of individual users, or calculating power consumption in real time.
  • The above information disclosed in this Background section is only for enhancement of understanding of the background of the disclosure and therefore it may contain information that does not constitute prior art.
  • SUMMARY
  • Aspects of embodiments of the present invention are directed to a storage system, and a method of operating the same, capable of managing (e.g., optimizing) operation of the power supplies of the system by dynamically monitoring their operation and ensuring that active power supplies operate in their high power-efficiency range.
  • Aspects of embodiments of the present invention are directed to a storage system, and a method of operating the same, capable of managing (e.g., optimizing) power usage of storage devices of a storage bank by dynamically adjusting their maximum power caps based on the workload of the storage bank.
  • According to some embodiments of the present invention, there is provided a storage system comprising: one or more storage devices; a plurality of power supplies configured to supply power to the storage device; a processor; and a memory having stored thereon instructions that, when executed by the processor, cause the processor to perform: determining whether multiple power supplies of the plurality of power supplies are active; in response to determining that multiple power supplies are active: determining a total power consumption of the one or more storage devices; in response to determining that the total power consumption is less than a first percentage threshold of a load of active ones of the power supplies, deactivating the active ones of the power supplies one by one until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active ones of the power supplies; and in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active ones of the power supplies, activating deactivated ones of the power supplies one by one until the total power consumption is less than the second percentage threshold of the load of each of the active ones of the power supplies.
  • In some embodiments, the determining the total power consumption of the one or more storage devices comprises: obtaining an actual power consumption of each storage device of the one or more storage devices from the storage device or a corresponding power meter; and summing the actual power consumption of each storage device to obtain the total power consumption.
  • In some embodiments, the obtaining the actual power consumption of each storage device comprises: retrieving power measurement information from a power log corresponding to the storage device, wherein the power measurement information is measured, and recorded in the power log, by the corresponding power meter.
  • In some embodiments, the corresponding power meter is internal to the storage device.
  • In some embodiments, the corresponding power meter is external to and coupled to the storage device.
  • In some embodiments, the first percentage threshold of the load of each of the active ones of the power supplies is 40% of the load of each of the active ones of the power supplies.
  • In some embodiments, the second percentage threshold of the load of each of the active ones of the power supplies is 90% of the load of each of the active ones of the power supplies.
  • In some embodiments, the instructions further cause the processor to perform: determining whether only one power supply of the plurality of power supplies is in a high availability mode; and in response to determining that only one power supply of the plurality of power supplies is in a high-availability mode, generating a warning message indicating that the one power supply is in high-availability mode.
  • In some embodiments, the deactivating the active ones of the power supplies one by one comprises: deactivating an active power supply of the active ones of the power supplies; determining that the total power consumption of the one or more storage devices is less than the first percentage threshold of a load of the active ones of the power supplies; and in response to the determining, deactivating an other active power supply of the active ones of the power supplies.
  • In some embodiments, the activating the deactivated ones of the power supplies one by one comprises: activating a deactivated power supply of the power supplies; determining that the total power consumption of the one or more storage devices is equal to or greater than the second percentage threshold of a load of the active ones of the power supplies; and in response to the determining, enabling an other deactivated power supply of the power supplies.
  • According to some embodiments of the present invention, there is provided a method of managing a storage system comprising one or more storage devices and a plurality of power supplies configured to supply power to the storage device, the method comprising: determining, by a processor of the storage device, whether multiple power supplies of the plurality of power supplies are active; in response to determining that multiple power supplies are active: determining, by the processor, a total power consumption of the one or more storage devices; in response to determining that the total power consumption is less than a first percentage threshold of a load of active ones of the power supplies, deactivating, by the processor, the active ones of the power supplies one by one until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active ones of the power supplies; and in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active ones of the power supplies, activating, by the processor, deactivated ones of the power supplies one by one until the total power consumption is less than the second percentage threshold of the load of each of the active ones of the power supplies.
  • According to some embodiments of the present invention, there is provided a storage system comprising: a plurality of storage devices, each storage device of the plurality of storage devices being configured to measure a power consumption of the storage device; a processor in communication with the plurality of storage devices; and a memory having stored thereon instructions that, when executed by the processor, cause the processor to perform: determining whether one or more first storage devices of the plurality of storage devices are idle or are in an idle state; in response to determining that the one or more first storage devices are in an idle state, instructing the one or more first storage devices to operate at lower power caps; determining whether one or more second storage devices of the plurality of storage devices are consuming power under a threshold power level; and in response to determining that the one or more second storage devices are consuming power under the threshold power level, instructing the one or more second storage devices to operate at or below the threshold power level.
  • In some embodiments, the determining whether one or more first storage devices are in idle state: obtaining power consumption of each storage device of the plurality of storage devices by retrieving a corresponding power log from the storage device; comparing the power consumption of each storage device with an idle power level; and determining whether the one or more first storage devices have power consumptions that are at or below the idle power level.
  • In some embodiments, the power log stores actual power consumption of the corresponding storage device as measured by a corresponding power meter.
  • In some embodiments, instructing the one or more first storage devices to operate at the lower power caps comprises: instructing the one or more first storage devices to change power states to a power state having a lower maximum power rating.
  • In some embodiments, determining whether the one or more second storage devices of the plurality of storage devices are consuming power under a threshold power level comprises: obtaining power consumption of each storage device of the plurality of storage devices by retrieving a corresponding power log from the storage device; comparing the power consumption of each storage device with the threshold power level; and determining whether the one or more first storage devices have power consumptions that below the threshold power level.
  • In some embodiments, instructing the one or more second storage devices to operate at or below the threshold power level comprises: instructing the one or more second storage devices to change power states to a power state having a maximum power rating corresponding to the threshold power level.
  • In some embodiments, the instructions further cause the processor to perform: determining whether one or more storage slots are not occupied by any storage device; and in response to determining that the one or more storage slots are not occupied by any storage device: identifying one or more power meters associated with the one or more storage slots; and instructing the identified one or more power meters to operate at lower power cap.
  • In some embodiments, instructing the identified one or more power meters to operate at lower power cap comprises: instructing the one or more power meters to operate at a lowest power state.
  • In some embodiments, instructing the identified one or more power meters to operate at lower power cap comprises: instructing the one or more power meters to deactivate.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Further features and aspects will become apparent and will be best understood by reference to the following detailed description reviewed in conjunction with the drawings. In the drawings:
  • FIG. 1 is an internal block diagram of a storage device according to an embodiment of the present invention.
  • FIG. 2 is a flow chart of a method for collecting power consumption measurements from a power measurement unit in the storage device of FIG. 1 .
  • FIG. 3 is a schematic diagram of a storage system incorporating multiple storage devices that are capable of providing power measurements.
  • FIG. 4 is a block diagram of an embodiment of the storage system of FIG. 3 in which a PCIe switch is used.
  • FIG. 5 is a diagram depicting an embodiment in which power measurements are transferred to a local service processor based on a query from the local service processor.
  • FIG. 6 is a diagram depicting an embodiment in which power measurements are set by the local service processor.
  • FIG. 7 shows an example of a power policy which can be used to by the local service processor 50 to control power consumption of a storage device.
  • FIG. 8 is a diagram depicting an embodiment in which power measurements are stored in a controller memory buffer until fetched by the local service processor.
  • FIG. 9 is a diagram depicting an embodiment in which power measurements taken by a power measurement unit are directly accessible to the local service processor.
  • FIG. 10 is an example of a power log according to an embodiment of the present invention.
  • FIG. 11 is an illustrative method of how a storage system manages the power reporting of multiple storage devices in its chassis using the power log of FIG. 10 .
  • FIG. 12 is a block diagram illustrating a storage system utilizing a storage bank and a power distribution unit, according to some exemplary embodiments of the present invention.
  • FIGS. 13A-13D illustrate histograms of power consumption of a storage system as generated by the local service processor, according to some exemplary embodiments of the present invention.
  • FIG. 14 is flow diagram illustrating a process of managing the operation of the power supplies of a storage system, according to some exemplary embodiments of the present invention.
  • FIG. 15 is flow diagram illustrating a process of managing the storage devices of the storage system, according to some exemplary embodiments of the present invention.
  • DETAILED DESCRIPTION
  • Hereinafter, example embodiments will be described in more detail with reference to the accompanying drawings, in which like reference numbers refer to like elements throughout. The present invention, however, may be embodied in various different forms, and should not be construed as being limited to only the illustrated embodiments herein. Rather, these embodiments are provided as examples so that this disclosure will be thorough and complete, and will fully convey the aspects and features of the present invention to those skilled in the art. Accordingly, processes, elements, and techniques that are not necessary to those having ordinary skill in the art for a complete understanding of the aspects and features of the present invention may not be described. Unless otherwise noted, like reference numerals denote like elements throughout the attached drawings and the written description, and thus, descriptions thereof will not be repeated. In the drawings, the relative sizes of elements, layers, and regions may be exaggerated for clarity.
  • It will be understood that when an element or layer is referred to as being “on,” “connected to,” or “coupled to” another element or layer, it can be directly on, connected to, or coupled to the other element or layer, or one or more intervening elements or layers may be present. In addition, it will also be understood that when an element or layer is referred to as being “between” two elements or layers, it can be the only element or layer between the two elements or layers, or one or more intervening elements or layers may also be present.
  • The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the present invention. As used herein, the singular forms “a” and “an” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes,” and “including,” when used in this specification, specify the presence of the stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
  • As used herein, the term “substantially,” “about,” and similar terms are used as terms of approximation and not as terms of degree, and are intended to account for the inherent deviations in measured or calculated values that would be recognized by those of ordinary skill in the art. Further, the use of “may” when describing embodiments of the present invention refers to “one or more embodiments of the present invention.” As used herein, the terms “use,” “using,” and “used” may be considered synonymous with the terms “utilize,” “utilizing,” and “utilized,” respectively. Also, the term “exemplary” is intended to refer to an example or illustration.
  • The electronic or electric devices and/or any other relevant devices or components according to embodiments of the present invention described herein may be implemented utilizing any suitable hardware, firmware (e.g. an application-specific integrated circuit), software, or a combination of software, firmware, and hardware. For example, the various components of these devices may be formed on one integrated circuit (IC) chip or on separate IC chips. Further, the various components of these devices may be implemented on a flexible printed circuit film, a tape carrier package (TCP), a printed circuit board (PCB), or formed on one substrate. Further, the various components of these devices may be a process or thread, running on one or more processors, in one or more computing devices, executing computer program instructions and interacting with other system components for performing the various functionalities described herein. The computer program instructions are stored in a memory which may be implemented in a computing device using a standard memory device, such as, for example, a random access memory (RAM). The computer program instructions may also be stored in other non-transitory computer readable media such as, for example, a CD-ROM, flash drive, or the like. Also, a person of skill in the art should recognize that the functionality of various computing devices may be combined or integrated into a single computing device, or the functionality of a particular computing device may be distributed across one or more other computing devices without departing from the spirit and scope of the exemplary embodiments of the present invention.
  • Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and/or the present specification, and should not be interpreted in an idealized or overly formal sense, unless expressly so defined herein.
  • Embodiments of the present invention include a storage device, such as an SSD (e.g., NVMe or NVMe-oF SSD), that is capable of reporting its actual power consumption to the local service processor, for example, a baseboard management controller (BMC). This enables the local service processor to provide power profiles and consumption of the storage device. In some embodiments, the storage device can report to the local service processor or BMC via a system management bus (SMBus) or a Peripheral Component Interconnect Express (PCIe), and can report by one of various protocols, such as by a Management Component Transport Protocol (MCTP) or by a NVMe Management Interface protocol for NVMe SSDs storage devices. In some embodiments, the storage system may be an NVMe-oF based system. Further embodiments include a storage system including several storage devices in which each storage device is capable of reporting its actual power consumption to the local service processor. In such a system, the local service processor can provide power profiles and analytics of the storage system and individual storage devices in the system.
  • FIG. 1 depicts an internal block diagram of a storage device 10 according to an embodiment of the present invention. While diagram depicts features relevant to the illustrated embodiment of the invention, the storage device 10 may include additional components. In some embodiments, the storage device 10 may be an SSD, an Ethernet SSD (eSSD), an NVMe SSD, an NVMe-oF SSD, a SAS or SATA SSD.
  • The storage device 10 includes internal components, including a controller 11, a memory 12, flash dies 13, a power metering unit (PMU) 14 and a connector 15. The controller 11, as known as the processor, implements firmware to retrieve and store data in the memory 12 and flash dies 13 and to communicate with a host computer. In some embodiments, the controller 11 may be an SSD controller, an ASIC SSD controller, or an NVMe-oF/EdgeSSD controller. The memory 12 can be a random access memory such as DRAM or MRAM and the flash dies 13 may be NAND flash memory devices, though the invention is not limited thereto. The controller 11 can be connected to the memory 12 via memory channel 22 and can be connected to the flash dies 13 via flash channels 23. The controller 11 can communicate with a host computer via a host interface 20 that connects the controller 11 to the host computer through the connector 15. In some embodiments, the host interface 20 may be a PCIe connection, an Ethernet connection or other suitable connection. The connector 15 may be U.2/M.2 connectors or other suitable connector(s). The PMU 14 allows the storage device 10 to support power management capabilities by measuring actual power consumption of the storage device 10.
  • The storage device 10 is supplied power through the connector 15 via power rails or pins 30. In examples in which the connector 15 is a PCIe connector, the pins 30 may be 12 V and 30 V pins. In examples in which the connector 15 is a U.2 connector, the pins 30 may be 5 V and 12 V pins (an NVMe SSD may only use the 12 V pin, while a SAS or SATA SSD may use both rails). Power rails 30 supply power to the various components of the storage device 10. For example, the power rail may supply power to the various components of the storage device 10 via the PMU 14 and various intermediary voltage rails. An embodiment of this is shown in FIG. 1 , in which the power rails 30 supply power to the PMU 14, which then distributes power to other components of the storage device 10. For example, the PMU 14 drives power to the flash dies 13 via flash voltage rails 33. The PMU 14 may similarly drive all power rails to the memory 12 via memory voltage rails 32. Power can be supplied to the controller 11 by the PMU 14 through multiple voltage rails, such as, for example, a core voltage rail 34, an I/O voltage rail 35 and one or more other voltage rails 36. Additional voltage rails, such as an additional voltage rail 37, may be included to connect other various components that may be included in the storage device 10. The various voltage rails 30, 33, 34, 35, 36, 37 used in the storage device 10 can be in a range of from 12V down to 0.6V, including 12V and/or 3.3V rails, for example, when the storage device 10 is an NVMe SSD. While, in the embodiments shown in FIG. 1 , the voltage regulators are built into (or integrated with) the PMU 14, embodiments of the present invention are not limited thereto, and the voltage regulators may be external to the PMU 14.
  • In addition to supplying power to the storage device 10, power supply rails 20 are provided by the PMU 14 inside the storage device 10 to generate power consumption measurements (“power measurements”) of the various voltages rails used by the components of the storage device 10, for example, used by components such as the controller 11, the flash dies 13, the memory 12 and other various components that may be included in the storage device 10. In some embodiments, the PMU 14 can be programmed to support get/set Power State by Power Info from the host computer or BMC.
  • The PMU 14 can measure the amount of current drawn on various voltage rails it is driving, for example, voltage rails 32, 33, 34, 35, 36 and 37. The PMU can output power measurements including the average, minimum and maximum voltage usage by the voltage rails 32, 33, 34, 35, 36 and 37 of the storage device 10. In some embodiments, the PMU 14 can meter each voltage rail 32, 33, 34, 35, 36 and 37 individually, with the summation of all voltage rails 32, 33, 34, 35, 36 and 37 used by the storage device 10 being the total power consumed by the storage device 10. The power measurements metered at the PMU 14 can be read by the controller 11 using a PMU/controller interface 41. In some embodiments, the PMU/controller interface 41 may be an I2C/SMBus. The controller 11 can then provide these power measurements to a local service processor 50 (see FIG. 3 ), such as a BMC, via either the host interface 20 or a separate controller/host interface 42. If a separate controller/host interface or side band bus 42 is used, that interface may be an I2C/SMBus. If the controller/host interface 42 is a PCIe connection, the controller 11 can provide power measurements to the local service processor 50 via NVMe-MI or MCTP protocols, as shown in FIG. 4 . The PMU 14 can report/output the power measurements periodically as specified by the local service processor 50 or passively keep track via internal counters which are accessible to the local service processor 50.
  • FIG. 2 is a flow chart of a method for collecting power consumption measurements from the PMU 14 of the storage device 10. As shown in FIG. 2 , power measurements can be read at predetermined intervals. For example, the power measurements can be read from the PMU 14 of the storage device 10 at the user's configurable frequency such as 1 second, 5 seconds, more than 5 seconds, or every few minutes. In other embodiments, that storage device 10 can read the power measurements only as needed (see, e.g., FIGS. 8 and 9 ), for example, at the completion of a specific job. The frequency at which the power measurements are read is hereinafter called a time unit.
  • For every time unit, the controller 11 prepares (S1) to receive power measurements from the PMU 14 for the various voltage rails 30, 33, 34, 35, 36, 37. The controller 11 queries (S2) the PMU 14 to determine if power measurements from all rails have been completed. If no, then a read request (S3) is sent to a DC-DC regulator at the PMU 14 corresponding to a voltage rail for which power measurements have not been received (the PMU 14 may include a number of DC-DC regulators each corresponding to unique voltage rail). This read request may be send via an I2C protocol via the PMU/controller interface 41. When the power measurement is received from the PMU 14, the power measurement is then annotated with a timestamp (S4) and a Host ID (S5). The received power measurement is then saved (S6) to a power log. The power log may include internal register(s) or may be included as part of the PMU's embedded non-volatile memory.
  • Once the received power measurement is saved, the PMU 14 is again queried (S7) until all power measurements are received from the various voltage rails 30, 33, 34, 35, 36, 37. Once all power measurements are complete and the annotated power measurements are saved in the power log, these power measurements persist (S8) in the power log through resets and power cycles.
  • In addition to the above annotations, the power log pages can also include any or all of the following: Namespace ID, NMV Set, read I/Os, write I/Os, SQ ID, Stream ID, and other suitable parameters. The controller 11 also implements actual power (AP) registers which are accessible by the local service processor 50. This allows a variety of parameters associated with the storage device and the power measurements to be mapped with fine granularity.
  • In some embodiments, the power log can be special proprietary or vendor defined log pages. The power log can be read by the local service processor 50 using existing standard protocols through either the host interface 20 or the separate controller/host interface or side-band bus 42, whichever is used. For example, the power log can be read by a BMC using the NVMe-MI protocol via the controller/host interface 42, which may be a SMBus or PCIe.
  • The above method provides dynamic, real-time output of actual power consumption measurements without affecting the I/O of the storage device. With the power measurement information, the local service processor can implement power budgets and allocate power to the storage device based on its actual power usage. For example, the local service processor can implement power budgets similar to existing industry standards for allocated power budget registers. Also, the storage device can report real time power consumption to system management software, such as Samsung's DCP or Redfish.
  • FIG. 3 is a block diagram of a storage system 100 incorporating multiple storage devices 10. The storage system 100 includes the local service processor 50 attached to multiple storage devices 10. Each storage device 10 has a PMU 14 to measure power consumption as described above with respect to FIGS. 1 and 2 . In the illustrated embodiment, the storage devices 10 provide power measurements to the local service processor 50 via the controller/host interface 42. In some embodiments, the controller/host interface 42 may be an I2C/SMBus or PCIe bus. The power measurements may be transferred to the local service processor 50 using NVMe protocols, such as NVMe-MI, MCTP over PCI-e, or I2C Bus protocols. If the storage device 10 is connected via a SMBus/I2C connection, the local service processor 50 can even access the power log during a power failure using these existing standard protocols.
  • FIG. 4 is a block diagram of an embodiment of the storage system 100 of FIG. 3 in which a PCIe switch 60 is used. In this embodiment, the storage devices 10 are connected to the local processor 50 via the PCIe switch 60. The power measurements may be transferred to the local service processor 50 via the PCIe switch 60 using suitable protocols such as, for example, NVMe-MI and/or MCTP.
  • In the embodiments of FIGS. 3 and 4 , the local service processor 50 and the multiple storage devices 10 can be housed within the same chassis allowing the local service processor 50 to process the power measurements of the multiple storage devices 10 according to chassis power management requirements; however, the invention is not limited thereto. For example, power measurements can also be processed at the individual storage device level.
  • In embodiments in which the power measurements are transferred to the local service processor 50 using NVMe protocols, NVMe specifications can define power measurements and their process mechanism. Based on this mechanism, the storage devices 10 (e.g., an NVMe SSD) can support power management either queried by the local service processor 50 (FIG. 5 ) or set by the local service processor 50 (FIG. 6 ).
  • FIG. 5 is a diagram depicting an embodiment in which power measurements are transferred to the local service processor 50 based on a query from the local service processor 50. In this embodiment, the local service processor 50 queries the power measurement information by sending a GetFeature command (S10), for example, FeatureID=0x2, to the firmware of the controller 11 for the storage device 10 from which the local service processor 50 is seeking power measurement information. The controller's firmware then fetches (S11) the power measurement information from the PMU 14. The firmware of the controller 11 receives the information and sends (S12) that information via direct memory access (DMA) to the local service processor 50. The controller's firmware then sends (S13) a completion notice to the local service processor 50 to signal completion of the query. This embodiment allows for real-time retrieval of power measurements from the storage device 10.
  • FIG. 6 is a diagram depicting an embodiment in which power measurements are set by the local service processor 50. In this embodiment, the local service processor 50 sets the power measurement information (hereinafter, called the power measurement budget) by sending a SetFeature command (S20), for example, FeatureID=0x2, to the firmware of the controller 11 for the storage device 10 for which the local service processor 50 intends to set the power measurement budget. The controller's firmware then uses DMA to request (S21) the power measurement budget from the local service processor 50. The firmware of the controller 11 receives the information and sets (S22) the power measurement budget of the PMU 14. In response, the controller's firmware processes the new power state transaction. In order to process the new power transaction, the controller's firmware queries the current power state job in the PMU 14 to ensure that all tasks that rely on the current power state are fully completed successfully. Then, the firmware changes the current power state from the current one to the next one required by the power measurement budget. The controller's firmware starts to process new tasks which rely on the power state using the allocated power measurement budget. The controller's firmware then sends (S23) a completion notice to the local service processor 50 to signal that the new power state has been set.
  • By enabling this SetFeature function, the local service processor 50 can control and throttle the power consumption of a particular storage device 10 to meet an allocated power budget of the local service processor 50. The controller 11 can enforce the power budget allocations programmed by the local service processor 50. If the actual power consumption exceeds the set threshold, the controller 11 can throttle the I/O performance for that parameter in order to minimize power consumption and to stay within the allocated power budget. The controller 11 can, for example, self-adjust by lowering the internal power state automatically when exceeding the allocated power budget. The controller 11 can then report back to the local service processor 50 so that the local service processor 50 can reallocate the available power to some other devices which may need additional power. The controller 11 may also collect statistics about such performance throttling on a fine granularity.
  • FIG. 7 shows an example of a power policy which can be used by the local service processor 50 to control power consumption of a storage device 10. The local service processor 50 can manage the power policy by monitoring each storage device 10 in the storage system and instructing each storage device 10 to maintain its respective allocated power budget. For example, if a storage device 10 changes from operating at normal 61 to operating at greater than 90% of its allocated power budget, as shown at 62, the controller 11 may throttle I/O performance by, for example, introducing additional latency of a small percentage (e.g., 10% or 20% of idle or overhead). However, if the current state is greater than 100% of its allocated power budget, as shown at 63, the controller 11 may introduce a much bigger latency (e.g., 50% or larger) or may introduce delays to NAND cycles, etc., in order to throttle the storage device 10 to meet its allocated budget. If the storage device 10 continues to exceed its allocated budget despite the introduced latencies, the local service processor 50 may execute shutdown instructions 64 to shutdown the device 10 or the controller 11 may shutdown itself.
  • In further embodiments, the local service processor 50 can also monitor and detect thermal load increases (temperature rises) or operate the resource during peak utility rate such as hot day times or during brown-out periods to ensure that each storage device 10 is behaving as intended performance-wise.
  • The above feature makes the storage device capable of autonomous optimizing power vs. performance vs. assigned power budget/state.
  • FIG. 8 is a diagram depicting a further embodiment in which power measurements are stored in the controller memory buffer until fetched by the local service processor 50. In this embodiment, the controller 11 can store the power measurements locally in its own memory 12 until requested by the local service processor 50. For example, the controller 11 could store the power measurement information in a controller memory buffer of the memory 12 in an embodiment in which the storage device 10 is an NVMe SSD. The NVMe specification define the controller memory buffer (CMB), which is a portion of the storage device's memory, but is assigned by the host/local service processor and owned by the host/local service processor logically.
  • The firmware of the controller 11 can fetch power measurement information from the PMU 14 and store it in the control memory buffer of the memory 12. The control memory buffer can be updated at any designated time unit. The local service processor 50 can then query the power measurement information by reading the power measurements directly from the controller memory buffer of the memory 12. The power measurements can be read from the control memory buffer via the controller/host interface 42. If the controller/host interface 42 is PCIe, the power measurement information can go through the PCIe to directly process memRd/memWr based on the BAR configuration in order to read from the control memory buffer. In other embodiments, the power measurement information can go through side band such as SMBus or I2C to directly access the control memory buffer.
  • Alternative to FIG. 8 , the storage device 10 can be configured so that the PMU 14 is directly accessible by the local service processor 50 in order for the local service processor to be able to access the power measurement information when desired/needed and in real-time.
  • FIG. 9 is a diagram depicting an embodiment in which power measurements taken by the PMU 14 are directly accessible to the local service processor 50. In this embodiment, the storage device 10 can be configured with an assistant bus, such as, for example, I2C or AXI, to allow direct access to the PMU 14 by the local service processor 50. This allows the local service processor 50 to be able to process the power measurement information by accessing the PMU 14 directly and allows for retrieval of power measurements in real-time.
  • FIG. 10 is an example of a power log 70 according to an embodiment of the present invention. As illustrated in this embodiment, a storage device 10 may have, for example, up to 32 Power States (PowerState) 71, which are recorded in the power log 70. Each PowerState 71 has predefined performance information, a Maximum Power (MP) 72 capable of being utilized in that Power State 71 and an Actual Power (AP) 73 actually being used at that PowerState. AP 73 is a measured period according to the time unit (e.g., 1 minute) and Workload/QoS. In the current embodiment, each row in the power log 70 represents a power state which has been defined in the NVMe Specifications 1.3. For example, there are total 32 Power State defined in NVMe Specifications. In some embodiments, a vendor-specific definition can be used for each PowerState 71.
  • The power log 70 can include in its table entries the various PowerStates 71 and each PowerState's respective MP 72, AP 73 and additional information for identifying the power measurements and a relationship among Max Power/Power State, Actual Power, and QoS. QoS information can include, for example, current Entry Latency (ENTLAT), current Exit Latency (EXTLAT), RRT (Relative Read Throughput), RWT (Relative Write Throughput) and other suitable variables.
  • FIG. 10 illustrates a Power State_3 with a defined Max Power=20 W. However, the storage device 10 at this Power State currently consumes an Actual Power=19 W. Current QoS is shown in other columns such as RRT=2, RWT=2, ENTLAT=20 us and EXTLAT=30 us. If applications 80 run on the storage system 200 expect the best QoS (such as the best RRT & RWT), those applications 80 could instruct the local service processor 50 to give more power to the storage device 10 by transferring from Power State_3 to Power State_0.
  • The current PowerState 71 is retrieved by the local service processor 50 through the GetFeature (FeatureID=0x2), as discussed with respect to FIG. 5 . An expected power state (i.e. power measurement budget) can be set by the local service processor 50 through the SetFeature (FeatureID=0x2), as discussed with respect to FIG. 6 . Other power-related information can be managed by local service processor 50 through VUCmd(Vendor Unique Cmd) or directly accessed through the local service processor 50. For example, if the user would like to get power measurement information which is not defined in the NVMe specification, a VUCmd can be used to allow host retrieve such non-standard power information, similar to LogPage.
  • FIG. 11 is an illustrative method of how a storage system 200 manages the power reporting of multiple storage devices 10 in its chassis. According to this method, each PMU 14 of each storage device 10 measures the current AP 73 and stores the information in the power log 70, which is queried and/or retrieved (S50) by the local service processor 50. The local service processor 50 then updates/uploads (S51) the power log 70 from the local service processor 50 to the storage system 200. Various applications 80 in the storage system 200 can analyze (S52) the power logs 70 of the storage devices 10 in the chassis at the local service processor 50. The results of these analyses can determine how to allocate power for better performance, e.g., whether more power needs to be allocated to a particular PowerState 71 or whether power should be reallocated from one PowerState 71 to another to meet QoS demands. For example, the local service processor 50 can request (S53) that the storage device 10, as illustrated with respect to the center storage device 10 shown in FIG. 10 , transfer Max Power State, in this example, from PowerState 3 to PowerState 0. The local service processor 50 can then either assign a new MP 72 to the storage devices 10 or can request (S54) a power distribution unit (PDU) 90 to assign a new MP 72 budget to the storage devices 10, i.e. redistributing power allocations. If the PDU 90 is used, the PDU will then assign (S55) the new MP 72 to the storage devices 10. The PDU 90 may be an independent component located in the chassis and may responsible for distributing MP to each storage device 10. The local service processor 50 then updates (S56) the power log 70 with the changes.
  • As discussed above, once the local service processor 50 has access and can read the power measurements, the local service processor 50 can then use that information to create graphs or histograms to trend projections and to run diagnostics.
  • Embodiments of the present invention also enable the local service processor to provide individual actual power profiles of each storage devices in the system to software developers, cloud service providers, users and others by allowing them to know the actual power consumption of their workloads consumed on each storage device. This provides the ability for software developers/users to optimize performance based on the actual cost of energy and also allows cloud service providers to provide more accurate billing of storage system users based on actual power consumption. Embodiments of the present invention can also provide better policing and tracking of storage devices violating an allocated power budget.
  • Embodiments of the present invention may be used in a variety of areas. For example, the embodiments of the present invention provide building blocks of crucial information that may be used for analysis purposes for artificial intelligence software, such as Samsung's DCP. The embodiments also provide information that may be useful to an ADRC (Active Disturbance Rejection) High Efficient Thermal control based system.
  • Although exemplary embodiments of the present invention have been described, it is understood that the present invention should not be limited to these exemplary embodiments but various changes and modifications can be made by one ordinary skilled in the art within the spirit and scope of the present invention as hereinafter claimed by appended claims and equivalents thereof.
  • FIG. 12 is a block diagram illustrating a storage system 300 utilizing a storage bank 302 and a power distribution unit (PDU) 90, according to some exemplary embodiments of the present invention.
  • In some embodiments, the storage bank (e.g., an Ethernet SSD chassis or Just-a-bunch-of-flashes (JBOF)) 302 includes a plurality of storage devices 10, and the PDU 90 includes a plurality of power supply units (PSUs or power supplies) 304 for supplying power to the storage devices 10 of the storage bank 302 under the direction of the local service processor (or BMC) 50. In some embodiments, the PSUs 304 are interchangeable, that is, each may have the same form factor and the same power supply capacity (e.g., have same output wattage); however, embodiments of the present invention are not limited thereto, and one or more of the PSUs 304 may have a power supply capacity that is different from other PSUs 304. In some examples, the plurality of PSUs 304 may be in an N+1 configuration in which N (an integer greater than or equal to 1) PSUs are sufficient to service the power needs of the storage bank 302, and an additional PSU 304 is provided as redundancy, which may be activated in the event that any of the PSUs experiences a failure.
  • As shown in FIG. 12 , in some embodiments, the PSUs 304 may be coupled together using a switch network (e.g., a FET network) 305, rather than directly connected to the power bus 306, in order to protect the power bus 306 from electrical short circuits and transients when other PSUs 304 are connected. The switch network may include a plurality of switches (e.g., transistors) that are connected to the plurality of PSUs 304, on one end, and connected to the power bus 306, at the other end. According to some embodiments, the switches are independently controlled by the local service provider (BMC) 50, so that any one of the PSUs 304 may be connected to, or disconnected from, the power bus 306, based on a control signal from the local service provider 50.
  • According to some embodiments, each storage devices 10 is configured to report its actual power consumption to the local service processor 50 via, for example, SMBus or PCI-e, and by, for example, NVMe-MI or MCTP protocols. The actual power consumption is measured by the PMU (i.e., power meter) 14, which may be internal to (e.g., integrated within) the storage device 10 (as shown in FIG. 12 ) or be external to, but coupled to, the storage device 10. The power consumption reporting enables the local service processor 50 to provide power profiles and perform analytics on the storage bank 302, which can in turn be used for diagnostics as well as offering value added services. This also allows each storage device 10 to more flexibly manage its own power usage as dictated by the system administrator 308, via the local service processor 50.
  • FIGS. 13A-13D illustrate histograms of power consumption of a storage system as generated by the local service processor 50, according to some exemplary embodiments of the present invention.
  • According to some embodiments, the local service processor 50 reads the power measurements periodically from the storage devices 10. In so doing, local service processor 50 may use NVMe-MI protocol over SMBus or PCIe to read the power log 70 pages, according to some examples. The local service processor 50 may then process the read power data to generate power usage trends, such as whole power usage of the storage bank 302 over time (e.g., per hour, during day time, night time, weekdays, or weekends, etc.), each storage device's 10 power consumption over time, relative power consumption of the storage devices 10 in a storage bank 302, and/or the like. In addition, the local service processor 50 may generate many derivative/additional graphs to learn about the power consumption behavior with respect to time, user, activity, etc. The local service processor 50 may also utilize such data for diagnostics purposes, power provisioning, future needs, cooling, and planning, etc.
  • As an example, FIG. 13A illustrates the power consumption of a single storage device 10 over time. In FIG. 13A, the Y axis represents power consumption in terms of Watts, and the X axis represents time in terms of hours.
  • In some embodiments, the local service processor 50 manages host access policies, and receives raw power data and host IDs of active storage devices. Thus, according to some embodiments, the local service processor 50 is cognizant/aware of which host or application is accessing each storage device 10 at any given time, and is able to combine this information with power usage metrics to profile the power consumption by various hosts or applications. Such information can provide deeper insights into storage power needs to various applications and can be used to calculate the storage costs per host or application more accurately.
  • As an example, FIG. 13B illustrates power consumption by different hosts or applications. In FIG. 13B, the Y axis represents average power consumption in Watts over a period of time (e.g., per hour, day, etc.), and the X axis represents the host ID or application ID.
  • According to some embodiments, the local service processor 50 is capable of using power usage metrics for diagnostic purposes. In some embodiments, when abnormal power consumption is observed for a storage device 10, the local service processor 50 may alert the storage administrator 308. The abnormal power consumption may be a result of a fault within the storage device 10, or may be due to anomalous activity of the host or application that is accessing the storage device 10. For example, the faults may be a result of flash die or flash channel failures, which may initiate RAID like recovery mechanism consuming excess power; or higher bit rate errors in the media or volatile memory, which may cause error correction algorithms not to converge and spend more time and energy on a process. The local service processor 50 may query storage device health and status logs, such as SMART Logs, as well as proprietary diagnostic logs to asses abnormal behavior. Based on the policies set by the administrator 308, some of the abnormal behavior may be alerted to the administrator 308 for further action.
  • For example, FIG. 13C illustrates a potential fault detected in a storage device 10 when the power consumption per hour suddenly spikes about normal levels (e.g., 3-10 W/hr) to close to maximum values (e.g., around 25 W). In FIG. 13C, the Y axis represents average power consumption in Watts, and the X axis represents time in terms of hours. Thus, in some embodiments, the criterion for fault detection may be the derivative of power consumption being greater than a set threshold. However, embodiments of the present invention are not limited thereto, and the actual power consumption may be measured against storage device performance to determine if a fault has occurred or not. In some examples, the fault detection criteria/policy may be set by the administrator 308.
  • Further, FIG. 13D illustrates an example, in which a potential fault is detected in a storage device 10 (e.g., the storage device in slot #8). In this example, the storage device 10 may be expected to consume a maximum power of about 25 W at 1 MIOPS (one million input/output operations per second) of performance. However, if the average power consumption of storage device in slot #8 reaches the maximum power of about 25 W, but the average performance is much lower than 1 MIOPs, then the local service processor may tag the storage device in slot #8 as potentially faulty or at least a good candidate for further fault analysis.
  • Accordingly, aspects of the present invention provide the building block of crucial information for other artificial intelligence SW to analyze. In addition, it also provides useful information for an ADRC (active disturbance rejection control), high-efficiency, thermal-control based system to take advantage of.
  • FIG. 14 is flow diagram illustrating the process 400 of managing operations of the PDU 90, according to some exemplary embodiments of the present invention.
  • According to some embodiments, the local service provider 50 manages (e.g., optimizes) operations of the PDU 90 by dynamically monitoring the operation of the PSUs 304 of the PDU 90 and ensuring that active PSUs 304 operate in their high power-efficiency range. In so doing, the local service provider 50 determines (S100) whether the PDU 90 includes multiple active PSUs 304 or not. The active PSUs 304 may be connected to the power bus 306 through the switch network (i.e., have the corresponding witches turned on), and the deactivated PSUs 304 may be disconnected from the power bus 306 (e.g., by having the corresponding switches turned off). In some embodiments, the local service provider 50 determines the status of each PSU 304 in the PDU 90 through a bus (e.g., SMBus/PMBus), and is thus able to determine the number of PSUs 304 at the PDU 90. In some examples, the local service provider 50 reads the PSU status register of each PSU 304 present in the PDU 90 to determine its status (i.e., active/enabled or deactivated/disabled). If only one active PSU 304 is present, the local service provider 50 proceed to determine (S114) if the active PSU 304 is the only one PSU 304 present and is in HA mode (more on this below). Otherwise, the local service provider 50 determines (S102) whether the total power consumption of the storage bank 302 is less than a first percentage threshold (e.g., 40% or a value between 30% to 50%) of the load of each of the active PSUs 304. In some embodiments, the local power processor 50 does so by obtaining the actual power consumption of each storage device 10, as measured by the corresponding PMU 14, and adding together the actual power consumptions. In some examples, the local service provider 50 may obtain the actual power consumption of each storage device 10 by querying/retrieving the power log 70 from the storage device 10 or the PMU 14 corresponding to the storage device 10 (which may be internal to or external to the storage device 10).
  • If the total power consumption is less than the first percentage threshold of the load of each of the active PSUs 304, the active PSUs 304 may be operating in low power efficiency mode, which may be undesirable. As such, the local service provider 50 disables an active PSU 304 (S104), waits (S106) for a period of time (e.g., seconds or minutes), and rechecks (S102) whether the total power consumption of the storage bank 302 is still less than the first percentage threshold of the load of each of the active PSUs 304. If so, the loop continues and the local service provider 50 continues to disable the active PSUs 304 one by one until the total power consumption is equal to or greater than the first percentage threshold of the load of each of the active PSUs 304.
  • At that point, the local service provider 50 proceeds to determine (S108) whether the total power consumption of the storage bank 302 is greater than a second percentage threshold (e.g., about 90% or a value between 85% and 95%) of the load of each of the active PSUs 304. If so, the active PSUs 304 may be operating in high-power state, which may be detrimental to the longevity of the PSUs 304 if prolonged. As such, the local service provider 50 enables (i.e., activates) a disabled (i.e., a deactivated) PSU 304 (S110), waits (S112) for a period of time (e.g., seconds or minutes), and rechecks (S108) whether the total power consumption of the storage bank 302 is still equal to or greater than the second percentage threshold of the load of each of the active PSUs 304. If so, the loop continues and the local service provider 50 continues to enable the active PSUs 304 one by one until the total power consumption is less than the second percentage threshold of the load of each of the active PSUs 304.
  • At that point, the local service provider 50 proceeds to determine (S114) if only one PSU 304 is present in the PDU 90 while the storage system 300 is in high availability (HA) mode, which indicates multi-path IO mode and N+1 redundant PSUs. Generally, in HA mode, the storage system 300 is in multi-path IO mode and N+1 redundant PSUs are present to ensure that there is no single point of failure. As such, when only one PSU 304 is present in the PDU 90 while the system 300 is in HA mode, the local service provider 50 issues a warning (e.g., a critical warning) message (S116) to the system administrator 308 to install another redundant PSU 304 in the PDU 90. Otherwise, the system is operating normally and no warning message is sent to the system administrator 308.
  • FIG. 15 is flow diagram illustrating a process 500 of managing the storage devices 10 of the storage system 300, according to some exemplary embodiments of the present invention.
  • According to some embodiments, the local service provider 50 manages (e.g., optimizes) storage devices 10 by dynamically adjusting (e.g., lowering) their maximum power range or power cap based on the current workload of the storage bank 302.
  • In some embodiments, the local service provider 50 identifies (S118) which storage devices 10 of the storage bank 302 are in an idle state or consume near-idle power. Herein, an idle state may refer to an operational state in which a storage device 10 does not have any active or outstanding host commands such as read or write in its command queue for a period of time. That is to say that the host command queues of the storage device controller have been empty for a period of time, which may be programmable (e.g., by the system administrator 308). Near-idle power may be any power consumption that is below a set threshold, which may be programmable (e.g., by the system administrator 308). In some embodiments, the local power processor 50 obtains the actual power consumption of each storage device 10, which is measured by the corresponding PMU 14, by querying/retrieving the power log 70 from the storage device 10. The local service provider 50 then compares the actual power consumption with an idle power level. If consumed power of the storage device 10 is at or below the idle power level, the storage device is identified as being in an idle state. The local service provider 50 then instructs (S120) the identified storage devices 10 to operate at a lower power cap. For example, the local service processor 50 may instruct each of the identified storage devices 10 to change power states to a power state having a lower maximum power rating (e.g., change from PowerState 2 to PowerState 5). This may be done based on a power policy that is implemented by the local service provider 50 (and is, e.g., defined by the system administrator 308), which associates each power state to a range of actual power consumption.
  • According to some embodiments, the local service provider 50 identifies (S122) which storage devices 10 consume power at a level less than a threshold power level. In some examples, the threshold may be set at 75% of maximum power, which may be 25 W, or 75 W, etc., depending on the kind of PSUs and/or power connectors used.
  • In some embodiments, the local power processor 50 obtains the actual power consumption of each storage device 10, which is measured by the corresponding PMU 14, by querying/retrieving the power log 70 from the storage device 10. The local service provider 50 then compares the actual power consumption with threshold power level to determine if consumed power of the storage device 10 is below the threshold power level. The local service provider 50 then dynamically instructs the identified storage devices 10 to operate at a power cap corresponding to the first level (e.g., at 75% or 80% of maximum power), as opposed to the default power cap of 100% maximum power. Because the power efficiency of a PSU drops as it reaches its maximum load capacity, lowering the power cap of the storage devices 10 may bring down the overall power usage of the storage bank 302, thus allowing the PSU to operate at a lower power level and at a higher (e.g., peak) power efficiency range. This may be particularly desirable in large data centers, where overall power usage and cooling is a great concern.
  • In some examples, the local service provider 50 may dynamically instruct each of the identified storage devices 10 to operate at a lower power cap by instructing them to change their power state to one where the maximum power corresponds to (e.g., is at or less than) the threshold power level (e.g., the power states may be changed from PowerState 0 to PowerState 1).
  • In some embodiments, the local service provider 50 identifies (S126) which storage device slots are empty (i.e., not occupied by, or connected to, any storage device 10). In some examples, each storage device 10 may have a presence pin on the slot connector 15, which is used by the service provider 50 to determine whether the slot is empty or occupied by a storage device 10. If any of the empty slots have corresponding PMU 14 that are external to (i.e., not integrated with, and outside of) their corresponding storage devices 10 (e.g., may be at a power distribution board or at a mid-plane of the storage chassis), the local service provider 50 instructs (S128) that these PMUs 14 operate at lower power caps (e.g., operate at the lowest power state, PowerState 31) or disable/deactivate altogether. This will allow the storage bank 302 to eliminate or reduce unnecessary power usage.
  • While operations S118-S120, S122-S124, and S126-S128 are ordered in a particular sequence in FIG. 15 , embodiments of the present invention are not limited thereto. For example, the operations S118-S120 can be performed after either or both of operations S122-S124 and S126-S128, and operations S126-S128 may be performed before either or both of operations S118-S120 and S122-S124.
  • The operations performed by the local service provider 50 (e.g., processes 400 and 500) may be described in terms of a software routine executed by one or more processors in the local service provider 50 based on computer program instructions stored in memory. A person of skill in the art should recognize, however, that the routine may be executed via hardware, firmware (e.g. via an ASIC), or in combination of software, firmware, and/or hardware. Furthermore, the sequence of steps of the process is not fixed, but may be altered into any desired sequence as recognized by a person of skill in the art.

Claims (1)

What is claimed is:
1. A storage system comprising:
one or more storage devices;
a plurality of power supplies configured to supply power to the storage device;
a processor; and
a memory having stored thereon instructions that, when executed by the processor, cause the processor to perform:
determining whether multiple power supplies of the plurality of power supplies are active;
in response to determining that multiple power supplies are active:
determining a total power consumption of the one or more storage devices;
in response to determining that the total power consumption is less than a first percentage threshold of a load of active ones of the power supplies,
deactivating one or more of the active ones of the power supplies until the total power consumption is equal to or greater than the first percentage threshold of a load of each of the active ones of the power supplies; and
in response to determining that the total power consumption is equal to or greater than a second percentage threshold of a load of each of the active ones of the power supplies,
activating one or more of the deactivated ones of the power supplies until the total power consumption is less than the second percentage threshold of the load of each of the active ones of the power supplies.
US18/055,331 2018-03-02 2022-11-14 Method and apparatus for performing power analytics of a storage system Pending US20230071775A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/055,331 US20230071775A1 (en) 2018-03-02 2022-11-14 Method and apparatus for performing power analytics of a storage system

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201862638035P 2018-03-02 2018-03-02
US15/975,463 US11481016B2 (en) 2018-03-02 2018-05-09 Method and apparatus for self-regulating power usage and power consumption in ethernet SSD storage systems
US201862713466P 2018-08-01 2018-08-01
US16/167,306 US11500439B2 (en) 2018-03-02 2018-10-22 Method and apparatus for performing power analytics of a storage system
US18/055,331 US20230071775A1 (en) 2018-03-02 2022-11-14 Method and apparatus for performing power analytics of a storage system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/167,306 Continuation US11500439B2 (en) 2018-03-02 2018-10-22 Method and apparatus for performing power analytics of a storage system

Publications (1)

Publication Number Publication Date
US20230071775A1 true US20230071775A1 (en) 2023-03-09

Family

ID=67768612

Family Applications (4)

Application Number Title Priority Date Filing Date
US16/167,306 Active US11500439B2 (en) 2018-03-02 2018-10-22 Method and apparatus for performing power analytics of a storage system
US17/112,933 Pending US20210089102A1 (en) 2018-03-02 2020-12-04 Method and apparatus for performing power analytics of a storage system
US17/233,303 Pending US20210232198A1 (en) 2018-03-02 2021-04-16 Method and apparatus for performing power analytics of a storage system
US18/055,331 Pending US20230071775A1 (en) 2018-03-02 2022-11-14 Method and apparatus for performing power analytics of a storage system

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US16/167,306 Active US11500439B2 (en) 2018-03-02 2018-10-22 Method and apparatus for performing power analytics of a storage system
US17/112,933 Pending US20210089102A1 (en) 2018-03-02 2020-12-04 Method and apparatus for performing power analytics of a storage system
US17/233,303 Pending US20210232198A1 (en) 2018-03-02 2021-04-16 Method and apparatus for performing power analytics of a storage system

Country Status (3)

Country Link
US (4) US11500439B2 (en)
KR (1) KR102385766B1 (en)
CN (1) CN110221946A (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6919538B2 (en) * 2017-12-05 2021-08-18 富士通株式会社 Power control system and power control program
US11481016B2 (en) 2018-03-02 2022-10-25 Samsung Electronics Co., Ltd. Method and apparatus for self-regulating power usage and power consumption in ethernet SSD storage systems
US11500439B2 (en) * 2018-03-02 2022-11-15 Samsung Electronics Co., Ltd. Method and apparatus for performing power analytics of a storage system
US10983852B2 (en) 2019-01-30 2021-04-20 Micron Technology, Inc. Power management component for memory sub-system voltage regulation
US11334135B1 (en) * 2019-03-28 2022-05-17 Amazon Technologies, Inc. Power supply optimization using backup battery power supplementation
US11449245B2 (en) * 2019-06-13 2022-09-20 Western Digital Technologies, Inc. Power target calibration for controlling drive-to-drive performance variations in solid state drives (SSDs)
US11050294B1 (en) 2019-06-28 2021-06-29 Amazon Technologies, Inc. Power supply shedding for power efficiency optimization
US11314315B2 (en) 2020-01-17 2022-04-26 Samsung Electronics Co., Ltd. Performance control of a device with a power metering unit (PMU)
CN111309132B (en) * 2020-02-21 2021-10-29 苏州浪潮智能科技有限公司 Method for multi-gear power supply redundancy of server
US20220244766A1 (en) * 2021-01-29 2022-08-04 Astrodyne TDI Highly adaptable power system
US11592894B2 (en) * 2021-04-12 2023-02-28 Dell Products L.P. Increasing power efficiency for an information handling system
KR20230097169A (en) * 2021-06-01 2023-06-30 양쯔 메모리 테크놀로지스 씨오., 엘티디. Power management of memory systems
CN113504823A (en) * 2021-06-16 2021-10-15 新华三信息安全技术有限公司 Method and device for reducing power consumption of frame type communication equipment
TW202324096A (en) * 2021-11-12 2023-06-16 南韓商三星電子股份有限公司 Storage device
US20230325097A1 (en) * 2022-04-12 2023-10-12 Dell Products L.P. Selective powering of storage drive components in a storage node based on system performance limits

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050028017A1 (en) * 2003-07-29 2005-02-03 Gopalakrishnan Janakiraman Supplying power to at least one electrical device based on an efficient operating point of a power supply
US20170187186A1 (en) * 2014-09-15 2017-06-29 Sma Solar Technology Ag Method and apparatus for the operation of a power station of fluctuating performance connected, besides a system former and at least one load, to a limited ac system
US11500439B2 (en) * 2018-03-02 2022-11-15 Samsung Electronics Co., Ltd. Method and apparatus for performing power analytics of a storage system

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6715071B2 (en) 1998-06-26 2004-03-30 Canon Kabushiki Kaisha System having devices connected via communication lines
JP3297389B2 (en) 1998-12-07 2002-07-02 インターナショナル・ビジネス・マシーンズ・コーポレーション Power consumption control method and electric equipment
US6785827B2 (en) * 2000-11-29 2004-08-31 Dell Products L.P. System for determining servers power supply requirement by sampling power usage values thereof at a rate based upon the criticality of its availability
US20040215912A1 (en) 2003-04-24 2004-10-28 George Vergis Method and apparatus to establish, report and adjust system memory usage
US7240225B2 (en) 2003-11-10 2007-07-03 Dell Products L.P. System and method for throttling power in one or more information handling systems
CN100482000C (en) * 2003-12-19 2009-04-22 艾利森电话股份有限公司 Adaptive power supply management for a node of a mobile telecommunications network
US7064994B1 (en) 2004-01-30 2006-06-20 Sun Microsystems, Inc. Dynamic memory throttling for power and thermal limitations
US7340616B2 (en) * 2004-05-26 2008-03-04 Intel Corporation Power management of storage units in a storage array
US7502948B2 (en) * 2004-12-30 2009-03-10 Intel Corporation Method, system, and apparatus for selecting a maximum operation point based on number of active cores and performance level of each of the active cores
US7444526B2 (en) 2005-06-16 2008-10-28 International Business Machines Corporation Performance conserving method for reducing power consumption in a server system
US7536573B2 (en) 2005-07-29 2009-05-19 Hewlett-Packard Development Company, L.P. Power budgeting for computers
US7647516B2 (en) 2005-09-22 2010-01-12 Hewlett-Packard Development Company, L.P. Power consumption management among compute nodes
US7539881B2 (en) 2006-04-15 2009-05-26 Hewlett-Packard Development Company, L.P. System and method for dynamically adjusting power caps for electronic components based on power consumption
US8272781B2 (en) 2006-08-01 2012-09-25 Intel Corporation Dynamic power control of a memory device thermal sensor
US7774520B2 (en) * 2006-12-19 2010-08-10 Intel Corporation Method and apparatus for maintaining synchronization of audio in a computing system
US8661167B2 (en) * 2007-09-17 2014-02-25 Intel Corporation DMA (direct memory access) coalescing
WO2009140404A2 (en) 2008-05-13 2009-11-19 Igo , Inc. Circuit and method for ultra-low idle power
US8051316B2 (en) * 2008-06-09 2011-11-01 Dell Products L.P. System and method for managing power supply units
US8217531B2 (en) * 2008-07-24 2012-07-10 International Business Machines Corporation Dynamically configuring current sharing and fault monitoring in redundant power supply modules
US8112651B2 (en) * 2008-09-25 2012-02-07 Intel Corporation Conserving power in a computer system
US7906871B2 (en) * 2008-12-30 2011-03-15 International Business Machines Corporation Apparatus, system, and method for reducing power consumption on devices with multiple power supplies
US8338988B2 (en) * 2009-04-17 2012-12-25 Lsi Corporation Adaptation of an active power supply set using an event trigger
US8312300B2 (en) * 2009-07-31 2012-11-13 Hewlett-Packard Development Company, L.P. Limiting power in redundant power supply systems
US8478451B2 (en) 2009-12-14 2013-07-02 Intel Corporation Method and apparatus for dynamically allocating power in a data center
US8429433B2 (en) 2010-01-15 2013-04-23 International Business Machines Corporation Dynamically adjusting an operating state of a data processing system running under a power cap
US8484497B2 (en) 2010-07-27 2013-07-09 Arm Limited Power supply control within an integrated circuit
US8589721B2 (en) * 2010-11-30 2013-11-19 International Business Machines Corporation Balancing power consumption and high availability in an information technology system
DE112011104789T5 (en) 2011-01-28 2013-10-31 Hewlett-Packard Development Co., L.P. information dissemination
US8627124B2 (en) 2011-02-10 2014-01-07 International Business Machines Corporation Techniques for performing storage power management
US8732506B2 (en) * 2011-05-26 2014-05-20 Cisco Technology, Inc. Method and apparatus for providing power to a server platforms by using a capacitor to provide power during a second power supply transitioning on
CN102290854B (en) * 2011-07-01 2013-10-16 创新科存储技术(深圳)有限公司 Method and device for performing state control on redundant power module
DE112012004063B4 (en) * 2011-09-29 2021-10-14 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Power supply unit and control method therefor
JP5998677B2 (en) 2012-06-29 2016-09-28 富士通株式会社 Storage device and connection device
WO2014051587A1 (en) * 2012-09-27 2014-04-03 Hewlett-Packard Development Company, L.P. Balancing a load between power supplies to increase efficiency
US9189045B2 (en) 2012-10-08 2015-11-17 Dell Products L.P. Power management system
US9396293B2 (en) 2012-11-06 2016-07-19 Cenergistic Llc Adjustment simulation method for energy consumption
US9298247B2 (en) 2012-11-27 2016-03-29 International Business Machines Corporation Distributed power budgeting
US10007323B2 (en) * 2012-12-26 2018-06-26 Intel Corporation Platform power consumption reduction via power state switching
US9280191B2 (en) * 2013-01-21 2016-03-08 Dell Products Lp. Systems and methods for power supply configuration and control
JP6059039B2 (en) * 2013-02-26 2017-01-11 京セラ株式会社 Transmitting apparatus and transmitting method
US20140281606A1 (en) 2013-03-15 2014-09-18 Silicon Graphics International Corp. Data storage power consumption threshold
US20140344947A1 (en) * 2013-05-20 2014-11-20 Advanced Micro Devices, Inc. Method and apparatus for handling storage of context information
US10324642B2 (en) 2013-06-07 2019-06-18 Sanmina Corporation Peripheral component interconnect express (PCIe) solid state drive (SSD) accelerator
US9292064B2 (en) * 2013-09-18 2016-03-22 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Changing output power to be within a range based on a power use efficiency peak
KR20150047785A (en) 2013-10-25 2015-05-06 삼성전자주식회사 Server system and storage system
WO2015066024A1 (en) * 2013-10-28 2015-05-07 Virtual Power Systems, Inc. Multi-level data center consolidated power control
CN103559872B (en) * 2013-11-15 2015-04-15 京东方科技集团股份有限公司 Power supply system
US9753520B2 (en) * 2013-12-23 2017-09-05 Dell Products, L.P. Predictive power capping and power allocation to computing nodes in a rack-based information handling system
US20150200566A1 (en) * 2014-01-10 2015-07-16 Zippy Technology Corp. Redundant power supply system for reducing standby power consumption
US9575677B2 (en) 2014-04-29 2017-02-21 Sandisk Technologies Llc Storage system power management using controlled execution of pending memory commands
US9477295B2 (en) 2014-05-15 2016-10-25 Dell Products, L.P. Non-volatile memory express (NVMe) device power management
US9547587B2 (en) 2014-05-23 2017-01-17 International Business Machines Corporation Dynamic power and thermal capping for flash storage
US9477299B2 (en) 2014-06-11 2016-10-25 Echostar Uk Holdings Limited Systems and methods for facilitating device control, device protection, and power savings
CN109766302B (en) 2014-09-12 2022-09-16 华为技术有限公司 Method and device for managing equipment
US9541988B2 (en) 2014-09-22 2017-01-10 Western Digital Technologies, Inc. Data storage devices with performance-aware power capping
US10146293B2 (en) * 2014-09-22 2018-12-04 Western Digital Technologies, Inc. Performance-aware power capping control of data storage devices
US9829902B2 (en) 2014-12-23 2017-11-28 Intel Corporation Systems and methods for dynamic temporal power steering
CN105808407B (en) 2014-12-31 2019-09-13 华为技术有限公司 Method, equipment and the equipment management controller of management equipment
US10095254B2 (en) * 2015-03-25 2018-10-09 Cisco Technology, Inc. Power distribution management
CN104951384B (en) 2015-06-16 2017-10-03 浪潮电子信息产业股份有限公司 A kind of monitoring system based on NVME SSD hard disks, baseboard management controller and monitoring method
US10234926B2 (en) * 2015-06-16 2019-03-19 Dell Products, Lp Method and apparatus for customized energy policy based on energy demand estimation for client systems
US10372185B2 (en) 2015-07-28 2019-08-06 Western Digital Technologies, Inc. Enhanced power control of data storage arrays
US10268262B2 (en) 2015-08-02 2019-04-23 Dell Products, L.P. Dynamic peak power limiting to processing nodes in an information handling system
US9983652B2 (en) 2015-12-04 2018-05-29 Advanced Micro Devices, Inc. Balancing computation and communication power in power constrained clusters
US10936044B2 (en) 2015-12-21 2021-03-02 Hewlett Packard Enterprise Development Lp Quality of service based memory throttling
US10185511B2 (en) 2015-12-22 2019-01-22 Intel Corporation Technologies for managing an operational characteristic of a solid state drive
US10209750B2 (en) 2016-05-02 2019-02-19 Samsung Electronics Co., Ltd. SSD driven system level thermal management
US10509456B2 (en) 2016-05-06 2019-12-17 Quanta Computer Inc. Server rack power management
US20190065243A1 (en) 2016-09-19 2019-02-28 Advanced Micro Devices, Inc. Dynamic memory power capping with criticality awareness
US10168753B2 (en) 2016-10-17 2019-01-01 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Power delivery optimization based on system capability
DE102017103214A1 (en) 2017-02-16 2018-08-16 Hyperstone Gmbh Methods and apparatus for managing a non-volatile digital information store
KR102462385B1 (en) 2017-07-17 2022-11-04 에스케이하이닉스 주식회사 Memory system and operating method thereof
US10831384B2 (en) 2017-08-31 2020-11-10 Micron Technology, Inc. Memory device with power management
US10509596B2 (en) 2017-12-21 2019-12-17 Advanced Micro Devices, Inc. Extreme-bandwidth scalable performance-per-watt GPU architecture
US11709539B2 (en) * 2018-01-24 2023-07-25 Western Digital Technologies, Inc. Low power state staging
US11481016B2 (en) * 2018-03-02 2022-10-25 Samsung Electronics Co., Ltd. Method and apparatus for self-regulating power usage and power consumption in ethernet SSD storage systems
US10359959B1 (en) * 2018-03-06 2019-07-23 Western Digital Technologies, Inc. Energy optimized power state declarations for solid state drives
US10852796B2 (en) * 2018-10-22 2020-12-01 Dell Products L.P. System and method of managing throttling of information handling systems
US11023287B2 (en) * 2019-03-27 2021-06-01 International Business Machines Corporation Cloud data center with reduced energy consumption
US11093135B1 (en) 2019-04-11 2021-08-17 Seagate Technology Llc Drive performance, power, and temperature management
US11921555B2 (en) 2019-07-26 2024-03-05 Samsung Electronics Co., Ltd. Systems, methods, and devices for providing power to devices through connectors
US11314315B2 (en) * 2020-01-17 2022-04-26 Samsung Electronics Co., Ltd. Performance control of a device with a power metering unit (PMU)
US11836028B2 (en) * 2021-01-20 2023-12-05 Dell Products L.P. System and method for closed-loop memory power capping
US11966613B2 (en) * 2021-11-24 2024-04-23 Western Digital Technologies, Inc. Selective device power state recovery method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050028017A1 (en) * 2003-07-29 2005-02-03 Gopalakrishnan Janakiraman Supplying power to at least one electrical device based on an efficient operating point of a power supply
US20170187186A1 (en) * 2014-09-15 2017-06-29 Sma Solar Technology Ag Method and apparatus for the operation of a power station of fluctuating performance connected, besides a system former and at least one load, to a limited ac system
US11500439B2 (en) * 2018-03-02 2022-11-15 Samsung Electronics Co., Ltd. Method and apparatus for performing power analytics of a storage system

Also Published As

Publication number Publication date
CN110221946A (en) 2019-09-10
KR102385766B1 (en) 2022-04-12
US20190272012A1 (en) 2019-09-05
KR20190104867A (en) 2019-09-11
US11500439B2 (en) 2022-11-15
US20210089102A1 (en) 2021-03-25
US20210232198A1 (en) 2021-07-29

Similar Documents

Publication Publication Date Title
US20230071775A1 (en) Method and apparatus for performing power analytics of a storage system
US9870159B2 (en) Solid-state disk (SSD) management
US11921555B2 (en) Systems, methods, and devices for providing power to devices through connectors
EP3242185B1 (en) Server rack power management
US10860082B2 (en) Dynamic power budget allocation
US11481016B2 (en) Method and apparatus for self-regulating power usage and power consumption in ethernet SSD storage systems
US7779276B2 (en) Power management in a power-constrained processing system
US9395790B2 (en) Power management system
US8129946B2 (en) Method and system for regulating current discharge during battery discharge conditioning cycle
US20050125701A1 (en) Method and system for energy management via energy-aware process scheduling
US8806256B2 (en) Power consumption monitor and method therefor
CN112969978B (en) Method and apparatus for providing peak-optimized power supply unit
US11086390B2 (en) Method and apparatus for improving power management by controlling a system input current in a power supply unit
US20150304177A1 (en) Processor management based on application performance data
CN114741180A (en) Rack management system, method and controller
US10852808B2 (en) Method and apparatus to distribute current indicator to multiple end-points
TWI805855B (en) A power assisted information handling system, a power assist unit and a method for regulating power to a load thereof

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAMSUNG ELECTRONICS CO., LTD., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KACHARE, RAMDAS P.;WU, WENTAO;OLARIG, SOMPONG PAUL;REEL/FRAME:061832/0123

Effective date: 20181019

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED