WO2021169259A1 - 一种动态电源管理系统 - Google Patents

一种动态电源管理系统 Download PDF

Info

Publication number
WO2021169259A1
WO2021169259A1 PCT/CN2020/117020 CN2020117020W WO2021169259A1 WO 2021169259 A1 WO2021169259 A1 WO 2021169259A1 CN 2020117020 W CN2020117020 W CN 2020117020W WO 2021169259 A1 WO2021169259 A1 WO 2021169259A1
Authority
WO
WIPO (PCT)
Prior art keywords
bmc
cmc
psu
signal
cpld
Prior art date
Application number
PCT/CN2020/117020
Other languages
English (en)
French (fr)
Inventor
韩齐
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2021169259A1 publication Critical patent/WO2021169259A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/266Arrangements to supply power to external peripherals either directly from the computer or under computer control, e.g. supply of power through the communication port, computer controlled power-strips
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/30Means for acting in the event of power-supply failure or interruption, e.g. power-supply fluctuations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2015Redundant power supplies

Definitions

  • the invention relates to the technical field of power management, in particular to a dynamic power management system.
  • High-density servers have the advantages of a small footprint and a more centralized system.
  • multiple server nodes share power and fans in the same chassis.
  • More processors and I/O (Input/Output) expansion capabilities are integrated in a smaller physical space. Power supply puts forward higher requirements, and multiple power supplies must be used for power supply.
  • the power supply usually adopts the redundant scheme design in the high-density server. For example, when four power supplies are configured for power supply, 3+1 power supply redundancy is supported. When one power supply fails, the remaining three power supplies can still meet the power supply requirements of the entire system. Due to the flexibility of the high-density server configuration scheme, the redundancy scheme design is more complicated.
  • the BMC controls the CPU throttling protection Pin to trigger the CPU frequency reduction to quickly reduce the power consumption of the whole system and avoid the system power consumption caused by abnormal redundant power output.
  • Overload and by controlling the CPU or the system node management controller to limit the power consumption of the whole system to not exceed the preset power consumption value, so as to realize the control of the cold redundancy of the server power supply, and meet the customer's cold redundancy of the power supply in the high-configuration server application Specification requirements reduce the complexity of power supply design and power supply costs.
  • the embodiment of the present invention provides a dynamic power management system in which BMC and CPLD jointly implement logic control to solve the problems in the prior art that the power supply redundancy of different specifications cannot be managed and the power supply alarm cannot be quickly responded to. Make the control method more flexible.
  • the present invention provides a dynamic power management system, including two or more PSUs.
  • the system also includes: a mid-backplane, a CMC board, and a Node Node.
  • the CMC board is a redundant design, the CMC board is used to perform on-site detection and power consumption detection of the PSU, the CMC board is used to control the Power capping trigger mechanism, and each node Node is used to perform The power consumption is limited, and the mid-backplane is used to supply power to the system.
  • the system presets one of the CMC boards as the master CMC board, and the other CMC board as the slave CMC board, and the slave CMC board is the redundant board;
  • the master CMC board When the system is powered on, the master CMC board is used to perform in-position detection and power consumption detection on the PSU, and the slave CMC board is used to perform in-position detection and power consumption detection on the PSU. If there is an abnormal PSU, then The main CMC board is used to control the power capping trigger mechanism.
  • the system upgrades the slave CMC board to the master CMC board.
  • each CMC board includes a first BMC and a first CPLD
  • the first BMC is connected to all PSUs and all Nodes
  • the first CPLD is connected to all PSUs and all Nodes
  • the first The CPLD is connected to the first BMC
  • the first BMC is used to perform on-site detection and power consumption detection of the PSU
  • the first CPLD is used to control the power capping trigger mechanism.
  • the GPIO module of the first BMC is connected to the presence signal PSU_PRESENT of each PSU, the I2C module of the first BMC is connected to the PMBUS signal of each PSU, and the GPIO_OVERRIDE signal of the first BMC is connected to the first CPLD,
  • the CMC_I2C signal of the first BMC is connected to each node Node;
  • the first BMC uses the presence signal PSU_PRESENT to determine whether each PSU is in place. If the PSU is in place, the PSU alarm signal PSU_Alert is low. If there is an abnormal PSU, the abnormal PSU alarm signal PSU_Alert is high;
  • the first BMC reads the power consumption of the in-place PSU through the PMBUS signal, and the first BMC determines whether the sum of the power consumption of the in-place PSU is greater than the system power consumption threshold preset by the BMC, and if so, the first The GPIO_OVERRIDE signal output by the BMC is low level and does not trigger the Power capping mechanism. If not, the GPIO_OVERRIDE signal output by the first BMC is high level, and the Power capping mechanism is triggered.
  • the GPIO module of the first CPLD is connected to the GPIO_OVERRIDE signal of the first BMC, the GPIO module of the first CPLD is connected to the alarm signal PSU_Alert of each PSU, and all the alarm signals PSU_Alert pass through the input of the NAND gate 1 in the CPLD
  • the output terminal of NAND gate 1 and the GPIO_OVERRIDE signal pass through the input terminal of NAND gate 2 inside the CPLD, and the output terminal of NAND gate 2 is connected to each node Node as the CMC_CAPPING_N signal of the first CPLD;
  • the GPIO_OVERRIDE signal output by the first BMC is low, and the first BMC The CMC_CAPPING_N signal output of the CPLD is high, and the Power capping mechanism is not triggered. Otherwise, the CMC_CAPPING_N signal output of the first CPLD is low, and the Power capping mechanism is triggered.
  • each node Node includes a second BMC, a second CPLD, a CPU, and a PCH
  • the GPIO module of the second CPLD is connected to the CMC_CAPPING_N signal of the first CPLD
  • the PROC_Hot_Throttle signal of the second CPLD is connected to the MEM_Hot_Throttle signal CPU
  • the BMC_GPIO_Throttle signal of the second CPLD is connected to the second BMC
  • the I2C module of the second BMC is connected to the CMC_I2C signal
  • the BMC_PCH_ME signal of the second BMC is connected to the PCH
  • the PCH_ME_N signal of the PCH is connected to the CPU
  • the second CPLD receives the low level of the CMC_CAPPING_N signal of the first CPLD, and the second CPLD sends the PROC_Hot_Throttle signal and the MEM_Hot_Throttle signal to the CPU, and the second CPLD sends the BMC_GPIO_Throttle signal to the CPU.
  • the CPU After the CPU receives the PROC_Hot_Throttle signal and the MEM_Hot_Throttle signal, the CPU enters the hardware low power consumption mode;
  • the second BMC After the second BMC receives the BMC_GPIO_Throttle signal, the second BMC obtains through the CMC_I2C signal that the sum of the power consumption of the PSU in place is less than the system power consumption threshold preset by the BMC, and the second BMC informs the PCH to be in place through the BMC_PCH_ME signal
  • the PSU has low power consumption
  • the ME module of the PCH informs the CPU to limit the power consumption through the PCH_ME_N signal.
  • each PSU in the system is any one of 800W, 1300W, 1600W or 2000W, and at least one PSU in the system is a redundant power supply.
  • the dynamic power management system uses BMC and CPLD to realize logic control together.
  • the first BMC on the CMC board judges whether each PSU is in position through the presence signal PSU_PRESENT. If the PSU is in position, the first BMC A BMC reads the power consumption of the in-place PSU through the PMBUS signal, and determines whether the sum of the power consumption of the in-place PSU is greater than the system power consumption threshold preset by the BMC, if it is, the Power capping mechanism is not triggered, if not, the Power capping is triggered Mechanism; after triggering the Powercapping mechanism, the second CPLD on the node Node receives the CMC_CAPPING_N signal low level of the first CPLD, the CPU enters the hardware low power consumption mode, and the second BMC on the node Node obtains the presence through the CMC_I2C signal The sum of the power consumption of the PSU is less than the system power consumption threshold preset by the BMC, and the
  • the dynamic power management system provided by the present invention uses a redundant design of two CMC boards.
  • the hardware structure and external connection structure of the two CMC boards are the same.
  • the system presets one of the CMC boards as the main CMC board, and the other CMC board
  • the board is used as a slave CMC board, and the slave CMC board is a redundant board.
  • the redundant design of the CMC board of the present invention improves the reliability and stability of the system.
  • Figure 1 is a structural block diagram of a BMC-based server power supply cold redundancy control scheme in the prior art
  • FIG. 2 is a block diagram of the system structure provided by an embodiment of the present invention.
  • FIG. 3 is a structural block diagram of a CMC board connection provided by an embodiment of the present invention.
  • FIG. 4 is another structural block diagram of the CMC board connection provided by the embodiment of the present invention.
  • Figure 5 is a structural block diagram of a Node connection provided by an embodiment of the present invention.
  • Fig. 6 is another structural block diagram of a Node connection provided by an embodiment of the present invention.
  • the system includes two or more PSUs (PSU, Power Supply unit, power supply unit), and the system also includes: a middle backplane, CMC (Chassis Management) Controller, chassis management controller) board and node Node, CMC board has two, Node Node has one or more, PSU is set on the middle back board, PSU is connected in parallel for external output, each CMC board passes through the CMC connector and the middle back All PSUs on the board are connected, and each node Node is connected to the mid-backplane through a node connector.
  • PSU Power Supply unit
  • power supply unit power supply unit
  • CMC board Chassis Management Controller
  • node Node CMC board has two
  • Node Node has one or more
  • PSU is set on the middle back board
  • PSU is connected in parallel for external output
  • each CMC board passes through the CMC connector and the middle back All PSUs on the board are connected
  • each node Node is connected to the mid-backplane through a node connector.
  • the CMC board is a redundant design.
  • the CMC board is used for PSU on-site detection and power consumption detection.
  • the CMC board is used to control the Power capping trigger mechanism.
  • Each node Node is used to limit the power consumption of the node. To power the system.
  • the hardware structure and external connection structure of the two CMC boards are the same.
  • the system presets one of the CMC boards as the master CMC board, and the other CMC board as the slave CMC board, and the slave CMC board as the redundant board;
  • the main CMC board When the system is powered on, the main CMC board is used for PSU in-position detection and power consumption detection, and the slave CMC board is used for PSU in-position detection and power consumption detection. If there is an abnormal PSU, the main CMC board is used To control the trigger mechanism of Power capping.
  • FIG. 3 it is a structural block diagram of the CMC board connection provided by the embodiment of the present invention.
  • Each CMC board includes a first BMC (BMC, Baseboard Management Controller, a baseboard management controller).
  • BMC Baseboard Management Controller
  • a baseboard management controller a baseboard management controller
  • the model is AST2500A2.
  • the first BMC's GPIO GPIO, General-purpose input/output, general-purpose Type I/O
  • the I2C I2C, Inter-Integrated Circuit, two-wire serial bus
  • the GPIO_OVERRIDE signal of the first BMC Connect the GPIO module of the first CPLD, the CMC_I2C signal of the first BMC is connected to each node, and the GPIO module of the first CPLD is connected to the alarm signal PSU_Alert of each PSU.
  • All alarm signals PSU_Alert pass through the input terminal of NAND gate 1, and The output terminal of the NOT gate 1 and the GPIO_OVERRIDE signal pass through the input terminal of the NAND gate 2, and the output terminal of the NAND gate 2 is connected to each node Node as the CMC_CAPPING_N signal of the first CPLD.
  • the working principle of the CMC board is:
  • the first BMC on the CMC board judges whether each PSU is in place through the presence signal PSU_PRESENT. If all PSUs are in position, the PSU alarm signal PSU_Alert is low, and the first BMC passes the PMBUS signal Read the power consumption of each PSU;
  • the alarm signal PSU_Alert of the abnormal PSU is high, and the first BMC judges whether the sum of the power consumption of the PSUs in the system is greater than the system power consumption threshold preset by the BMC, and if so, the output of the first BMC
  • the GPIO_OVERRIDE signal is low level
  • the CMC_CAPPING_N signal output of the first CPLD is high level
  • the Power capping mechanism is not triggered. If not, the GPIO_OVERRIDE signal output by the first BMC is high level, and the CMC_CAPPING_N signal output of the first CPLD is low Level, trigger the Power capping mechanism.
  • FIG 4 it is another structural block diagram of the CMC board connection provided by the embodiment of the present invention.
  • the management control of the 8 nodes is Information and feedback interact with the CMC board through the CMC_I2C_N* signal in each node.
  • FIG. 5 it is a structural block diagram of the connection of nodes provided by an embodiment of the present invention.
  • Each node includes a second BMC, a second CPLD, a CPU (CPU, central processing unit, central processing unit), and a PCH.
  • PCH Platform Controller Hub, Intel’s integrated south bridge, the model is EY82C627 in this embodiment
  • the GPIO module of the second CPLD is connected to the CMC_CAPPING_N signal of the first CPLD
  • the PROC_Hot_Throttle signal and MEM_Hot_Throttle signal of the second CPLD are connected to the CPU.
  • the BMC_GPIO_Throttle signal of the second CPLD is connected to the second BMC
  • the I2C module of the second BMC is connected to the CMC_I2C signal
  • the BMC_PCH_ME signal of the second BMC is connected to the PCH
  • the PCH_ME_N signal of the PCH is connected to the CPU.
  • the second CPLD receives the low level of the CMC_CAPPING_N signal of the first CPLD, and the second CPLD sends the PROC_Hot_Throttle signal and the MEM_Hot_Throttle signal to the CPU.
  • the second CPLD sends the BMC_GPIO_Throttle signal to the second BMC, and the CPU receives it.
  • the CPU enters the hardware low power consumption mode;
  • the second BMC After the second BMC receives the BMC_GPIO_Throttle signal, the second BMC obtains through the CMC_I2C signal that the sum of the power consumption of the PSU in the system is less than the system power consumption threshold preset by the BMC, and the second BMC informs the PCH through the BMC_PCH_ME signal that the PSU has low power consumption, and the ME of the PCH The module informs the CPU to limit the power consumption through the PCH_ME_N signal.
  • FIG. 6 it is another structural block diagram of the node connection provided by the embodiment of the present invention.
  • the connection of the two CMC boards in the system and the node Node is exactly the same, and the two CMC boards are in a redundant structure. .
  • At least one PSU in the system is a redundant power supply, and the power consumption specification of each PSU is any of 800W, 1300W, 1600W or 2000W.
  • the invention also includes a fan board, the fan board is inserted into the middle back board, and the fan board realizes the control of the fan and the network.

Abstract

一种动态电源管理系统,包括两个或多个PSU,系统还包括:中背板、CMC板和节点Node,CMC板有两个,节点Node有一个或多个,PSU设置于中背板上,PSU并联对外输出,每个CMC板通过CMC连接器和中背板上的所有PSU连接,每个节点Node通过节点连接器和中背板连接;CMC板为冗余设计,CMC板用于对PSU进行在位检测和功耗侦测,CMC板用于控制Power capping触发机制,每个节点Node用于对节点进行功耗限定,中背板用于给系统供电。本发明使用BMC和CPLD共同实现逻辑控制,以解决现有技术中无法对不同规格电源冗余情况进行管理,无法对电源告警实现快速响应的问题,使控制方式更加灵活。

Description

一种动态电源管理系统
本申请要求于2020年02月29日提交中国专利局、申请号为202010132986.3、发明名称为“一种动态电源管理系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及电源管理技术领域,具体涉及一种动态电源管理系统。
背景技术
随着科技的发展,服务器的应用越来越广泛。高密度服务器具有占用空间小、系统更集中等优点。高密度服务器在同一个机箱里由多台服务器节点共享电源和风扇,在更小的物理空间内集成更多的处理器和I/O(Input/Output,输入/输出)扩展能力,因此对电源供电提出更高的要求,必须使用多块电源进行供电。
为提高服务器的可靠性,在高密度服务器中电源通常采用冗余方案设计。例如在4个电源配置供电时,支持3+1电源冗余,可以允许1个电源故障时,其余3个电源仍然可以满足整个系统的供电要求。由于高密度服务器配置方案的灵活性,使得冗余方案设计更加复杂。
如图1所示,为目前基于BMC的服务器电源冷冗余控制方案,BMC控制CPU节流保护Pin触发CPU降频实现快速降低整机系统功耗,避免因冗余电源输出异常造成系统功耗过载;并且通过控制CPU或系统节点管理控制器限定整机系统功耗不超过预设功耗值,从而实现服务器电源冷冗余的控制,满足了高配置服务器应用中客户对电源冷冗余的规格需求,降低了电源设计的复杂度和电源成本。
现有技术中,仅通过BMC实现服务器电源冗余控制的方法,无法满足高密度服务器中针对不同规格电源冗余情况进行管理,无法对电源告警实现快速响应,控制方式不够灵活。
发明内容
本发明实施例中提供了一种动态电源管理系统,由BMC和CPLD共同实现逻辑控制,以解决现有技术中无法对不同规格电源冗余情况进行管理,无法对电源告警实现快速响应的问题,使得控制方式更加灵活。
本发明实施例公开了如下技术方案:
本发明提供了一种动态电源管理系统,包括两个或多个PSU,所述系统还包括:中背板、CMC板和节点Node,所述CMC板有两个,所述节点Node有一个或多个,所述PSU设置于中背板上,所述PSU并联对外输出,所述每个CMC板通过CMC连接器和中背板上的所有PSU连接,所述每个节点Node通过节点连接器和中背板连接;
所述CMC板为冗余设计,所述CMC板用于对PSU进行在位检测和功耗侦测,所述CMC板用于控制Power capping触发机制,所述每个节点Node用于对节点进行功耗限定,所述中背板用于给系统供电。
进一步地,所述两个CMC板的硬件结构和对外连接结构相同,系统预设其中一个CMC板作为主CMC板,则另一个CMC板作为从CMC板,所述从CMC板为冗余板;
系统上电时,所述主CMC板用于对PSU进行在位检测和功耗侦测,所述从CMC板用于对PSU进行在位检测和功耗侦测,如果有异常的PSU,则所述主CMC板用于控制Power capping触发机制。
进一步地,若所述主CMC板发生故障时,系统将所述从CMC板升级为主CMC板。
进一步地,所述每个CMC板包括第一BMC和第一CPLD,所述第一BMC连接所有PSU和所有的节点Node,所述第一CPLD连接所有PSU和所有的节点Node,所述第一CPLD连接第一BMC;所述第一BMC用于对PSU进行在位检测和功耗侦测,所述第一CPLD用于控制Power capping触发机制。
进一步地,所述第一BMC的GPIO模块连接每个PSU的在位信号 PSU_PRESENT,所述第一BMC的I2C模块连接每个PSU的PMBUS信号,所述第一BMC的GPIO_OVERRIDE信号连接第一CPLD,所述第一BMC的CMC_I2C信号连接每个节点Node;
系统上电时,所述第一BMC通过在位信号PSU_PRESENT判断每个PSU是否在位,如果PSU在位,则PSU的告警信号PSU_Alert为低电平,如果有异常的PSU,异常PSU的告警信号PSU_Alert为高电平;
所述第一BMC通过PMBUS信号读取在位PSU的功耗,所述第一BMC判断在位PSU的功耗之和是否大于BMC预设的系统功耗阈值,如果是,则所述第一BMC输出的GPIO_OVERRIDE信号为低电平,不触发Power capping机制,如果否,则所述第一BMC输出的GPIO_OVERRIDE信号为高电平,触发Power capping机制。
进一步地,所述第一CPLD的GPIO模块连接第一BMC的GPIO_OVERRIDE信号,所述第一CPLD的GPIO模块连接每个PSU的告警信号PSU_Alert,所有的告警信号PSU_Alert经过CPLD内部与非门1的输入端,与非门1的输出端和GPIO_OVERRIDE信号经过CPLD内部与非门2的输入端,与非门2的输出端作为第一CPLD的CMC_CAPPING_N信号连接每个节点Node;
当有异常的PSU,并且所述第一BMC判断在位PSU的功耗之和大于BMC预设的系统功耗阈值,则所述第一BMC输出的GPIO_OVERRIDE信号为低电平,所述第一CPLD的CMC_CAPPING_N信号输出为高电平,不触发Power capping机制,否则所述第一CPLD的CMC_CAPPING_N信号输出为低电平,触发Power capping机制。
进一步地,所述每个节点Node包括第二BMC、第二CPLD、CPU和PCH,所述第二CPLD的GPIO模块连接第一CPLD的CMC_CAPPING_N信号,所述第二CPLD的PROC_Hot_Throttle信号和MEM_Hot_Throttle信号连接CPU,所述第二CPLD的BMC_GPIO_Throttle信号连接所述第 二BMC,所述第二BMC的I2C模块连接CMC_I2C信号,所述第二BMC的BMC_PCH_ME信号连接PCH,PCH的PCH_ME_N信号连接CPU;
当触发Power capping机制后,所述第二CPLD接收第一CPLD的CMC_CAPPING_N信号低电平,则所述第二CPLD将PROC_Hot_Throttle信号和MEM_Hot_Throttle信号发送给CPU,所述第二CPLD将BMC_GPIO_Throttle信号发送给所述第二BMC,CPU接收到PROC_Hot_Throttle信号和MEM_Hot_Throttle信号后,CPU进入硬件低功耗模式;
所述第二BMC接收到BMC_GPIO_Throttle信号后,所述第二BMC通过CMC_I2C信号获取在位PSU的功耗之和小于BMC预设的系统功耗阈值,所述第二BMC通过BMC_PCH_ME信号通知PCH在位PSU功耗低,PCH的ME模块通过PCH_ME_N信号通知CPU限定功耗。
进一步地,所述系统中每个PSU的功耗规格为800W、1300W、1600W或2000W中的任一种,所述系统中至少有一个PSU为冗余电源。
发明内容中提供的效果仅仅是实施例的效果,而不是发明所有的全部效果,上述技术方案中的一个技术方案具有如下优点或有益效果:
1)本发明提供的动态电源管理系统,使用BMC和CPLD共同实现逻辑控制,系统上电时,CMC板上第一BMC通过在位信号PSU_PRESENT判断每个PSU是否在位,如果PSU在位,第一BMC通过PMBUS信号读取在位PSU的功耗,并判断在位PSU的功耗之和是否大于BMC预设的系统功耗阈值,如果是,不触发Power capping机制,如果否,触发Power capping机制;触发Power capping机制后,所述节点Node上的第二CPLD接收第一CPLD的CMC_CAPPING_N信号低电平,CPU进入硬件低功耗模式,同时节点Node上的第二BMC通过CMC_I2C信号获取在位PSU的功耗之和小于BMC预设的系统功耗阈值,第二BMC和PCH协商降功耗,并通知CPU限定功耗。本发明实现了对不同规格电源冗余情况进行管理, 对电源告警实现快速响应,控制方式更加灵活。
2)本发明提供的动态电源管理系统,使用两个CMC板的冗余设计,两个CMC板的硬件结构和对外连接结构相同,系统预设其中一个CMC板作为主CMC板,则另一个CMC板作为从CMC板,从CMC板为冗余板。本发明CMC板的冗余设计提高了系统的可靠性和稳定性。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为现有技术中基于BMC的服务器电源冷冗余控制方案结构框图;
图2为本发明实施例所提供的系统结构框图;
图3为本发明实施例所提供的CMC板连接的一种结构框图;
图4为本发明实施例所提供的CMC板连接的另一种结构框图;
图5为本发明实施例所提供的节点Node连接的一种结构框图;
图6为本发明实施例所提供的节点Node连接的另一种结构框图。
具体实施方式
为了能清楚说明本方案的技术特点,下面通过具体实施方式,并结合其附图,对本发明进行详细阐述。下文的公开提供了许多不同的实施例或例子用来实现本发明的不同结构。为了简化本发明的公开,下文中对特定例子的部件和设置进行描述。此外,本发明可以在不同例子中重复参考数字和/或字母。这种重复是为了简化和清楚的目的,其本身不指示所讨论各种实施例和/或设置之间的关系。应当注意,在附图中所图示的部件不一定按比例绘制。本发明省略了对公知组件和处理技术及工艺的描述以避免不必要地限制本发明。
如图2所示,为本发明实施例所提供的系统结构框图,系统包括两个或多个PSU(PSU,Power Supply unit,电源供应单元),系统还包括: 中背板、CMC(Chassis Management Controller,机箱管理控制器)板和节点Node,CMC板有两个,节点Node有一个或多个,PSU设置于中背板上,PSU并联对外输出,每个CMC板通过CMC连接器和中背板上的所有PSU连接,每个节点Node通过节点连接器和中背板连接。
CMC板为冗余设计,CMC板用于对PSU进行在位检测和功耗侦测,CMC板用于控制Power capping触发机制,每个节点Node用于对节点进行功耗限定,中背板用于给系统供电。
两个CMC板的硬件结构和对外连接结构相同,系统预设其中一个CMC板作为主CMC板,则另一个CMC板作为从CMC板,从CMC板为冗余板;
系统上电时,主CMC板用于对PSU进行在位检测和功耗侦测,从CMC板用于对PSU进行在位检测和功耗侦测,如果有异常的PSU,则主CMC板用于控制Power capping触发机制。
若主CMC板发生故障时,系统将从CMC板升级为主CMC板。
如图3所示,为本发明实施例所提供的CMC板连接的一种结构框图,每个CMC板包括第一BMC(BMC,Baseboard Management Controller,基板管理控制器,本实施例中型号为AST2500A2-GP)和第一CPLD(CPLD,Complex Programmable Logic Device,复杂可编程逻辑器件,本实施例中型号为LCMXO2-2000HC-4FTG256I),第一BMC的GPIO(GPIO,General-purpose input/output,通用型输入输出)模块连接每个PSU的在位信号PSU_PRESENT,第一BMC的I2C(I2C,Inter-Integrated Circuit,两线式串行总线)模块连接每个PSU的PMBUS信号,第一BMC的GPIO_OVERRIDE信号连接第一CPLD的GPIO模块,第一BMC的CMC_I2C信号连接每个节点Node,第一CPLD的GPIO模块连接每个PSU的告警信号PSU_Alert,所有的告警信号PSU_Alert经过与非门1的输入端,与非门1的输出端和GPIO_OVERRIDE信号经过与非门2的输入端,与非 门2的输出端作为第一CPLD的CMC_CAPPING_N信号连接每个节点Node。
CMC板的工作原理是:
系统上电时,CMC板上的第一BMC通过在位信号PSU_PRESENT判断每个PSU是否在位,如果所有的PSU都在位,则PSU的告警信号PSU_Alert为低电平,第一BMC通过PMBUS信号读取每个PSU的功耗;
当有异常的PSU时,异常PSU的告警信号PSU_Alert为高电平,第一BMC判断系统中PSU的功耗之和是否大于BMC预设的系统功耗阈值,如果是,则第一BMC输出的GPIO_OVERRIDE信号为低电平,第一CPLD的CMC_CAPPING_N信号输出为高电平,不触发Power capping机制,如果否,则第一BMC输出的GPIO_OVERRIDE信号为高电平,第一CPLD的CMC_CAPPING_N信号输出为低电平,触发Power capping机制。
如图4所示,为本发明实施例所提供的CMC板连接的另一种结构框图,本结构中有8个节点Node,8个节点Node在系统中为并联形式,8个节点的管理控制信息及反馈通过各节点中CMC_I2C_N*信号与CMC板进行交互。
如图5所示,为本发明实施例所提供的节点Node连接的一种结构框图,每个节点Node包括第二BMC、第二CPLD、CPU(CPU,central processing unit,中央处理器)和PCH(PCH,Platform Controller Hub,intel公司的集成南桥,本实施例中型号为EY82C627),第二CPLD的GPIO模块连接第一CPLD的CMC_CAPPING_N信号,第二CPLD的PROC_Hot_Throttle信号和MEM_Hot_Throttle信号连接CPU,第二CPLD的BMC_GPIO_Throttle信号连接第二BMC,第二BMC的I2C模块连接CMC_I2C信号,第二BMC的BMC_PCH_ME信号连接PCH,PCH的PCH_ME_N信号连接CPU。
节点Node的工作原理是:
当触发Power capping机制后,第二CPLD接收第一CPLD的CMC_CAPPING_N信号低电平,则第二CPLD将PROC_Hot_Throttle信号和MEM_Hot_Throttle信号发送给CPU,第二CPLD将BMC_GPIO_Throttle信号发送给第二BMC,CPU接收到PROC_Hot_Throttle信号和MEM_Hot_Throttle信号后,CPU进入硬件低功耗模式;
第二BMC接收到BMC_GPIO_Throttle信号后,第二BMC通过CMC_I2C信号获取系统中PSU的功耗之和小于BMC预设的系统功耗阈值,第二BMC通过BMC_PCH_ME信号通知PCH PSU功耗低,PCH的ME模块通过PCH_ME_N信号通知CPU限定功耗。
如图6所示,为本发明实施例所提供的节点Node连接的另一种结构框图,本结构中2个CMC板在系统中和节点Node的连接完全相同,2个CMC板为冗余结构。
系统中至少有一个PSU为冗余电源,每个PSU的功耗规格为800W、1300W、1600W或2000W中的任一种。
本发明中还包括风扇板,风扇板插到中背板,风扇板实现对风扇及网络的控制。以上所述只是本发明的优选实施方式,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也被视为本发明的保护范围。

Claims (8)

  1. 一种动态电源管理系统,其特征在于,所述系统包括两个或多个PSU,所述系统还包括中背板、CMC板和节点Node,所述CMC板有两个,所述节点Node有一个或多个,所述PSU设置于中背板上,所述PSU并联对外输出,所述每个CMC板通过CMC连接器和中背板上的所有PSU连接,所述每个节点Node通过节点连接器和中背板连接;
    所述CMC板为冗余设计,所述CMC板用于对PSU进行在位检测和功耗侦测,所述CMC板用于控制Power capping触发机制,所述每个节点Node用于对节点进行功耗限定,所述中背板用于给系统供电。
  2. 根据权利要求1所述的一种动态电源管理系统,其特征在于,所述两个CMC板的硬件结构和对外连接结构相同,系统预设其中一个CMC板作为主CMC板,则另一个CMC板作为从CMC板,所述从CMC板为冗余板;
    系统上电时,所述主CMC板用于对PSU进行在位检测和功耗侦测,所述从CMC板用于对PSU进行在位检测和功耗侦测,如果有异常的PSU,则所述主CMC板用于控制Power capping触发机制。
  3. 根据权利要求2所述的一种动态电源管理系统,其特征在于,若所述主CMC板发生故障时,系统将所述从CMC板升级为主CMC板。
  4. 根据权利要求2所述的一种动态电源管理系统,其特征在于,所述每个CMC板包括第一BMC和第一CPLD,所述第一BMC连接所有PSU和所有的节点Node,所述第一CPLD连接所有PSU和所有的节点Node,所述第一CPLD连接第一BMC;所述第一BMC用于对PSU进行在位检测和功耗侦测,所述第一CPLD用于控制Power capping触发机制。
  5. 根据权利要求4所述的一种动态电源管理系统,其特征在于,所述第一BMC的GPIO模块连接每个PSU的在位信号PSU_PRESENT,所述第一BMC的I2C模块连接每个PSU的PMBUS信号,所述第一BMC的 GPIO_OVERRIDE信号连接第一CPLD,所述第一BMC的CMC_I2C信号连接每个节点Node;
    系统上电时,所述第一BMC通过在位信号PSU_PRESENT判断每个PSU是否在位,如果PSU在位,则PSU的告警信号PSU_Alert为低电平,如果有异常的PSU,异常PSU的告警信号PSU_Alert为高电平;
    所述第一BMC通过PMBUS信号读取在位PSU的功耗,所述第一BMC判断在位PSU的功耗之和是否大于BMC预设的系统功耗阈值,如果是,则所述第一BMC输出的GPIO_OVERRIDE信号为低电平,不触发Power capping机制,如果否,则所述第一BMC输出的GPIO_OVERRIDE信号为高电平,触发Power capping机制。
  6. 根据权利要求4所述的一种动态电源管理系统,其特征在于,所述第一CPLD的GPIO模块连接第一BMC的GPIO_OVERRIDE信号,所述第一CPLD的GPIO模块连接每个PSU的告警信号PSU_Alert,所有的告警信号PSU_Alert经过CPLD内部与非门1的输入端,与非门1的输出端和GPIO_OVERRIDE信号经过CPLD内部与非门2的输入端,与非门2的输出端作为第一CPLD的CMC_CAPPING_N信号连接每个节点Node;
    当有异常的PSU,并且所述第一BMC判断在位PSU的功耗之和大于BMC预设的系统功耗阈值,则所述第一BMC输出的GPIO_OVERRIDE信号为低电平,所述第一CPLD的CMC_CAPPING_N信号输出为高电平,不触发Power capping机制,否则所述第一CPLD的CMC_CAPPING_N信号输出为低电平,触发Power capping机制。
  7. 根据权利要求1所述的一种动态电源管理系统,其特征在于,所述每个节点Node包括第二BMC、第二CPLD、CPU和PCH,所述第二CPLD的GPIO模块连接第一CPLD的CMC_CAPPING_N信号,所述第二CPLD的PROC_Hot_Throttle信号和MEM_Hot_Throttle信号连接CPU,所述第二CPLD的BMC_GPIO_Throttle信号连接所述第二BMC,所述第二BMC 的I2C模块连接CMC_I2C信号,所述第二BMC的BMC_PCH_ME信号连接PCH,PCH的PCH_ME_N信号连接CPU;
    当触发Power capping机制后,所述第二CPLD接收第一CPLD的CMC_CAPPING_N信号低电平,则所述第二CPLD将PROC_Hot_Throttle信号和MEM_Hot_Throttle信号发送给CPU,所述第二CPLD将BMC_GPIO_Throttle信号发送给所述第二BMC,CPU接收到PROC_Hot_Throttle信号和MEM_Hot_Throttle信号后,CPU进入硬件低功耗模式;
    所述第二BMC接收到BMC_GPIO_Throttle信号后,所述第二BMC通过CMC_I2C信号获取在位PSU的功耗之和小于BMC预设的系统功耗阈值,所述第二BMC通过BMC_PCH_ME信号通知PCH在位PSU功耗低,PCH的ME模块通过PCH_ME_N信号通知CPU限定功耗。
  8. 根据权利要求1所述的一种动态电源管理系统,其特征在于,所述系统中每个PSU的功耗规格为800W、1300W、1600W或2000W中的任一种,所述系统中至少有一个PSU为冗余电源。
PCT/CN2020/117020 2020-02-29 2020-09-23 一种动态电源管理系统 WO2021169259A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010132986.3A CN111367392B (zh) 2020-02-29 2020-02-29 一种动态电源管理系统
CN202010132986.3 2020-02-29

Publications (1)

Publication Number Publication Date
WO2021169259A1 true WO2021169259A1 (zh) 2021-09-02

Family

ID=71206491

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117020 WO2021169259A1 (zh) 2020-02-29 2020-09-23 一种动态电源管理系统

Country Status (2)

Country Link
CN (1) CN111367392B (zh)
WO (1) WO2021169259A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111367392B (zh) * 2020-02-29 2021-08-24 苏州浪潮智能科技有限公司 一种动态电源管理系统
CN111857316B (zh) * 2020-07-21 2023-01-06 苏州浪潮智能科技有限公司 一种实现ipmi功率传感器自动阈值配置功能的方法和装置
CN112306209A (zh) * 2020-10-28 2021-02-02 苏州浪潮智能科技有限公司 一种服务器用分离式冗余电源供电电路及其控制方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107589955A (zh) * 2017-09-19 2018-01-16 郑州云海信息技术有限公司 一种双cmc固件版本的升级方法及系统
CN110147155A (zh) * 2019-05-21 2019-08-20 苏州浪潮智能科技有限公司 基于bmc的服务器电源冷冗余控制方法、装置及bmc
CN110609760A (zh) * 2019-08-14 2019-12-24 苏州浪潮智能科技有限公司 一种防止服务器误触发降频的系统
US20200012334A1 (en) * 2017-07-14 2020-01-09 Cisco Technology, Inc. Dynamic power capping of multi-server nodes in a chassis based on real-time resource utilization
CN111367392A (zh) * 2020-02-29 2020-07-03 苏州浪潮智能科技有限公司 一种动态电源管理系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9075595B2 (en) * 2012-08-30 2015-07-07 Dell Products L.P. Power excursion warning system
CN103995575B (zh) * 2014-05-27 2018-02-02 浪潮(北京)电子信息产业有限公司 一种服务器启动方法和服务器
CN104794033A (zh) * 2015-04-29 2015-07-22 浪潮电子信息产业股份有限公司 一种基于bmc的cpu低频故障的定位方法及装置
CN209560479U (zh) * 2019-04-16 2019-10-29 苏州浪潮智能科技有限公司 一种电源板

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200012334A1 (en) * 2017-07-14 2020-01-09 Cisco Technology, Inc. Dynamic power capping of multi-server nodes in a chassis based on real-time resource utilization
CN107589955A (zh) * 2017-09-19 2018-01-16 郑州云海信息技术有限公司 一种双cmc固件版本的升级方法及系统
CN110147155A (zh) * 2019-05-21 2019-08-20 苏州浪潮智能科技有限公司 基于bmc的服务器电源冷冗余控制方法、装置及bmc
CN110609760A (zh) * 2019-08-14 2019-12-24 苏州浪潮智能科技有限公司 一种防止服务器误触发降频的系统
CN111367392A (zh) * 2020-02-29 2020-07-03 苏州浪潮智能科技有限公司 一种动态电源管理系统

Also Published As

Publication number Publication date
CN111367392B (zh) 2021-08-24
CN111367392A (zh) 2020-07-03

Similar Documents

Publication Publication Date Title
WO2021169259A1 (zh) 一种动态电源管理系统
US11150165B2 (en) System and method for configuration drift detection and remediation
US11907148B2 (en) OCP adapter card and computer device
CN105700969B (zh) 服务器系统
US20080162691A1 (en) Blade server management system
US7325149B2 (en) Power-on management for remote power-on signals to high density server module
US9223394B2 (en) Rack and power control method thereof
US20190286590A1 (en) Cpld cache application in a multi-master topology system
WO2010094170A1 (zh) 一种管理电源的方法、装置及供电系统
US20120324088A1 (en) Multi-service node management system, device and method
CN112000501A (zh) 一种多节点分区服务器访问i2c设备的管理系统
US20040133819A1 (en) System and method for providing a persistent power mask
CN111209241A (zh) 整机柜服务器的管理系统
CN110609760A (zh) 一种防止服务器误触发降频的系统
CN107179804B (zh) 机柜装置
JP2019102078A (ja) システム電源管理方法及び計算機システム
CN114442787B (zh) 服务器进入功耗封顶后实现整机功耗回调的方法、系统
TWI777058B (zh) 伺服器電源保護裝置
EP2759905A2 (en) Information processing apparatus, method of controlling power consumption, and storage medium
CN105471652A (zh) 大数据一体机及其冗余管理单元
CN117111693A (zh) 一种服务器机箱系统、服务器机箱系统设计方法及设备
TW201729097A (zh) 機櫃裝置
US11733762B2 (en) Method to allow for higher usable power capacity in a redundant power configuration
CN212966168U (zh) 支持高密度服务器的安全增强龙芯计算主板装置
US7464257B2 (en) Mis-configuration detection methods and devices for blade systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921056

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20921056

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20921056

Country of ref document: EP

Kind code of ref document: A1