CN115858292A - IPMI management method and system based on MCU - Google Patents

IPMI management method and system based on MCU Download PDF

Info

Publication number
CN115858292A
CN115858292A CN202211581139.0A CN202211581139A CN115858292A CN 115858292 A CN115858292 A CN 115858292A CN 202211581139 A CN202211581139 A CN 202211581139A CN 115858292 A CN115858292 A CN 115858292A
Authority
CN
China
Prior art keywords
management unit
slave
ipmi
master
power
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211581139.0A
Other languages
Chinese (zh)
Inventor
王琳
刘战朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jusontech Co ltd
Original Assignee
Beijing Jusontech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jusontech Co ltd filed Critical Beijing Jusontech Co ltd
Priority to CN202211581139.0A priority Critical patent/CN115858292A/en
Publication of CN115858292A publication Critical patent/CN115858292A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses an IPMI management method and a system based on MCU, the method is applied to a case comprising a master device and a plurality of slave devices, the method comprises the following steps that a master device is connected with each slave device through an IPMB, a master management unit based on an MCU is arranged in the master device, and a slave management unit based on the MCU is arranged in the slave device, and the method comprises the following steps: the master management unit reads the power-on state of each in-place slave device and determines the powered-on target slave device according to the power-on state; the method comprises the steps that a master management unit sends a first IPMI request message to a target slave management unit corresponding to a target slave device; the master management unit determines the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit; if abnormal data exists in the operating environment data of the target slave equipment, the master management unit controls the target slave equipment according to the abnormal data, and therefore IPMI management on the server is accurately achieved on the basis of reducing cost.

Description

IPMI management method and system based on MCU
Technical Field
The present application relates to the field of computer technologies, and in particular, to an IPMI management method and system based on an MCU.
Background
In view of the safety of the server, IPMI (intelligent platform management interface) management needs to be performed on each device in the chassis of the server, and IPMI is an industrial standard for managing peripheral devices used in an enterprise system based on an Intel structure, and a user can monitor physical health characteristics of the server, such as temperature, voltage, operating state of a cooling fan, power state, and the like, by using IPMI.
In the prior art, a BMC (baseboard management controller) chip is usually used for IPMI management, but the BMC chip is expensive, which is particularly indicated that a large number of independent card-inserting type devices are required for one chassis, and each device needs an independent control unit, resulting in higher cost; in addition, the BMC chip needs to run a linux system for realization, and the relative overhead and power consumption are high.
Therefore, how to accurately manage IPMI of the server based on cost reduction is a technical problem to be solved at present.
Disclosure of Invention
The embodiment of the application provides an IPMI management method and system based on an MCU (micro control unit), which are used for accurately carrying out IPMI management on a server on the basis of reducing the cost.
In one aspect, an IPMI management method based on an MCU is provided, where the method is applied to an enclosure including a master device and a plurality of slave devices, the master device and each of the slave devices are connected through an IPMB, a master management unit based on an MCU is disposed in the master device, and a slave management unit based on an MCU is disposed in the slave device, and the method includes:
the master management unit reads the power-on state of each in-place slave device and determines the powered-on target slave device according to the power-on state;
the master management unit sends a first IPMI request message to a target slave management unit corresponding to the target slave equipment;
the master management unit determines the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit;
if abnormal data exist in the operating environment data of the target slave equipment, the master management unit controls the target slave equipment according to the abnormal data;
the first IPMI response packet is generated by encapsulating the operating environment data of the target slave device after the target slave management unit receives the first IPMI request packet.
In some embodiments, after the master management unit reads a power-on state of each in-place slave device and determines a target slave device that has been powered on according to the power-on state, the method further includes:
if the master management unit monitors a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment, verifying the power-on request IPMI message, and returning an IPMI message agreeing to power on to the in-place slave management unit after the verification is passed so that the in-place slave management unit powers on a CPU of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
In some embodiments, after the master management unit reads a power-on state of each in-place slave device and determines a target slave device that has been powered on according to the power-on state, the method further includes:
if the master management unit monitors an alarm IPMI message sent by the target slave management unit, analyzing the alarm IPMI message, and controlling the target slave equipment according to an analysis result;
the warning IPMI message is generated when the target slave management unit detects that abnormal data exists in the operating environment data of the target slave equipment.
In some embodiments, before the master management unit reads the power-on state of each in-place slave device and determines a target slave device that has been powered on according to the power-on state, the method further includes:
the main management unit responds to a starting instruction input by a user to carry out hardware initialization operation on the main equipment;
the main management unit detects a power-on environment of the main equipment and controls a CPU of the main equipment to be powered on when the power-on environment meets a preset condition;
and the main management unit performs case initialization operation according to the case information.
In some embodiments, after the main management unit performs the chassis initialization operation according to the chassis information, the method further includes:
the main management unit circularly obtains the operating environment data of the main equipment and controls the main equipment according to the operating environment data of the main equipment.
In some embodiments, the main management unit and the CPU of the main device communicate with each other through a preset serial communication private protocol, in the preset serial communication private protocol, the request packet and the response packet each include a packet header, a message length, an I2c address, a command type, a specific command, specific data, and a check bit, and a completion code is further included between the specific data and the check bit of the response packet.
In some embodiments, the chassis further includes a standby master device, where the standby master device continuously performs information interaction with the master device during the operation of the master device, and if the number of times that the master device does not respond reaches a preset number of times, it is determined that the master device is abnormal, and the standby master device is reinitialized as a new master device.
On the other hand, an IPMI management system based on MCU is provided, where the system includes a master device and a plurality of slave devices, where the master device and each of the slave devices are connected through an IPMB, a master management unit based on MCU is provided in the master device, a slave management unit based on MCU is provided in the slave device, and the master management unit is configured to:
reading the power-on state of each in-place slave device, and determining a powered-on target slave device according to the power-on state;
sending a first IPMI request message to a target slave management unit corresponding to the target slave device;
determining the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit;
if abnormal data exist in the operating environment data of the target slave equipment, controlling the target slave equipment according to the abnormal data;
the first IPMI response packet is generated by encapsulating the operating environment data of the target slave device after the target slave management unit receives the first IPMI request packet.
In some embodiments, the master management unit is further configured to:
if a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment is monitored, verifying the power-on request IPMI message, and returning the IPMI message which agrees to power on to the in-place slave management unit after the verification is passed so that the in-place slave equipment powers on a CPU of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
In some embodiments, the master management unit is further configured to:
if the alarm IPMI message sent by the target slave management unit is monitored, analyzing the alarm IPMI message, and controlling the target slave equipment according to the analysis result;
the warning IPMI message is generated when the slave management unit detects that abnormal data exists in the operating environment data of the target slave device.
By applying the technical scheme, in a case comprising a main device and a plurality of slave devices, the main device is connected with each slave device through an IPMB, a main management unit based on an MCU is arranged in the main device, a slave management unit based on the MCU is arranged in each slave device, the main management unit reads the power-on state of each in-place slave device, and determines the powered-on target slave device according to the power-on state; the method comprises the steps that a master management unit sends a first IPMI request message to a target slave management unit corresponding to a target slave device; the master management unit determines the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit; if abnormal data exist in the operating environment data of the target slave equipment, the master management unit controls the target slave equipment according to the abnormal data; the first IPMI response message is generated by packaging the running environment data of the target slave equipment after the target slave management unit receives the first IPMI request message, and the cost is reduced by adopting the MCU as the master management unit and the slave management unit compared with a BMC chip.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart illustrating an IPMI management method based on MCU according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an in-chassis device topology according to an embodiment of the present invention;
FIG. 3 illustrates a serial communication proprietary protocol frame format in an embodiment of the invention;
fig. 4 shows an example of the master device private protocol command type in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
The embodiment of the application provides an IPMI (intelligent platform management BUS) management method based on an MCU (microcontrollerUnit), which is applied to a chassis comprising a main device and a plurality of slave devices, wherein the main device and each slave device are connected through an IPMB (Intelligent platform management BUS), a main management unit based on the MCU is arranged in the main device, and a slave management unit based on the MCU is arranged in the slave device.
In this embodiment, a set of chassis has a plurality of independent devices, each of which needs IPMI management, as shown in fig. 2, each of the independent devices includes a master device 100 and a plurality of slave devices 200, the master device 100 and each of the slave devices 200 are connected through an IPMB, a master management unit 110 based on an MCU is disposed in the master device 100, a slave management unit 210 based on an MCU is disposed in the slave device 200, and the master management unit 110 and the slave management unit 210 support, in hardware, power-on and power-off of a CPU of the device by changing a level timing of a pin. The master management unit 110 and the slave management unit 210 need to monitor sensor data of the devices to which they belong, such as temperature and voltage, and perform corresponding processing after an abnormality occurs, for example, adjust the rotation speed of the cooling fan 300 or the number of the cooling fans 300, or turn off the power supply 400 to perform power-off processing of the master device or the slave device.
As shown in fig. 1, the method comprises the steps of:
step S101, the master management unit reads the power-on state of each in-place slave device, and determines the powered-on target slave device according to the power-on state.
In this embodiment, the incumbent slave devices are slave devices inserted into the chassis, the master management unit reads the power-on states of the incumbent slave devices, the power-on states may include power-on and power-off, and the target slave devices that have been powered on may be determined according to the power-on states.
Optionally, a state table of each slave device may be preset by the master management unit, and after the on-site slave device is powered on, the master management unit may update the powered-on state of the corresponding slave device in the state table, so that the master management unit may determine the powered-on state of the on-site slave device by reading the state table.
In order to ensure that the on-site slave devices are powered on reliably, in some embodiments of the present application, after the master management unit reads a power-on state of each on-site slave device, and determines a powered-on target slave device according to the power-on state, the method further includes:
if the master management unit monitors a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment, verifying the power-on request IPMI message, and returning an IPMI message agreeing to power on to the in-place slave management unit after the verification is passed so that the in-place slave management unit powers on a CPU of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
In this embodiment, on one hand, before powering on, the in-place slave management unit of the in-place slave device detects whether a power-on environment of the in-place slave device meets a preset condition, where the preset condition may include that a temperature of the in-place slave device is within a preset temperature range, a power voltage of the in-place slave device is within a preset voltage range, and no protection condition that power-on is not allowed exists, and if the preset condition is met, the in-place slave management unit may generate a power-on request IPMI message and send the power-on request IPMI message to the master management unit of the master device. On the other hand, when receiving the power-on request IPMI message, the master management unit checks, for example, whether the format of the message is correct, whether the in-place slave device to be powered on belongs to a legal device, and the like, and returns the IPMI message agreeing to power on to the in-place slave management unit after the check is passed, and the in-place slave management unit can power on the CPU of the in-place slave device.
In order to ensure the reliability of the on-site slave device, in some embodiments of the present application, before detecting whether a power-on environment of the on-site slave device meets a preset condition, the on-site slave management unit performs an initialization operation on the on-site slave device, where the initialization operation may include initializing a serial port, initializing a local IPMB address, and the like.
Step S102, the master management unit sends a first IPMI request message to a target slave management unit corresponding to the target slave device.
In this embodiment, the first IPMI request message may be generated by the master management unit according to a query instruction of a user, and when the user needs to know the operating environment data of the target slave device, the first IPMI request message may be sent to the master management unit, so that the master management unit generates the first IPMI request message; the first IPMI request message may also be generated by the master management unit periodically and automatically, and the master management unit may monitor the operating environment data of the target slave device at any time by periodically sending the first IPMI request message to the target slave management unit.
Step S103, the master management unit determines the operating environment data of the target slave device according to the first IPMI response message returned by the target slave management unit.
In this embodiment, after receiving the first IPMI request message, the target slave management unit verifies the first IPMI request message, and after the verification is passed, the target slave management unit encapsulates the operating environment data of the target slave device to generate a first IPMI response message, where the operating environment data may include sensor data such as temperature and voltage, and then the target slave management unit sends the first IPMI response message to the master management unit, and the master management unit analyzes the first IPMI response message to determine the operating environment data of the target slave device.
Step S104, if abnormal data exists in the operating environment data of the target slave device, the master management unit controls the target slave device according to the abnormal data.
The master management unit judges whether abnormal data exist in the operating environment data of the target slave device, the abnormal data can include temperature abnormal data exceeding a preset voltage range or voltage abnormal data exceeding the preset voltage range, and if the abnormal data exist, the master management unit can control the target slave device according to the abnormal data.
For example, if the abnormal data is temperature abnormal data, the master management unit may adjust the rotation speed or the operation number of the cooling fan of the target slave device; or when the temperature abnormal data is the temperature higher than the protection fixed value, the main management unit can directly perform power-off processing on the target slave device so as to ensure the safety of the target slave device. If the abnormal data is voltage abnormal data, the master management unit can perform power-off processing on the target slave device.
Optionally, when the master management unit controls the target slave device according to the abnormal data, the master management unit stores the abnormal data in a log, so that a user can view specific abnormal data through the log.
In order to improve the reliability of the slave devices, in some embodiments of the present application, after the master management unit reads a power-on state of each in-place slave device, and determines a target slave device that has been powered on according to the power-on state, the method further includes:
if the master management unit monitors an alarm IPMI message sent by the target slave management unit, analyzing the alarm IPMI message, and controlling the target slave equipment according to an analysis result;
the warning IPMI message is generated when the target slave management unit detects that abnormal data exists in the operating environment data of the target slave equipment.
In this embodiment, on one hand, the target slave management unit monitors, in real time, operating environment data of the target slave device, such as temperature and voltage data of the target slave device, and if it is detected that abnormal data exists in the operating environment data of the target slave device, the target slave management unit generates an alarm IPMI message including corresponding alarm information, and sends the alarm IPMI message to the master management unit, where the abnormal data may include temperature abnormal data exceeding a preset voltage range or voltage abnormal data exceeding the preset voltage range. On the other hand, the main management unit circularly monitors whether the alarm IPMI message sent from each slave management unit exists, if the alarm IPMI message sent by the target slave management unit is monitored, the main management unit analyzes the alarm IPMI message and controls the target slave equipment according to the analysis result.
For example, if a temperature alarm exists in the analysis result, the master management unit may adjust the rotation speed or the running number of the cooling fans of the target slave device, or perform power-off processing on the target slave device; if the voltage alarm exists in the analysis result, the main management unit can perform power-off processing on the target slave equipment so as to ensure the safety of the target slave equipment.
Optionally, when the master management unit controls the target slave device according to the analysis result, the master management unit stores the analysis result in a log, so that a user can view specific alarm information through the log.
In order to ensure the reliability of the master device, in some embodiments of the present application, before the master management unit reads a power-on state of each in-place slave device and determines a target slave device that has been powered on according to the power-on state, the method further includes:
the main management unit responds to a starting instruction input by a user to carry out hardware initialization operation on the main equipment;
the main management unit detects a power-on environment of the main equipment and controls a CPU of the main equipment to be powered on when the power-on environment meets a preset condition;
and the main management unit performs case initialization operation according to the case information.
In this embodiment, the master management unit is started according to a start instruction input by a user, and performs a hardware initialization operation on the master device in response to the start instruction, where the hardware initialization operation may include initializing a serial port, initializing an IPMB address, and the like, the master management unit then detects a power-on environment (such as temperature, voltage, and the like) of the master device, and controls a CPU of the master device to power on when the power-on environment meets a preset condition, where the preset condition may include that the temperature of the master device is within a preset temperature range, the supply voltage of the master device is within a preset voltage range, and a protection condition that the power-on is not allowed does not exist, and the master management unit then performs a chassis initialization operation according to chassis information, and the chassis initialization operation may include initializing variables such as a chassis slot address, and the like, and corresponding the slot number to the slot address.
In order to ensure the reliability of the master device, in some embodiments of the present application, after the master management unit performs a chassis initialization operation according to chassis information, the method further includes:
the main management unit circularly obtains the operating environment data of the main equipment and controls the main equipment according to the operating environment data of the main equipment.
In this embodiment, the operating environment data of the main device may include sensor information such as temperature and voltage of the main device, and if the temperature of the main device exceeds a preset temperature range, a cooling fan of the main device is adjusted or the main device is powered off; if the voltage of the main equipment exceeds the preset voltage range, the main equipment can be powered off.
In order to improve the reliability of the device, in some embodiments of the present application, after the main management unit performs the chassis initialization operation according to the chassis information, the method further includes:
and the main management unit compares the highest temperature values of all the electrified slave devices in the case, adjusts the rotating speed of the cooling fan according to the threshold range where the highest temperature value is located, and performs power-off processing on the electrified slave devices if the temperature of the electrified slave devices exceeds the highest threshold.
In order to perform complete IPMI management in the CPU of the main device, in some embodiments of the present application, the main management unit and the CPU of the main device communicate with each other through a preset serial communication private protocol, in the preset serial communication private protocol, both the request packet and the response packet include a packet header, a message length, an I2c address, a command type, a specific command, specific data, and a check bit, and a completion code is further included between the specific data and the check bit of the response packet.
In this embodiment, the CPU of the master device and the main management unit communicate with each other through a preset serial communication private protocol, as shown in fig. 3, a frame format of the serial communication private protocol in the embodiment of the present invention, where both the Request message Request and the Response message Response include a header head, a message length, an I2c address I2caddr, a command type cmdtype, a specific command cmd, specific data, and a check bit crc16, and a completion code completion is further included between specific data and the check bit of the Response message.
Specifically, the header of the packet is 0xc0 of 3 bytes; the message length is 1 byte; the I2c address I2caddr is the I2c address of the other slave devices that the master device wants to acquire; the command type cmd type is extended from the IPMI standard and may include the types shown in FIG. 4; the specific command cmd uses characters to command numerical values according to an IPMI standard, so that the main management unit can receive messages and directly package the messages into an IPMI frame format conveniently; the specific data is fixed 8 bytes in a Request message Request and fixed 100 bytes in a Response message Response; the check bit crc16 is 2 bytes and is used for checking the accuracy of the message; the completion code completion is 1 byte and the definition of the completion code conforms to the IPMI standard, that is, the main management unit extracts the completion code from the IPMI message sent from the management unit and fills the completion code field, and the definition of the completion code is consistent with the IPMI standard.
In addition, the specific data mainly encapsulates messages of set types in the Request message Request, for example, the cooling fan rotating speed, powerlevel and the like are set, only the first byte of the data is filled with data, the rest part of the data is filled with 0, and for other equipment requests similar to restarting and powering off, the data field is filled with 0 completely.
The encapsulation of the proprietary protocol in the Response message Response about specific data follows two points:
1. the command (for example, a slave device DeviceID command) for acquiring data can be acquired only by once interaction in the IPMI standard, and specific data is packaged according to the IPMI standard;
2. the command that data can be acquired only by multiple interactions in the IPMI standard is required, for example, when the data information of the device sensor is acquired in the IPMI standard, the data is encapsulated based on a private definition. Wherein the private definition includes: for specific data (such as data of a sensor such as temperature and the like) recorded by an IPMI standard without floating point numbers, the package of the specific data occupies three bytes, the first byte represents the type of the sensor and is defined according to the IPMI standard, the second byte represents the mark number of the sensor, and the third byte is a specific sensor numerical value; for specific data (such as class voltage and the like) needing floating point recording, the packaging of the specific data occupies four bytes for transmission, the integer part and the decimal part of the sensor numerical value respectively occupy one byte, the data is packaged according to the sequence of the sensor label, and the vacant part is filled with 0.
In order to ensure the reliability of IPMI management after the main device is abnormal, in some embodiments of the present application, the enclosure further includes a standby main device, where the standby main device continuously performs information interaction with the main device during the operation of the main device, and if the number of times that the main device does not respond reaches a preset number of times, it is determined that the main device is abnormal, and the standby main device is reinitialized as a new main device.
In this embodiment, the standby main device also includes a main device unit, and in the operation process of the main device, the standby main device continuously performs information interaction with the main device, and if the number of times that the main device does not respond reaches a preset number of times, it is determined that the main device is abnormal, and at this time, the standby main device is reinitialized as a new main device, so that the reliability of IPMI management is improved.
By applying the technical scheme, in a case comprising a main device and a plurality of slave devices, the main device is connected with each slave device through an IPMB, a main management unit based on an MCU is arranged in the main device, a slave management unit based on the MCU is arranged in each slave device, the main management unit reads the power-on state of each in-place slave device, and determines the powered-on target slave device according to the power-on state; the method comprises the steps that a master management unit sends a first IPMI request message to a target slave management unit corresponding to a target slave device; the master management unit determines the operating environment data of the target slave equipment according to the first IPMI response message returned by the target slave management unit; if abnormal data exist in the operating environment data of the target slave equipment, the main management unit controls the target slave equipment according to the abnormal data; the first IPMI response message is generated by packaging the running environment data of the target slave equipment after the target slave management unit receives the first IPMI request message, and the cost is reduced by adopting the MCU as the master management unit and the slave management unit compared with a BMC chip.
In order to further illustrate the technical idea of the present invention, the technical solution of the present invention will now be described with reference to specific application scenarios.
An embodiment of the present application provides an IPMI management method based on an MCU, as shown in fig. 2, the IPMI management method is applied to a chassis including a master device 100 and a plurality of slave devices 200, the master device 100 and each slave device 200 are connected through an IPMB, a master management unit 110 based on an MCU is disposed in the master device 100, a slave management unit 210 based on an MCU is disposed in the slave device 200, and the master management unit 110 and the slave management unit 210 support, in hardware, power-on and power-off control of a CPU of the device by level timing change of pins. The master management unit 110 and the slave management unit 210 need to monitor sensor data of the devices to which they belong, such as temperature and voltage, and perform corresponding processing after an abnormality occurs, for example, adjust the rotation speed of the cooling fan 300 or the number of the cooling fans 300, or turn off the power supply 400 to perform power-off processing of the master device or the slave device.
The workflow of the master management unit 110 is as follows:
step S201, in response to a starting instruction of a user, initializing necessary information such as a serial port, an IPMB address of a local machine and the like;
step S202, reading sensor information (information such as temperature and voltage) of the main equipment, electrifying a CPU of the main equipment after the electrifying environment meets preset conditions, and lighting a POWER lamp (namely a POWER indicator lamp);
step S203, reading the case information, initializing variables such as a case slot position address according to the read case information, and corresponding the slot position number to the slot position address;
step S204, acquiring the in-place slave equipment and the power-on state of the current case, reading the temperature information of the power-on slave equipment, comparing to obtain the highest temperature value of all the power-on slave equipment in the case, adjusting the rotating speed of the cooling fan according to the threshold range of the highest temperature value, and performing power-off processing on the power-on slave equipment if the temperature exceeds the highest threshold;
step S205, starting a thread for processing the IPMI request message of the slave equipment, circularly monitoring the IPMI message sent by the slave management unit, and carrying out corresponding processing after successfully verifying the power-on request IPMI message and the alarm IPMI message sent by the slave management unit; and starting a thread for monitoring the running environment data of the main equipment, circularly reading the sensor information (temperature, voltage and the like) of the main equipment, and carrying out corresponding processing if the sensor information exceeds a preset range.
After the slave device is inserted into the chassis, the slave management unit 210 needs to complete the following steps:
step S301, initializing necessary information such as a serial port, an IPMB address of a local machine and the like;
step S302, reading sensor information (temperature, voltage and other information) of the subordinate device, sending an IPMI (intelligent POWER management interface) electrifying request message to the master management unit after the electrifying environment meets preset conditions, going through an IPMI electrifying process, electrifying a CPU (central processing unit) of the subordinate device after the master management unit agrees to electrify, and lighting a POWER lamp after the main management unit successfully electrifys;
step S303, starting a thread for processing the IPMI request message of the main equipment, wherein the thread realizes the circular monitoring of whether the IPMI request message sent by the main management unit exists or not, and performing corresponding packaging to reply the main management unit after the verification is successful; and starting a thread for monitoring the running environment of the equipment, circularly reading the sensor information (temperature, voltage and the like) of the subordinate equipment, and sending an alarm IPMI message to the main management unit when the sensor information exceeds a threshold value, wherein the main management unit carries out corresponding processing.
The embodiment of the present application further provides an IPMI management system based on an MCU, as shown in fig. 2, the system includes a master device 100 and a plurality of slave devices 200, the master device 100 is connected to each of the slave devices 200 through an IPMB, a master management unit 110 based on an MCU is disposed in the master device 100, a slave management unit 210 based on an MCU is disposed in the slave device 200, and the master management unit 110 is configured to:
reading the power-on state of each in-place slave device, and determining a powered-on target slave device according to the power-on state;
sending a first IPMI request message to a target slave management unit corresponding to the target slave device;
determining the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit;
if abnormal data exist in the operating environment data of the target slave equipment, controlling the target slave equipment according to the abnormal data;
the first IPMI response packet is generated by encapsulating the operating environment data of the target slave device after the target slave management unit receives the first IPMI request packet.
In a specific application scenario, the master management unit 110 is further configured to:
if a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment is monitored, verifying the power-on request IPMI message, and returning an IPMI message agreeing to power on to the in-place slave management unit after the verification is passed so that the in-place slave equipment powers on a CPU (central processing unit) of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
In a specific application scenario, the master management unit 110 is further configured to:
if the alarm IPMI message sent by the target slave management unit is monitored, analyzing the alarm IPMI message, and controlling the target slave equipment according to the analysis result;
the warning IPMI message is generated when the slave management unit detects that abnormal data exists in the operating environment data of the target slave device.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. An IPMI management method based on an MCU, which is applied to a chassis comprising a master device and a plurality of slave devices, wherein the master device and each slave device are connected through an IPMB, a master management unit based on the MCU is arranged in the master device, and a slave management unit based on the MCU is arranged in the slave device, the method comprising:
the master management unit reads the power-on state of each in-place slave device and determines a powered-on target slave device according to the power-on state;
the master management unit sends a first IPMI request message to a target slave management unit corresponding to the target slave equipment;
the master management unit determines the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit;
if abnormal data exist in the operating environment data of the target slave equipment, the master management unit controls the target slave equipment according to the abnormal data;
the first IPMI response packet is generated by encapsulating the operating environment data of the target slave device after the target slave management unit receives the first IPMI request packet.
2. The method of claim 1, wherein after the master management unit reads the power-on status of each in-place slave device and determines a target slave device that has been powered on according to the power-on status, the method further comprises:
if the master management unit monitors a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment, verifying the power-on request IPMI message, and returning an IPMI message agreeing to power on to the in-place slave management unit after the verification is passed so that the in-place slave management unit powers on a CPU of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
3. The method of claim 1, wherein after the master management unit reads a power-on state of each in-place slave device and determines a powered-on target slave device according to the power-on state, the method further comprises:
if the master management unit monitors an alarm IPMI message sent by the target slave management unit, analyzing the alarm IPMI message, and controlling the target slave equipment according to an analysis result;
the warning IPMI message is generated when the target slave management unit detects that abnormal data exists in the operating environment data of the target slave equipment.
4. The method of claim 1, wherein before the master management unit reads a power-on state of each in-place slave device and determines a powered-on target slave device according to the power-on state, the method further comprises:
the main management unit responds to a starting instruction input by a user to carry out hardware initialization operation on the main equipment;
the main management unit detects a power-on environment of the main equipment and controls a CPU of the main equipment to be powered on when the power-on environment meets a preset condition;
and the main management unit performs case initialization operation according to the case information.
5. The method of claim 4, wherein after the primary management unit performs a chassis initialization operation based on chassis information, the method further comprises:
the main management unit circularly obtains the operating environment data of the main equipment and controls the main equipment according to the operating environment data of the main equipment.
6. The method according to claim 1, wherein the main management unit communicates with the CPU of the main device through a preset serial communication private protocol, in which the request packet and the response packet each include a packet header, a message length, an I2c address, a command type, a specific command, specific data, and a check bit, and a completion code is further included between the specific data and the check bit of the response packet.
7. The method of claim 1, wherein the chassis further includes a standby master device, the standby master device continuously performs information interaction with the master device during the operation of the master device, and if the number of times that the master device does not respond reaches a preset number of times, it is determined that the master device is abnormal, and the standby master device is reinitialized as a new master device.
8. An IPMI management system based on MCU, the system includes a master device and a plurality of slave devices, the master device and each slave device are connected through IPMB, a master management unit based on MCU is set in the master device, a slave management unit based on MCU is set in the slave device, the master management unit is used for:
reading the power-on state of each in-place slave device, and determining a powered-on target slave device according to the power-on state;
sending a first IPMI request message to a target slave management unit corresponding to the target slave device;
determining the operating environment data of the target slave equipment according to a first IPMI response message returned by the target slave management unit;
if abnormal data exist in the operating environment data of the target slave equipment, controlling the target slave equipment according to the abnormal data;
the first IPMI response packet is generated by encapsulating the operating environment data of the target slave device after the target slave management unit receives the first IPMI request packet.
9. The system of claim 8, wherein the master management unit is further to:
if a power-on request IPMI message sent by an in-place slave management unit corresponding to the in-place slave equipment is monitored, verifying the power-on request IPMI message, and returning the IPMI message which agrees to power on to the in-place slave management unit after the verification is passed so that the in-place slave equipment powers on a CPU of the in-place slave equipment;
the IPMI message is generated by the in-place slave management unit after detecting that the power-on environment of the in-place slave equipment meets the preset condition.
10. The system of claim 8, wherein the master management unit is further to:
if the alarm IPMI message sent by the target slave management unit is monitored, analyzing the alarm IPMI message, and controlling the target slave equipment according to the analysis result;
the warning IPMI message is generated when the slave management unit detects that abnormal data exists in the operating environment data of the target slave device.
CN202211581139.0A 2022-12-09 2022-12-09 IPMI management method and system based on MCU Pending CN115858292A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211581139.0A CN115858292A (en) 2022-12-09 2022-12-09 IPMI management method and system based on MCU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211581139.0A CN115858292A (en) 2022-12-09 2022-12-09 IPMI management method and system based on MCU

Publications (1)

Publication Number Publication Date
CN115858292A true CN115858292A (en) 2023-03-28

Family

ID=85671618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211581139.0A Pending CN115858292A (en) 2022-12-09 2022-12-09 IPMI management method and system based on MCU

Country Status (1)

Country Link
CN (1) CN115858292A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101026480A (en) * 2006-02-20 2007-08-29 华为技术有限公司 IPMI subsystem and single chip power on method
CN101119224A (en) * 2006-08-01 2008-02-06 上海未来宽带技术及应用工程研究中心有限公司 ATCA frame based FRU debugging and testing device
CN101344807A (en) * 2007-07-13 2009-01-14 环达电脑(上海)有限公司 Fan control structure
WO2017004908A1 (en) * 2015-07-07 2017-01-12 中兴通讯股份有限公司 Communication method and apparatus for intelligent platform management interface device, and communication device
CN106484578A (en) * 2016-10-14 2017-03-08 苏州国芯科技有限公司 A kind of check system based on trusted computer hardware

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101026480A (en) * 2006-02-20 2007-08-29 华为技术有限公司 IPMI subsystem and single chip power on method
CN101119224A (en) * 2006-08-01 2008-02-06 上海未来宽带技术及应用工程研究中心有限公司 ATCA frame based FRU debugging and testing device
CN101344807A (en) * 2007-07-13 2009-01-14 环达电脑(上海)有限公司 Fan control structure
WO2017004908A1 (en) * 2015-07-07 2017-01-12 中兴通讯股份有限公司 Communication method and apparatus for intelligent platform management interface device, and communication device
CN106484578A (en) * 2016-10-14 2017-03-08 苏州国芯科技有限公司 A kind of check system based on trusted computer hardware

Similar Documents

Publication Publication Date Title
US20210352148A1 (en) Remote management for a computing device
US7512830B2 (en) Management module failover across multiple blade center chassis
US9130824B2 (en) Chassis management implementation by management instance on baseboard management controller managing multiple computer nodes
US20120136502A1 (en) Fan speed control system and fan speed reading method thereof
CN106844162A (en) Storage server cabinet management system and method based on BMC
US8560688B2 (en) Monitoring sensors for systems management
CN106814826B (en) System and method for controlling airflow in server cabinet
CN102692985B (en) The steady band external power control method of telemanagement formula computer system and system
EP3575975A1 (en) Method and apparatus for operating smart network interface card
CN102027430A (en) Managing power consumption of a computer
TW201428487A (en) Testing system and testing method thereof
CN115314416B (en) Network card state automatic detection method and device, electronic equipment and storage medium
EP3014817A1 (en) Hardware management communication protocol
CN107026759A (en) The firmware and its development approach of a kind of remote management BBU modules based on BMC
CN109488631A (en) A kind of fan control device and its method
CN115543872A (en) Equipment management method and device and computer storage medium
CN115858292A (en) IPMI management method and system based on MCU
CN114911332B (en) Method and system for regulating and controlling server fan, electronic equipment and storage medium
CN114217942B (en) Power management in blade enclosure
CN109981635B (en) Data processing method and system
CN114281172A (en) Server fan management method, system, equipment and storage medium
CN116483613B (en) Processing method and device of fault memory bank, electronic equipment and storage medium
US10291582B2 (en) System and method of supporting more than 256 sensors by intelligent platform management interface (IPMI) based server management controller
CN112065757A (en) Intelligent regulation and control method for server fan in S5 state
US20120233450A1 (en) System and method of booting a computer system using an efi personality of a different computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230328