CN117056114A - IPMI command processing method, device, system and electronic equipment - Google Patents

IPMI command processing method, device, system and electronic equipment Download PDF

Info

Publication number
CN117056114A
CN117056114A CN202311071917.6A CN202311071917A CN117056114A CN 117056114 A CN117056114 A CN 117056114A CN 202311071917 A CN202311071917 A CN 202311071917A CN 117056114 A CN117056114 A CN 117056114A
Authority
CN
China
Prior art keywords
bmc
ipmi
bios
signal
interrupt signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311071917.6A
Other languages
Chinese (zh)
Inventor
仇广东
芦飞
陈鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202311071917.6A priority Critical patent/CN117056114A/en
Publication of CN117056114A publication Critical patent/CN117056114A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application provides an IPMI command processing method, device and system and electronic equipment, wherein the method comprises the following steps: acquiring state information of the BMC; judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry number of the BMC is modified based on the BIOS to be a preset minimum IPMI retry number; and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode. By detecting the fault state of the BMC and modifying the IPMI retry number to the preset minimum IPMI retry number under the condition that the BMC is determined to be faulty, the number of repeated sending of the IPMI command is reduced, the running time of the server in the system management mode is shortened, and the system performance of the server is improved.

Description

IPMI command processing method, device, system and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to an IPMI command processing method, device, system, and electronic apparatus.
Background
The BMC is a special controller independent of the CPU of the server, and can communicate with the CPU through an intelligent platform management interface (Intelligent Platform Management Interface, IPMI for short) without depending on a processor, a BIOS or an operating system of the server, and the like. The error information of the CPU of the server is sent to the BMC for recording through the IPMI command, and the display also depends on the BMC.
In the prior art, after the BMC normally receives the IPMI command, the BMC will feed back a response signal to the IPMI, and the IPMI will repeatedly send the IPMI command before not receiving the response signal.
However, the BMC may fail, and the BMC may not be able to feed back a corresponding signal under the failure condition, which causes the IPMI to repeatedly send a plurality of IPMI commands, and in the process of repeatedly sending the IPMI commands, the server will always operate in the system management mode, and the more the number of times of repeatedly sending the IPMI commands, the longer the time that the server operates in the system management mode, thereby reducing the system performance of the server.
Disclosure of Invention
The application provides an IPMI command processing method, an IPMI command processing device, an IPMI command processing system and electronic equipment, and aims to overcome the defect that the performance of a server system is reduced under the condition that the BMC possibly fails in the prior art.
The first aspect of the present application provides an IPMI command processing method, including:
acquiring state information of the BMC;
judging whether the BMC is in a fault state currently according to the state information of the BMC;
triggering an interrupt signal and transmitting the interrupt signal to a BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry frequency of the BMC is modified to be a preset minimum IPMI retry frequency based on the BIOS;
and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode.
Optionally, the BMC is provided with a watchdog timer, and the obtaining the state information of the BMC includes:
determining timing information of a watchdog timer of the BMC based on GPIO connected with the BMC;
and determining the state information of the BMC according to the timing information of the watchdog timer of the BMC.
Optionally, when the BMC is in a normal working condition, the BMC performs a dog feeding operation before the watchdog timer is overtime, and triggers the GPIO signal to be a low level signal when the watchdog timer is overtime, and determines, according to state information of the BMC, whether the BMC is currently in a fault state, including:
judging whether the GPIO signal is a low-level signal or not according to the state information of the BMC;
and under the condition that the GPIO signal is a low-level signal, determining that the BMC is in a fault state currently.
Optionally, the triggering an interrupt signal and transmitting the interrupt signal to the BIOS when determining that the BMC is currently in a fault state, so as to modify the IPMI retry number of the BMC to a preset minimum IPMI retry number based on the BIOS, includes:
generating a system control interrupt signal based on a correspondence between the low level signal and a BMC fault event signal under the condition that the BMC is determined to be in a fault state currently,
sending the system control interrupt signal to an operating system;
calling an ASL code processing function based on the operating system;
triggering a corresponding system management interrupt signal according to the system control interrupt signal based on the ASL code processing function;
sending the system management interrupt signal to a BIOS;
invoking a system management interrupt handling function based on the BIOS;
based on the system management interrupt processing function, modifying the IPMI retry number of the BMC to be a preset minimum IPMI retry number;
the preset minimum IPMI retry number is smaller than a default value of the IPMI retry number.
Optionally, the method further comprises:
when the BMC is determined to be changed from a fault state to a normal state, triggering a restore signal, and transmitting the restore signal to a BIOS, so that the IPMI retry number of the BMC is modified to be a default value based on the BIOS.
A second aspect of the present application provides an IPMI command processing apparatus, comprising:
the acquisition module is used for acquiring the state information of the BMC;
the judging module is used for judging whether the BMC is in a fault state currently according to the state information of the BMC;
the triggering module is used for triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry frequency of the BMC is modified to be a preset minimum IPMI retry frequency based on the BIOS;
and the processing module is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
A third aspect of the present application provides an IPMI command processing system, comprising: BMC, CPU and BIOS;
the BMC is used for reporting state information to the CPU;
the CPU is used for judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently;
the BIOS is used for modifying the IPMI retry number of the BMC to a preset minimum IPMI retry number after receiving the interrupt signal;
and the CPU is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
Optionally, the BMC is provided with a watchdog timer, and the BMC and the CPU are connected through GPIO;
when the BMC is under a normal working condition, the BMC performs a dog feeding operation before the watchdog timer is overtime;
and under the condition that the watchdog timer is overtime, the BMC triggers the GPIO signal to be a low-level signal and transmits the low-level signal to the CPU.
A fourth aspect of the present application provides an electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes the computer-executable instructions stored by the memory such that the at least one processor performs the method as described above in the first aspect and the various possible designs of the first aspect.
A fifth aspect of the application provides a computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the method as described above for the first aspect and the various possible designs of the first aspect.
The technical scheme of the application has the following advantages:
the application provides an IPMI command processing method, device and system and electronic equipment, wherein the method comprises the following steps: acquiring state information of the BMC; judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry number of the BMC is modified based on the BIOS to be a preset minimum IPMI retry number; and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode. According to the method provided by the scheme, the fault state detection is carried out on the BMC, and under the condition that the BMC is determined to be faulty, the IPMI retry times are modified to be the preset minimum IPMI retry times, so that the times of repeatedly sending the IPMI command are reduced, the running time of the server in the system management mode is shortened, and the system performance of the server is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings required for the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
FIG. 1 is a flowchart illustrating an IPMI command processing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an IPMI command processing apparatus according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an interaction flow of an IPMI command processing system according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an IPMI command processing system according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. These drawings and the written description are not intended to limit the scope of the disclosed concept in any way, but to illustrate the inventive concept to those skilled in the art by reference to specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. In the following description of the embodiments, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The BMC is a special controller independent of the CPU of the server, can communicate with the CPU through an IPMI interface, does not depend on the processor, the BIOS or the operating system of the server to work, can be quite independent, is a single agent-free management subsystem running in the system, and can start to work as long as the BMC and the IPMI firmware exist, and is usually a separate board card installed on a server main board, and the server main board also provides support for the IPMI. In operation, all IPMI functions are completed by sending commands to the BMC, which uses instructions specified in the IPMI specification, and the BMC receives and records event messages in the system event log, maintaining a record of sensor data describing the sensor conditions in the system. The BMC in the server is used as a management core, monitors the states of a plurality of devices and records, for example, error information of the CPU can be sent to the BMC for recording through an IPMI command, and the display also depends on the BMC. However, if the BMC itself fails, in this case, if the system fails to report a trigger SMI, after all cores of the CPU enter SMM mode, the BIOS will send a failure message to the BMC through IPMI, and the failure of sending the message will be repeated for a certain number of times, and if the number of times is greater, the time in SMM is longer, then the system will have a stuck phenomenon, and the system performance will be seriously affected.
In order to solve the above problems, the method, device, system and electronic device for processing an IPMI command provided by the embodiments of the present application include: acquiring state information of the BMC; judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry number of the BMC is modified based on the BIOS to be a preset minimum IPMI retry number; and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode. According to the method provided by the scheme, the fault state detection is carried out on the BMC, and under the condition that the BMC is determined to be faulty, the IPMI retry times are modified to be the preset minimum IPMI retry times, so that the times of repeatedly sending the IPMI command are reduced, the running time of the server in the system management mode is shortened, and the system performance of the server is improved.
The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
The embodiment of the application provides an IPMI command processing method which is used for detecting a fault state of a BMC and reducing the retransmission times of the IPMI command under the condition that the BMC is determined to be faulty so as to shorten the running time of a server in a system management mode. The execution main body of the embodiment of the application is electronic equipment such as a server, a desktop computer, a notebook computer, a tablet personal computer and other electronic equipment which can be used for detecting the fault state of the BMC, and reducing the retransmission times of the IPMI command under the condition that the BMC is determined to be faulty so as to shorten the running time of the server in a system management mode.
As shown in fig. 1, a flow chart of an IPMI command processing method according to an embodiment of the present application is shown, where the method includes:
step 101, obtaining state information of the BMC.
Wherein, BMC is equipped with watchdog timer.
Specifically, the state information of the BMC may be determined according to the feeding condition of the BMC to the watchdog timer.
Specifically, in one embodiment, the timing information of the watchdog timer of the BMC may be determined based on the GPIO connected to the BMC; and determining the state information of the BMC according to the timing information of the watchdog timer of the BMC.
Specifically, whether the watchdog timer is overtime can be determined according to timing information of the watchdog timer, and state information of the BMC can be determined according to an overtime determination result of the watchdog timer.
Step 102, judging whether the BMC is in a fault state currently according to the state information of the BMC.
When the BMC fails, such as downtime or jamming, it cannot communicate with an external device or perform a dog feeding operation in time, so that the watchdog timer will be overtime. The watchdog feeding operation is to reset the watchdog timer regularly, when the BMC is under a normal working condition, the BMC feeds the watchdog before the watchdog timer is overtime, and when the watchdog timer is overtime, the BMC triggers the GPIO signal to be a low level signal.
Specifically, after the BMC is started normally, the GPIO needs to be pulled high, the watchdog timer is started, the timer value can be set to be refilled every 1 minute, so that the dog feeding operation of the watchdog timer is realized, if the program of the BMC runs normally, the counter is reset before timeout and timeout cannot be caused, otherwise, timeout interruption is triggered, and the BMC can pull the GPIO low at the interruption. If the watchdog is overtime and the operation of pulling down the GPIO cannot be executed, the BMC needs to select a GPIO with a default value of low level when selecting the GPIO, if the watchdog timer is overtime, the system of the BMC is reset, the GPIO is restored to the default state of low level, and thus the interrupt is triggered.
Specifically, in an embodiment, whether the GPIO signal is a low level signal may be determined according to the state information of the BMC; and under the condition that the GPIO signal is a low level signal, determining that the BMC is in a fault state currently.
Specifically, when the watchdog timer times out, a timeout signal may be triggered, and the GPIO is pulled down based on a preset timeout processing function, so that the GPIO signal is a low level signal. When the received state information of the BMC indicates that the GPIO signal is a low-level signal, determining that the BMC is in a fault state currently.
It should be noted that, a watchdog timer is a timer circuit, an input signal is generally operated to "feed dog" (kicking the dog or service the dog), an output signal is output to the RST end of the MCU, when the MCU works normally, a signal is output at intervals to perform "feed dog" operation, the WDT is cleared, if the time exceeds the prescribed time and the dog is not fed, (generally when the program runs off), the WDT timing exceeds, a reset signal is given to the MCU, so that the MCU is reset. The function of the watchdog is to prevent the program from dead circulation or running off. In consideration of real-time monitoring of the running state of the single-chip microcomputer, a chip specially used for monitoring the running state of a single-chip microcomputer program is generated, and the circuit provides a responsive input pulse stream loss latch fault indication on the basis of a commonly known as a watchdog (MAX 9) integrated circuit. The circuit may monitor the fan (calculation of the rotational speed output of the upper fan), the oscillating circuit, or a microprocessor software implementation.
Step 103, triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so as to modify the IPMI retry number of the BMC based on the BIOS to be a preset minimum IPMI retry number.
Specifically, when it is determined that the BMC is currently in a fault state, an interrupt signal for indicating that the BMC is faulty may be triggered based on a preset interrupt signal triggering logic, and the interrupt signal is transmitted to the BIOS. After receiving the interrupt signal, the BIOS stores the variable of the IPMI retry number to update, so as to modify the IPMI retry number of the BMC to the preset minimum IPMI retry number.
And 104, when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode.
When the CPU fails, the server enters a system management mode (SMM mode), and the BIOS sends an IPMI command to the BMC through the IPMI to notify the BMC that the failure occurs. Based on the method of the embodiment of the application, under the condition that the BMC is determined to be in a fault state at present, only the IPMI command with the preset minimum IPMI retry number is required to be repeatedly sent, and the IPMI command retry number is reduced, so that the running time of the server in a system management mode is shortened.
It is further noted that BMC failure is a serious problem for server monitoring, and if during this time a system operation error causes the CPU to enter SMM mode, the longer the SMM run time the greater the impact on OS performance. The existing cloud service manufacturer processes data at a high speed and delays for a few seconds, so that a large amount of data can be affected, and therefore, the requirements on the performance and stability of the OS are extremely high, the function of automatically updating the retry times of the IPMI command when the BMC is added in error is achieved, the SMM running time is shortened, and the influence on the OS can be reduced.
On the basis of the foregoing embodiment, as an implementation manner, in an embodiment, in a case that it is determined that the BMC is currently in a fault state, an interrupt signal is triggered, and the interrupt signal is transmitted to the BIOS, so as to modify, based on the BIOS, the IPMI retry number of the BMC to a preset minimum IPMI retry number, including:
step 1031, in the case where it is determined that the BMC is currently in a fault state, generating a system control interrupt signal based on a correspondence between the low level signal and the BMC fault event signal,
step 1032, sending a system control interrupt signal to the operating system;
step 1033, calling ASL code processing functions based on the operating system;
step 1034, based on the ASL code processing function, triggering a corresponding system management interrupt signal according to the system control interrupt signal;
step 1035, sending a system management interrupt signal to the BIOS;
step 1036, based on the BIOS, invoking a system management interrupt handling function;
step 1037, based on the system management interrupt handling function, modifies the IPMI retry number of the BMC to a preset minimum IPMI retry number.
The preset minimum IPMI retry number is smaller than a default value of the IPMI retry number, for example, the default value is 50000, and the preset minimum IPMI retry number is 1000. The BIOS creates variable save error reporting IPMI retry times during boot, defaults to 50000, and creates an SMI handling function (System management interrupt handling function).
Specifically, a GPIO is used between the BMC and the CPU to indicate that the BMC has faults. For BIOS, the GPIO signal (low level signal) needs to be configured with Input, the lower edge triggers the System Control Interrupt (SCI) mode, and the configuration register corresponds the GPIO signal and the BMC fault event signal (GPE), the GPIO signal will trigger the GPE after being pulled down, and then the GPIO signal is called and enters an ASL code processing function based on the operating system of the CPU.
It should be noted that, the system control interrupt (System Control Interrupt, abbreviated as SCI) is an interrupt signal that is effectively shared at a low level, when some ACPI events are generated by hardware, the system control interrupt SCI may be used to notify an Operating System (OS). ACPI Spec describes that 2 types of Events will produce SCIs, one called Fixed-Feature Events, and the other GPE (General-Purpose Events). SCI generated by Fixed-Feature Events is typically processed by an OS box driver. In the embodiment of the application, a GPE mode is adopted, the GPE is generally a method of a Level trigger in an ACPI table, which is generally in the form of_Lxx, wherein xx is corresponding to the register of GPE_STS, so that the configuration is carried out by referring to the Intel PCH specification, GPIO sets the bit of the corresponding GPE_STS, and when SCI is interrupted and triggered, the_Lxx function in ASL is executed.
Further, an SMI interface may be invoked to trigger an SMI interrupt to trigger a corresponding System Management Interrupt (SMI) signal based on an ASL code handling function. And sending a system management interrupt signal to the BIOS, and calling a system management interrupt processing function by the BIOS after receiving the system management interrupt signal, and updating the variable of the IPMI retry number through the SMI interface based on the system management interrupt processing function so as to modify the IPMI retry number of the BMC to be the preset minimum IPMI retry number.
It should be further noted that when an error occurs in the CPU, an SMI interrupt is triggered, all cores enter the SMM mode, and the OS is not aware of the SMI and acts as an OS behavior interrupt. The SMI interrupt function will transfer the error information to the BMC in the IPMI mode, and because the BMC cannot respond at this time, the BIOS will try 1000 times (preset minimum IPMI retry times), if the communication is unsuccessful, the IPMI command will not be continuously sent, so as to exit the SMM mode, return to the OS, and the shorter the time of entering the SMM is, the less the perception under the OS is strong, and the less the influence on the OS behavior is.
Further, in an embodiment, when it is determined that the BMC transitions from the fault state to the normal state, a restore signal is triggered and transmitted to the BIOS to modify the IPMI retry number of the BMC to a default value based on the BIOS.
Specifically, after the BMC is restored to normal, the GPIO is pulled up again, that is, the GIPO signal is expressed as a high level signal, the OS terminal can detect the state of the GPIO to determine whether the BMC is restored, if so, a restore signal is triggered, the BIOS will call the SMI interrupt function again after receiving the restore signal to trigger the number of retries of the internal restore IPMI command, so as to modify the number of retries of the IPMI of the BMC to be a default value, and meanwhile, the interrupt mode is changed to be a falling edge trigger.
Among these, SMI generation generally has two approaches, one is via the external SMI# pin, just like an external interrupt. This is triggered mainly by the GPIO of an external device, e.g. a BMC. Another approach that software may use is that X86 provides a port 0xB2 to provide a channel for software to trigger SMIs, e.g., system software may perform BIOS upgrades on-line, etc. The embodiment of the application uses software. SMM is a mode that is invisible to software for performing some very important operations, one of which is error handling. Many of the RAS characteristics of X86 are done in SMM, and many of the error types of X86 will allow NMI or SMI to be configured as a report mechanism. As above, SMM is a mode that is not visible to system software, and if an SMI storm occurs, it can cause the system to become slow or get stuck. Only then is the OS and software unaware what happens, because SMM is not at all within the control range of the software
According to the IPMI command processing method provided by the embodiment of the application, the state information of the BMC is obtained; judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry number of the BMC is modified based on the BIOS to be a preset minimum IPMI retry number; and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode. According to the method provided by the scheme, the fault state detection is carried out on the BMC, and under the condition that the BMC is determined to be faulty, the IPMI retry times are modified to be the preset minimum IPMI retry times, so that the times of repeatedly sending the IPMI command are reduced, the running time of the server in the system management mode is shortened, and the system performance of the server is improved.
The embodiment of the application provides an IPMI command processing device, which is used for executing the IPMI command processing method provided by the embodiment.
Fig. 2 is a schematic structural diagram of an IPMI command processing apparatus according to an embodiment of the present application. The IPMI command processing apparatus 20 includes: the device comprises an acquisition module 201, a judgment module 202, a triggering module 203 and a processing module 204.
The BMC comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring the state information of the BMC; the judging module is used for judging whether the BMC is in a fault state currently according to the state information of the BMC; the triggering module is used for triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry number of the BMC is modified based on the BIOS to be a preset minimum IPMI retry number; and the processing module is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
With respect to the IPMI command processing apparatus in the present embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiments regarding the method, and will not be described in detail herein.
The IPMI command processing apparatus provided by the embodiments of the present application is configured to execute the IPMI command processing method provided by the foregoing embodiments, and the implementation manner and principle of the IPMI command processing apparatus are the same and are not repeated.
The embodiment of the application provides an IPMI command processing system for executing the IPMI command processing method provided by the embodiment.
Fig. 3 is a schematic diagram of an interaction flow of the IPMI command processing system according to an embodiment of the present application. The IPMI command processing system includes: BMC, CPU and BIOS.
The BMC is used for reporting state information to the CPU; the CPU is used for judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently; the BIOS is used for modifying the IPMI retry number of the BMC to a preset minimum IPMI retry number after receiving the interrupt signal; and the CPU is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
Specifically, in an embodiment, as shown in fig. 4, a schematic structural diagram of an IPMI command processing system according to an embodiment of the present application is shown, a watchdog timer is provided in a BMC, and a PCH is provided in a CPU through GPIO connection between the BMC and the CPU.
When the BMC is under a normal working condition, the BMC feeds the dog before the watchdog timer is overtime; and under the condition that the watchdog timer is overtime, the BMC triggers the GPIO signal to be a low-level signal and transmits the low-level signal to the CPU.
The detailed manner in which the respective components perform operations in relation to the IPMI command processing system of the present embodiment has been described in detail in relation to the embodiment of the method, and will not be described in detail herein.
The IPMI command processing system provided by the embodiment of the present application is configured to execute the IPMI command processing method provided by the foregoing embodiment, and its implementation manner and principle are the same and are not repeated.
The embodiment of the application provides an electronic device for executing the IPMI command processing method provided by the embodiment.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 50 includes: at least one processor 51 and a memory 52.
The memory stores computer-executable instructions; at least one processor executes computer-executable instructions stored in the memory to cause the at least one processor to perform the IPMI command processing method as provided in the above embodiments.
The implementation manner and principle of the electronic device provided by the embodiment of the present application are the same, and are not repeated.
The embodiment of the application provides a computer readable storage medium, wherein computer execution instructions are stored in the computer readable storage medium, and when a processor executes the computer execution instructions, the IPMI command processing method provided by any embodiment is realized.
The storage medium containing computer executable instructions in the embodiments of the present application may be used to store the computer executable instructions of the IPMI command processing method provided in the foregoing embodiments, and the implementation manner and principle of the storage medium are the same, and are not repeated.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform part of the steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. The specific working process of the above-described device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (10)

1. An IPMI command processing method, comprising:
acquiring state information of the BMC;
judging whether the BMC is in a fault state currently according to the state information of the BMC;
triggering an interrupt signal and transmitting the interrupt signal to a BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry frequency of the BMC is modified to be a preset minimum IPMI retry frequency based on the BIOS;
and when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries, exiting the system management mode.
2. The method according to claim 1, wherein the BMC is provided with a watchdog timer, and the obtaining state information of the BMC includes:
determining timing information of a watchdog timer of the BMC based on GPIO connected with the BMC;
and determining the state information of the BMC according to the timing information of the watchdog timer of the BMC.
3. The method according to claim 2, wherein when the BMC is in a normal working condition, the BMC performs a feeding operation before the watchdog timer times out, and triggers the GPIO signal to be a low level signal when the watchdog timer times out, and the determining whether the BMC is in a fault state currently according to the state information of the BMC includes:
judging whether the GPIO signal is a low-level signal or not according to the state information of the BMC;
and under the condition that the GPIO signal is a low-level signal, determining that the BMC is in a fault state currently.
4. The method of claim 3, wherein triggering an interrupt signal and transmitting the interrupt signal to a BIOS to modify the number of IPMI retries of the BMC to a preset minimum number of IPMI retries based on the BIOS if it is determined that the BMC is currently in a fault state comprises:
generating a system control interrupt signal based on a correspondence between the low level signal and a BMC fault event signal under the condition that the BMC is determined to be in a fault state currently,
sending the system control interrupt signal to an operating system;
calling an ASL code processing function based on the operating system;
triggering a corresponding system management interrupt signal according to the system control interrupt signal based on the ASL code processing function;
sending the system management interrupt signal to a BIOS;
invoking a system management interrupt handling function based on the BIOS;
based on the system management interrupt processing function, modifying the IPMI retry number of the BMC to be a preset minimum IPMI retry number;
the preset minimum IPMI retry number is smaller than a default value of the IPMI retry number.
5. The method as recited in claim 1, further comprising:
when the BMC is determined to be changed from a fault state to a normal state, triggering a restore signal, and transmitting the restore signal to a BIOS, so that the IPMI retry number of the BMC is modified to be a default value based on the BIOS.
6. An IPMI command processing apparatus, comprising:
the acquisition module is used for acquiring the state information of the BMC;
the judging module is used for judging whether the BMC is in a fault state currently according to the state information of the BMC;
the triggering module is used for triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently, so that the IPMI retry frequency of the BMC is modified to be a preset minimum IPMI retry frequency based on the BIOS;
and the processing module is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
7. An IPMI command processing system, comprising: BMC, CPU and BIOS;
the BMC is used for reporting state information to the CPU;
the CPU is used for judging whether the BMC is in a fault state currently according to the state information of the BMC; triggering an interrupt signal and transmitting the interrupt signal to the BIOS under the condition that the BMC is determined to be in a fault state currently;
the BIOS is used for modifying the IPMI retry number of the BMC to a preset minimum IPMI retry number after receiving the interrupt signal;
and the CPU is used for exiting the system management mode when the number of the IPMI command retries reaches the preset minimum number of the IPMI retries.
8. The system according to claim 7, wherein the BMC is provided with a watchdog timer, and the BMC and the CPU are connected through a GPIO;
when the BMC is under a normal working condition, the BMC performs a dog feeding operation before the watchdog timer is overtime;
and under the condition that the watchdog timer is overtime, the BMC triggers the GPIO signal to be a low-level signal and transmits the low-level signal to the CPU.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of any one of claims 1 to 5.
10. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the method of any one of claims 1 to 5.
CN202311071917.6A 2023-08-24 2023-08-24 IPMI command processing method, device, system and electronic equipment Pending CN117056114A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311071917.6A CN117056114A (en) 2023-08-24 2023-08-24 IPMI command processing method, device, system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311071917.6A CN117056114A (en) 2023-08-24 2023-08-24 IPMI command processing method, device, system and electronic equipment

Publications (1)

Publication Number Publication Date
CN117056114A true CN117056114A (en) 2023-11-14

Family

ID=88658724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311071917.6A Pending CN117056114A (en) 2023-08-24 2023-08-24 IPMI command processing method, device, system and electronic equipment

Country Status (1)

Country Link
CN (1) CN117056114A (en)

Similar Documents

Publication Publication Date Title
US11119874B2 (en) Memory fault detection
US6742139B1 (en) Service processor reset/reload
US6505298B1 (en) System using an OS inaccessible interrupt handler to reset the OS when a device driver failed to set a register bit indicating OS hang condition
US7594144B2 (en) Handling fatal computer hardware errors
US5758190A (en) Control unit threshold timeout controls for software missing interrupt handlers in operating systems
US5944840A (en) Continuous monitor for interrupt latency in real time systems
US7318171B2 (en) Policy-based response to system errors occurring during OS runtime
CN102761439B (en) Device and method for detecting and recording abnormity on basis of watchdog in PON (Passive Optical Network) access system
US11526411B2 (en) System and method for improving detection and capture of a host system catastrophic failure
CN114328102B (en) Equipment state monitoring method, equipment state monitoring device, equipment and computer readable storage medium
US20080140895A1 (en) Systems and Arrangements for Interrupt Management in a Processing Environment
US20140122421A1 (en) Information processing apparatus, information processing method and computer-readable storage medium
US7877643B2 (en) Method, system, and product for providing extended error handling capability in host bridges
CN112905376B (en) Method, device and medium for reporting errors
US11372589B2 (en) Flash memory controller and method capable of efficiently reporting debug information to host device
EP3877843A1 (en) System for configurable error handling
CN117389790B (en) Firmware detection system, method, storage medium and server capable of recovering faults
CN117453442A (en) Recording method, device, equipment and storage medium for server error reporting information
CN117056114A (en) IPMI command processing method, device, system and electronic equipment
CN115098342A (en) System log collection method, system, terminal and storage medium
CN111858183A (en) Restarting method and apparatus for electronic device
EP1222543B1 (en) Method and device for improving the reliability of a computer system
CN114153303B (en) Power consumption control system, power consumption control method, device and medium
CN118093265A (en) PCIE equipment fault processing method and server
CN114880187B (en) Method, device and medium for monitoring abnormal state of VR chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination