CN116225812A - Baseboard management controller system operation method, device, equipment and storage medium - Google Patents

Baseboard management controller system operation method, device, equipment and storage medium Download PDF

Info

Publication number
CN116225812A
CN116225812A CN202310509382.XA CN202310509382A CN116225812A CN 116225812 A CN116225812 A CN 116225812A CN 202310509382 A CN202310509382 A CN 202310509382A CN 116225812 A CN116225812 A CN 116225812A
Authority
CN
China
Prior art keywords
bmc
module
software module
management controller
controller system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310509382.XA
Other languages
Chinese (zh)
Other versions
CN116225812B (en
Inventor
张贞雷
邹晓峰
李拓
满宏涛
刘同强
周玉龙
王贤坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202310509382.XA priority Critical patent/CN116225812B/en
Publication of CN116225812A publication Critical patent/CN116225812A/en
Application granted granted Critical
Publication of CN116225812B publication Critical patent/CN116225812B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application relates to the technical field of communication, and discloses a method, a device, equipment and a storage medium for operating a baseboard management controller system, wherein the method comprises the following steps: acquiring a detection instruction for performing function detection on the baseboard management controller system; if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module; and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module. The stability and the safety of the operation of the server baseboard management controller system can be improved.

Description

Baseboard management controller system operation method, device, equipment and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a method, an apparatus, a device, and a storage medium for operating a baseboard management controller system.
Background
The baseboard management controller system in the related art works on the main board of the server and is mainly responsible for monitoring the temperature of the host CPU of the main board of the server, the voltage of each path of the main board, the rotating speed of the fan and other various information on the main board of the server, and when the main board of the server has a problem, a user can process the information in the first time. However, during the operation of the baseboard management controller system, whether it is software or hardware, an abnormality will inevitably occur, and under the abnormal condition, the baseboard management controller system is unstable to operate and is less safe, and catastrophic irrecoverable losses such as burning of a motherboard and burning of a server host CPU may be caused.
Therefore, how to improve the stability and safety of the operation of the server baseboard management controller system is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method, apparatus, device and storage medium for operating a baseboard management controller system, which can improve the stability and security of the operation of the baseboard management controller system of a server. The specific scheme is as follows:
a first aspect of the present application provides a baseboard management controller system operation method, including:
Acquiring a detection instruction for performing function detection on the baseboard management controller system;
if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module;
and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module.
Optionally, the detecting the validity of the BMC hardware module by using a BMC software module includes:
adding simulation processing logic in the BMC software module; the simulation processing logic is used for simulating the data processing process of the BMC hardware module at a software layer;
presetting a memory space in the BMC hardware module and adding data writing logic;
and executing the data writing logic and the simulation processing logic, and detecting the validity of the BMC hardware module according to a logic execution result.
Optionally, executing the data write logic includes:
controlling a general input/output port in the BMC hardware module to simultaneously send the acquired original data to a bus interface in the BMC hardware module and the memory space; and a mapping relation exists between the writing address in the memory space and the original data written in the bus interface.
Optionally, executing the analog processing logic includes:
the BMC software module is controlled to acquire first processed data from the bus interface, and the original data is read from the memory space; the first processed data is data obtained after the bus interface processes the original data;
and controlling the BMC software module to execute the same processing mode as the bus interface on the read original data to obtain second processed data, and detecting whether the bus interface is effective or not by comparing the first processed data with the second processed data.
Optionally, the controlling the BMC software module to obtain the first processed data from the bus interface and read the original data from the memory space includes:
And controlling the BMC software module to acquire the first processed data from the bus interface in a polling mode, and reading the original data from the write-in address of the memory space according to the mapping relation.
Optionally, the detecting whether the bus interface is valid by comparing the first processed data with the second processed data includes:
if the first processed data and the second processed data are consistent, judging that the bus interface is valid;
and if the first processed data and the second processed data are inconsistent, judging that the bus interface is invalid.
Optionally, the controlling the BMC software module to take over the task of the BMC hardware module includes:
and determining second processed data obtained by the BMC software module as output data obtained by processing the original data by the bus interface, and sending the second processed data to a remote client.
Optionally, if the detection result is that the BMC hardware module fails, the method further includes:
generating first alarm information and sending the first alarm information to the remote client.
Optionally, the detecting, by using the BMC hardware module, the validity of the BMC software module includes:
running a convention algorithm in the BMC software module to obtain a first operation result, and sending the first operation result to a software state detection module preset in the BMC hardware module;
and running the agreed algorithm in the software state detection module to obtain a second operation result, and controlling the software state detection module to detect whether the BMC software module is effective or not in a mode of comparing the first operation result with the second operation result.
Optionally, before the running of the commitment algorithm in the BMC software module obtains the first operation result, the method further includes:
the BMC software module is controlled to read a pre-stored random number from a register so as to operate the agreed algorithm by utilizing the pre-stored random number; the agreed algorithm is a CRC algorithm, and the pre-stored random number is a random number which is pre-stored in the register and is used for executing the CRC algorithm;
correspondingly, before the running of the appointment algorithm in the software state detection module obtains the second operation result, the method further comprises:
And controlling the software state detection module to read the prestored random number from the register.
Optionally, the controlling the software state detection module to detect whether the BMC software module is valid by comparing the first operation result and the second operation result includes:
if the first operation result is consistent with the second operation result, judging that the BMC software module is effective;
and if the first operation result and the second operation result are inconsistent, judging that the BMC software module fails.
Optionally, the method for operating the baseboard management controller system further includes:
a threshold value marking module is preset in the BMC hardware module, so that the threshold value marking module is utilized to receive and store the threshold value range of the supervision parameters of each supervised device issued by the BMC software module in an effective state;
correspondingly, if the detection result is that the BMC software module fails, the method further comprises the following steps:
and executing corresponding regulation and control operation on each supervised device based on the threshold parameters stored in the threshold sign module.
Optionally, the performing, based on the threshold parameter stored in the threshold flag module, a corresponding coping operation on each of the supervised devices includes:
The real-time supervision parameter values of the target supervised equipment are acquired in real time by utilizing a self-regulating module preset in the BMC hardware module, and a target threshold range of the target supervision parameters of the target supervised equipment stored in the threshold mark module is read;
and judging whether the real-time supervision parameter value of the target supervised equipment is within the target threshold range, if not, generating second alarm information, and sending the second alarm information to the server host side so that the server host side executes corresponding regulation and control operation, and whether the real-time supervision parameter value of the target supervised equipment meets the target threshold range.
Optionally, after the sending the second alarm information to the server host, the method further includes:
and if the real-time supervision parameter value of the target supervision equipment is still not within the target threshold value range, executing active power-down operation on the server main board by utilizing the self-regulating module.
Optionally, the method for operating the baseboard management controller system further includes:
and recording the real-time supervision parameter value and the execution process of the regulation operation in a register by utilizing the self-regulation module.
Optionally, the method for operating the baseboard management controller system further includes:
A log generation module is preset in the BMC hardware module;
and reading the related information recorded by the self-regulating module from the register by utilizing the log generating module, analyzing the read related information to generate corresponding log information, and transmitting the log information to a remote client.
Optionally, the sending the log information to a remote client includes:
and carrying out the grouping of the Ethernet frames on the log information by utilizing a network module preset in the BMC hardware module, and sending the log information after the grouping to the remote client.
Optionally, if the detection result is that the BMC software module fails, the method further includes:
and alarming at the server host computer by utilizing an alarming module preset in the BMC hardware module according to a preset alarming mode.
A second aspect of the present application provides a baseboard management controller system operation device, including:
the instruction acquisition module is used for acquiring a detection instruction for performing function detection on the baseboard management controller system;
the first detection and take-over module is used for detecting the validity of the BMC hardware module by utilizing the BMC software module if the detection instruction is for detecting the BMC hardware module, and controlling the BMC software module to take over the task of the BMC hardware module if the detection result is that the BMC hardware module fails;
And the second detection and take-over module is used for detecting the validity of the BMC software module by using the BMC hardware module if the detection object of the detection instruction is the BMC software module, and controlling the BMC hardware module to take over the task of the BMC software module if the detection result is that the BMC software module fails.
A third aspect of the present application provides an electronic device comprising a processor and a memory; wherein the memory is configured to store a computer program that is loaded and executed by the processor to implement the aforementioned baseboard management controller system operation method.
A fourth aspect of the present application provides a computer readable storage medium having stored therein computer executable instructions that, when loaded and executed by a processor, implement the foregoing baseboard management controller system operation method.
In the application, a detection instruction for performing function detection on a baseboard management controller system is firstly obtained; if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module; and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module. Therefore, when software is detected, the BMC software module is mainly utilized to detect the validity of the BMC hardware module, when hardware is detected, the BMC hardware module is mainly utilized to detect the validity of the BMC software module, and when hardware or software failure is detected, the task is ensured not to be interrupted in a task taking-over mode, so that the safety of a server main board is ensured, and the running stability and safety of a server baseboard management controller system are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an operation method of a baseboard management controller system provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for operating a baseboard management controller system according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for operating a baseboard management controller system according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an operation architecture of a baseboard management controller system according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating a method for operating a baseboard management controller system according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a method for operating a baseboard management controller system according to an embodiment of the present application;
FIG. 7 is a schematic diagram of an operation architecture of a baseboard management controller system according to an embodiment of the present application;
Fig. 8 is a schematic structural diagram of an operation device of a baseboard management controller system according to an embodiment of the present application;
fig. 9 is a block diagram of an electronic device for running a baseboard management controller system according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the running process of the existing baseboard management controller system, no matter software or hardware, the abnormality can be inevitably generated, and under the abnormal condition, the baseboard management controller system is unstable to run and is safer, and the disaster irrecoverable losses such as the burning of a main board and the burning of a CPU of a server host can be possibly caused. Aiming at the technical defects, the application provides an operation scheme of the baseboard management controller system, which mainly utilizes a BMC software module to detect the validity of a BMC hardware module when software is detected, mainly utilizes the BMC hardware module to detect the validity of the BMC software module when hardware is detected, and ensures that tasks are not interrupted in a task taking-over mode when hardware or software failure is detected, thereby ensuring the safety of a server mainboard and improving the running stability and safety of the server baseboard management controller system.
Fig. 1 is a flowchart of an operation method of a baseboard management controller system according to an embodiment of the present application. Referring to fig. 1, the method for operating the baseboard management controller system includes:
s11: and acquiring a detection instruction for performing function detection on the baseboard management controller system.
In this embodiment, a detection instruction for performing function detection on a baseboard management controller system is first obtained, and the baseboard management controller system, namely a BMC (Baseboard management controller) system, is mainly used for monitoring and managing states (temperature, fan, main CPU running condition and the like) of a server, and includes a BMC hardware module and a BMC software module, where the BMC hardware module is integrated on a BMC chip and includes inherent various bus interfaces (IIC, UART and other interfaces), and also includes each hardware module preset in the embodiment of the present application. The bus interface in the BMC hardware module mainly collects information of the server main board and the host CPU, in a multi-path host, the number of the CPU is possibly 2/4/8/16, even if a plurality of server main boards share one baseboard management controller system in the blade server, after each interface bus collects related information, the BMC CPU reads the related information. The BMC software module runs on the BMC CPU and is used for forming relevant alarms according to the read mainboard information, notifying a user of timely processing by the log, and even some BMC software modules can automatically perform emergency exception processing.
S12: if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module.
In this embodiment, if the detection instruction is to detect the BMC hardware module, the BMC software module is used to detect the validity of the BMC hardware module, and if the detection result is that the BMC hardware module fails, the BMC software module is controlled to take over the task of the BMC hardware module. According to the process, under the condition that some hardware interfaces of the BMC hardware module fail, the BMC software function is optimized, the BMC software takes over part of functions of the BMC hardware module, and therefore the BMC software can still process the abnormality of the mainboard under the condition that some hardware interfaces of the BMC hardware module fail, and the stability and the safety of the server mainboard are improved. Specifically comprises the following steps (figure 2):
s121: adding simulation processing logic in the BMC software module; the simulation processing logic is logic for simulating the data processing process of the BMC hardware module at a software layer.
S122: and presetting a memory space in the BMC hardware module and adding data writing logic.
S123: and executing the data writing logic and the simulation processing logic, and detecting the validity of the BMC hardware module according to a logic execution result.
In this embodiment, corresponding logic implementation needs to be added to the BMC software module and the BMC hardware module at the same time, and specifically, analog processing logic is added to the BMC software module, where the analog processing logic is logic that performs simulation on a data processing process of the BMC hardware module at a software layer, and at the same time, a memory space is preset in the BMC hardware module and data writing logic is added. On the basis, the data writing logic and the simulation processing logic are executed, and the effectiveness of the BMC hardware module is detected according to a logic execution result.
It can be understood that, because the interfaces IIC/UART in the BMC hardware module have more data (even up to 10 are required), but such a large number of interfaces IIC, UART and the like are intensively placed in the BMC chip layout, and the comprehensive constraint, layout wiring, packaging manufacturing and the like of the rear end of the chip all cause great risks, once a certain interface is physically damaged, the key information of the server cannot be transmitted to the user side, so that the probability of damage to the server motherboard and the host CPU is greatly increased, unavoidable losses are caused, and the consequences are extremely serious, so that it is necessary to detect the validity of the BMC hardware module.
S13: and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module.
In this embodiment, if the detection object of the detection instruction is the BMC software module, the BMC hardware module is used to detect the validity of the BMC software module, and if the detection result is that the BMC software module fails, the BMC hardware module is controlled to take over the task of the BMC software module. Under the condition that BMC software fails, the BMC hardware module is optimized, the BMC hardware module takes over part of software functions, and the BMC hardware module comprehensively takes over information monitoring, processing and storage of key monitoring points of a server main board, so that stability and safety of the server main board are improved.
It can be understood that the BMC software may cause abnormal running of the BMC software and incapability of reading related information of the motherboard due to reasons such as out-of-range values or memory in the use process, too fast reading of the memory or data, incapability of timely responding to the system, strong external interference and the like. In addition, in normal operation, a server often has the problem of BMC software upgrading, because the version can frequently update and iterate, repair the bug of the previous version, add new software functions and the like, in the process of BMC software upgrading, the server is not allowed to be powered down, the BMC software of the new version is written into the flash of the BMC chip in a remote upgrading mode, then the BMC chip is restarted, related content of the flash is read, and the BMC software is moved to the DDR to complete the operation of the BMC software system of the new version. In the remote upgrading process, a vacuum period of monitoring management exists, and because the restarting process, the abnormal phenomenon of the server motherboard cannot be processed by BMC software, cannot be known by a user, and is likely to cause key problems such as the abnormal rise of motherboard voltage and the burning of the motherboard in a short time. Therefore, it is also necessary to check the validity of the BMC software module.
Therefore, in the embodiment of the application, a detection instruction for performing function detection on the baseboard management controller system is obtained first; if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module; and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module. According to the embodiment of the application, when software is detected, the BMC software module is mainly used for detecting the validity of the BMC hardware module, when hardware is detected, the BMC hardware module is mainly used for detecting the validity of the BMC software module, and when hardware or software failure is detected, tasks are guaranteed not to be interrupted in a task taking-over mode, so that the safety of a server main board is guaranteed, and the running stability and safety of a server baseboard management controller system are improved.
Fig. 3 is a flowchart of a specific operation method of a baseboard management controller system according to an embodiment of the present application. Referring to fig. 3, the baseboard management controller system operation method includes:
s21: and acquiring a detection instruction for performing function detection on the baseboard management controller system.
In this embodiment, for the specific process of step S21, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
S22: controlling a general input/output port in the BMC hardware module to simultaneously send the acquired original data to a bus interface and a memory space in the BMC hardware module; and a mapping relation exists between the writing address in the memory space and the original data written in the bus interface.
In this embodiment, when executing the write logic, the collected original data is mainly sent to the bus interface and the memory space in the BMC hardware module by controlling the general purpose input output port (GPIO) in the BMC hardware module, where a mapping relationship exists between a write address in the memory space and the original data written in the bus interface. The memory space may be DDR (Double Data Rate), which is not limited in this embodiment, and the related architecture is shown in fig. 4.
S23: the BMC software module is controlled to acquire first processed data from the bus interface, and the original data is read from the memory space; the first processed data is data obtained after the bus interface processes the original data.
S24: and controlling the BMC software module to execute the same processing mode as the bus interface on the read original data to obtain second processed data, and detecting whether the bus interface is effective or not by comparing the first processed data with the second processed data.
In this embodiment, when executing the analog processing logic, the BMC software module is controlled to acquire the first processed data from the bus interface in a polling manner, and read the original data from the write address of the memory space according to the mapping relationship. And then controlling the BMC software module to execute the same processing mode as the bus interface on the read original data to obtain second processed data, and detecting whether the bus interface is effective or not by comparing the first processed data with the second processed data. If the first processed data and the second processed data are consistent, judging that the bus interface is valid; and if the first processed data and the second processed data are inconsistent, judging that the bus interface is invalid.
Specifically, the BMC software module is optimized, so that the BMC software module obtains a relevant hardware interface (such as IIC0 and the like) and a corresponding DDR address in a polling mode, the software layer performs data comparison, meanwhile, the BMC hardware is optimized, the relevant GPIO port not only transmits data to the relevant hardware module interface, but also adds DDR write logic to the GPIO, and relevant data is written into the appointed DDR address. The BMC software is added with hardware interface processing functions such as analog IIC and the like, for example, the original data is subjected to the same analysis and the like. The BMC software polls the data obtained from the hardware interface module and the original data read from the agreed address in the DDR, and carries out corresponding processing on the original data, so as to carry out data comparison, further judge whether the related hardware interface is invalid, and finally be attributed to the failure of the BMC hardware module.
S25: and determining second processed data obtained by the BMC software module as output data obtained by processing the original data by the bus interface, and sending the second processed data to a remote client.
In this embodiment, if the polled data of the corresponding interface such as IIC is consistent with the processed DDR data, it is determined that the related interface hardware such as IIC is normal, if the data is inconsistent, it is determined that the corresponding interface hardware such as IIC is invalid, at this time, the BMC software module needs to take over the corresponding function, that is, the second processed data obtained by the BMC software module is determined as output data obtained by processing the original data by the bus interface, and the second processed data is sent to the remote client.
In this embodiment, first alarm information may also be generated at the same time, and the first alarm information may be sent to the remote client. The method and the device send out alarm information, timely inform a user that a certain interface is abnormal and the BMC chip needs to be replaced, so that the information of key points of the server main board can still be known by a user side under the condition that certain hardware interfaces of the BMC chip are invalid, relevant processing is carried out, and the stability and the safety of a server baseboard management control system are guaranteed.
Fig. 5 is a flowchart of a specific operation method of a baseboard management controller system according to an embodiment of the present application. Referring to fig. 5, the baseboard management controller system operation method includes:
s31: and acquiring a detection instruction for performing function detection on the baseboard management controller system.
In this embodiment, regarding the specific process of step S31, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
S32: and running a convention algorithm in the BMC software module to obtain a first operation result, and sending the first operation result to a software state detection module preset in the BMC hardware module.
S33: and running the agreed algorithm in the software state detection module to obtain a second operation result, and controlling the software state detection module to detect whether the BMC software module is effective or not in a mode of comparing the first operation result with the second operation result.
In this embodiment, the architecture of the BMC hardware module is optimized, and a software STATE detection module (soft_state_check) is added, which is responsible for running a contract algorithm, and the effectiveness detection of the BMC software module is implemented by using the contract algorithm. Specifically, a convention algorithm is operated in the BMC software module to obtain a first operation result, and the first operation result is sent to a software state detection module preset in the BMC hardware module. And meanwhile, running the agreed algorithm in the software state detection module to obtain a second operation result, and controlling the software state detection module to detect whether the BMC software module is effective or not in a mode of comparing the first operation result with the second operation result. And if the first operation result is consistent with the second operation result, judging that the BMC software module is valid, and if the first operation result is inconsistent with the second operation result, judging that the BMC software module is invalid.
In this embodiment, when the contract algorithm is a CRC algorithm, the software state detection module further needs to generate a random number and store the random number in a register. Therefore, before the provisioning algorithm is run, the BMC software module needs to be controlled to read the pre-stored random number from the register, and meanwhile, the software state detection module needs to be controlled to read the pre-stored random number from the register. And running the agreed algorithm by utilizing the pre-stored random number. If the result is inconsistent or no calculated value is written back in the appointed time, the BMC software is judged to be invalid. And if the calculation results are consistent, judging that the BMC software is normal in function. That is, the BMC software module and the software state detection module simultaneously perform CRC algorithm operation, after the calculation of the BMC software module is completed, the BMC software module is written back to the software state detection module, then the result comparison is performed, and further whether the BMC software runs off or is in a monitoring management vacuum period of a BMC software restarting stage is judged, and finally the BMC software is unified and is in failure. Notably, the time interval of the single comparison of the software state detection modules can be set by a user according to a specific application scenario, for example, the interval can be set to be shorter in the process of upgrading the BMC software, so that the stability and the safety of the baseboard management control system are ensured.
S34: and if the detection result is that the BMC software module fails, executing corresponding regulation and control operation on each monitored device based on the threshold parameter stored in the threshold flag module.
In this embodiment, if the detection result is that the BMC software module fails, a corresponding regulation and control operation is performed on each monitored device based on the threshold parameter stored in the threshold flag module. A THRESHOLD sign module (threshold_MARK) and a SELF-regulating module (self_ADJUST) are also added when the BMC hardware module architecture is optimized.
The method comprises the steps that a threshold value mark module preset in a BMC hardware module is utilized to receive and store the threshold value range of the supervision parameters of each supervised device issued by the BMC software module in an effective state, namely the threshold value mark module receives the normal threshold value range of each monitoring management interface (supervised device) of the BMC chip issued by the BMC software in normal operation. For example, the IIC0 interface is responsible for monitoring the temperature of the host CPU, the higher temperature threshold set by the user is 80 degrees, the very dangerous threshold is 90 degrees, etc. And then when the BMC software fails, the threshold value stored by the module is utilized to carry out autonomous management on each monitoring management interface. For example, IIC3 monitors voltage information of PSU module (power supply), 5V is normal value, 6V is higher threshold, 7V is very dangerous threshold, etc.
The self-regulating module bears the function of adjusting key monitoring points of the server main board under the condition that BMC software fails, and the embodiment mainly processes temperature and voltage monitoring points. The regulating operation mainly comprises the following steps (figure 6):
s341: and acquiring the real-time supervision parameter value of the target supervised equipment in real time by utilizing a self-regulating module preset in the BMC hardware module, and reading a target threshold range of the target supervision parameter of the target supervised equipment stored in the threshold mark module.
S342: and judging whether the real-time supervision parameter value of the target supervised equipment is within the target threshold range, if not, generating second alarm information, and sending the second alarm information to the server host side so that the server host side executes corresponding regulation and control operation, and whether the real-time supervision parameter value of the target supervised equipment meets the target threshold range.
S343: and if the real-time supervision parameter value of the target supervision equipment is still not within the target threshold value range, executing active power-down operation on the server main board by utilizing the self-regulating module.
In this embodiment, the real-time supervision parameter value of the target device to be supervised is collected in real time by using a self-tuning module preset in the BMC hardware module, and the target threshold range of the target supervision parameter of the target device to be supervised stored in the threshold flag module is read. On the basis, whether the real-time supervision parameter value of the target supervised equipment is within the target threshold range is judged, if not, second alarm information is generated, and the second alarm information is sent to the server host side, so that the server host side executes corresponding regulation and control operation, and whether the real-time supervision parameter value of the target supervised equipment meets the target threshold range is judged. And if the real-time supervision parameter value of the target supervision equipment is still not within the target threshold value range, executing active power-down operation on the server main board by utilizing the self-regulating module.
Taking temperature monitoring of a CPU of a server host as an example, the number of CPUs in the server currently includes 1/2/4/6/8/16, etc., the self-regulating module first needs to know the positional relationship between the CPUs and the corresponding fans, that is, needs to know which fan rotation speed needs to be regulated when a certain CPU temperature is abnormal. When the BMC software fails, the self_adjust module is responsible for monitoring information of a corresponding monitoring point, if the information is temperature information, no measures are taken when the temperature is in a normal range according to the THRESHOLD range configured by the threshold_mark module and the position relationship between the CPU and the fan. When the temperature of a certain CPU is in a higher temperature range, the corresponding fan rotating speed is automatically adjusted, meanwhile, the temperature abnormality information is transmitted to software of the server host through an LPC interface of interaction between the BMC chip and the host, the software is correspondingly modified, when the temperature abnormality warning of the BMC chip is received, the software of the host adopts measures such as reducing the load of the CPU, and the SELF-adjusting module monitors in real time, and when the temperature of the CPU is still in a threshold range which is continuously increased to be dangerous, the main board actively cuts off power, so that the safety of the CPU is ensured. Similarly, when the monitored voltage information is abnormal, the corresponding voltage control circuit is controlled to reduce the voltage, and when the voltage continuously rises, the main board actively loses power, so that the safety of the main board of the server is ensured.
S344: and recording the real-time supervision parameter value and the execution process of the regulation operation in a register by utilizing the self-regulation module.
S345: and presetting a log generation module in the BMC hardware module.
S346: and reading the related information recorded by the self-regulating module from the register by utilizing the log generating module, analyzing the read related information to generate corresponding log information, and transmitting the log information to a remote client.
In this embodiment, the self-adjusting module may further record the real-time supervision parameter value and the execution process of the adjusting operation in a register, and the BMC hardware module may be preset with a log generating module. The steps mentioned above should be recorded, and the corresponding modification of the BMC software is also involved, for example, bit [1:0] =1 of the register a is set to indicate that the temperature of the CPU0 reaches the dangerous threshold, a specific temperature value is recorded through the register B, and the register a [2] =1 is set to indicate that the power-down operation of the motherboard is adopted because the temperature of the CPU0 is too high.
In this embodiment, a log generating module (result_store), a network module (emac_mdy), and a warning module (local_warn) are further added to the BMC hardware module, and the overall architecture is shown in fig. 7. Further, the log generating module is used for reading the related information recorded by the self-regulating module from the register, analyzing the read related information to generate corresponding log information, and sending the log information to a remote client. The result_store module is used for correspondingly recording the operation and the processing procedure corresponding to the self_adjust module, which requires modification corresponding to the BMC software, that is, the software needs to know the specific meaning corresponding to the register storage. After the BMC software is updated, the result_store module firstly reads the value of the register, forms log information after analysis, displays the log information on a BMC WEB page, and is known by a user that during the failure period of the BMC software, the server mainboard has abnormal conditions and corresponding processing procedures, so that the user can conveniently and specifically analyze and process the phenomenon generated by the abnormality, and the phenomenon is prevented from appearing again. The chip pins are connected with the motherboard detection points to acquire motherboard information, so that the self-regulating module bears the function of adjusting key monitoring points of the motherboard of the server under the condition that BMC software fails.
In this embodiment, under the condition that the BMC software is abnormal, the network module preset in the BMC hardware module is used to perform ethernet frame packet grouping on the log information, and the log information after packet grouping is sent to the remote client. The EMAC-MDY module is used for optimizing a traditional EMAC (Internet protocol) network module, the EMAC-MDY module is added with the function of completing the grouping of Ethernet frames by EMAC hardware under the condition that BMC software is abnormal, and sending abnormal information to a remote client under the condition that the abnormal information does not pass through a BMC software layer (namely corresponding EMAC drive), software of the remote client also needs to be correspondingly modified, and after the software of the remote client analyzes the specific network data packet, warning information needs to be popped out on a BMC WEB interface of the remote client in time to remind a user of the remote client to process.
In this embodiment, an alarm module preset in the BMC hardware module is utilized to alarm at the server host according to a preset alarm mode. The LOCAL warning module has the function of realizing LOCAL warning when the BMC software is abnormal, and the user can not know the abnormal situation in time because the BMC software is abnormal, and the LOCAL user is timely reminded through the buzzer or the indicator lamp measures arranged on the LOCAL warning module at the moment, so that the user can know abnormal information in the first time.
Referring to fig. 8, the embodiment of the application further correspondingly discloses a running device of the baseboard management controller system, which includes:
an instruction acquisition module 11, configured to acquire a detection instruction for performing function detection on the baseboard management controller system;
the first detecting and taking over module 12 is configured to detect, if the detecting instruction is to detect a BMC hardware module, the validity of the BMC hardware module by using a BMC software module, and if the detecting result is that the BMC hardware module fails, control the BMC software module to take over a task of the BMC hardware module;
and the second detecting and taking over module 13 is configured to detect the validity of the BMC software module by using the BMC hardware module if the detection object of the detection instruction is the BMC software module, and if the detection result is that the BMC software module fails, control the BMC hardware module to take over the task of the BMC software module.
Therefore, in the embodiment of the application, a detection instruction for performing function detection on the baseboard management controller system is obtained first; if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module; and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module. According to the embodiment of the application, when software is detected, the BMC software module is mainly used for detecting the validity of the BMC hardware module, when hardware is detected, the BMC hardware module is mainly used for detecting the validity of the BMC software module, and when hardware or software failure is detected, tasks are guaranteed not to be interrupted in a task taking-over mode, so that the safety of a server main board is guaranteed, and the running stability and safety of a server baseboard management controller system are improved.
In some embodiments, the first detecting and taking over module 12 specifically includes:
the first adding sub-module is used for adding simulation processing logic in the BMC software module; the simulation processing logic is used for simulating the data processing process of the BMC hardware module at a software layer;
the second adding sub-module is used for presetting a memory space in the BMC hardware module and adding data writing logic;
and the logic execution sub-module is used for executing the data writing logic and the simulation processing logic and detecting the validity of the BMC hardware module according to a logic execution result.
And the receiving sub-module is used for determining second processed data obtained by the BMC software module as output data obtained by processing the original data by the bus interface and sending the second processed data to a remote client.
In some embodiments, the logic execution submodule specifically includes:
the data sending unit is used for controlling a general input/output port in the BMC hardware module to send the acquired original data to a bus interface in the BMC hardware module and the memory space simultaneously; the writing address in the memory space has a mapping relation with the original data written in the bus interface;
The data reading unit is used for controlling the BMC software module to acquire first processed data from the bus interface and read the original data from the memory space; the first processed data is data obtained after the bus interface processes the original data;
and the comparison unit is used for controlling the BMC software module to execute the same processing mode as the bus interface on the read original data to obtain second processed data, and detecting whether the bus interface is effective or not by comparing the first processed data with the second processed data.
In some embodiments, the data reading unit is specifically configured to control the BMC software module to obtain the first processed data from the bus interface by using a polling manner, and read the original data from the write address of the memory space according to the mapping relationship.
In some embodiments, the comparing unit is specifically configured to determine that the bus interface is valid if the first processed data and the second processed data are consistent; and if the first processed data and the second processed data are inconsistent, judging that the bus interface is invalid.
In some embodiments, the second detecting and taking-over module 13 specifically includes:
the data reading sub-module is used for controlling the BMC software module to read a pre-stored random number from a register so as to operate the agreed algorithm by utilizing the pre-stored random number; the agreed algorithm is a CRC algorithm, and the pre-stored random number is a random number which is pre-stored in the register and is used for executing the CRC algorithm;
the first operation sub-module is used for operating a convention algorithm in the BMC software module to obtain a first operation result and sending the first operation result to a software state detection module preset in the BMC hardware module;
the data reading sub-module is further used for controlling the software state detection module to read the prestored random number from the register;
and the second operation sub-module is used for operating the agreed algorithm in the software state detection module to obtain a second operation result and controlling the software state detection module to detect whether the BMC software module is effective or not in a mode of comparing the first operation result with the second operation result.
In some embodiments, the second operation submodule is specifically configured to determine that the BMC software module is valid if the first operation result and the second operation result are consistent; and if the first operation result and the second operation result are inconsistent, judging that the BMC software module fails.
In some embodiments, the baseboard management controller system operation device further includes:
the parameter storage module is used for receiving and storing the threshold range of the supervision parameters of each supervised device issued by the BMC software module in the effective state by utilizing a threshold value mark module preset in the BMC hardware module;
and the regulation and control module is used for executing corresponding regulation and control operation on each monitored device based on the threshold parameter stored in the threshold marking module.
In some embodiments, the regulation module specifically includes:
the acquisition sub-module is used for acquiring the real-time supervision parameter value of the target supervised equipment in real time by utilizing the self-control module preset in the BMC hardware module and reading the target threshold range of the target supervision parameter of the target supervised equipment stored in the threshold mark module;
the judging sub-module is used for judging whether the real-time supervision parameter value of the target supervised equipment is within the target threshold range, if not, generating second alarm information, and sending the second alarm information to the server host end so that the server host end executes corresponding regulation and control operation, and whether the real-time supervision parameter value of the target supervised equipment meets the target threshold range or not;
The electronic power-off module is used for executing active power-off operation on the server main board by utilizing the self-regulating module if the real-time supervision parameter value of the target supervision equipment is still not within the target threshold range;
the recording sub-module is used for recording the real-time supervision parameter value and the execution process of the regulation operation in a register by utilizing the self-regulation module;
the generation sub-module is used for presetting a log generation module in the BMC hardware module;
and the analysis sub-module is used for reading the related information recorded by the self-regulating module from the register by utilizing the log generation module, analyzing the read related information to generate corresponding log information, and sending the log information to a remote client.
Further, the embodiment of the application also provides electronic equipment. Fig. 9 is a block diagram of an electronic device 20, according to an exemplary embodiment, and nothing in the figure should be taken as limiting the scope of use of the present application.
Fig. 9 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is configured to store a computer program that is loaded and executed by the processor 21 to implement relevant steps in the baseboard management controller system operation method disclosed in any one of the foregoing embodiments.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, data 223, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and the computer program 222, so as to implement the operation and processing of the processor 21 on the mass data 223 in the memory 22, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the method of operating the baseboard management controller system performed by the electronic device 20 disclosed in any one of the embodiments described above. The data 223 may include instruction data collected by the electronic device 20.
Further, the embodiment of the application also discloses a storage medium, and the storage medium stores a computer program, and when the computer program is loaded and executed by a processor, the method steps for operating the baseboard management controller system disclosed in any one of the previous embodiments are realized.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has described in detail the methods, apparatus, devices and storage medium for operating a baseboard management controller system provided by the present invention, and specific examples have been applied herein to illustrate the principles and embodiments of the present invention, and the above examples are only used to help understand the methods and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (21)

1. A method of operating a baseboard management controller system, comprising:
acquiring a detection instruction for performing function detection on the baseboard management controller system;
if the detection instruction is to detect the BMC hardware module, detecting the validity of the BMC hardware module by utilizing a BMC software module, and if the detection result is that the BMC hardware module fails, controlling the BMC software module to take over the task of the BMC hardware module;
and if the detection object of the detection instruction is the BMC software module, detecting the validity of the BMC software module by using the BMC hardware module, and if the detection result is that the BMC software module fails, controlling the BMC hardware module to take over the task of the BMC software module.
2. The baseboard management controller system operation method of claim 1, wherein the detecting the validity of the BMC hardware module by the BMC software module comprises:
adding simulation processing logic in the BMC software module; the simulation processing logic is used for simulating the data processing process of the BMC hardware module at a software layer;
presetting a memory space in the BMC hardware module and adding data writing logic;
and executing the data writing logic and the simulation processing logic, and detecting the validity of the BMC hardware module according to a logic execution result.
3. The baseboard management controller system operation method of claim 2, wherein executing the data write logic comprises:
controlling a general input/output port in the BMC hardware module to simultaneously send the acquired original data to a bus interface in the BMC hardware module and the memory space; and a mapping relation exists between the writing address in the memory space and the original data written in the bus interface.
4. The baseboard management controller system operation method of claim 3, wherein executing the analog processing logic comprises:
The BMC software module is controlled to acquire first processed data from the bus interface, and the original data is read from the memory space; the first processed data is data obtained after the bus interface processes the original data;
and controlling the BMC software module to execute the same processing mode as the bus interface on the read original data to obtain second processed data, and detecting whether the bus interface is effective or not by comparing the first processed data with the second processed data.
5. The method of claim 4, wherein controlling the BMC software module to obtain first processed data from the bus interface and read the raw data from the memory space comprises:
and controlling the BMC software module to acquire the first processed data from the bus interface in a polling mode, and reading the original data from the write-in address of the memory space according to the mapping relation.
6. The method of claim 4, wherein the detecting whether the bus interface is valid by comparing the first processed data with the second processed data comprises:
If the first processed data and the second processed data are consistent, judging that the bus interface is valid;
and if the first processed data and the second processed data are inconsistent, judging that the bus interface is invalid.
7. The baseboard management controller system operation method of claim 6, wherein controlling the BMC software module to take over tasks of the BMC hardware module comprises:
and determining second processed data obtained by the BMC software module as output data obtained by processing the original data by the bus interface, and sending the second processed data to a remote client.
8. The baseboard management controller system operation method of claim 7, further comprising, if the detection result is that the BMC hardware module fails:
generating first alarm information and sending the first alarm information to the remote client.
9. The baseboard management controller system operation method of any one of claims 1 to 8, wherein said detecting the validity of the BMC software module by the BMC hardware module comprises:
Running a convention algorithm in the BMC software module to obtain a first operation result, and sending the first operation result to a software state detection module preset in the BMC hardware module;
and running the agreed algorithm in the software state detection module to obtain a second operation result, and controlling the software state detection module to detect whether the BMC software module is effective or not in a mode of comparing the first operation result with the second operation result.
10. The method according to claim 9, wherein before the running the provisioning algorithm in the BMC software module obtains the first operation result, further comprising:
the BMC software module is controlled to read a pre-stored random number from a register so as to operate the agreed algorithm by utilizing the pre-stored random number; the agreed algorithm is a CRC algorithm, and the pre-stored random number is a random number which is pre-stored in the register and is used for executing the CRC algorithm;
correspondingly, before the running of the appointment algorithm in the software state detection module obtains the second operation result, the method further comprises:
and controlling the software state detection module to read the prestored random number from the register.
11. The baseboard management controller system operation method of claim 9, wherein controlling the software status detection module to detect whether the BMC software module is valid by comparing the first operation result and the second operation result comprises:
if the first operation result is consistent with the second operation result, judging that the BMC software module is effective;
and if the first operation result and the second operation result are inconsistent, judging that the BMC software module fails.
12. The baseboard management controller system operation method of claim 9, further comprising:
receiving and storing a threshold range of supervision parameters of each supervised device issued by the BMC software module in an effective state by using a threshold mark module preset in the BMC hardware module;
correspondingly, the controlling the BMC hardware module to take over the task of the BMC software module further includes:
and executing corresponding regulation and control operation on each supervised device based on the threshold parameters stored in the threshold sign module.
13. The baseboard management controller system operation method of claim 12, wherein the performing a respective regulatory operation on each of the supervised devices based on the threshold parameters stored in the threshold signature module comprises:
The real-time supervision parameter values of the target supervised equipment are acquired in real time by utilizing a self-regulating module preset in the BMC hardware module, and a target threshold range of the target supervision parameters of the target supervised equipment stored in the threshold mark module is read;
and judging whether the real-time supervision parameter value of the target supervised equipment is within the target threshold range, if not, generating second alarm information, and sending the second alarm information to the server host side so that the server host side executes corresponding regulation and control operation, and whether the real-time supervision parameter value of the target supervised equipment meets the target threshold range.
14. The method for operating a baseboard management controller system according to claim 13, wherein after the second alarm information is sent to a server host, further comprising:
and if the real-time supervision parameter value of the target supervision equipment is still not within the target threshold value range, executing active power-down operation on the server main board by utilizing the self-regulating module.
15. The baseboard management controller system operation method of claim 14, further comprising:
and recording the real-time supervision parameter value and the execution process of the regulation operation in a register by utilizing the self-regulation module.
16. The baseboard management controller system operation method of claim 15, further comprising:
a log generation module is preset in the BMC hardware module;
and reading the related information recorded by the self-regulating module from the register by utilizing the log generating module, analyzing the read related information to generate corresponding log information, and transmitting the log information to a remote client.
17. The baseboard management controller system operation method of claim 16, wherein said sending the log information to a remote client comprises:
and carrying out the grouping of the Ethernet frames on the log information by utilizing a network module preset in the BMC hardware module, and sending the log information after the grouping to the remote client.
18. The method for operating a baseboard management controller system according to claim 9, further comprising, if the detection result is that the BMC software module fails:
and alarming at the server host computer by utilizing an alarming module preset in the BMC hardware module according to a preset alarming mode.
19. A baseboard management controller system operation device, comprising:
The instruction acquisition module is used for acquiring a detection instruction for performing function detection on the baseboard management controller system;
the first detection and take-over module is used for detecting the validity of the BMC hardware module by utilizing the BMC software module if the detection instruction is for detecting the BMC hardware module, and controlling the BMC software module to take over the task of the BMC hardware module if the detection result is that the BMC hardware module fails;
and the second detection and take-over module is used for detecting the validity of the BMC software module by using the BMC hardware module if the detection object of the detection instruction is the BMC software module, and controlling the BMC hardware module to take over the task of the BMC software module if the detection result is that the BMC software module fails.
20. An electronic device comprising a processor and a memory; wherein the memory is for storing a computer program to be loaded and executed by the processor to implement the baseboard management controller system operation method of any one of claims 1 to 18.
21. A computer readable storage medium storing computer executable instructions which when loaded and executed by a processor implement the baseboard management controller system operation method of any one of claims 1 to 18.
CN202310509382.XA 2023-05-08 2023-05-08 Baseboard management controller system operation method, device, equipment and storage medium Active CN116225812B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310509382.XA CN116225812B (en) 2023-05-08 2023-05-08 Baseboard management controller system operation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310509382.XA CN116225812B (en) 2023-05-08 2023-05-08 Baseboard management controller system operation method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116225812A true CN116225812A (en) 2023-06-06
CN116225812B CN116225812B (en) 2023-08-04

Family

ID=86569887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310509382.XA Active CN116225812B (en) 2023-05-08 2023-05-08 Baseboard management controller system operation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116225812B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806900A (en) * 2023-07-28 2024-04-02 苏州浪潮智能科技有限公司 Server management method, device, electronic equipment and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289570A1 (en) * 2013-03-22 2014-09-25 Insyde Software Corp. Virtual baseboard management controller
CN105760241A (en) * 2014-12-18 2016-07-13 联想(北京)有限公司 Exporting method and system for memory data
CN109976959A (en) * 2019-03-27 2019-07-05 苏州浪潮智能科技有限公司 A kind of portable device and method for server failure detection
CN110543398A (en) * 2019-07-31 2019-12-06 苏州浪潮智能科技有限公司 method and system for recording fault information
CN111447079A (en) * 2020-02-28 2020-07-24 华东计算技术研究所(中国电子科技集团公司第三十二研究所) High-availability extension system and method based on SCA framework
CN111694719A (en) * 2020-06-10 2020-09-22 腾讯科技(深圳)有限公司 Server fault processing method and device, storage medium and electronic equipment
CN112486743A (en) * 2020-10-28 2021-03-12 苏州浪潮智能科技有限公司 Interactive server intelligent fault processing system and method
CN112631863A (en) * 2020-12-22 2021-04-09 苏州浪潮智能科技有限公司 BMC health state detection method, electronic device and storage medium
CN112948157A (en) * 2021-01-29 2021-06-11 苏州浪潮智能科技有限公司 Server fault positioning method, device and system and computer readable storage medium
CN113867129A (en) * 2021-10-27 2021-12-31 珠海格力电器股份有限公司 Redundancy control method, device and system, computer equipment and storage medium
CN114138567A (en) * 2021-11-26 2022-03-04 浪潮电子信息产业股份有限公司 Substrate management control module maintenance method, device, equipment and storage medium
CN114996090A (en) * 2022-05-31 2022-09-02 济南浪潮数据技术有限公司 Server abnormity detection method and device, electronic equipment and storage medium
CN115048655A (en) * 2022-06-23 2022-09-13 苏州浪潮智能科技有限公司 Method, device, equipment and medium for checking mirror image of basic input/output system
CN115114118A (en) * 2022-07-28 2022-09-27 济南浪潮数据技术有限公司 Server monitoring method and related equipment
CN115437819A (en) * 2022-08-12 2022-12-06 苏州浪潮智能科技有限公司 Error reporting method and device for server, computer equipment and storage medium
CN115562918A (en) * 2022-09-30 2023-01-03 苏州浪潮智能科技有限公司 Computer system fault testing method and device, electronic equipment and readable medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289570A1 (en) * 2013-03-22 2014-09-25 Insyde Software Corp. Virtual baseboard management controller
CN105760241A (en) * 2014-12-18 2016-07-13 联想(北京)有限公司 Exporting method and system for memory data
CN109976959A (en) * 2019-03-27 2019-07-05 苏州浪潮智能科技有限公司 A kind of portable device and method for server failure detection
CN110543398A (en) * 2019-07-31 2019-12-06 苏州浪潮智能科技有限公司 method and system for recording fault information
CN111447079A (en) * 2020-02-28 2020-07-24 华东计算技术研究所(中国电子科技集团公司第三十二研究所) High-availability extension system and method based on SCA framework
CN111694719A (en) * 2020-06-10 2020-09-22 腾讯科技(深圳)有限公司 Server fault processing method and device, storage medium and electronic equipment
CN112486743A (en) * 2020-10-28 2021-03-12 苏州浪潮智能科技有限公司 Interactive server intelligent fault processing system and method
CN112631863A (en) * 2020-12-22 2021-04-09 苏州浪潮智能科技有限公司 BMC health state detection method, electronic device and storage medium
CN112948157A (en) * 2021-01-29 2021-06-11 苏州浪潮智能科技有限公司 Server fault positioning method, device and system and computer readable storage medium
CN113867129A (en) * 2021-10-27 2021-12-31 珠海格力电器股份有限公司 Redundancy control method, device and system, computer equipment and storage medium
CN114138567A (en) * 2021-11-26 2022-03-04 浪潮电子信息产业股份有限公司 Substrate management control module maintenance method, device, equipment and storage medium
CN114996090A (en) * 2022-05-31 2022-09-02 济南浪潮数据技术有限公司 Server abnormity detection method and device, electronic equipment and storage medium
CN115048655A (en) * 2022-06-23 2022-09-13 苏州浪潮智能科技有限公司 Method, device, equipment and medium for checking mirror image of basic input/output system
CN115114118A (en) * 2022-07-28 2022-09-27 济南浪潮数据技术有限公司 Server monitoring method and related equipment
CN115437819A (en) * 2022-08-12 2022-12-06 苏州浪潮智能科技有限公司 Error reporting method and device for server, computer equipment and storage medium
CN115562918A (en) * 2022-09-30 2023-01-03 苏州浪潮智能科技有限公司 Computer system fault testing method and device, electronic equipment and readable medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘茗中;朱香佳;颜世佳;刘彩云;: "基于国产处理器的服务器设计", 信息通信, no. 07 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117806900A (en) * 2023-07-28 2024-04-02 苏州浪潮智能科技有限公司 Server management method, device, electronic equipment and storage medium
CN117806900B (en) * 2023-07-28 2024-05-07 苏州浪潮智能科技有限公司 Server management method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116225812B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
US7181651B2 (en) Detecting and correcting a failure sequence in a computer system before a failure occurs
JP3831377B2 (en) Method and apparatus for analyzing power failure in a computer system
CN107145410B (en) Method, system and equipment for automatically powering on and starting up system after abnormal power failure
CN104639380A (en) Server monitoring method
US20140122930A1 (en) Performing diagnostic tests in a data center
CN116225812B (en) Baseboard management controller system operation method, device, equipment and storage medium
CN114328102B (en) Equipment state monitoring method, equipment state monitoring device, equipment and computer readable storage medium
CN103577298A (en) Baseboard management controller monitoring system and method
US10275330B2 (en) Computer readable non-transitory recording medium storing pseudo failure generation program, generation method, and generation apparatus
US20240053812A1 (en) Power supply control method and apparatus, and server and non-volatile storage medium
US20120136970A1 (en) Computer system and method for managing computer device
CN112702182A (en) Trusted management method, device, system, equipment and storage medium
CN111124827A (en) Monitoring device and monitoring method for equipment fan
CN113434356A (en) Method and system for automatically detecting and alerting computing device component changes
WO2023179684A1 (en) Method and apparatus for monitoring state of central processing unit, and device and storage medium
CN111625386A (en) Monitoring method and device for power-on overtime of system equipment
TW201516672A (en) System and method of monitoring a server
CN111488050B (en) Power supply monitoring method, system and server
JP5689783B2 (en) Computer, computer system, and failure information management method
CN117149491A (en) Power supply monitoring method, system, device and storage medium
CN115687026A (en) Multi-node server fault early warning method, device, equipment and medium
CN111124095B (en) Power supply running state detection method and related device during upgrading of power supply firmware
TWI494754B (en) Server monitoring apparatus and method thereof
TWI698741B (en) Method for remotely clearing abnormal status of racks applied in data center
CN111416721A (en) Far-end eliminating method for abnormal state of cabinet applied to data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant