CN112905377B - Method, system and medium for avoiding BMC breakdown - Google Patents

Method, system and medium for avoiding BMC breakdown Download PDF

Info

Publication number
CN112905377B
CN112905377B CN202110181358.9A CN202110181358A CN112905377B CN 112905377 B CN112905377 B CN 112905377B CN 202110181358 A CN202110181358 A CN 202110181358A CN 112905377 B CN112905377 B CN 112905377B
Authority
CN
China
Prior art keywords
flag
interrupt
strategy
instruction
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110181358.9A
Other languages
Chinese (zh)
Other versions
CN112905377A (en
Inventor
黄家安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110181358.9A priority Critical patent/CN112905377B/en
Publication of CN112905377A publication Critical patent/CN112905377A/en
Application granted granted Critical
Publication of CN112905377B publication Critical patent/CN112905377B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method for avoiding BMC (baseboard management controller) crash, which comprises the following steps: executing a monitoring strategy, and monitoring the CPU utilization rate and an interrupt instruction; setting a flag area, a non-flag area, an interrupt parameter threshold value and a flag threshold value; when the CPU utilization rate reaches a flag area, if an instruction generator generates an interrupt instruction, executing an incremental strategy on an interrupt parameter and a flag corresponding to the instruction generator; according to the method, when the CPU utilization rate reaches the non-flag area, the decrement strategy is executed on the flag corresponding to the instruction generator, and when the server instruction generator is abnormal in factory manufacturing, the fault judgment of testers caused by the fact that the BMC cannot respond due to the fact that the BMC is crashed can be avoided.

Description

Method, system and medium for avoiding BMC breakdown
Technical Field
The present invention relates to the field of server detection technologies, and in particular, to a method, a system, and a medium for avoiding a BMC crash.
Background
When a user presses a power instruction generator to start a server, a BMC records an SEL log to record that a server system is pressed by the power instruction generator, when the user presses a restart instruction generator, the BMC records an SEL log to record that the server system is pressed by the restart instruction generator, when the user presses the interrupt instruction generator, the BMC records a system event log to record that the server system is pressed by the interrupt instruction generator, the BMC judges the high-low level change of a GPIO pin to know that the instruction generator on the server system is pressed by the user, and then the GPIO interrupt function is used for quickly processing the event corresponding to the pressed instruction generator.
However, when the instruction generator has a problem, many interrupts are generated in a short time, which causes centralized processing of these interrupt functions by the BMC, so that the CPU utilization of the BMC is reduced, and thus the upper layer service cannot be processed immediately and effectively.
Disclosure of Invention
The invention mainly solves the problem that when the server instruction generator is abnormal, the BMC repeatedly processes abnormal interruption, so that the CPU utilization rate of the BMC is reduced or the BMC crashes.
In order to solve the technical problems, the invention adopts a technical scheme that: a method for avoiding BMC crash is provided, which comprises the following steps:
executing a monitoring strategy, and monitoring the CPU utilization rate and an interrupt instruction of an instruction generator;
setting a flag area, a non-flag area, an interrupt parameter threshold value and a flag threshold value;
when the CPU utilization rate reaches the flag area, if the instruction generator generates the interrupt instruction, executing an increasing strategy on the interrupt parameter and the flag corresponding to the instruction generator;
and when the CPU utilization rate reaches the non-flag area, executing a decreasing strategy on the flag corresponding to the instruction generator.
Further, the step of setting the flag area, the non-flag area, the interrupt parameter threshold value and the flag threshold value further includes: the non-flag area comprises a first threshold value interval, a second threshold value interval and a buffer interval, and the flag area comprises a first parameter interval and a second parameter interval.
Further, the increment policies include a first increment policy and a second increment policy;
when the CPU utilization rate reaches the first parameter interval, the monitoring strategy monitors whether the instruction generator generates the interrupt instruction, if so, the first increment strategy is executed on the interrupt parameter according to the interrupt instruction, when the interrupt parameter reaches the interrupt parameter threshold, the first increment strategy is executed on the flag, and when the flag is larger than the flag threshold, the emergency strategy is executed.
Further, when the CPU utilization rate reaches the second parameter interval, the monitoring strategy monitors whether the instruction generator generates the interrupt instruction, if so, the second increment strategy is executed on the interrupt parameter according to the interrupt instruction, when the interrupt parameter reaches the interrupt parameter threshold, the first increment strategy is executed on the flag, and when the flag is greater than the flag threshold, the emergency strategy is executed.
Further, the step of executing the emergency policy further comprises: and closing the interrupt processing program corresponding to the instruction generator, stopping processing the interrupt instruction of the instruction generator, storing the information of closing the interrupt processing program in a system event log, and executing alarm operation.
Further, the step of executing a decrement policy on the flag corresponding to the instruction generator when the CPU utilization reaches the non-flag region further includes: and if the CPU utilization rate reaches a first threshold interval, initializing the interrupt parameters and the flag corresponding to the instruction generator.
Further, the step of executing a decrement policy on the flag corresponding to the instruction generator when the CPU utilization reaches the non-flag region further includes:
when the CPU utilization rate reaches the second threshold interval, executing the first decreasing strategy on the flag corresponding to the instruction generator, judging whether the interrupt processing program corresponding to the instruction generator is in a closed state, if the interrupt processing program of the instruction generator is in the closed state, judging whether the flag of the instruction generator is equal to the flag threshold, if so, keeping the interrupt processing program of the instruction generator in the closed state, if not, starting the interrupt processing program of the instruction generator, and storing the information of the started interrupt processing program in the system event log.
Further, the first increment strategy is to increment the interruption parameter or the flag by taking the first parameter as a base number, and when the interruption parameter or the flag reaches the interruption parameter threshold or the flag threshold, the increment is continued from an initial value;
the second increment strategy is to increment the interruption parameter or the flag by taking the second parameter as a base number, and when the interruption parameter or the flag reaches the interruption parameter threshold or the flag threshold, the increment is continued from an initial value.
The invention also provides a system for avoiding the BMC crash, which is applied to a server, and comprises: the system comprises a monitoring module, a weighting module, a strategy module and a system event module;
the monitoring module is used for monitoring the CPU utilization rate of the server and an interrupt instruction of an instruction generator of the server, and setting a flag area and a non-flag area;
the strategy module stores an increasing strategy, a decreasing strategy and an emergency strategy;
when the CPU utilization rate reaches the flag area, the weighting module executes the increment strategy, executes the interrupt parameter weighting and the flag weighting according to the interrupt instruction, and executes the emergency strategy on an interrupt processing program corresponding to the instruction generator;
when the CPU utilization rate reaches a non-flag area, the weighting module executes the decrement strategy on the flag and executes the emergency strategy on an interrupt processing program corresponding to the instruction generator;
the system event module is used for storing the information of the interrupt processing program.
The invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the method steps for avoiding a BMC crash.
The invention has the beneficial effects that:
1. the method for avoiding the BMC crash can avoid that when a server instruction generator is abnormal during the manufacture of a server in a factory, the BMC cannot respond due to the crash of the BMC, so that a server tester mistakenly thinks that the problem occurs in the design of the BMC firmware of the server, can locate the abnormal instruction generator through a system event log, and does not need a server research and development worker to carry out the factory to confirm the reason of the abnormality of the server instruction generator.
2. The system for avoiding the BMC crash can weight the interrupt instruction of the instruction generator according to the strategy of the strategy module, close the interrupt processing program according to the interrupt threshold after the weighting is finished, open the interrupt processing program again when the interrupt processing program opening condition is met, store the system event of the closed or opened interrupt processing program information, and facilitate the maintenance of maintenance personnel.
3. The medium for avoiding the BMC breakdown can realize an increasing or decreasing strategy of the interrupt parameter corresponding to the instruction generator, generates a flag when the interrupt parameter reaches a threshold value, judges whether the instruction generator is abnormal or not through the flag, and closes an interrupt processing program of the abnormal instruction generator after the abnormality occurs, so that the load pressure of a CPU is reduced, and the normal operation of the CPU is ensured.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a method for avoiding a BMC crash according to embodiment 1 of the invention;
FIG. 2 is a flowchart of a method for avoiding a BMC crash according to embodiment 1 of the invention;
FIG. 3 is a block diagram of a method for avoiding BMC crash according to embodiment 1 of the present invention;
fig. 4 is a schematic diagram of a system for avoiding a BMC crash according to embodiment 2 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; the two elements may be directly connected or indirectly connected through an intermediate medium, or may be communicated with each other inside the two elements, or may be wirelessly connected or wired connected. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description of the present invention, bmc (baseboard Management controller) is a board Management controller, and cpu (central Processing unit) is a central Processing unit.
Example 1
The embodiment of the invention provides a method for preventing BMC crash, please refer to fig. 1 to 3, which includes the following steps;
in order to avoid the phenomenon that when a server instruction generator is abnormal, the instruction generator transmits an interrupt function to the BMC abnormally, so that the CPU load of the BMC is increased instantly, and the BMC is crashed, the invention firstly creates a monitoring strategy, continuously monitors the CPU utilization rate of the server, sets a CPU utilization rate threshold value and an interrupt parameter, divides the CPU utilization rate into a flag area and a non-flag area according to the CPU utilization rate, the flag area is that the CPU utilization rate is more than or equal to 70 percent, the non-flag area is that the CPU utilization rate is less than 70 percent, when the CPU utilization rate reaches 70 percent, the CPU utilization rate is less than 90 percent, and determines whether the interrupt of the instruction generator occurs on the server, because each instruction generator corresponds to an interrupt response program, an interrupt counter counts the interrupt generated by each instruction generator, initializes the interrupt parameter when corresponding to the start, and the interrupt parameter is 0, when the pressing instruction generator is interrupted, adding 1 to the interruption parameter, and if the interruption is not generated, continuing to use the monitoring strategy to monitor the CPU utilization rate;
when the monitoring strategy monitors that the CPU utilization rate is more than or equal to 90 percent, the CPU can not respond to the interrupt response with intensive burst, so the interrupt parameters are further weighted to confirm whether the interrupt operation of the instruction generator occurs on the server, and if the interrupt operation occurs, the interrupt parameter is added by 2.
The addition of 1 or 2 to the interrupt parameter is only a simple weighting relationship, the number is not a problem to be discussed or a protection range, and the addition of 1 or 2 is only an ideal case, and a better interrupt parameter can be properly selected for weighting according to the running state of the server, or the application scene of the server, and the performance of the server CPU.
When the interrupt parameter is greater than the upper limit 10 of the interrupt parameter, a flag is generated, when the number of the flag is greater than 10, interrupt processing programs corresponding to an instruction generator which generates interrupt, such as a shutdown instruction generator and a restart instruction generator, are closed, namely the instruction generator is judged to be abnormal, a CPU refuses to receive the interrupt instruction of the instruction generator, the normal operation of upper-layer services of a server is protected, the information log of the instruction generator is recorded in an SEL log system event log while the instruction generator is closed, and a worker is reminded in this way, and the normal operation of the CPU is influenced when the instruction generator is abnormal, so that the worker is reminded to quickly make a processing response.
The monitoring strategy monitors that the CPU utilization rate is lower than 70%, and when the CPU utilization rate reaches a non-flag area, the CPU utilization rate can already process some interrupt instructions, so that the CPU opens the closed interrupt processing program corresponding to the instruction generator according to the situation.
When the CPU utilization rate is lower than 70% and is more than or equal to 60%, the CPU utilization rate is equal to the buffer area, the monitoring strategy does not process the interrupt processing program, but continuously waits for the CPU utilization rate to continuously change, and waits for the CPU utilization rate to increase or decrease.
The monitoring strategy monitors that the utilization rate of the CPU is less than or equal to 60 percent and is more than 50 percent, the flag number is reduced by one, the flag number is 0 at least, because the residual utilization rate of the CPU is enough to deal with various interrupt instructions, the interrupt operation can be tried, the monitoring strategy confirms the current interrupt processing program state of the instruction generator, when the flag number of the instruction generator is reduced, if the flag number of the instruction generator is equal to 10, namely the original flag number of the instruction generator is 11, the instruction generator possibly has serious problems, and a maintenance worker does not timely process the problem, the interrupt processing program of the instruction generator is kept closed, if the instruction generator is subjected to twice flag number reduction processing, namely the utilization rate of the CPU is kept stable and exceeds a non-flag area, the utilization rate of the CPU performs at least one reciprocating motion in a buffer area and an interval of 60 percent to 50 percent of the CPU utilization rate, and when the monitoring strategy monitors that the flag number of the instruction generator is less than 10, opening the interrupt processing program of which the instruction generator is closed, recording the opening information of the instruction generator in a system time log, informing a maintenance worker that the instruction generator is opened, and processing the interrupt instruction of the instruction generator by the CPU.
When the monitoring strategy monitors that the utilization rate of the CPU is less than or equal to 50%, an initialization strategy is triggered, and the initialization strategy clears the interrupt parameters and the flag number, so that the CPU can process a large amount of services and can execute a large amount of interrupt instructions.
Example 2
An embodiment of the present invention provides a system for avoiding BMC crash, which is applied to a server, please refer to fig. 4, where the system includes;
the system comprises a monitoring module, a weighting module, an instruction generator, a strategy module and a system event module.
The instruction generator is used for carrying out various instruction control on the server, such as a shutdown instruction, a restart instruction, a standby instruction and the like, the instruction generator generates an interrupt to the server BMC every time the instruction generator generates an instruction, when the instruction generator generates an exception, the instruction is repeatedly sent to the BMC, a CPU of the BMC is crashed due to the fact that a large number of instructions are received instantly, and the BMC cannot work normally.
The monitoring module monitors the utilization rate of the CPU, and divides a flag interval and a flag interval according to the utilization rate of the CPU, in the embodiment, according to the performance index of the CPU, when the utilization rate of the CPU is more than or equal to 70%, the flag interval is set, and when the utilization rate of the CPU reaches 70%, the non-flag interval is set, and when the utilization rate of the CPU reaches 70%, the monitoring module further divides the interval.
When the CPU utilization rate is more than or equal to 70% and less than 90%, the monitoring module monitors whether the instruction generator generates an interrupt instruction, if the interrupt instruction is generated, the weighting module and the strategy module are matched to weight the interrupt parameter and the flag, the monitoring module performs independent weighting on each instruction generator when monitoring that the shutdown instruction generator or the restart instruction generator generates the interrupt instruction, each time the monitoring module monitors that the interrupt instruction is generated, the weighting module instructs a first increasing strategy of the strategy module, namely, 1 is added each time, the interrupt instruction of the instruction generator is increased in interrupt, when the interrupt parameter is increased to 10, the first increasing strategy of the strategy module is executed on the flag, the flag is increased in steps, when the flag number is more than 10, the interrupt processing program corresponding to the instruction generator is stopped, and the BMC does not receive the interrupt instruction of the interrupt generator any more, because the interrupt instruction is frequently generated, which indicates that the interrupt generator is abnormal, the stopped interrupt processing program is recorded in the system event module, and in this way, the field can be protected, so that maintenance personnel can maintain the abnormal instruction generator.
Similarly, when the utilization rate of the CPU is greater than or equal to 90%, it indicates that the available space of the CPU is very small, and a faster weighting strategy is required, and each time the monitoring module monitors that the shutdown instruction generator or the restart instruction generator generates an interrupt instruction, the interrupt parameter of the interrupt instruction generated by the instruction generator is subjected to a second increment strategy, i.e. every increment by 2, when the interrupt parameter is incremented to 10, a first incremental strategy of the strategy module is performed on the flag, and when the number of flags is greater than 10, stopping the interrupt processing program corresponding to the instruction generator, so that the BMC does not receive the interrupt instruction of the interrupt generator any more, because the interrupt instruction is frequently generated, which indicates that the interrupt generator is abnormal, the stopped interrupt handler is recorded in the system event module, in this way, the site can be protected so that maintenance personnel can maintain the abnormal command generator.
When the CPU utilization rate is reduced to a non-flag area, the CPU utilization rate is subdivided, when the CPU utilization rate is less than 70% and is more than or equal to 60%, the CPU utilization rate is divided into a buffer area, whether the CPU utilization rate is stable or not is not known, so that when the CPU utilization rate is the buffer area, the interrupt instructions are not counted, and the interrupt processing program of the instruction generator is not opened or closed.
When the CPU utilization rate is reduced to less than 60% and more than 50%, once the CPU utilization rate enters once, all the interrupt processors are subjected to a flag decrement strategy, namely the flag of the instruction generator is decremented by 1, since the interrupt processing is shut down only when flag is greater than 10, i.e., 11, therefore, when the CPU utilization rate is reduced to less than 60% and more than 50%, and when the flag corresponding to the instruction generator is less than 10, the closed interrupt processing program is opened, which indicates that the CPU utilization rate is reduced to less than 60% and more than 50% twice, and is stable, therefore, when the flag is equal to 10, the closed interrupt processing program is not opened, and when the closed interrupt processing program is opened, the processed condition is sent to the system event module for storage, and a repairer is informed that the instruction generator has undergone a restart and needs to be carefully confirmed.
When the CPU utilization rate is less than or equal to 50%, initializing the interrupt parameters and the flag, wherein the interrupt parameters and the flag are initialized normally, namely cleared, and the initialization values can be set according to the performance of the CPU
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps of the above embodiments may be implemented by hardware, or a program executed by hardware and instructed by a program to be stored in a computer-readable storage medium, where the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (4)

1. A method for avoiding BMC crash is characterized by comprising the following steps:
executing a monitoring strategy, and monitoring the CPU utilization rate and an interrupt instruction of an instruction generator;
setting a flag area, a non-flag area, an interruption parameter threshold and a flag threshold, wherein the non-flag area comprises a first threshold interval, a second threshold interval and a buffer interval, and the flag area comprises a first parameter interval and a second parameter interval;
when the CPU utilization rate reaches the flag area, if the instruction generator generates the interrupt instruction, executing an increasing strategy on the interrupt parameter and the flag corresponding to the instruction generator; the increment policies include a first increment policy and a second increment policy;
when the CPU utilization rate reaches the first parameter interval, the monitoring strategy monitors whether the instruction generator generates the interrupt instruction, if so, the first incremental strategy is executed on the interrupt parameter according to the interrupt instruction, when the interrupt parameter reaches the interrupt parameter threshold, the first incremental strategy is executed on the flag, and when the flag is greater than the flag threshold, the emergency strategy is executed;
when the CPU utilization rate reaches the second parameter interval, the monitoring strategy monitors whether the instruction generator generates the interrupt instruction, if so, the second incremental strategy is executed on the interrupt parameter according to the interrupt instruction, when the interrupt parameter reaches the interrupt parameter threshold, the first incremental strategy is executed on the flag, and when the flag is greater than the flag threshold, the emergency strategy is executed;
the step of executing the emergency policy includes: closing an interrupt processing program corresponding to the instruction generator, stopping processing of the interrupt instruction of the instruction generator, storing information of closing the interrupt processing program in a system event log, and executing an alarm operation;
when the CPU utilization rate reaches the non-flag area, executing a decreasing strategy on the flag corresponding to the instruction generator; if the CPU utilization rate reaches a first threshold interval, initializing the interrupt parameters and the flag corresponding to the instruction generator; when the CPU utilization rate reaches the second threshold interval, executing the first decreasing strategy on the flag corresponding to the instruction generator, judging whether the interrupt processing program corresponding to the instruction generator is in a closed state, if the interrupt processing program of the instruction generator is in the closed state, judging whether the flag of the instruction generator is equal to the flag threshold, if so, keeping the interrupt processing program of the instruction generator in the closed state, if not, starting the interrupt processing program of the instruction generator, and storing the information of the started interrupt processing program in the system event log.
2. The method of claim 1, wherein the BMC crash prevention step comprises: the first increment strategy is to increment the interruption parameter or the flag by taking the first parameter as a base number, and when the interruption parameter or the flag reaches the interruption parameter threshold or the flag threshold, the increment is continued from an initial value;
the second increment strategy is to increment the interruption parameter or the flag by taking the second parameter as a base number, and when the interruption parameter or the flag reaches the interruption parameter threshold or the flag threshold, the increment is continued from an initial value.
3. A system for avoiding a BMC crash, applied to a server, using the method for avoiding a BMC crash of claim 1 or 2, wherein the system comprises: the system comprises a monitoring module, a weighting module, a strategy module and a system event module;
the monitoring module is used for monitoring the CPU utilization rate of the server and an interrupt instruction of an instruction generator of the server, and setting a flag area and a non-flag area;
the strategy module stores an increasing strategy, a decreasing strategy and an emergency strategy;
when the CPU utilization rate reaches the flag area, the weighting module executes the increment strategy, executes the interrupt parameter weighting and the flag weighting according to the interrupt instruction, and executes the emergency strategy on an interrupt processing program corresponding to the instruction generator;
when the CPU utilization rate reaches a non-flag area, the weighting module executes the decrement strategy on the flag and executes the emergency strategy on an interrupt processing program corresponding to the instruction generator;
the system event module is used for storing the information of the interrupt processing program.
4. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the method steps of avoiding a BMC crash of claim 1 or 2.
CN202110181358.9A 2021-02-10 2021-02-10 Method, system and medium for avoiding BMC breakdown Active CN112905377B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110181358.9A CN112905377B (en) 2021-02-10 2021-02-10 Method, system and medium for avoiding BMC breakdown

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110181358.9A CN112905377B (en) 2021-02-10 2021-02-10 Method, system and medium for avoiding BMC breakdown

Publications (2)

Publication Number Publication Date
CN112905377A CN112905377A (en) 2021-06-04
CN112905377B true CN112905377B (en) 2022-09-20

Family

ID=76123416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110181358.9A Active CN112905377B (en) 2021-02-10 2021-02-10 Method, system and medium for avoiding BMC breakdown

Country Status (1)

Country Link
CN (1) CN112905377B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844136A (en) * 2016-12-06 2017-06-13 北京智能管家科技有限公司 The collection method and system of a kind of program crashing information
CN110187996A (en) * 2019-05-30 2019-08-30 苏州浪潮智能科技有限公司 BMC host process method for diagnosing faults, device, equipment and readable storage medium storing program for executing

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6629252B1 (en) * 1999-10-28 2003-09-30 International Business Machines Corporation Method for determining if a delay required before proceeding with the detected interrupt and exiting the interrupt without clearing the interrupt
FR2830952B1 (en) * 2001-10-11 2004-03-12 Inst Nat Rech Inf Automat IMPROVED ALEA GENERATOR
US9298524B2 (en) * 2013-03-22 2016-03-29 Insyde Software Corp. Virtual baseboard management controller
US9529750B2 (en) * 2014-07-14 2016-12-27 American Megatrends, Inc. Service processor (SP) initiated data transaction with bios utilizing interrupt
US10705901B2 (en) * 2018-02-23 2020-07-07 Dell Products, L.P. System and method to control memory failure handling on double-data rate dual in-line memory modules via suspension of the collection of correctable read errors
CN110362326A (en) * 2019-07-12 2019-10-22 浪潮电子信息产业股份有限公司 A kind of upgrade method and relevant device of BIOS
CN111124761B (en) * 2019-12-29 2022-04-22 浪潮电子信息产业股份有限公司 Equipment restarting method, device, equipment and medium
CN111737064A (en) * 2020-06-29 2020-10-02 深圳市同泰怡信息技术有限公司 BMC system control method and device, storage medium and computer equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844136A (en) * 2016-12-06 2017-06-13 北京智能管家科技有限公司 The collection method and system of a kind of program crashing information
CN110187996A (en) * 2019-05-30 2019-08-30 苏州浪潮智能科技有限公司 BMC host process method for diagnosing faults, device, equipment and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN112905377A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN106776099A (en) One kind service fusing shielding system and method
CN112905377B (en) Method, system and medium for avoiding BMC breakdown
CN110726936A (en) Method for judging and processing voltage sampling fault and voltage extreme value fault
CA2365427A1 (en) Internal product fault monitoring apparatus and method
CN112732064A (en) Power consumption adjusting system, method and medium
CN112865264A (en) Method for controlling energy storage device, power supply system and power utilization device
CN110990903B (en) Cloud system and cloud system protection method
WO2023056851A1 (en) Voltage monitoring method and apparatus, electronic device and storage medium
CN111277593A (en) Multi-line parallel monitoring method based on internal and external network isolation
WO2022161100A1 (en) Edge computing server resetting method and device
CN114489290B (en) Fan control method, system, equipment and medium
CN107203455B (en) Method and system for preventing over-discharge of standby power module when power-down fan of storage system is hung up
CN113360344B (en) Server monitoring method, device, equipment and computer readable storage medium
JPH10307635A (en) Computer system and temperature monitoring method applied to the same system
JP2001195267A (en) Control computer system and task control method
CN114584041A (en) Control method and device of servo driver
WO2018023881A1 (en) Parameter adjustment method and device, and computer storage medium
CN114415053A (en) Battery monitoring method and device, vehicle and computer readable storage medium
TWI678855B (en) A low voltage threshold adjusting method
CN107360019A (en) A kind of thread monitor method and system of server
CN109062718B (en) Server and data processing method
CN117666452A (en) Multiple safety control method and device for robot, electronic equipment and storage medium
CN112467815A (en) Control method and system for passive current limiting of battery management system
CN113280469B (en) Fan blade fault detection method, air conditioner and computer readable storage medium
JP2003345629A (en) System monitor device, system monitoring method used for the same, and program therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant