CN113032182B - Method and equipment for recovering computer system from abnormity - Google Patents

Method and equipment for recovering computer system from abnormity Download PDF

Info

Publication number
CN113032182B
CN113032182B CN202110271428.XA CN202110271428A CN113032182B CN 113032182 B CN113032182 B CN 113032182B CN 202110271428 A CN202110271428 A CN 202110271428A CN 113032182 B CN113032182 B CN 113032182B
Authority
CN
China
Prior art keywords
power
bmc
power supply
event
current system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110271428.XA
Other languages
Chinese (zh)
Other versions
CN113032182A (en
Inventor
王兴隆
宿燕鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110271428.XA priority Critical patent/CN113032182B/en
Publication of CN113032182A publication Critical patent/CN113032182A/en
Application granted granted Critical
Publication of CN113032182B publication Critical patent/CN113032182B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • G06F11/1482Generic software techniques for error detection or fault masking by means of middleware or OS functionality

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The invention provides a method and a device for recovering computer system exception, wherein the method comprises the following steps: in response to the computer system is detected to have a power-off event and the electric quantity of the standby power supply of the system is lower than a threshold value, the CPLD and the BMC are powered through the additional power supply capacitor, the current system state is recorded through the CPLD, and the current system state is sent to the BMC; judging the type of the power failure event by the BMC; in response to the power-off event being a flash, the BMC restores the computer system based on the current system state. By using the scheme of the invention, the computer system can be recovered to the state before the power failure abnormity occurs at the first time after the power failure abnormity is finished, and the downtime of the system can be reduced.

Description

Method and equipment for recovering computer system from abnormity
Technical Field
The field relates to the field of computers, and more particularly to a method and apparatus for computer system exception recovery.
Background
In the running process of a complex computer system such as a server, a storage, a switch and the like, a scene that a power supply system is instable to cause instant interruption and recovery occasionally occurs, which is called as flash interruption, the computer system is changed from a running state to a standby state (power-on and not started) in the scene, and the time is the cost in practical application, so that how to enable the computer system to be recovered to the running state as soon as possible is a technical problem to be solved in the field.
Disclosure of Invention
In view of this, an embodiment of the present invention provides a method and a device for recovering from a computer system exception, which, by using the technical solution of the present invention, can recover a computer system to a state before a power failure exception occurs in a first time after the power failure exception is ended, and can reduce downtime of the system.
In view of the above object, an aspect of the embodiments of the present invention provides a method for recovering from computer system exception, including the following steps:
in response to the computer system is detected to have a power-off event and the electric quantity of the standby power supply of the system is lower than a threshold value, the CPLD and the BMC are powered through the additional power supply capacitor, the current system state is recorded through the CPLD, and the current system state is sent to the BMC;
judging the type of the power failure event by the BMC;
in response to the power-off event being a flash, the BMC restores the computer system based on the current system state.
According to an embodiment of the present invention, further comprising:
in response to the power-off event being a non-flash, the BMC stores the current system state into the memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state.
According to one embodiment of the present invention, determining the type of the power-off event by the BMC includes:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power outage event, the timer begins timing and compares the recorded time to a time threshold in real time;
responding to the recorded time exceeding a time threshold value, and judging that the power failure event is non-flash failure;
and responding to the recorded time not exceeding the time threshold value, and judging the power failure event as flash failure.
According to one embodiment of the invention, in response to detecting that a power down event occurs in the computer system and that the standby power supply capacity of the system is below a threshold, recording the current system state via the CPLD and sending the current system state to the BMC comprises:
recording the power-on and power-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system.
According to an embodiment of the present invention, further comprising:
connecting the input end of the power supply capacitor with a standby power supply so that the standby power supply charges the power supply capacitor;
and the output end of the power supply capacitor is respectively connected to the CPLD and the BMC so that the power supply capacitor supplies power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity.
In another aspect of the embodiments of the present invention, there is also provided an apparatus for recovering from an exception of a computer system, the apparatus including:
the recording module is configured to supply power to the CPLD and the BMC through the additional power supply capacitor in response to the fact that the computer system is detected to have a power failure event and the electric quantity of the standby power supply of the system is lower than a threshold value, record the current system state through the CPLD and send the current system state to the BMC;
the judging module is configured to judge the type of the power failure event by the BMC;
and the recovery module is configured to respond to the power failure event as flash, and the BMC recovers the computer system based on the current system state.
According to one embodiment of the invention, the device further comprises a reading module configured to:
in response to the power-off event being a non-flash, the BMC stores the current system state into the memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state.
According to an embodiment of the invention, the determining module is further configured to:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power-off event, the timer begins timing and compares the recorded time with a time threshold in real time;
responding to the recorded time exceeding a time threshold value, and judging that the power failure event is non-flash failure;
and responding to the recorded time not exceeding the time threshold value, and judging the power failure event as flash failure.
According to an embodiment of the invention, the recording module is further configured to:
recording the on-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system.
According to an embodiment of the present invention, further comprising a power supply module configured to:
connecting the input end of the power supply capacitor with a standby power supply to charge the power supply capacitor by the standby power supply;
and the output end of the power supply capacitor is respectively connected to the CPLD and the BMC so that the power supply capacitor supplies power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity.
The invention has the following beneficial technical effects: according to the method for recovering the computer system from the abnormity, which is provided by the embodiment of the invention, in response to the detection that the computer system has a power failure event and the electric quantity of the standby power supply of the system is lower than the threshold value, the CPLD and the BMC are powered by the additional power supply capacitor, the current system state is recorded by the CPLD, and the current system state is sent to the BMC; judging the type of the power failure event by the BMC; in response to the fact that the power failure event is flash-off, the technical scheme that the BMC restores the computer system based on the current system state enables the computer system to restore to the state before the power failure abnormality occurs at the first time after the power failure abnormality is ended, and the downtime of the system can be reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of a method of computer system exception recovery in accordance with one embodiment of the present invention;
FIG. 2 is a diagram of an apparatus for computer system exception recovery, according to one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
In view of the above objects, a first aspect of embodiments of the present invention proposes an embodiment of a method for recovering from an exception of a computer system. Fig. 1 shows a schematic flow diagram of the method.
As shown in fig. 1, the method may comprise the steps of:
s1, in response to the fact that a power-off event of a computer system is detected and the electric quantity of a standby power supply of the system is lower than a threshold value, supplying power to a CPLD and a BMC through an extra power supply capacitor, recording the current system state through the CPLD and sending the current system state to the BMC, wherein any logic code of the CPLD can be designed to be used for detecting the event that all power supplies are disconnected;
s2, judging the type of the power failure event by the BMC, wherein two situations generally exist when the power failure occurs, wherein one situation is that the power failure is recovered in a short time and is called flash break, and the other situation is that the power failure is not recovered for a long time and is called non-flash break;
and S3, responding to the fact that the power-off event is flash, the BMC restores the computer system based on the current system state, when the power-off event is flash, namely the main power supply of the system immediately restores power supply after the electric quantity of the standby power supply of the system is exhausted, the system is in a power-on state and is not started, and the system is restored to the state before the power-off event according to the current system state received by the BMC.
By the technical scheme, the computer system can be restored to the state before the power failure abnormity occurs at the first time after the power failure abnormity is finished, and the downtime of the system can be reduced.
In a preferred embodiment of the present invention, the method further comprises:
in response to the power-off event being a non-flash, the BMC stores the current system state into the memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state. When a non-flash event occurs, indicating that the system may be in a condition of no power supply for a long time, the BMC needs to save the current system state in the non-volatile memory, and when the system is powered back, the BMC reads the current system state from the non-volatile memory and restores the computer system based on the current system state.
In a preferred embodiment of the present invention, the determining, by the BMC, the type of the power-off event includes:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power outage event, the timer begins timing and compares the recorded time to a time threshold in real time;
responding to the recorded time exceeding a time threshold value, and judging that the power failure event is non-flash failure;
and responding to the recorded time not exceeding the time threshold value, and judging the power failure event as flash.
In a preferred embodiment of the present invention, recording the current system state via the CPLD and sending the current system state to the BMC in response to detecting that a power down event occurs to the computer system and that the standby power supply capacity of the system is below a threshold comprises:
recording the power-on and power-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system. The startup and shutdown state of the system comprises that the system is started up when the system is started up, the system is kept to be shutdown when the system is shut down, the login state of the system comprises that the system is in the login state, the system is restored to be in the login state, the account and the password for login of the system are required to be recorded at the moment, and the system is kept to be in the non-login state when the system is in the non-login state.
In a preferred embodiment of the present invention, the method further comprises:
connecting the input end of the power supply capacitor with a standby power supply so that the standby power supply charges the power supply capacitor;
and the output end of the power supply capacitor is respectively connected to the CPLD and the BMC so that the power supply capacitor supplies power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity. The power supply capacitor needs to have certain capacity and can supply power for the CPLD and the BMC for certain time. The CPLD and BMC can also be provided with sufficient power to perform the above functions when the power supply (including the backup power supply) of the system is totally disabled.
By the technical scheme, the computer system can be recovered to the state before the power failure abnormity occurs at the first time after the power failure abnormity is finished, and the downtime of the system can be reduced.
It should be noted that, as will be understood by those skilled in the art, all or part of the processes in the methods of the above embodiments may be implemented by instructing relevant hardware through a computer program, and the above programs may be stored in a computer-readable storage medium, and when executed, the programs may include the processes of the embodiments of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
Furthermore, the method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU, and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the functions defined above in the methods disclosed in the embodiments of the present invention.
In view of the above object, according to a second aspect of the embodiments of the present invention, there is provided an apparatus for recovering from an exception of a computer system, as shown in fig. 2, the apparatus 200 includes:
the recording module is configured to supply power to the CPLD and the BMC through the additional power supply capacitor in response to the fact that the computer system is detected to have a power failure event and the electric quantity of the standby power supply of the system is lower than a threshold value, record the current system state through the CPLD and send the current system state to the BMC;
the judging module is configured to judge the type of the power failure event by the BMC;
and the recovery module is configured to respond to the power-off event as flash interruption, and the BMC recovers the computer system based on the current system state.
In a preferred embodiment of the present invention, the mobile terminal further comprises a reading module configured to:
in response to the power-off event being a non-flash, the BMC stores the current system state into the memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state.
In a preferred embodiment of the present invention, the determining module is further configured to:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power-off event, the timer begins timing and compares the recorded time with a time threshold in real time;
responding to the recorded time exceeding a time threshold value, and judging that the power failure event is non-flash;
and responding to the recorded time not exceeding the time threshold value, and judging the power failure event as flash.
In a preferred embodiment of the present invention, the recording module is further configured to:
recording the on-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system.
In a preferred embodiment of the present invention, the power supply module further comprises:
connecting the input end of the power supply capacitor with a standby power supply to charge the power supply capacitor by the standby power supply;
and the output end of the power supply capacitor is respectively connected to the CPLD and the BMC so that the power supply capacitor supplies power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity.
The above-described embodiments, particularly any "preferred" embodiments, are possible examples of implementations, and are set forth only for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing from the spirit and principles of the technology described herein. All such modifications are intended to be included within the scope of this disclosure and protected by the following claims.

Claims (8)

1. A method for recovering from an exception in a computer system, comprising the steps of:
in response to detecting that a power-off event occurs in the computer system and the standby power supply capacity of the system is lower than a threshold value, supplying power to the CPLD and the BMC through the additional power supply capacitor, recording the current system state through the CPLD and sending the current system state to the BMC;
judging the type of the power-off event by the BMC;
in response to the power-off event being a flash, the BMC restores the computer system based on the current system state;
in response to the power-off event being a non-flash, the BMC stores the current system state to a memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state.
2. The method of claim 1, wherein determining, by the BMC, the type of the outage event comprises:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power outage event, a timer starts timing and compares the recorded time with the time threshold in real time;
responding to the recorded time exceeding the time threshold value, and judging that the power failure event is non-flash failure;
and responding to the recorded time not exceeding the time threshold value, and judging that the power failure event is flash failure.
3. The method of claim 1, wherein in response to detecting that a power down event has occurred for a computer system and that a standby power supply capacity of the system is below a threshold, recording a current system state via a CPLD and sending the current system state to a BMC comprises:
recording the on-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system.
4. The method of claim 1, further comprising:
connecting an input end of a power supply capacitor with the standby power supply so that the standby power supply charges the power supply capacitor;
and respectively connecting the output end of the power supply capacitor to the CPLD and the BMC so as to enable the power supply capacitor to supply power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity.
5. An apparatus for computer system exception recovery, the apparatus comprising:
a recording module configured to, in response to detecting that a power-off event occurs in the computer system and the standby power supply capacity of the system is below a threshold, supply power to the CPLD and the BMC through the additional power supply capacitor, record a current system state via the CPLD and transmit the current system state to the BMC;
the judging module is configured to judge the type of the power failure event by the BMC;
a recovery module configured to, in response to the power-off event being a flash, the BMC to recover the computer system based on the current system state;
a read module configured to:
in response to the power-off event being a non-flash, the BMC stores the current system state to a memory;
in response to the computer system resuming power, the BMC reads the current system state from the memory and resumes the computer system based on the current system state.
6. The device of claim 5, wherein the determination module is further configured to:
setting a power-off time threshold and a power-off event timer;
in response to the occurrence of a power-off event, a timer starts timing and compares the recorded time with the time threshold in real time;
responding to the recorded time exceeding the time threshold value, and judging that the power failure event is non-flash failure;
and responding to the recorded time not exceeding the time threshold value, and judging that the power failure event is flash failure.
7. The device of claim 5, wherein the recording module is further configured to:
recording the power-on and power-off state of the current system and the login state of the system;
and recording the name and the login state of the software running in the current system.
8. The device of claim 5, further comprising a power module configured to:
connecting an input end of a power supply capacitor with the standby power supply so that the standby power supply charges the power supply capacitor;
and respectively connecting the output end of the power supply capacitor to the CPLD and the BMC so that the power supply capacitor supplies power to the CPLD and the BMC under the condition that the standby power supply has no electric quantity.
CN202110271428.XA 2021-03-12 2021-03-12 Method and equipment for recovering computer system from abnormity Active CN113032182B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110271428.XA CN113032182B (en) 2021-03-12 2021-03-12 Method and equipment for recovering computer system from abnormity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110271428.XA CN113032182B (en) 2021-03-12 2021-03-12 Method and equipment for recovering computer system from abnormity

Publications (2)

Publication Number Publication Date
CN113032182A CN113032182A (en) 2021-06-25
CN113032182B true CN113032182B (en) 2022-11-29

Family

ID=76470478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110271428.XA Active CN113032182B (en) 2021-03-12 2021-03-12 Method and equipment for recovering computer system from abnormity

Country Status (1)

Country Link
CN (1) CN113032182B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324192A (en) * 2020-02-26 2020-06-23 苏州浪潮智能科技有限公司 System board power supply detection method, device, equipment and storage medium
CN111427722A (en) * 2020-03-18 2020-07-17 深圳震有科技股份有限公司 Data storage method and system for abnormal power failure of computer
CN112256499A (en) * 2020-08-28 2021-01-22 苏州浪潮智能科技有限公司 Power failure monitoring method and device, electronic equipment and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1261844C (en) * 2003-03-14 2006-06-28 联想(北京)有限公司 Self-restoring apparatus and method for computer interruption
US10846160B2 (en) * 2018-01-12 2020-11-24 Quanta Computer Inc. System and method for remote system recovery

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111324192A (en) * 2020-02-26 2020-06-23 苏州浪潮智能科技有限公司 System board power supply detection method, device, equipment and storage medium
CN111427722A (en) * 2020-03-18 2020-07-17 深圳震有科技股份有限公司 Data storage method and system for abnormal power failure of computer
CN112256499A (en) * 2020-08-28 2021-01-22 苏州浪潮智能科技有限公司 Power failure monitoring method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN113032182A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN101493776B (en) Mobile terminal and power-on method and system thereof
CN106356097B (en) Protection method and device for preventing data loss
US20040039960A1 (en) Method and apparatus for automatic hibernation after a power failure
CN113064757B (en) Server firmware self-recovery system and server
CN102455950A (en) Firmware recovery system and method of base board management controller
CN111143132B (en) BIOS recovery method, device, equipment and readable storage medium
CN101969502A (en) Mobile terminal service recovering method and mobile terminal
CN111813753A (en) File saving method, file restoring method, device and terminal equipment
CN114003173A (en) Power-down protection system of storage device and storage device
CN109495909A (en) The mobile network's control method and device of Android device
CN113608930B (en) System chip and electronic device
CN113032182B (en) Method and equipment for recovering computer system from abnormity
CN109992437B (en) Processing method, device and equipment for hard disk flash break and storage medium
CN111475343B (en) Computer state outage restoration method and device and terminal equipment
CN112214094B (en) Method and equipment for coping with power supply jitter of hard disk
CN111427721A (en) Exception recovery method and device
JP3231561B2 (en) Backup memory control method
CN111539044A (en) Server power firmware write protection control method, device, equipment and storage medium
CN114218010B (en) Data backup and recovery method, system, terminal equipment and storage medium
JP3087650B2 (en) Automatic power recovery method
CN112463694B (en) Board hot plug control method and system
CN115629916B (en) Service program fault recovery method based on Zynq
CN113625855B (en) Power supply control method, system, medium and equipment of server system
JP2001333545A (en) Power supply, electronic device and its stopping/restoring method and recording medium
EP4131888B1 (en) Method for searching for interrupted device, slave device, master device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant