CN111124728A - Automatic service recovery method, system, readable storage medium and server - Google Patents

Automatic service recovery method, system, readable storage medium and server Download PDF

Info

Publication number
CN111124728A
CN111124728A CN201911275677.5A CN201911275677A CN111124728A CN 111124728 A CN111124728 A CN 111124728A CN 201911275677 A CN201911275677 A CN 201911275677A CN 111124728 A CN111124728 A CN 111124728A
Authority
CN
China
Prior art keywords
operating system
server
management control
control module
restart
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911275677.5A
Other languages
Chinese (zh)
Other versions
CN111124728B (en
Inventor
李晓龙
陈吉宝
袁迎春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Celestica Technology Consultancy Shanghai Co Ltd
Original Assignee
Celestica Technology Consultancy Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Celestica Technology Consultancy Shanghai Co Ltd filed Critical Celestica Technology Consultancy Shanghai Co Ltd
Priority to CN201911275677.5A priority Critical patent/CN111124728B/en
Publication of CN111124728A publication Critical patent/CN111124728A/en
Application granted granted Critical
Publication of CN111124728B publication Critical patent/CN111124728B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating

Abstract

The invention provides a method, a system, a readable storage medium and a server for automatically recovering a service, wherein the method for automatically recovering the service comprises the following steps: after the server is started, guiding the running system of the server to enter a first operating system; calling a substrate management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, calling the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal; if yes, calling a basic input and output system to restart the first operating system; if not, calling the basic input and output system to start the second operating system so as to facilitate automatic recovery of the service. The invention realizes the service automatic recovery function when the operating system is completely crashed based on the server firmware, thereby greatly shortening the time required by the recovery operation service, reducing the loss of customers and simultaneously reducing the manual maintenance cost.

Description

Automatic service recovery method, system, readable storage medium and server
Technical Field
The invention belongs to the field of computing networks, relates to a recovery method and a recovery system, and particularly relates to an automatic service recovery method, an automatic service recovery system, a readable storage medium and a server.
Background
The operating system is a computer program that manages computer hardware and software resources, and is also the kernel and foundation of the computer system. The operating system needs to handle basic transactions such as managing and configuring memory, prioritizing system resources, controlling input devices and output devices, operating the network, and managing the file system.
In an edge computing network environment, due to the lack of adequate redundancy node backup. When a client uses a single-node host to perform operation, when conditions such as unexpected power failure, external impact, software crash and the like occur, an operating system may crash, an operation service is interrupted, and automatic recovery is further impossible. If the maintenance is performed manually, a lot of time is consumed.
Therefore, what is needed is a method, a system, a readable storage medium, and a server for automatically recovering a service, so as to solve the problems of the prior art that when a host fails, the operation service is interrupted and cannot be automatically recovered.
Disclosure of Invention
In view of the above drawbacks of the prior art, an object of the present invention is to provide a method, a system, a readable storage medium, and a server for automatically recovering a service, which are used to solve the problem that the operation service is interrupted and cannot be automatically recovered when a host fails in the prior art.
In order to achieve the above and other related objects, an aspect of the present invention provides a method for automatically recovering a service, which is applied to a server, where the server includes an operation module and a baseboard management control module connected to the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery method comprises the following steps: after the server is started, guiding the running system of the server to enter the first operating system; calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, executing the next step: calling the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal; if so, calling the basic input and output system to restart the first operating system; if not, calling the basic input and output system to start the second operating system so as to facilitate automatic recovery of the service.
In an embodiment of the present invention, the operation module is communicatively connected to a network configuration module; the first operating system and the second operating system deploy application programs.
In an embodiment of the present invention, after entering the first operating system, the method for automatically recovering a service further includes: and the first operating system acquires an application configuration file matched with the application program from the network configuration module.
In an embodiment of the present invention, the method for automatically recovering a service further includes: and after the second operating system is started, the second operating system acquires an application configuration file matched with the application program from the network configuration module.
In an embodiment of the present invention, in a process of calling the bmc module to restart the first os, the method for automatically recovering a service further includes: and handing over the system management right of the server to the basic input and output system, and initiating a restart query to the baseboard management control module by the basic input and output system so as to judge whether the restart of the first operating system is normal.
In an embodiment of the invention, the operation module and the baseboard management control module are disposed on a motherboard of the server.
In an embodiment of the invention, the bios is bios firmware having an operating system recovery function and solidified on the motherboard.
The invention provides a service automatic recovery system on the other hand, which is applied to a server, wherein the server comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery system comprises: the guiding unit is used for guiding the running system of the server to enter the first operating system after the server is started; the calling unit is used for calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, continuing to call the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal or not through a judging unit; if yes, calling the basic input and output system through the calling unit to restart the first operating system; if not, the basic input and output system is called by the calling unit to start the second operating system so as to facilitate automatic recovery of the service.
The present invention further provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the automatic service restoration method.
A final aspect of the present invention provides a server comprising: a processor and a memory; the memory is used for storing a computer program, and the processor is used for executing the computer program stored by the memory so as to enable the server to execute the service automatic recovery method; the processor comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system.
As described above, the method, system, readable storage medium and server for automatically recovering services according to the present invention have the following advantages:
the method, the system, the readable storage medium and the server realize the automatic service recovery function when the operating system is completely crashed based on the server firmware, thereby greatly shortening the time required by the recovery operation service, reducing the loss of customers and simultaneously reducing the manual maintenance cost.
Drawings
FIG. 1 is a system architecture diagram of a server according to the present invention.
Fig. 2 is a flowchart illustrating an automatic service recovery method according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an embodiment of the service automatic recovery system according to the present invention.
Description of the element reference numerals
1 Server
11 Operation module
12 Baseboard management control module
111 A first operating system
112 Second operating system
13 Network configuration module
14 Main board
S21~S27 Step (ii) of
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The technical principles of the method, the system, the readable storage medium and the server for automatically recovering the service are as follows:
two operating systems which are redundant backup with each other are provided for a server host of edge computing, a user application program is deployed in each operating system, configuration information of the user application program is stored in a network configuration module, in the actual operation process, a server firmware BMC can actively monitor the health condition of the operating system in combination with the application program, once the operating system crashes, the BMC can automatically restart the server and inform the server to guide a firmware BIOS to complete the switching and guiding of starting items, and the server is switched to a backup operating system. And the application program in the backup operating system is loaded from the network configuration module and runs continuously before the crash, so that the reliability of the whole edge computing node is improved.
Example one
The embodiment provides an automatic service recovery method, which is characterized in that the method is applied to a server, and the server comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery method comprises the following steps:
after the server is started, guiding the running system of the server to enter the first operating system;
calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, executing the next step:
calling the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal; if so, calling the basic input and output system to restart the first operating system; if not, calling the basic input and output system to start the second operating system so as to facilitate automatic recovery of the service.
The service automatic recovery method provided by the present embodiment will be described in detail below with reference to the drawings. The method for automatically recovering the service is applied to a server. Please refer to fig. 1, which is a schematic diagram of a system architecture of a server. As shown in fig. 1, the server 1 includes an arithmetic module 11 and a baseboard management control module 12(BMC module) connected to the arithmetic module 11 (specifically, connected through a KCS interface). The operation module 11 is connected to a network configuration module 13 in communication.
The operation module 11 and the baseboard management control module 12 are disposed on a main board 14 of the server 1. A first operating system 111, a second operating system 112 and a basic input/output system are provided for the operation module 11, and application programs are deployed for the first operating system 111 and the second operating system 112. The application program can automatically obtain the application configuration file matched with the application program from the network configuration module 2.
The application program is a general edge computing MEC application program, which is designed and deployed by a user who purchases a server, for example, the application program can be automatic driving, AI operation and data acceleration, and an application configuration file of the application program is also defined according to the characteristics of the application program.
In this embodiment, the first operating system 111 and the second operating system 112 are redundant operating systems.
In this embodiment, the BIOS is BIOS firmware (also called BIOS firmware) that has a function of recovering the operating system and is solidified on the motherboard. During the starting process, the BIOS firmware interacts with the baseboard management control module 12 to query the operating state of the running system.
Please refer to fig. 2, which is a flowchart illustrating an exemplary embodiment of an automatic service recovery method. As shown in fig. 2, the method for automatically recovering a service specifically includes the following steps:
and S21, after the server is started, guiding the running system of the server to enter the first operating system.
S22, after entering the first operating system, the first operating system obtains an application configuration file matched with the application program deployed on the first operating system from the network configuration module, so as to complete service operation.
S23, calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if yes, returning to the step S23, namely continuing to call the baseboard management control module to monitor the working state of the first operating system; if not, go to S24.
S24, the bmc module restarts the first os, and hands over the system management right of the server to the BIOS (BIOS firmware), and the BIOS initiates a restart query to the bmc module to determine whether the restart of the first os is a normal restart; if yes, go to S25; if not, S26 is executed.
In this embodiment, the bios initiates a restart query to the baseboard management control module to obtain whether the restart of the first operating system is caused by the crash of the first operating system. If so, indicating that the restart of the first operating system is abnormal restart. If not, the restarting of the first operating system is normal.
In practical applications, when a server encounters conditions such as unexpected power failure, external impact, software crash, and the like, the operating system crashes.
S25, calling the basic input output system to restart the first operating system, and guiding the running system of the server to enter the first operating system.
And S26, calling the basic input output system to start the second operating system, and guiding the running system of the server to enter the second operating system.
And S27, after entering the second operating system, the second operating system acquires an application configuration file matched with the application program deployed on the second operating system from the network configuration module. In this embodiment, the configuration of the application program before the loading crash of the application program in the second operating system as the backup from the network configuration module is continuously run, so as to improve the reliability of the whole edge computing node.
The automatic service recovery method of the embodiment realizes the automatic service recovery function when the operating system is completely crashed based on the server firmware, thereby greatly shortening the time required by the recovery operation service, reducing the customer loss and simultaneously reducing the manual maintenance cost.
The present embodiment also provides a readable storage medium (also referred to as computer readable storage medium) having a computer program stored thereon, which when executed by a processor implements the automatic service restoration method.
One of ordinary skill in the art will appreciate that the computer-readable storage medium is: all or part of the steps for implementing the above method embodiments may be performed by hardware associated with a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Example two
The embodiment provides an automatic service recovery system, which is characterized in that the system is applied to a server, and the server comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery system comprises:
the guiding unit is used for guiding the running system of the server to enter the first operating system after the server is started;
the calling unit is used for calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, continuing to call the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal or not through a judging unit; if yes, calling the basic input and output system through the calling unit to restart the first operating system; if not, the basic input and output system is called by the calling unit to start the second operating system so as to facilitate automatic recovery of the service.
The service automatic recovery system provided by the present embodiment will be described in detail with reference to the drawings. The automatic service recovery system according to this embodiment is applied to a server shown in fig. 1. Please refer to fig. 3, which is a schematic structural diagram of an embodiment of an automatic service recovery system. As shown in fig. 3, the service automatic recovery system 3 includes a guiding unit 31, a calling unit 32, and a determining unit 33.
The guiding unit 31 is configured to guide the running system of the server to enter the first operating system after the server is started.
In this embodiment, after the booting unit 31 boots the system to enter the first operating system, the first operating system obtains, from the network configuration module, an application configuration file matched with an application program deployed on the first operating system, so as to complete business operations.
The calling unit 32 coupled to the guiding unit 31 is configured to call the bmc module to monitor whether the working state of the first os is normal in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, the calling unit 32 calls the baseboard management control module to restart the first operating system, so that the baseboard management control module hands over the system management right of the server to the basic input/output system (BIOS firmware), and calls the basic input/output system to initiate a restart query to the baseboard management control module.
The judging unit 33 coupled to the invoking unit 32 is configured to judge whether the reboot of the first operating system is a normal reboot; if so, calling the basic input and output system to restart the first operating system, and guiding the running system of the server to enter the first operating system; if not, calling the basic input and output system to start the second operating system, and guiding the running system of the server to enter the second operating system.
In this embodiment, the bios initiates a restart query to the baseboard management control module to obtain whether the restart of the first operating system is caused by the crash of the first operating system. If so, indicating that the restart of the first operating system is abnormal restart. If not, the restarting of the first operating system is normal.
In an embodiment, after entering the second operating system, the second operating system obtains, from the network configuration module, an application configuration file matched with an application program deployed on the second operating system. In this embodiment, the configuration of the application program before the loading crash of the application program in the second operating system as the backup from the network configuration module is continuously run, so as to improve the reliability of the whole edge computing node.
It should be noted that the division of the modules of the above system is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And the modules can be realized in a form that all software is called by the processing element, or in a form that all the modules are realized in a form that all the modules are called by the processing element, or in a form that part of the modules are called by the hardware. For example: the guiding module can be a processing element which is established separately, and can also be integrated in a certain chip of the system. The boot module may be stored in the memory of the system in the form of program code, and may be called by a processing element of the system to execute the functions of the above modules. Other modules are implemented similarly. All or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software. These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), one or more microprocessors (DSPs), one or more Field Programmable Gate Arrays (FPGAs), and the like. When a module is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. These modules may be integrated together and implemented in the form of a System-on-a-chip (SOC).
EXAMPLE III
This embodiment provides a server, including: a processor, memory, transceiver, communication interface, or/and system bus; the memory and the communication interface are connected with the processor and the transceiver through a system bus and are used for completing mutual communication, the memory is used for storing the computer program, the communication interface is used for communicating with other equipment, and the processor and the transceiver are used for running the computer program so as to enable the server to execute the steps of the automatic service recovery method according to the embodiment one. In this embodiment, the processor includes an operation module, a substrate management control module connected to the operation module, and a network configuration module connected to the operation module. The operation module is provided with a first operating system, a second operating system and a basic input and output system (BIOS firmware).
The above-mentioned system bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other equipment (such as a client, a read-write library and a read-only library). The Memory may include a Random Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components.
The protection scope of the method for automatically recovering a service according to the present invention is not limited to the execution sequence of the steps listed in this embodiment, and all the schemes of adding, subtracting, and replacing steps in the prior art according to the principles of the present invention are included in the protection scope of the present invention.
The present invention also provides an automatic service recovery system, which can implement the automatic service recovery method of the present invention, but the implementation apparatus of the automatic service recovery method of the present invention includes, but is not limited to, the structure of the automatic service recovery system described in this embodiment, and all structural modifications and substitutions in the prior art made according to the principle of the present invention are included in the scope of the present invention.
In summary, the method, the system, the readable storage medium and the server for automatically recovering the service of the present invention implement the automatic service recovery function when the operating system is completely crashed based on the server firmware, thereby greatly reducing the time required for recovering the operation service, reducing the customer loss, and reducing the labor maintenance cost. The invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (10)

1. The method is characterized in that the method is applied to a server, and the server comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery method comprises the following steps:
after the server is started, guiding the running system of the server to enter the first operating system;
calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, executing the next step:
calling the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal; if so, calling the basic input and output system to restart the first operating system; if not, calling the basic input and output system to start the second operating system so as to facilitate automatic recovery of the service.
2. The automatic traffic restoration method according to claim 1,
the operation module is in communication connection with a network configuration module;
the first operating system and the second operating system deploy application programs.
3. The method according to claim 2, wherein after entering the first operating system, the method further comprises:
and the first operating system acquires an application configuration file matched with the application program from the network configuration module.
4. The method of claim 2, further comprising:
and after the second operating system is started, the second operating system acquires an application configuration file matched with the application program from the network configuration module.
5. The method according to claim 1, wherein in the process of invoking the baseboard management control module to restart the first operating system, the method further comprises:
and handing over the system management right of the server to the basic input and output system, and initiating a restart query to the baseboard management control module by the basic input and output system so as to judge whether the restart of the first operating system is normal.
6. The method according to claim 1, wherein the computing module and the baseboard management control module are disposed on a motherboard of the server.
7. The method according to claim 6, wherein the bios is bios firmware having a run system recovery function and being solidified on the motherboard.
8. The automatic service recovery system is characterized by being applied to a server, wherein the server comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system; the automatic service recovery system comprises:
the guiding unit is used for guiding the running system of the server to enter the first operating system after the server is started;
the calling unit is used for calling the baseboard management control module to monitor whether the working state of the first operating system is normal or not in real time; if so, continuing to call the substrate management control module to monitor the working state of the first operating system; if not, continuing to call the baseboard management control module to restart the first operating system, and judging whether the restart of the first operating system is normal or not through a judging unit; if yes, calling the basic input and output system through the calling unit to restart the first operating system; if not, the basic input and output system is called by the calling unit to start the second operating system so as to facilitate automatic recovery of the service.
9. A readable storage medium having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, implements the method for automatic service restoration according to any one of claims 1 to 7.
10. A server, comprising: a processor and a memory;
the memory is used for storing a computer program, and the processor is used for executing the computer program stored by the memory to cause the server to execute the automatic service recovery method according to any one of claims 1 to 7;
the processor comprises an operation module and a substrate management control module connected with the operation module; the operation module is provided with a first operation system, a second operation system and a basic input and output system.
CN201911275677.5A 2019-12-12 2019-12-12 Service automatic recovery method, system, readable storage medium and server Active CN111124728B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911275677.5A CN111124728B (en) 2019-12-12 2019-12-12 Service automatic recovery method, system, readable storage medium and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911275677.5A CN111124728B (en) 2019-12-12 2019-12-12 Service automatic recovery method, system, readable storage medium and server

Publications (2)

Publication Number Publication Date
CN111124728A true CN111124728A (en) 2020-05-08
CN111124728B CN111124728B (en) 2024-02-23

Family

ID=70499955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911275677.5A Active CN111124728B (en) 2019-12-12 2019-12-12 Service automatic recovery method, system, readable storage medium and server

Country Status (1)

Country Link
CN (1) CN111124728B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111752750A (en) * 2020-05-20 2020-10-09 西安万像电子科技有限公司 Starting method and device of operating system
CN112860477A (en) * 2020-12-31 2021-05-28 京信网络系统股份有限公司 High-reliability operation method and system for operating system, storage medium and server
CN113360347A (en) * 2021-06-30 2021-09-07 南昌华勤电子科技有限公司 Server and control method thereof
CN114968379A (en) * 2021-02-24 2022-08-30 宇瞻科技股份有限公司 Storage device recovery system
CN117033086A (en) * 2023-10-09 2023-11-10 苏州元脑智能科技有限公司 Recovery method and device of operating system, storage medium and server management chip

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030126511A1 (en) * 2001-12-28 2003-07-03 Jen-Tsung Yang Module and method for automatic restoring BIOS device
KR20040047209A (en) * 2002-11-29 2004-06-05 (주)소프트위드솔루션 Method for automatically recovering computer system in network and recovering system for realizing the same
CN101038561A (en) * 2006-03-14 2007-09-19 联想(北京)有限公司 Computer remote control method and system
KR20090120541A (en) * 2008-05-20 2009-11-25 주식회사 이노와이어리스 Method and system for automatic recovery of an embedded operating system
CN101697132A (en) * 2009-10-30 2010-04-21 北京星网锐捷网络技术有限公司 Method, device and network equipment for quickly restarting operating system
CN102855174A (en) * 2011-06-28 2013-01-02 奇智软件(北京)有限公司 Automatic-recovery target program run control method and device in automated testing
CN107678880A (en) * 2017-09-08 2018-02-09 郑州云海信息技术有限公司 A kind of minicomputer calculates partition operating system Backup and Restore device and method
CN107729183A (en) * 2017-10-12 2018-02-23 郑州云海信息技术有限公司 A kind of method and system backed up by FC agreements with recovering linux operating systems
CN108268302A (en) * 2016-12-30 2018-07-10 华为技术有限公司 Realize the method and apparatus that equipment starts
CN108804144A (en) * 2018-05-22 2018-11-13 中国科学院上海高等研究院 Control method/system, storage medium and the electronic equipment of os starting

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030126511A1 (en) * 2001-12-28 2003-07-03 Jen-Tsung Yang Module and method for automatic restoring BIOS device
KR20040047209A (en) * 2002-11-29 2004-06-05 (주)소프트위드솔루션 Method for automatically recovering computer system in network and recovering system for realizing the same
CN101038561A (en) * 2006-03-14 2007-09-19 联想(北京)有限公司 Computer remote control method and system
KR20090120541A (en) * 2008-05-20 2009-11-25 주식회사 이노와이어리스 Method and system for automatic recovery of an embedded operating system
CN101697132A (en) * 2009-10-30 2010-04-21 北京星网锐捷网络技术有限公司 Method, device and network equipment for quickly restarting operating system
CN102855174A (en) * 2011-06-28 2013-01-02 奇智软件(北京)有限公司 Automatic-recovery target program run control method and device in automated testing
CN108268302A (en) * 2016-12-30 2018-07-10 华为技术有限公司 Realize the method and apparatus that equipment starts
CN107678880A (en) * 2017-09-08 2018-02-09 郑州云海信息技术有限公司 A kind of minicomputer calculates partition operating system Backup and Restore device and method
CN107729183A (en) * 2017-10-12 2018-02-23 郑州云海信息技术有限公司 A kind of method and system backed up by FC agreements with recovering linux operating systems
CN108804144A (en) * 2018-05-22 2018-11-13 中国科学院上海高等研究院 Control method/system, storage medium and the electronic equipment of os starting

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
曹辉;童超;牛建伟;高宇航;: "Minix3的微重启自愈体系架构研究", 小型微型计算机系统, no. 03 *
李霞;: "一种基于双机热备的大型服务器程序数据备份还原机制的实现", 科技创新导报, no. 19 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111752750A (en) * 2020-05-20 2020-10-09 西安万像电子科技有限公司 Starting method and device of operating system
CN112860477A (en) * 2020-12-31 2021-05-28 京信网络系统股份有限公司 High-reliability operation method and system for operating system, storage medium and server
CN114968379A (en) * 2021-02-24 2022-08-30 宇瞻科技股份有限公司 Storage device recovery system
CN113360347A (en) * 2021-06-30 2021-09-07 南昌华勤电子科技有限公司 Server and control method thereof
CN113360347B (en) * 2021-06-30 2023-08-25 南昌华勤电子科技有限公司 Server and control method thereof
CN117033086A (en) * 2023-10-09 2023-11-10 苏州元脑智能科技有限公司 Recovery method and device of operating system, storage medium and server management chip
CN117033086B (en) * 2023-10-09 2024-02-09 苏州元脑智能科技有限公司 Recovery method and device of operating system, storage medium and server management chip

Also Published As

Publication number Publication date
CN111124728B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN111124728A (en) Automatic service recovery method, system, readable storage medium and server
JP6530774B2 (en) Hardware failure recovery system
US10353779B2 (en) Systems and methods for detection of firmware image corruption and initiation of recovery
CN101814035B (en) Method and system to enable fast platform restart
US11126420B2 (en) Component firmware update from baseboard management controller
US9846616B2 (en) Boot recovery system
US8782469B2 (en) Request processing system provided with multi-core processor
CN112015599B (en) Method and apparatus for error recovery
US9372702B2 (en) Non-disruptive code update of a single processor in a multi-processor computing system
WO2018095107A1 (en) Bios program abnormal processing method and apparatus
US20210133022A1 (en) Memory scrub system
US20170212815A1 (en) Virtualization substrate management device, virtualization substrate management system, virtualization substrate management method, and recording medium for recording virtualization substrate management program
US20130262916A1 (en) Cluster monitor, method for monitoring a cluster, and computer-readable recording medium
US20210240831A1 (en) Systems and methods for integrity verification of secondary firmware while minimizing boot time
CN114035905A (en) Fault migration method and device based on virtual machine, electronic equipment and storage medium
CN111147615B (en) Method and system for taking over IP address, computer readable storage medium and server
US20140181496A1 (en) Method, Apparatus and Processor for Reading Bios
WO2023109880A1 (en) Service recovery method, data processing unit and related device
US20130111454A1 (en) Technique for updating program being executed
JP2015045905A (en) Information processing system and failure processing method of information processing system
CN114116330A (en) Server performance test method, system, terminal and storage medium
US20210406113A1 (en) Systems and methods for dynamically resolving hardware failures in an information handling system
TWI554876B (en) Method for processing node replacement and server system using the same
US10768940B2 (en) Restoring a processing unit that has become hung during execution of an option ROM
CN107783855B (en) Fault self-healing control device and method for virtual network element

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant