CN102394791A - Downtime recovery method and system - Google Patents

Downtime recovery method and system Download PDF

Info

Publication number
CN102394791A
CN102394791A CN201110329567XA CN201110329567A CN102394791A CN 102394791 A CN102394791 A CN 102394791A CN 201110329567X A CN201110329567X A CN 201110329567XA CN 201110329567 A CN201110329567 A CN 201110329567A CN 102394791 A CN102394791 A CN 102394791A
Authority
CN
China
Prior art keywords
server
testing
machine
intranet
delaying
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201110329567XA
Other languages
Chinese (zh)
Inventor
刘希猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201110329567XA priority Critical patent/CN102394791A/en
Publication of CN102394791A publication Critical patent/CN102394791A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides downtime recovery method and system, relates to the communication field, and solves the problem that the working efficiency of the system is low because the requirement for fast system failure response can not be satisfied by manual detection. The method comprises the following steps: a monitoring server periodically and automatically sends heartbeat detection messages to a plurality of detection servers in an intranet according to the preset intranet heartbeat time; the monitoring server receives heartbeat detection results responding to the heartbeat detection messages and returned by the detection servers in the intranet; and when the heartbeat detection result returned by the detection server shows that the network of the detection server is down, the monitoring server sends a power supply switch-off and re-start command to the detection server which is down. The technical scheme provided by the invention is suitable for a multi-server network, and achieves automatic and efficient downtime detection and recovery.

Description

Machine restoration methods and system delay
Technical field
The present invention relates to the communications field, relate in particular to a kind of machine restoration methods and system of delaying.
Background technology
Developing rapidly of IT industry makes enterprise application software towards automation, intelligent development.And for these complicated applications Development of Software merchants, how the functional software that becomes increasingly complex better goes test also is a difficult problem, and especially those need the long-term test contents that continues.Test program needs intellectuality more could adapt to the complicated of IT software and hardware function equally.
Along with the complexity of software systems applied environment, the probability that software is made mistakes also constantly increases, and software is faced with a very crucial demand and after system makes mistakes, can recovers exactly.So, can find timely that system mistake just becomes matter of utmost importance to be solved.
The manually actuated detection mode of general at present employing is carried out the wrong detection monitoring to system; And for the server of 7*24 hour follow-on test in the work; In case the server machine of delaying appears, because manual work can't the real time inspection state of runtime machine, and might be when manual detection be found system mistake that the server machine of delaying causes; This server has been delayed machine a period of time, and then causes the substantive test waste of time.
To sum up, existing software systems applied environment is increasingly sophisticated, and the probability of makeing mistakes also constantly increases, and manual detection can't satisfy the demand of quick response system mistake, makes the system works inefficiency.
Summary of the invention
The invention provides a kind of machine restoration methods and system of delaying, solved the demand that manual detection can't satisfy the quick response system mistake, make the problem of system works inefficiency.
A kind of machine restoration methods of delaying comprises:
Monitoring server is according to the Intranet heart time that presets, and a plurality of testing servers send heartbeat detection message in Intranet periodically automatically;
Said monitoring server receives the heartbeat detection result of the said heartbeat detection message of response that each testing server returns in the said Intranet;
When the heartbeat test result of returning at said testing server showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying.
Preferably, said heartbeat detection message is specially the ping order.
Preferably, said heartbeat detection result comprises normal operation of said testing server and the said testing server network two kinds of situation of machine of delaying.
Preferably, said monitoring server is specially to the testing server transmission power-off instruction of restarting of this machine of delaying:
Said monitoring server adopts IPMI (IPMI) administration order to send the power-off instruction of restarting to the testing server of the said machine of delaying.
Preferably, the said heartbeat test result of returning at said testing server shows when this testing server network is delayed machine that said monitoring server also comprises after the testing server of this machine of delaying sends the step of power-off instruction of restarting:
The testing server of the said machine of delaying restarts this testing server according to the power-off instruction of restarting that the said monitoring server that receives sends.
Preferably, according to the Intranet heart time that presets, a plurality of testing servers send before the step of heartbeat detection message in Intranet periodically automatically, also comprise at monitoring server:
From a plurality of testing servers of said Intranet, the testing server that selection one is stablized and load is lower is as monitoring server.
Preferably, the above-mentioned machine restoration methods of delaying also comprises:
Dispose said Intranet heart time, send heartbeat detection message with the indication monitoring server according to this Intranet heart time, said Intranet heart time restarts the required time greater than testing server.
The present invention also provides a kind of machine recovery system of delaying, and comprises a plurality of testing servers under monitoring server and the monitoring of this monitoring server, and said monitoring server and said a plurality of testing server are in the same Intranet, and be interconnected through said Intranet;
Said monitoring server; Be used for according to the Intranet heart time that presets; Periodic said a plurality of testing servers in the said Intranet of trend send heartbeat detection message; Receive the heartbeat detection result that each testing server returns in the said Intranet, and when the heartbeat test result that said testing server returns showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying;
Said testing server is used to receive the heartbeat detection message that said monitoring server sends, and the heartbeat detection result of the said heartbeat detection message of returning to said monitoring server of response.
Preferably, said monitoring server adopts the IPMI administration order to send the power-off instruction of restarting.
Preferably, said testing server also is used for the power-off instruction of restarting according to the said monitoring server transmission that receives, and restarts this testing server.
The invention provides a kind of machine restoration methods and system of delaying; Monitoring server is according to the Intranet heart time that presets; Periodically automatically, a plurality of testing servers send heartbeat detection message in Intranet; Receive the heartbeat detection result of the said heartbeat detection message of response that each testing server returns in the said Intranet, and when the heartbeat test result that said testing server returns showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying; Realized the delay automatic time of machine of server is detected; Shortened server in the system has been delayed response time of machine, solved the demand that manual detection can't satisfy the quick response system mistake, made the problem of system works inefficiency.
Description of drawings
The structural representation of a kind of machine recovery system of delaying that Fig. 1 provides for embodiments of the invention;
The flow chart of a kind of machine restoration methods of delaying that Fig. 2 provides for embodiments of the invention.
Embodiment
The manually actuated detection mode of general at present employing is carried out the wrong detection monitoring to system; And for the server of 7*24 hour follow-on test in the work; In case the server machine of delaying appears, because manual work can't the real time inspection state of runtime machine, and might be when manual detection be found system mistake that the server machine of delaying causes; This server has been delayed machine a period of time, and then causes the substantive test waste of time.
To sum up, existing software systems applied environment is increasingly sophisticated, and the probability of makeing mistakes also constantly increases, and manual detection can't satisfy the demand of quick response system mistake, makes the system works inefficiency.
In order to address the above problem, embodiments of the invention provide a kind of machine restoration methods and system of delaying.Hereinafter will combine accompanying drawing that embodiments of the invention are elaborated.Need to prove that under the situation of not conflicting, embodiment among the application and the characteristic among the embodiment be combination in any each other.
The embodiment of the invention provides a kind of machine recovery system of delaying, and its structure is as shown in Figure 1, comprising:
A plurality of testing servers 102 under monitoring server 101 is monitored with this monitoring server, said monitoring server 101 is in the same Intranet with said a plurality of testing servers 102, and is interconnected through said Intranet;
Said monitoring server 101; Be used for according to the Intranet heart time that presets; Periodic said a plurality of testing servers 102 in the said Intranet of trend send heartbeat detection message; Receive the heartbeat detection result that each testing server 102 returns in the said Intranet, and when the heartbeat test result that said testing server 102 returns showed that this testing server 102 is delayed machine, said monitoring server 101 sent the power-off instruction of restarting to the testing server 102 of this machine of delaying;
Said testing server 102 is used to receive the heartbeat detection message that said monitoring server 101 sends, and the heartbeat detection result of the said heartbeat detection message of returning to said monitoring server 101 of response.
Preferably, the heartbeat detection message that said monitoring server 101 sends is specially the ping order, and said heartbeat detection result comprises said the testing server 102 normal and said testing servers 102 two kinds of situation of machine of delaying.
Preferably, said monitoring server 101 adopts the IPMI administration order to send the power-off instruction of restarting.
Preferably, said testing server 102 also is used for the power-off instruction of restarting according to the said monitoring server that receives 101 transmissions, restarts this testing server 102.
Need explanation to be; Monitoring server 101 is the common server in the Intranet with testing server 102; Generally speaking, select one to stablize and load lower server as monitoring server in a plurality of servers from Intranet, other servers promptly receive the monitoring of this monitoring server.According to the variation of each server working condition in the Intranet, but also other servers of human configuration are monitoring server.
In conjunction with the above-mentioned machine recovery system of delaying, embodiments of the invention provide a kind of machine restoration methods of delaying, and it is as shown in Figure 2 to use this method to accomplish the flow process that detection and control to server in the Intranet recovers, and comprising:
Step 201, from a plurality of testing servers of said Intranet, select stable and a testing server that load is lower as monitoring server;
In this step, the server of selecting continual and steady operation is as monitoring server.
Preferably, in Intranet, build baseboard management controller (Baseboard Management Controller, BMC) internet.IPMI tool software bag is installed on monitoring server; The BMC address of configuration testing server, and open the IPMI service.At present most servers is integrated IPMI way to manage on mainboard BMC; After monitoring server discovery testing server network is delayed machine; Can adopt the IPMI administration order to send the instruction that power-off is restarted to testing server BMC; Restart server rapidly, in time recover the bottom hardware system of testing service.
Step 202, configuration Intranet heart time send heartbeat detection message with the indication monitoring server according to this Intranet heart time;
In this step; Configuration Intranet heart time; The Intranet heart time needs to restart the required time greater than testing server; Otherwise possibly cause in the testing server restarting process monitoring server to detect the same testing server network machine of delaying once again, and then repeat to send the power-off instruction of restarting, cause same testing server by frequent power-off restarting.The Intranet heart time restarts the required time greater than testing server, and outage leaves the certain time interval buffering with the start that powers on when restarting for testing server.
Preferably, all right configuration monitoring server is to starting up's item of testing server monitoring.Can realize the different intelligence restorations that continue test assignments through specific resources monitoring and the unlatching flow process of calling different test programs in starting up's item.
Step 203, monitoring server are according to the Intranet heart time that presets, and a plurality of testing servers send heartbeat detection message in Intranet periodically automatically;
The simplest method is as heartbeat detection message with network timing ping order.
Step 204, said monitoring server receive the heartbeat detection result of the said heartbeat detection message of response that each testing server returns in the said Intranet;
Operation has the heartbeat trace routine on testing server, the heartbeat detection message that this program response receives, and return current heartbeat detection result.The heartbeat detection result who returns is divided into delay two kinds in machine of the normal operation of testing server and testing server network.
When step 204 detects heartbeat through the ping order; In this step; Through returning of ping result, utilize shell script to obtain operating state UP (corresponding testing server normally moves this result) or the DOWN (corresponding testing server network delay this result of machine) of remote testing server automatically.
Step 205, when the heartbeat test result of returning at said testing server shows that this testing server network is delayed machine, said monitoring server sends the power-off instruction of restarting to the testing server of this machine of delaying;
In this step, concrete, the instruction that the supervising the network of monitoring server through IPMI restarts to the BMC send server of the machine testing server of delaying, this instruction directly acts on the testing server power supply.Even testing server still can restart rapidly under (software delay machine) situation because of test program causes crashing.
The testing server of step 206, the said machine of delaying restarts this testing server according to the power-off instruction of restarting that the said monitoring server that receives sends.
After testing server is restarted, through monitoring resource and the startup flow process that adds test program in the flow process that starts at testing server.Need not the automatic recovery of the realization heartbeat detection program of manual intervention, guarantee to continue the fast quick-recovery of test, save the human and material resources and the time cost of test.
Embodiments of the invention provide a kind of machine restoration methods and system of delaying; Monitoring server is according to the Intranet heart time that presets; Periodically automatically, a plurality of testing servers send heartbeat detection message in Intranet; Receive the heartbeat detection result of the said heartbeat detection message of response that each testing server returns in the said Intranet, and when the heartbeat test result that said testing server returns showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying; Realized the delay automatic time of machine of server is detected; Shortened server in the system has been delayed response time of machine, solved the demand that manual detection can't satisfy the quick response system mistake, made the problem of system works inefficiency.In the os starting flow process of testing server, add test program necessary monitoring resource and start flow process.After server network is normal, need not the automatic recovery of the realization test program of manual intervention, guarantee to continue the fast quick-recovery of test, save the human and material resources and the time cost of test.Automatically the key of recovering is to coordinate the sequence flow of monitoring resource and test program startup.
The all or part of step that the one of ordinary skill in the art will appreciate that the foregoing description program circuit that can use a computer is realized; Said computer program can be stored in the computer-readable recording medium; Said computer program (like system, unit, device etc.) on the relevant hardware platform is carried out; When carrying out, comprise one of step or its combination of method embodiment.
Alternatively, all or part of step of the foregoing description also can use integrated circuit to realize, these steps can be made into integrated circuit modules one by one respectively, perhaps a plurality of modules in them or step is made into the single integrated circuit module and realizes.Like this, the present invention is not restricted to any specific hardware and software combination.
Each device/functional module/functional unit in the foregoing description can adopt the general calculation device to realize, they can concentrate on the single calculation element, also can be distributed on the network that a plurality of calculation element forms.
Each device/functional module/functional unit in the foregoing description is realized with the form of software function module and during as independently production marketing or use, can be stored in the computer read/write memory medium.The above-mentioned computer read/write memory medium of mentioning can be a read-only memory, disk or CD etc.
Any technical staff who is familiar with the present technique field can expect changing or replacement in the technical scope that the present invention discloses easily, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the described protection range of claim.

Claims (10)

1. the machine restoration methods of delaying is characterized in that, comprising:
Monitoring server is according to the Intranet heart time that presets, and a plurality of testing servers send heartbeat detection message in Intranet periodically automatically;
Said monitoring server receives the heartbeat detection result of the said heartbeat detection message of response that each testing server returns in the said Intranet;
When the heartbeat test result of returning at said testing server showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying.
2. the machine restoration methods of delaying according to claim 1 is characterized in that, said heartbeat detection message is specially the ping order.
3. the machine restoration methods of delaying according to claim 1 and 2 is characterized in that, said heartbeat detection result comprises normal operation of said testing server and the said testing server network two kinds of situation of machine of delaying.
4. the machine restoration methods of delaying according to claim 1 is characterized in that, said monitoring server is specially to the testing server transmission power-off instruction of restarting of this machine of delaying:
Said monitoring server adopts IPMI (IPMI) administration order to send the power-off instruction of restarting to the testing server of the said machine of delaying.
5. the machine restoration methods of delaying according to claim 1; It is characterized in that; The said heartbeat test result of returning at said testing server shows when this testing server network is delayed machine; Said monitoring server also comprises after the testing server of this machine of delaying sends the step of power-off instruction of restarting:
The testing server of the said machine of delaying restarts this testing server according to the power-off instruction of restarting that the said monitoring server that receives sends.
6. the machine restoration methods of delaying according to claim 1 is characterized in that, according to the Intranet heart time that presets, a plurality of testing servers send before the step of heartbeat detection message in Intranet periodically automatically, also comprise at monitoring server:
From a plurality of testing servers of said Intranet, the testing server that selection one is stablized and load is lower is as monitoring server.
7. the machine restoration methods of delaying according to claim 1 is characterized in that this method also comprises:
Dispose said Intranet heart time, send heartbeat detection message with the indication monitoring server according to this Intranet heart time, said Intranet heart time restarts the required time greater than testing server.
8. the machine recovery system of delaying is characterized in that, comprises a plurality of testing servers under monitoring server and the monitoring of this monitoring server, and said monitoring server and said a plurality of testing server are in the same Intranet, and be interconnected through said Intranet;
Said monitoring server; Be used for according to the Intranet heart time that presets; Periodic said a plurality of testing servers in the said Intranet of trend send heartbeat detection message; Receive the heartbeat detection result that each testing server returns in the said Intranet, and when the heartbeat test result that said testing server returns showed that this testing server network is delayed machine, said monitoring server sent the power-off instruction of restarting to the testing server of this machine of delaying;
Said testing server is used to receive the heartbeat detection message that said monitoring server sends, and the heartbeat detection result of the said heartbeat detection message of returning to said monitoring server of response.
9. the machine recovery system of delaying according to claim 8 is characterized in that,
Said monitoring server adopts the IPMI administration order to send the power-off instruction of restarting.
10. the machine recovery system of delaying according to claim 7 is characterized in that,
Said testing server also is used for the power-off instruction of restarting according to the said monitoring server transmission that receives, and restarts this testing server.
CN201110329567XA 2011-10-26 2011-10-26 Downtime recovery method and system Pending CN102394791A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110329567XA CN102394791A (en) 2011-10-26 2011-10-26 Downtime recovery method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110329567XA CN102394791A (en) 2011-10-26 2011-10-26 Downtime recovery method and system

Publications (1)

Publication Number Publication Date
CN102394791A true CN102394791A (en) 2012-03-28

Family

ID=45862002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110329567XA Pending CN102394791A (en) 2011-10-26 2011-10-26 Downtime recovery method and system

Country Status (1)

Country Link
CN (1) CN102394791A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103701661A (en) * 2013-12-23 2014-04-02 浪潮(北京)电子信息产业有限公司 Method and system for realizing node monitoring
CN103729280A (en) * 2013-12-23 2014-04-16 国云科技股份有限公司 High availability mechanism for virtual machine
CN103825778A (en) * 2014-02-19 2014-05-28 互联网域名系统北京市工程研究中心有限公司 DNS downtime detection switching method and system based on cloud detection
CN103944755A (en) * 2014-04-02 2014-07-23 云南大学 Universal method for detection and recovery of communication dead halt of network switching device
CN104536875A (en) * 2015-01-16 2015-04-22 浪潮电子信息产业股份有限公司 Automatic server restart testing method based on IPMI
CN104714863A (en) * 2015-02-06 2015-06-17 浪潮电子信息产业股份有限公司 Method for completely storing Raid card logs on basis of Linux operation system after system crashes
CN104965727A (en) * 2015-04-29 2015-10-07 无锡天脉聚源传媒科技有限公司 Method and device for restarting server
WO2016197876A1 (en) * 2015-06-11 2016-12-15 阿里巴巴集团控股有限公司 Remote control method, remote server, management device, and terminal
CN106326042A (en) * 2016-08-19 2017-01-11 浪潮(北京)电子信息产业有限公司 Method and device for determining operating state
CN106708656A (en) * 2015-07-30 2017-05-24 北京国双科技有限公司 Method and device for recovering user operations
CN107682889A (en) * 2017-09-11 2018-02-09 北京奇安信科技有限公司 Wireless network performance method of testing, apparatus and system
CN108234184A (en) * 2016-12-22 2018-06-29 上海诺基亚贝尔股份有限公司 For the method and apparatus of managing user information
CN108847982A (en) * 2018-06-26 2018-11-20 郑州云海信息技术有限公司 A kind of distributed storage cluster and its node failure switching method and apparatus
CN109324834A (en) * 2018-09-19 2019-02-12 郑州云海信息技术有限公司 A kind of system and method that distributed storage server is restarted automatically
CN110132692A (en) * 2019-05-27 2019-08-16 福州迈新生物技术开发有限公司 Method for reducing specimen sample and consumable material loss of pathological staining system
WO2020107205A1 (en) * 2018-11-27 2020-06-04 刘馥祎 Computing device maintenance method and apparatus, storage medium and program product
CN111767178A (en) * 2020-05-20 2020-10-13 北京奇艺世纪科技有限公司 Physical machine performance testing method and device
CN113225216A (en) * 2021-05-20 2021-08-06 国网山西省电力公司太原供电公司 Method for automatically restarting data transmission exchanger and data transmission exchange device
CN113867815A (en) * 2021-09-17 2021-12-31 杭州当虹科技股份有限公司 Server suspension monitoring and automatic restarting method and server applying same
CN115102838A (en) * 2022-06-14 2022-09-23 阿里巴巴(中国)有限公司 Emergency processing method and device for server downtime risk and electronic equipment
CN115278213A (en) * 2022-07-11 2022-11-01 海南乾唐视联信息技术有限公司 Offline detection method, server, electronic device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101299680A (en) * 2008-06-17 2008-11-05 中国移动通信集团江苏有限公司 Method for implementing quick recovery after delay of WLAN AP
CN101345663A (en) * 2008-08-22 2009-01-14 杭州华三通信技术有限公司 Heartbeat detection method and heartbeat detection apparatus
CN101355464A (en) * 2008-08-26 2009-01-28 广州鼎坚资讯科技有限公司 Electric control method and apparatus for monitoring network wire break and automatically restarting network equipment
CN101453366A (en) * 2007-11-30 2009-06-10 英业达股份有限公司 Method and system for on-line repair in real-time
CN101582787A (en) * 2008-05-16 2009-11-18 中兴通讯股份有限公司 Double-computer backup system and backup method
CN101610188A (en) * 2009-07-30 2009-12-23 迈普通信技术股份有限公司 Sip server restoring method of service process fault and sip server
CN101964729A (en) * 2009-07-22 2011-02-02 英业达股份有限公司 Test system and test method thereof
CN102026042A (en) * 2009-09-18 2011-04-20 中兴通讯股份有限公司 Keep-alive and self-healing method and device for advanced telecom computing architecture control surface

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453366A (en) * 2007-11-30 2009-06-10 英业达股份有限公司 Method and system for on-line repair in real-time
CN101582787A (en) * 2008-05-16 2009-11-18 中兴通讯股份有限公司 Double-computer backup system and backup method
CN101299680A (en) * 2008-06-17 2008-11-05 中国移动通信集团江苏有限公司 Method for implementing quick recovery after delay of WLAN AP
CN101345663A (en) * 2008-08-22 2009-01-14 杭州华三通信技术有限公司 Heartbeat detection method and heartbeat detection apparatus
CN101355464A (en) * 2008-08-26 2009-01-28 广州鼎坚资讯科技有限公司 Electric control method and apparatus for monitoring network wire break and automatically restarting network equipment
CN101964729A (en) * 2009-07-22 2011-02-02 英业达股份有限公司 Test system and test method thereof
CN101610188A (en) * 2009-07-30 2009-12-23 迈普通信技术股份有限公司 Sip server restoring method of service process fault and sip server
CN102026042A (en) * 2009-09-18 2011-04-20 中兴通讯股份有限公司 Keep-alive and self-healing method and device for advanced telecom computing architecture control surface

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103701661B (en) * 2013-12-23 2017-08-25 浪潮(北京)电子信息产业有限公司 A kind of method and system for realizing monitoring nodes
CN103729280A (en) * 2013-12-23 2014-04-16 国云科技股份有限公司 High availability mechanism for virtual machine
CN103701661A (en) * 2013-12-23 2014-04-02 浪潮(北京)电子信息产业有限公司 Method and system for realizing node monitoring
CN103825778A (en) * 2014-02-19 2014-05-28 互联网域名系统北京市工程研究中心有限公司 DNS downtime detection switching method and system based on cloud detection
CN103825778B (en) * 2014-02-19 2018-02-27 互联网域名系统北京市工程研究中心有限公司 DNS based on cloud detection delays machine testing switching method and system
CN103944755A (en) * 2014-04-02 2014-07-23 云南大学 Universal method for detection and recovery of communication dead halt of network switching device
CN104536875A (en) * 2015-01-16 2015-04-22 浪潮电子信息产业股份有限公司 Automatic server restart testing method based on IPMI
CN104714863A (en) * 2015-02-06 2015-06-17 浪潮电子信息产业股份有限公司 Method for completely storing Raid card logs on basis of Linux operation system after system crashes
CN104965727A (en) * 2015-04-29 2015-10-07 无锡天脉聚源传媒科技有限公司 Method and device for restarting server
CN104965727B (en) * 2015-04-29 2018-10-26 无锡天脉聚源传媒科技有限公司 A kind of method and device for restarting server
CN106302618A (en) * 2015-06-11 2017-01-04 阿里巴巴集团控股有限公司 Long-range control method, remote server, management equipment and terminal
WO2016197876A1 (en) * 2015-06-11 2016-12-15 阿里巴巴集团控股有限公司 Remote control method, remote server, management device, and terminal
CN106708656A (en) * 2015-07-30 2017-05-24 北京国双科技有限公司 Method and device for recovering user operations
CN106708656B (en) * 2015-07-30 2020-05-22 北京国双科技有限公司 User operation recovery method and device
CN106326042A (en) * 2016-08-19 2017-01-11 浪潮(北京)电子信息产业有限公司 Method and device for determining operating state
CN106326042B (en) * 2016-08-19 2020-02-07 浪潮(北京)电子信息产业有限公司 Method and device for determining running state
CN108234184A (en) * 2016-12-22 2018-06-29 上海诺基亚贝尔股份有限公司 For the method and apparatus of managing user information
CN108234184B (en) * 2016-12-22 2021-01-15 上海诺基亚贝尔股份有限公司 Method and apparatus for managing user information
CN107682889A (en) * 2017-09-11 2018-02-09 北京奇安信科技有限公司 Wireless network performance method of testing, apparatus and system
CN108847982A (en) * 2018-06-26 2018-11-20 郑州云海信息技术有限公司 A kind of distributed storage cluster and its node failure switching method and apparatus
CN108847982B (en) * 2018-06-26 2021-11-19 郑州云海信息技术有限公司 Distributed storage cluster and node fault switching method and device thereof
CN109324834A (en) * 2018-09-19 2019-02-12 郑州云海信息技术有限公司 A kind of system and method that distributed storage server is restarted automatically
WO2020107205A1 (en) * 2018-11-27 2020-06-04 刘馥祎 Computing device maintenance method and apparatus, storage medium and program product
CN110132692A (en) * 2019-05-27 2019-08-16 福州迈新生物技术开发有限公司 Method for reducing specimen sample and consumable material loss of pathological staining system
CN110132692B (en) * 2019-05-27 2021-09-28 福州迈新生物技术开发有限公司 Method for reducing specimen sample and consumable material loss of pathological staining system
CN111767178A (en) * 2020-05-20 2020-10-13 北京奇艺世纪科技有限公司 Physical machine performance testing method and device
CN111767178B (en) * 2020-05-20 2023-09-01 北京奇艺世纪科技有限公司 Physical machine performance test method and device
CN113225216A (en) * 2021-05-20 2021-08-06 国网山西省电力公司太原供电公司 Method for automatically restarting data transmission exchanger and data transmission exchange device
CN113867815A (en) * 2021-09-17 2021-12-31 杭州当虹科技股份有限公司 Server suspension monitoring and automatic restarting method and server applying same
CN113867815B (en) * 2021-09-17 2023-08-11 杭州当虹科技股份有限公司 Method for monitoring server suspension and automatically restarting and server applying same
CN115102838A (en) * 2022-06-14 2022-09-23 阿里巴巴(中国)有限公司 Emergency processing method and device for server downtime risk and electronic equipment
CN115102838B (en) * 2022-06-14 2024-02-27 阿里巴巴(中国)有限公司 Emergency processing method and device for server downtime risk and electronic equipment
CN115278213A (en) * 2022-07-11 2022-11-01 海南乾唐视联信息技术有限公司 Offline detection method, server, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN102394791A (en) Downtime recovery method and system
CN108847982B (en) Distributed storage cluster and node fault switching method and device thereof
US8954784B2 (en) Reduced power failover
TWI495297B (en) Process, computer readable media, and system for device power management using network connections
US20060041767A1 (en) Methods, devices and computer program products for controlling power supplied to devices coupled to an uninterruptible power supply (UPS)
CN101753357A (en) Network server centralized monitoring system and method
CN102394914A (en) Cluster brain-split processing method and device
CN106155844B (en) A kind of self-recovery method and self recoverable system of WEB server
CN102360324B (en) Failure recovery method and equipment for failure recovery
CN102880527B (en) Data recovery method of baseboard management controller
EP1745374A2 (en) Dynamic migration of virtual machine computer programs
CN102629224A (en) Method and device of integrated data disaster recovery based on cloud platform
CN102624919A (en) Distributed service integrated system for service-oriented architecture and application method thereof
CN106330523A (en) Cluster server disaster recovery system and method, and server node
CN105242980A (en) Complementary watchdog system and complementary watchdog monitoring method
CN111400104B (en) Data synchronization method and device, electronic equipment and storage medium
TW200426571A (en) Policy-based response to system errors occurring during os runtime
CN102354261A (en) Remote control system for power supply switches of machine room servers
CN105335256A (en) Method, device and system for switching backup disks in complete cabinet server
CN101207519A (en) Version server, operation maintenance unit and method for restoring failure
CN103178977A (en) Computer system and starting-up management method of same
CN102521060A (en) Pseudo halt solving method of high-availability cluster system based on watchdog local detecting technique
EP2887592A1 (en) Enum-dns disaster recovery method and system in ims network
CN102957563B (en) Linux clustering fault automatic recovery method and Linux clustering fault automatic recovery system
TW200304297A (en) Clustered/fail-over remote hardware management system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120328