CN108776625A - A kind of restorative procedure of service fault, device and storage medium - Google Patents
A kind of restorative procedure of service fault, device and storage medium Download PDFInfo
- Publication number
- CN108776625A CN108776625A CN201810665134.3A CN201810665134A CN108776625A CN 108776625 A CN108776625 A CN 108776625A CN 201810665134 A CN201810665134 A CN 201810665134A CN 108776625 A CN108776625 A CN 108776625A
- Authority
- CN
- China
- Prior art keywords
- failure
- service
- fault
- solution
- knowledge library
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
Abstract
The invention discloses a kind of restorative procedure of service fault, device and storage mediums.The method includes:The service operation of real time monitoring judges in the failure whether existing fault knowledge library in the fault knowledge library there is no when the failure when breaking, restore the failure according to preset strategy when the service failure state.Compared to conventional artificial manual handle, real time monitoring and intelligent repair process are repaired automatically when service being enable to break down.
Description
Technical field
The present invention relates to computer technology technology, espespecially a kind of restorative procedure of service fault, device and storage medium.
Background technology
In the information age, more and more important is become to the monitoring of application service, the normal operation of application service produces company
Raw huge economic benefit.It can inevitably break down in application program operational process, the solution after application failure is more
For artificial manual handle, labor intensive, treatment effeciency is relatively low, and a period of time is needed from pinpointing the problems to being disposed, and influences
The normal use of application program generates large effect to the business efficiency of company.
Invention content
In order to solve the above technical problem, the present invention provides a kind of restorative procedure of service fault, device and storages to be situated between
Matter can intelligently repair service fault.
In order to reach the object of the invention, the present invention provides a kind of restorative procedure of service fault, the method includes:
The service operation of real time monitoring judges the whether existing event of the failure when the service failure state
Hinder in knowledge base, in the fault knowledge library there is no when the failure when breaking, the failure is restored according to preset strategy.
Further, the method further includes:
When judging in the fault knowledge library there are when the failure, the solution described in the fault knowledge library
Scheme restores the failure.
Further, the method further includes:Record the running log information of the service;
It is described to include according to the preset strategy recovery failure:
Obtain the running log information of the service;
The reason of failure being obtained according to the log information;
According to inquiring solution in preset program the reason of the failure;
Restore the failure according to the solution.
Further, after the failure according to preset strategy recovery, further include:
The evaluation result to the Petri Nets is obtained, when the evaluation result is correct, by the failure
Failure cause and solution are stored in the fault knowledge library.
Further, it when the service failure state, opens preset timer and starts timing;
When reaching the preset time when the timer, failure does not release yet, sends out failure and artificially handles alarm;
Restore failure according to the instruction of input.
In order to reach the object of the invention, the present invention also provides a kind of prosthetic device of service fault, described device includes:
Monitoring module, recovery module, wherein:
The monitoring module, the service operation for real time monitoring;
The recovery module, for when the service failure state, judging the whether existing failure of the failure
In knowledge base, it in the fault knowledge library there is no when the failure when breaking, the failure is restored according to preset strategy.
Further, described device further includes:
When the recovery module is judged in the fault knowledge library there are when the failure, according to the fault knowledge library
Described in solution restore the failure.
Further, described device further includes:Logging modle, the logging modle are used to record the operation day of the service
Will information;
The recovery module restores the failure according to preset strategy:
The recovery module obtains the running log information of the service;
The reason of recovery module obtains the failure according to the log information;
The recovery module in preset program the reason of the failure according to inquiring solution;
The recovery module restores the failure according to the solution.
Further, further include deposit module after the failure according to preset strategy recovery:
It is stored in evaluation result of the module acquisition to the Petri Nets, when the evaluation result is correct, by institute
The failure cause and solution for stating failure are stored in the fault knowledge library.
In order to reach the object of the invention, the present invention also provides a kind of computer readable storage mediums, are stored thereon with meter
Calculation machine program, when which is executed by processor the step of the realization above method.
Compared with prior art, the present invention includes the service operation of real time monitoring, when the service failure state,
Judge in the failure whether existing fault knowledge library, when break in the fault knowledge library be not present the failure when, root
Restore the failure according to preset strategy.Real time monitoring and intelligent repair process, are repaired automatically when service being enable to break down.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The purpose of the present invention and other advantages can be by specification, rights
Specifically noted structure is realized and is obtained in claim and attached drawing.
Description of the drawings
Attached drawing is used for providing further understanding technical solution of the present invention, and a part for constitution instruction, with this
The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 is the flow chart of the restorative procedure of one service fault of the embodiment of the present invention;
Fig. 2 is another flow chart of the restorative procedure of two service fault of the embodiment of the present invention;
Fig. 3 is the structural schematic diagram of the prosthetic device of three service fault of the embodiment of the present invention.
Specific implementation mode
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention
Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application
Feature mutually can arbitrarily combine.
Step shown in the flowchart of the accompanying drawings can be in the computer system of such as a group of computer-executable instructions
It executes.Also, although logical order is shown in flow charts, and it in some cases, can be with suitable different from herein
Sequence executes shown or described step.
Embodiment one
The present invention provides a kind of restorative procedures of service fault, as shown in Figure 1, being somebody's turn to do, the method includes S11-S12:
The service operation of S11, real time monitoring;
S12, when servicing failure state, in failure judgement whether existing fault knowledge library, when disconnected be out of order is known
Know in library there is no when failure, failure is restored according to preset strategy.
In the embodiment of the present invention, when servicing failure state, failure is restored according to preset strategy.Compared to routine
Artificial manual handle can intelligently repair service fault.
When the service failure state, the method further includes:Send out fault warning.
In the present embodiment, further include:
When judging in fault knowledge library there are when failure, the solution described in fault knowledge library restore therefore
Barrier.
Solution described in fault knowledge library restores failure, including:The failure for obtaining the service of failure is former
Cause finds corresponding solution according to service fault reason in fault knowledge library, obtains and server state is matched repaiies
Multiple program, is repaired automatically.
In the present embodiment, further include:Record the running log information of service;
It is described to include according to preset strategy recovery failure:
Obtain the running log information of service;
The reason of failure being obtained according to log information;
According to inquiring solution in preset program the reason of failure;
Restore failure according to solution.
In a program the reason of preset failure and the solution of failure cause, according to obtaining keyword message in daily record,
Fault type and failure cause are analyzed, solution is matched to according to failure cause, is carried out automatically according to fault solution
It repairs.
Optionally, it is described according to it is preset strategy restore failure after, further include:
The evaluation result to Petri Nets is obtained, when evaluation result is correct, the failure cause of failure is conciliate
Certainly in scheme deposit fault knowledge library.
In one alternate embodiment, it when servicing failure state, opens preset timer and starts timing;
When reaching the preset time when timer, failure does not release yet, sends out failure and artificially handles alarm;
Restore failure according to the instruction of input.
Optionally, the service is Web service.
In the embodiment of the present invention, the service operation of real time monitoring judges the event when the service failure state
Whether barrier in existing fault knowledge library, when break the failure is not present in the fault knowledge library when, according to preset plan
Slightly restore the failure, compared to conventional artificial manual handle, service can be alerted in real time, and is diagnosed to be service event
The reason of barrier, analyzes failure cause, automatic to repair failure so that service recovery normal operation, timely and effectively solves clothes
The failure of business influences caused by reducing failure.
Embodiment two
The method of above-described embodiment is specifically described in the present embodiment.
First, the status information of monitoring module real time monitoring service sends out alarm and according to clothes when servicing the when of breaking down
The analyzing failure causes such as business state and log information navigate to the reason of failure occurs and automatically carry out failure after solution
Reparation, make service normal operation.Failure cause, analysis result, the information such as solution can be carried out in fault treating procedure
It preserving, user can evaluate troubleshooting result, when processing mode is correct, when there is such failure in next time, this
Processing mode identical with this may be used in program, further shortens fault handling time.
As shown in Fig. 2, including mainly following steps when the monitoring of web services is realized with intelligent restorative procedure:
(1) status information of web services is monitored in real time;
(2) when service is broken down, fault warning is sent out in real time, real-time informing user while carries out in next step automatically
Fault analysis and handling;
(3) Analysis Service failure cause.It is former caused by the information analyses failure such as running log according to the status information of service
Cause.If there is identical failure in fault knowledge library, fault restoration is directly carried out;If newly-increased failure, which is arrived
In fault knowledge library;
(4) positioning failure reason and determining solution, are handled the failure of application service, according to the daily record of analysis
The reason of failure, targetedly repairs failure, makes service recovery normal operation.
(5) user can evaluate troubleshooting situation, if failure cause and fault repairing method are correct, occur later
It is directly handled according to fault knowledge library processing mode when same fault.
The present invention can be used JAVA modes and realize, the status information of monitoring module real-time collecting service, according to the shape of service
State information judges service whether normal operation, when servicing failure, sends out alarm notification user and carries out intelligent fault point
Analysis, analysis module find failure cause and solution according to the information analyses failure cause such as service state and running log, and
The failure of analysis and solution can be stayed shelves, the processing of failure can be directly carried out when identical failure occurring next time,
After navigating to the failure cause and scheme of service, the reparation of failure is carried out the reason of automatically according to failure.
Such as after monitoring service stopping operation, service can be started;When the configuration file for the service that monitors exists
When mistake, the mistake of configuration file can be repaired automatically and starts service, makes service normal operation.Meeting in fault treating procedure
By failure cause, analysis result, the information such as solution are preserved, and user can evaluate troubleshooting result, when
When processing mode is correct, when such failure occurs in next time, processing mode identical with this may be used in this program, further
Shorten fault handling time.
Embodiment three
In order to reach the object of the invention, the present invention also provides a kind of prosthetic devices of service fault, shown in Fig. 3, the dress
Set including:Monitoring module 31, recovery module 32, wherein:
Monitoring module 31, the service operation for real time monitoring;
Recovery module 32 is used for when servicing failure state, in failure judgement whether existing fault knowledge library, when
It is disconnected to be out of order in knowledge base there is no when failure, failure is restored according to preset strategy.
In the present embodiment, which further includes:
When recovery module 32 is judged in fault knowledge library there are when failure, the solution party described in fault knowledge library
Case restores failure.
In the present embodiment, which further includes:Logging modle 33, logging modle 33 are used to record the running log of service
Information;
Recovery module 32 restores failure according to preset strategy:
Recovery module 32 obtains the running log information of service;
The reason of recovery module 32 obtains failure according to log information;
Recovery module 32 in preset program the reason of failure according to inquiring solution;
Recovery module 32 restores failure according to solution.
Optionally, it is described according to it is preset strategy restore failure after, further include deposit module 34:
It is stored in the evaluation result that module 34 is obtained to Petri Nets and is stored in module 34 when evaluation result is correct
It will be in the failure cause of failure and solution deposit fault knowledge library.
The embodiment of the present invention also provides a kind of computer storage media, and the computer storage media is stored with computer journey
Sequence;After the computer program is performed, the attack evidence collecting method that previous embodiment provides can be realized, for example, executing such as Fig. 1
In shown method.
Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use
Embodiment is not limited to the present invention.Technical staff in any fields of the present invention is taken off not departing from the present invention
Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation
Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.
Claims (10)
1. a kind of restorative procedure of service fault, which is characterized in that the method includes:
The service operation of real time monitoring judges that the whether existing failure of the failure is known when the service failure state
Know in library, in the fault knowledge library there is no when the failure when breaking, the failure is restored according to preset strategy.
2. according to the method described in claim 1, it is characterized in that, the method further includes:
When judging in the fault knowledge library there are when the failure, the solution described in the fault knowledge library
Restore the failure.
3. according to the method described in claim 1, it is characterized in that, the method further includes:Record the operation day of the service
Will information;
It is described to include according to the preset strategy recovery failure:
Obtain the running log information of the service;
The reason of failure being obtained according to the log information;
According to inquiring solution in preset program the reason of the failure;
Restore the failure according to the solution.
4. according to the method described in claim 3, it is characterized in that, it is described the failure is restored according to preset strategy after,
Further include:
The evaluation result to the Petri Nets is obtained, when the evaluation result is correct, by the failure of the failure
Reason and solution are stored in the fault knowledge library.
5. according to the method described in claim 1, it is characterized in that, when the service failure state, open preset
Timer starts timing;
When reaching the preset time when the timer, failure does not release yet, sends out failure and artificially handles alarm;
Restore failure according to the instruction of input.
6. a kind of prosthetic device of service fault, which is characterized in that described device includes:Monitoring module, recovery module, wherein:
The monitoring module, the service operation for real time monitoring;
The recovery module, for when the service failure state, judging the whether existing fault knowledge of the failure
In library, it in the fault knowledge library there is no when the failure when breaking, the failure is restored according to preset strategy.
7. device according to claim 6, which is characterized in that described device further includes:
When the recovery module is judged to remember according in the fault knowledge library there are when the failure in the fault knowledge library
The solution of load restores the failure.
8. device according to claim 6, which is characterized in that described device further includes:Logging modle, the logging modle
Running log information for recording the service;
The recovery module restores the failure according to preset strategy:
The recovery module obtains the running log information of the service;
The reason of recovery module obtains the failure according to the log information;
The recovery module in preset program the reason of the failure according to inquiring solution;
The recovery module restores the failure according to the solution.
9. device according to claim 8, which is characterized in that after the failure according to preset strategy recovery,
It further include deposit module:
It is stored in evaluation result of the module acquisition to the Petri Nets, it, will the event when the evaluation result is correct
The failure cause and solution of barrier are stored in the fault knowledge library.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The step of any one of claim 1-5 the methods are realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810665134.3A CN108776625A (en) | 2018-06-26 | 2018-06-26 | A kind of restorative procedure of service fault, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810665134.3A CN108776625A (en) | 2018-06-26 | 2018-06-26 | A kind of restorative procedure of service fault, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108776625A true CN108776625A (en) | 2018-11-09 |
Family
ID=64026382
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810665134.3A Pending CN108776625A (en) | 2018-06-26 | 2018-06-26 | A kind of restorative procedure of service fault, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108776625A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109474470A (en) * | 2018-11-27 | 2019-03-15 | 郑州云海信息技术有限公司 | One kind is from monitoring method and device |
CN109757771A (en) * | 2019-02-22 | 2019-05-17 | 红云红河烟草(集团)有限责任公司 | Filter-stick forming device shuts down duration calculation method and computing device |
CN110011854A (en) * | 2019-04-12 | 2019-07-12 | 苏州浪潮智能科技有限公司 | MDS fault handling method, device, storage system and computer readable storage medium |
CN112286797A (en) * | 2020-09-29 | 2021-01-29 | 长沙市到家悠享网络科技有限公司 | Service monitoring method and device, electronic equipment and storage medium |
CN112988537A (en) * | 2021-03-11 | 2021-06-18 | 山东英信计算机技术有限公司 | Server fault diagnosis method and device and related equipment |
WO2021143483A1 (en) * | 2020-01-17 | 2021-07-22 | 中兴通讯股份有限公司 | System maintenance method and apparatus, device, and storage medium |
CN112286797B (en) * | 2020-09-29 | 2024-05-03 | 长沙市到家悠享网络科技有限公司 | Service monitoring method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130286859A1 (en) * | 2011-04-21 | 2013-10-31 | Huawei Technologies Co., Ltd. | Fault detection method and system |
CN103838637A (en) * | 2014-03-03 | 2014-06-04 | 江苏智联天地科技有限公司 | Terminal automatic fault diagnosis and restoration method on basis of data mining |
CN105262616A (en) * | 2015-09-21 | 2016-01-20 | 浪潮集团有限公司 | Failure repository-based automated failure processing system and method |
-
2018
- 2018-06-26 CN CN201810665134.3A patent/CN108776625A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130286859A1 (en) * | 2011-04-21 | 2013-10-31 | Huawei Technologies Co., Ltd. | Fault detection method and system |
CN103838637A (en) * | 2014-03-03 | 2014-06-04 | 江苏智联天地科技有限公司 | Terminal automatic fault diagnosis and restoration method on basis of data mining |
CN105262616A (en) * | 2015-09-21 | 2016-01-20 | 浪潮集团有限公司 | Failure repository-based automated failure processing system and method |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109474470A (en) * | 2018-11-27 | 2019-03-15 | 郑州云海信息技术有限公司 | One kind is from monitoring method and device |
CN109757771A (en) * | 2019-02-22 | 2019-05-17 | 红云红河烟草(集团)有限责任公司 | Filter-stick forming device shuts down duration calculation method and computing device |
CN110011854A (en) * | 2019-04-12 | 2019-07-12 | 苏州浪潮智能科技有限公司 | MDS fault handling method, device, storage system and computer readable storage medium |
CN110011854B (en) * | 2019-04-12 | 2022-03-04 | 苏州浪潮智能科技有限公司 | MDS fault processing method, device, storage system and computer readable storage medium |
WO2021143483A1 (en) * | 2020-01-17 | 2021-07-22 | 中兴通讯股份有限公司 | System maintenance method and apparatus, device, and storage medium |
CN112286797A (en) * | 2020-09-29 | 2021-01-29 | 长沙市到家悠享网络科技有限公司 | Service monitoring method and device, electronic equipment and storage medium |
CN112286797B (en) * | 2020-09-29 | 2024-05-03 | 长沙市到家悠享网络科技有限公司 | Service monitoring method and device, electronic equipment and storage medium |
CN112988537A (en) * | 2021-03-11 | 2021-06-18 | 山东英信计算机技术有限公司 | Server fault diagnosis method and device and related equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108776625A (en) | A kind of restorative procedure of service fault, device and storage medium | |
CN105337765B (en) | A kind of distribution hadoop cluster automatic fault diagnosis repair system | |
US10545807B2 (en) | Method and system for acquiring parameter sets at a preset time interval and matching parameters to obtain a fault scenario type | |
CN107612756A (en) | A kind of operation management system with intelligent trouble analyzing and processing function | |
CN104125085B (en) | A kind of data management-control method and device based on ESB | |
CN106649040A (en) | Automatic monitoring method and device for performance of Weblogic middleware | |
CN104750596B (en) | A kind of alarm information processing method and service subsystem | |
CN110581773A (en) | automatic service monitoring and alarm management system | |
CN106789306A (en) | Restoration methods and system are collected in communication equipment software fault detect | |
CN109462490B (en) | Video monitoring system and fault analysis method | |
CN107995255A (en) | A kind of method and its system of remote monitoring intelligent cabinet | |
US20210271555A1 (en) | Traffic data self-recovery processing method, readable storage medium, server and apparatus | |
CN106452811B (en) | A kind of malfunction elimination method and system | |
CN107766208A (en) | A kind of method, system and device of monitoring business system | |
CN110808856A (en) | Big data operation and maintenance method and system based on data center | |
CN113485220A (en) | Cloud cooperation method and system for simplifying field network diagnosis of operation and maintenance personnel | |
CN109032058A (en) | A kind of device management method, device, system and storage medium | |
CN107846314A (en) | A kind of intelligent operation management system | |
CN108681780A (en) | A kind of device management method, apparatus and system based on collection control big data | |
CN114493203A (en) | Method and device for safety arrangement and automatic response | |
CN110311802A (en) | Network operation method, device, electronic equipment and storage medium | |
CN113471864A (en) | Transformer substation secondary equipment field maintenance device and method | |
CN116560893B (en) | Computer application program operation data fault processing system | |
CN112803587A (en) | Intelligent inspection method for state of automatic equipment based on diagnosis decision library | |
CN113760634A (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181109 |
|
RJ01 | Rejection of invention patent application after publication |