WO2020119369A1 - Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium - Google Patents

Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium Download PDF

Info

Publication number
WO2020119369A1
WO2020119369A1 PCT/CN2019/117548 CN2019117548W WO2020119369A1 WO 2020119369 A1 WO2020119369 A1 WO 2020119369A1 CN 2019117548 W CN2019117548 W CN 2019117548W WO 2020119369 A1 WO2020119369 A1 WO 2020119369A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault point
detection
target
fault
return value
Prior art date
Application number
PCT/CN2019/117548
Other languages
French (fr)
Chinese (zh)
Inventor
方振宇
Original Assignee
平安普惠企业管理有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安普惠企业管理有限公司 filed Critical 平安普惠企业管理有限公司
Publication of WO2020119369A1 publication Critical patent/WO2020119369A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/50Testing of electric apparatus, lines, cables or components for short-circuits, continuity, leakage current or incorrect line connections

Definitions

  • This application relates to the field of computer technology, and in particular to an intelligent IT operation and maintenance fault location method, device, equipment, and readable storage medium.
  • the main purpose of this application is to provide an intelligent IT operation and maintenance fault location method, device, equipment and readable storage medium, aiming to solve the existing IT system operation and maintenance faults, the location efficiency of the fault repair node is low, resulting in a repair cycle Long technical problems.
  • the intelligent IT operation and maintenance fault location method includes:
  • Health detection is performed on the potential failure point, and the first detection return value is obtained
  • the target fault point and the second detection return value are output.
  • the step of performing health detection on the potential failure point and obtaining the first detection return value includes:
  • the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  • the step of performing health detection on the potential failure point and obtaining the first detection return value further includes:
  • step of outputting the target fault point and the second detection return value includes:
  • the method includes:
  • the steps of acquiring and selecting a target emergency plan corresponding to the target fault point from a pre-stored plan database according to the fault state of the target fault point include:
  • the emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
  • the intelligent IT operation and maintenance fault locating device includes:
  • a receiving module used to receive a failure analysis report and obtain all potential failure points in the failure analysis report
  • execution module which includes:
  • a health detection submodule configured to perform health detection on the potential failure point and obtain a first detection return value
  • a first obtaining submodule configured to obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point, until no new fault point is generated, the The fault point corresponding to the new fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
  • the output submodule is used to output the target fault point and the second detection return value.
  • the health detection sub-module includes:
  • a positioning unit configured to locate the operation source and operation process cited by the potential failure point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
  • the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  • the health detection sub-module includes:
  • a first obtaining unit configured to obtain the node type of the potential failure point, and obtain a detection function corresponding to the node type from a preset tool library;
  • the second obtaining unit obtains the detection index of the detection function, and performs a sniffing test on the index data in the potential fault point according to the detection index to obtain a first detection return value of the potential fault point.
  • the intelligent IT operation and maintenance fault locating device further includes:
  • the first obtaining module is used to obtain and select the target emergency plan corresponding to the target fault point from the pre-stored scheme database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point ;
  • the re-detection module is used to perform health detection on the target failure point again after the execution of the target emergency plan.
  • the intelligent IT operation and maintenance fault locating device further includes:
  • a second obtaining module configured to obtain a third detection return value obtained after performing health detection on the target failure point again, and determine whether the third detection return value points to a new failure point;
  • the output module is configured to output a warning message that cannot be automatically processed if the third detection return value points to a new fault point.
  • the first obtaining module includes:
  • the second obtaining submodule is used to obtain and obtain all the emergency plans corresponding to the target fault point according to the fault state of the target fault point; if there are multiple emergency plans, the statistics of the past historical time period The frequency with which the target failure point successfully passed the health detection after the execution of each emergency plan;
  • a selection submodule is used to select the emergency plan with the highest passing frequency from the pre-stored plan database, and set the emergency plan with the highest passing frequency as the target emergency plan.
  • the present application also provides an intelligent IT operation and maintenance fault locating device
  • the intelligent IT operation and maintenance fault locating device includes: a memory, a processor, a communication bus, and an intelligent IT operation stored on the memory Dimensional fault locating computer readable instructions
  • the communication bus is used to realize the communication connection between the processor and the memory
  • the processor is used to execute the intelligent IT operation and maintenance fault location computer readable instructions to achieve the following steps:
  • Health detection is performed on the potential failure point, and the first detection return value is obtained
  • the target fault point and the second detection return value are output.
  • the present application also provides a readable storage medium that stores one or more computer-readable instructions, and the one or more computer-readable instructions may be used by one or one The above processor executes for:
  • Health detection is performed on the potential failure point, and the first detection return value is obtained
  • the target fault point and the second detection return value are output.
  • This application receives the failure analysis report and obtains all potential failure points in the failure analysis report; for each potential failure point, the following steps are performed: health detection is performed on the potential failure point, and the first detection return value is obtained ; Obtaining a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point until no new fault point is generated, corresponding to the new fault point is no longer generated The fault point of is determined as the target fault point, and the second detection return value of the target fault point is obtained; the target fault point and the second detection return value are output.
  • the potential failure point is automatically obtained, and continuous iterative health detection is automatically performed on the potential failure point, rather than manual detection, to quickly obtain the target failure point, that is Quickly locate the target fault point, because it quickly locates the target fault point, it saves the positioning time, so it also saves the repair time accordingly, and improves the experience of users who are O&M personnel. Therefore, the technical problem of locating the fault repair node in the prior art is low, consumes valuable repair time, prolongs the repair cycle, and affects the user experience.
  • FIG. 1 is a schematic flowchart of a first embodiment of a smart IT operation and maintenance fault location method of this application;
  • FIG. 2 is a detailed flow diagram of the step of performing health detection on the potential fault point and obtaining the first detection return value in the intelligent IT operation and maintenance fault locating method of the present application;
  • FIG. 3 is a schematic diagram of the device structure of the hardware operating environment involved in the method of the embodiment of the present application.
  • the present application provides an intelligent IT O&M fault location method.
  • the intelligent IT O&M fault location method includes:
  • Step S10 Receive a failure analysis report and obtain all potential failure points in the failure analysis report
  • Step S20 health detection is performed on the potential failure point, and a first detection return value is obtained
  • Step S30 Obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point until no new fault point is generated, and then the new fault point is not generated.
  • the fault point corresponding to the point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
  • Step S40 output the target fault point and the second detection return value.
  • Step S10 Receive a failure analysis report and obtain all potential failure points in the failure analysis report
  • positioning detection is performed on a plurality of potential fault points in the forefront of credibility, so as to finally detect the determined target fault point.
  • the intelligent IT operation and maintenance fault location method is applied to the intelligent IT operation and maintenance fault location system.
  • the intelligent IT operation and maintenance fault analysis system that communicates with the intelligent IT operation and maintenance fault location system will get a certain time Alarm information of all related parties.
  • the intelligent IT operation and maintenance analysis system will analyze the alarm information of each related party according to the pre-stored fault analysis computer-readable instructions, obtain a fault analysis report, and report the fault
  • the analysis report is sent to the intelligent IT operation and maintenance fault location system, where the fault analysis report lists various potential fault points.
  • all potential fault points in the fault analysis report can be parsed and obtained.
  • node A in the current system cannot call node B, while the report Node A and node B are listed as potential failure points at the same time.
  • the system will directly obtain all potential failure points of node A and node B, and locate and detect all potential failure points, namely node A and node B, to determine whether there is a failure of node A, or a failure of node B, or A Both node and node B have a fault, and further determine the specific fault flow or source of the faulty node.
  • Step S20 health detection is performed on the potential failure point, and a first detection return value is obtained
  • the intelligent IT operation and maintenance fault location system performs health detection on each potential fault point to obtain each first detection return value.
  • the step of performing health detection on the potential fault point and obtaining the first detection return value includes:
  • Step S21 Locate the operation source and operation process cited by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
  • the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  • the potential failure node is assumed to be a normal node, and the operation source and each operation process referenced by the potential failure point are located, and the operation source and each operation process are healthy Detection, specifically, detection is performed through pre-stored operation sources and data I/O indicators of each operation process in a normal state.
  • a detection function corresponding to the node type may be obtained from a preset tool library, and the detection function may be obtained Detection index, and perform a sniffing test on the operation source and operation process in the potential fault point according to the detection index, and after sniffing detection, then accurately select the operation source and each operation process under normal conditions
  • the pre-stored data I/O indicators are detected to obtain the first detection return value of the potential fault point, so as to save the detection process.
  • the three sequential links A1, A2, and A3 need to be executed in the A fault point, that is, the three calculation processes of A1, A2, and A3.
  • the intelligent IT operation and maintenance fault location system starts from the A1 link. By entering the preset starting parameters (data I/O indicators) in the A1 link, the A1 link will get an operation value to judge the operation value and the preset result value Is it consistent. If they are consistent, it means that there is no problem in the A1 link. At this time, the intelligent IT operation and maintenance fault location system will receive the first detection return value of the A1 link without problems, such as a10. Otherwise, it means that there is a problem in the A1 link.
  • the intelligent IT The operation and maintenance fault location system will receive the first detection return value of the problem in the A1 link, such as a11, that is, the intelligent IT operation and maintenance fault determination system will locate the operation source and operation process referenced in the A1 link, so as to obtain the corresponding The return value of each first probe. Then detect the subsequent A2 link, the principle is the same as the A1 link. Finally, all the first detection return values in the A fault point are obtained. If the fault point A is complete and normal, then it is the fault point A that provides the value of the A3 link to the fault point B, and the fault point B has an error. At this time, the detection return value of the fault point B node can be obtained as b.
  • a fault point is the product order node
  • B fault point is the order database
  • the fault condition is that A fault point cannot call the order content in the B fault point database.
  • the intelligent IT operation and maintenance fault locating system will determine whether the order number of the A fault point is correct, and detect the order call of the A fault point by detecting the known order number to determine whether the A fault point normally retrieves the order number in the B node. The order number in node B can be retrieved normally, and then the fault point A can be used to query the record of the number. If the order number can be used to query the record, then point A can determine the order content in the record.
  • the system will locate the step node with different step result. For example, when the intelligent IT operation and maintenance fault location system locates the fault point A and calls the order record in the fault point B, node B cannot feedback the corresponding record information, then the intelligent IT operation and maintenance fault location system will detect which step in the calling process caused the error Failure, and return a return value of the first probe representing the failure of the step call.
  • Step S30 Obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point until no new fault point is generated, and then the new fault point is not generated.
  • the fault point corresponding to the point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
  • the return value of the first detection may point to a new fault point. Therefore, the intelligent IT operation and maintenance fault location system will perform another health detection on the new fault point to obtain a new return value of the first detection.
  • the new first detection return value points to another fault point, and iterates the above steps cyclically until no new fault point is generated in the end.
  • the fault point corresponding to the new fault point is determined as the target fault point, and The second detection return value of the target fault point.
  • the intelligent IT operation and maintenance fault location system needs to iteratively detect potential fault points, that is, iteratively obtain the first detection return value of each potential fault point. If the first detection return value points to a new fault point, it means data detection Not detected in the end, if no new fault point is generated, it means that the system has traversed all the current fault points that may be abnormal. At this time, after the operation of detecting multiple potential failure points to the end, the intelligent IT operation and maintenance fault location system will obtain one or more target failure points pointed by the first detection return value multiple times, that is, the intelligent IT operation
  • the dimension fault location system has extracted the common fault points (target fault points) generated by the intersection of each potential fault point. There can be more than one common fault point, which is the source of data offset for all associated fault points.
  • A when A calls B, A generates a fault, and A calls C to generate a fault, but B calls C to generate a fault, then as a common intersection fault point of BC, A is the source fault point.
  • Step S40 output the target fault point and the second detection return value.
  • the target fault point and the second detection return value are obtained, the target fault point and the second detection return value are output to prompt the user or the operation and maintenance personnel.
  • This application receives the failure analysis report and obtains all potential failure points in the failure analysis report; for each potential failure point, the following steps are performed: health detection is performed on the potential failure point, and the first detection return value is obtained ; Obtaining a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point until no new fault point is generated, corresponding to the new fault point is no longer generated The fault point of is determined as the target fault point, and the second detection return value of the target fault point is obtained; the target fault point and the second detection return value are output.
  • the potential failure point is automatically obtained, and continuous iterative health detection is automatically performed on the potential failure point, rather than manual detection, to quickly obtain the target failure point, that is Quickly locate the target fault point, because it quickly locates the target fault point, it saves the positioning time, so it also saves the repair time accordingly, and improves the experience of users who are O&M personnel. Therefore, the technical problem of locating the fault repair node in the prior art is low, consumes valuable repair time, prolongs the repair cycle, and affects the user experience.
  • the present application provides another embodiment of an intelligent IT operation and maintenance fault locating method.
  • the step of performing health detection on the potential fault point and obtaining the first detection return value further include:
  • Step S22 Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
  • different potential fault points have respective node types in the intelligent IT O&M fault location system, and different node types have corresponding ones in the preset function library of the intelligent IT O&M fault location system system
  • Exclusive detection function to obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library, for example, if the potential failure point is a network communication node, then the system will be preset
  • the network detection function mapped to the network communication node category is obtained from the function library.
  • Step S23 Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
  • the detection indicators of the network detection function are network link status, data transmission rate, and so on.
  • a sniffing test is performed on the index data in the corresponding potential failure points through the detection function.
  • the sniffer test is to classify and filter the index data in the potential failure point, so as to filter out the index data of the same type as the detection index in the index data, and perform traceability detection on the selected index data of the same type.
  • each first detection return value or second detection return value corresponding to the potential fault point is further obtained.
  • the current detection index detects the network connection status in the potential fault point
  • the detection step of the system may include the following steps: intelligent IT operation and maintenance fault location to determine the network connection status of the connected dual-end object, and initiate the establishment from node A to node B
  • the system determines the ip1 address of node A, obtains the ip2 address of node B, and establishes whether the DNS resolution service between node A and node B is correct, and so on.
  • the detection function the output and input of all network index data involved in the network connection are tested to confirm which part of each process has a problem.
  • the node type of the potential failure point is obtained, and the detection function corresponding to the node type is obtained from a preset tool library; the detection index of the detection function is obtained, and according to the detection
  • the indicator performs a sniffing test on the indicator data in the potential fault point to obtain the first detection return value of the potential fault point. Due to the accurate sniffer test, it can lay the foundation for orderly and rapid location to obtain the target fault point.
  • this application provides another embodiment of the intelligent IT operation and maintenance fault locating method.
  • the step of outputting the target fault point and the second detection return value includes:
  • Step S50 Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
  • a plan database is pre-stored, and the plan database includes various emergency plans for the node type or fault state of the target fault point, and is used to solve the fault situation of the target fault point.
  • the system After determining the target failure point, the system directly retrieves and executes the corresponding target emergency plan from the system database.
  • Step S60 after executing the target emergency plan, perform health detection on the target fault point again.
  • the steps are the same as the above-mentioned health detection steps.
  • the method includes:
  • Step S70 Obtain a third detection return value obtained after performing health detection on the target fault point again, and determine whether the third detection return value points to a new fault point;
  • the target fault point is re-executed After health detection, if a new fault point is obtained, it is clear that the intelligent IT operation and maintenance fault location system has not solved the fault state of the corresponding target fault point.
  • Step S80 If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
  • the intelligent IT operation and maintenance fault locating system cannot automatically complete the processing of the above-mentioned fault status, it is necessary to output a warning message that cannot be automatically processed, so that the operation and maintenance personnel can perform manual processing to improve the fault tolerance performance of the intelligent IT operation and maintenance fault locating system.
  • the target emergency plan corresponding to the target fault point is selected from the pre-stored plan database by acquiring and according to the fault state of the target fault point, and the target emergency plan is executed on the target fault point ; After executing the target emergency plan, re-health the target fault point. Therefore, the possible inconsistency between the target emergency plan and the fault state of the target fault point can be avoided, and the fault tolerance of the intelligent IT operation and maintenance fault location system can be improved.
  • this application provides another embodiment of the intelligent IT operation and maintenance fault locating method.
  • the target fault is acquired and selected from the pre-stored solution database according to the fault status of the target fault point
  • the steps of the target emergency plan corresponding to the points include:
  • Step S51 Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count and execute the emergency plans in the past historical time period The frequency of passing the target failure point successfully passing the health detection;
  • the system counts the frequency of passing the target fault point through the health detection after the emergency plans are executed in the past historical time period.
  • step S52 the emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
  • the system automatically recognizes the number of successful emergency plans that directly pass health detection, selects the emergency plan that passes the highest frequency from the pre-stored plan database, sets it as the highest priority recommended plan, and recommends the implementation of priority in the future emergency plan matching, Therefore, the system sets the emergency plan with the highest passing frequency as the target emergency plan.
  • the target failure point successfully passes the health detection pass frequency; the emergency plan with the highest pass frequency is selected from the pre-stored plan database, and the emergency plan with the highest pass frequency is set as the target emergency plan. Therefore, the target fault point can be resolved most quickly, and thus the experience of the operation and maintenance personnel, that is, the user can be improved.
  • FIG. 3 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the present application.
  • the intelligent IT operation and maintenance fault locating device in the embodiment of the present application may be a PC, or may be a smartphone, tablet computer, e-book reader, MP3 (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio layer 3 player, MP4 (Moving Picture Experts Group Audio Layer IV, the standard audio layer for motion picture experts compression 3) Terminal devices such as players and portable computers.
  • MP3 Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio layer 3 player
  • MP4 Moving Picture Experts Group Audio Layer IV, the standard audio layer for motion picture experts compression 3
  • Terminal devices such as players and portable computers.
  • the intelligent IT operation and maintenance fault locating device may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to implement connection communication between the processor 1001 and the memory 1005.
  • the memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as disk storage.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • the intelligent IT operation and maintenance fault locating device may further include a target user interface, a network interface, a camera, and RF (Radio Frequency (radio frequency) circuits, sensors, audio circuits, WiFi modules, etc.
  • the target user interface may include a display (Display) and an input sub-module, such as a keyboard (Keyboard), and the optional target user interface may also include a standard wired interface and a wireless interface.
  • the network interface may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the structure of the intelligent IT O&M fault locating device shown in FIG. 3 does not constitute a limitation on the intelligent IT O&M fault locating device, and may include more or fewer components than the illustration, or a combination Some components, or different component arrangements.
  • the memory 1005 as a computer-readable storage medium may include an operating system, a network communication module, and computer-readable instructions for intelligent IT operation and maintenance fault location.
  • the operating system is a computer-readable instruction that manages and controls the hardware and software resources of the intelligent IT O&M fault location equipment, and supports the operation of the intelligent IT O&M fault location computer-readable command and other software and/or computer-readable instructions.
  • the network communication module is used to realize communication between various components inside the memory 1005, and to communicate with other hardware and software in the intelligent IT operation and maintenance fault locating device.
  • the processor 1001 is configured to execute the intelligent IT O&M fault locating computer readable instructions stored in the memory 1005 to implement the intelligent IT O&M fault described in any one of the above Steps of positioning method.
  • the specific implementation of the intelligent IT operation and maintenance fault locating device of the present application is basically the same as the above-mentioned embodiments of the intelligent IT operation and maintenance fault locating method, which will not be repeated here.
  • This application also provides an intelligent IT operation and maintenance fault locating device.
  • the specific implementation of the intelligent IT operation and maintenance fault locating device in this application is basically the same as the above-mentioned embodiments of the intelligent IT operation and maintenance fault locating method, and details are not described herein again.
  • the present application provides a readable storage medium.
  • the readable storage medium may be a non-volatile readable storage medium.
  • the readable storage medium stores one or more computer-readable instructions.
  • the one or one The above computer readable instructions may also be executed by one or more processors for implementing the steps of the intelligent IT operation and maintenance fault location method described in any one of the above.
  • the specific implementation of the readable storage medium of the present application is basically the same as the above embodiments of the intelligent IT operation and maintenance fault locating method, which will not be repeated here.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

An intelligent IT operation and maintenance fault positioning method, apparatus and device, and a readable storage medium. The method comprises: receiving a fault analysis report, and obtaining all potential fault points in the fault analysis report (S10); executing the following steps for each potential fault point: performing health detection on the potential fault points, and obtaining a first detection return value (S20); obtaining a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point until no new fault point is generated, and determining the fault point corresponding to the no longer generated fault point as a target fault point, to obtain a second detection return value of the target fault point (S30); and outputting the target fault point and the second detection return value (S40). The method solves the technical problem in the operation and maintenance fault of an existing IT system that the positioning efficiency of a fault repairing node is low, and the repairing period is too long.

Description

智能IT运维故障定位方法、装置、设备及可读存储介质 Intelligent IT operation and maintenance fault location method, device, equipment and readable storage medium The
本申请要求于2018年12月13日提交中国专利局、申请号为201811530943.X、发明名称为“智能IT运维故障定位方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application requires the priority of the Chinese patent application submitted to the Chinese Patent Office on December 13, 2018, with the application number 201811530943.X and the invention titled "Intelligent IT O&M fault location method, device, equipment and readable storage medium" , The entire contents of which are incorporated into the application by reference.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种智能IT运维故障定位方法、装置、设备及可读存储介质。This application relates to the field of computer technology, and in particular to an intelligent IT operation and maintenance fault location method, device, equipment, and readable storage medium.
背景技术Background technique
目前,在IT系统运维过程中,不可避免地会出现各类故障事故,为尽可能的减少因故障事故而带来的损失,常常需要迅速地定位故障,在定位后,才能快速找到对应的修复解决方案,由于传统的故障修复工具没有高效的快速定位子工具,都是人工排查,因而使得对故障修复节点的定位效率低下,消耗宝贵的修复时间,延长了修复周期,影响了用户的使用体验。At present, during the operation and maintenance of IT systems, various types of failure accidents will inevitably occur. In order to reduce the losses caused by failure accidents as often as possible, it is often necessary to quickly locate the fault. After the positioning, the corresponding The repair solution, because traditional fault repair tools do not have efficient and rapid sub-tools, they are manually checked, which makes the location of fault repair nodes inefficient, consumes valuable repair time, extends the repair cycle, and affects the user's use Experience.
发明内容Summary of the invention
本申请的主要目的在于提供一种智能IT运维故障定位方法、装置、设备及可读存储介质,旨在解决现有IT系统运维过错中,对故障修复节点的定位效率低下,造成修复周期过长的技术问题。The main purpose of this application is to provide an intelligent IT operation and maintenance fault location method, device, equipment and readable storage medium, aiming to solve the existing IT system operation and maintenance faults, the location efficiency of the fault repair node is low, resulting in a repair cycle Long technical problems.
为实现上述目的,本申请提供一种智能IT运维故障定位方法,所述智能IT运维故障定位方法包括:To achieve the above purpose, the present application provides an intelligent IT operation and maintenance fault location method. The intelligent IT operation and maintenance fault location method includes:
接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
可选地,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤包括:Optionally, the step of performing health detection on the potential failure point and obtaining the first detection return value includes:
定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;Locate the operation source and operation process referenced by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
可选地,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤还包括:Optionally, the step of performing health detection on the potential failure point and obtaining the first detection return value further includes:
获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
可选地,所述输出所述目标故障点与所述第二探测返回值步骤之后包括:Optionally, after the step of outputting the target fault point and the second detection return value includes:
获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。After executing the target emergency plan, health detection is performed on the target fault point again.
可选地,所述在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测步骤之后包括:Optionally, after performing the target emergency plan, after performing the health detection step on the target fault point again, the method includes:
获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;Acquiring a third detection return value obtained after performing health detection on the target fault point again, and determining whether the third detection return value points to a new fault point;
若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
可选地,所述获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案步骤包括:Optionally, the steps of acquiring and selecting a target emergency plan corresponding to the target fault point from a pre-stored plan database according to the fault state of the target fault point include:
获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count the statistics after the execution of the emergency plans in the past historical time period Passing frequency of successfully passing the health detection of the target fault point;
从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。The emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
本申请还提供一种智能IT运维故障定位装置,所述智能IT运维故障定位装置包括:This application also provides an intelligent IT operation and maintenance fault locating device. The intelligent IT operation and maintenance fault locating device includes:
接收模块,用于接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;A receiving module, used to receive a failure analysis report and obtain all potential failure points in the failure analysis report;
针对每个潜在故障点,存在执行模块,所述执行模块包括:For each potential point of failure, there is an execution module, which includes:
健康探测子模块,用于对所述潜在故障点进行健康探测,并获得第一探测返回值;A health detection submodule, configured to perform health detection on the potential failure point and obtain a first detection return value;
第一获取子模块,用于获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;A first obtaining submodule, configured to obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point, until no new fault point is generated, the The fault point corresponding to the new fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
输出子模块,用于输出所述目标故障点与所述第二探测返回值。The output submodule is used to output the target fault point and the second detection return value.
可选地,所述健康探测子模块包括:Optionally, the health detection sub-module includes:
定位单元,用于定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;A positioning unit, configured to locate the operation source and operation process cited by the potential failure point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
可选地,所述健康探测子模块包括:Optionally, the health detection sub-module includes:
第一获取单元,用于获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;A first obtaining unit, configured to obtain the node type of the potential failure point, and obtain a detection function corresponding to the node type from a preset tool library;
第二获取单元,用获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。The second obtaining unit obtains the detection index of the detection function, and performs a sniffing test on the index data in the potential fault point according to the detection index to obtain a first detection return value of the potential fault point.
可选地,所述智能IT运维故障定位装置还包括:Optionally, the intelligent IT operation and maintenance fault locating device further includes:
第一获取模块,用于获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;The first obtaining module is used to obtain and select the target emergency plan corresponding to the target fault point from the pre-stored scheme database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point ;
再次探测模块,用于在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。The re-detection module is used to perform health detection on the target failure point again after the execution of the target emergency plan.
可选地,所述智能IT运维故障定位装置还包括:Optionally, the intelligent IT operation and maintenance fault locating device further includes:
第二获取模块,用于获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;A second obtaining module, configured to obtain a third detection return value obtained after performing health detection on the target failure point again, and determine whether the third detection return value points to a new failure point;
输出模块,用于若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。The output module is configured to output a warning message that cannot be automatically processed if the third detection return value points to a new fault point.
可选地,所述第一获取模块包括:Optionally, the first obtaining module includes:
第二获取子模块,用于获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;The second obtaining submodule is used to obtain and obtain all the emergency plans corresponding to the target fault point according to the fault state of the target fault point; if there are multiple emergency plans, the statistics of the past historical time period The frequency with which the target failure point successfully passed the health detection after the execution of each emergency plan;
选取子模块,用于从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。A selection submodule is used to select the emergency plan with the highest passing frequency from the pre-stored plan database, and set the emergency plan with the highest passing frequency as the target emergency plan.
此外,为实现上述目的,本申请还提供一种智能IT运维故障定位设备,所述智能IT运维故障定位设备包括:存储器、处理器,通信总线以及存储在所述存储器上的智能IT运维故障定位计算机可读指令,In addition, in order to achieve the above object, the present application also provides an intelligent IT operation and maintenance fault locating device, the intelligent IT operation and maintenance fault locating device includes: a memory, a processor, a communication bus, and an intelligent IT operation stored on the memory Dimensional fault locating computer readable instructions,
所述通信总线用于实现处理器与存储器间的通信连接;The communication bus is used to realize the communication connection between the processor and the memory;
所述处理器用于执行所述智能IT运维故障定位计算机可读指令,以实现以下步骤:The processor is used to execute the intelligent IT operation and maintenance fault location computer readable instructions to achieve the following steps:
接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
此外,为实现上述目的,本申请还提供一种可读存储介质,所述可读存储介质存储有一个或者一个以上计算机可读指令,所述一个或者一个以上计算机可读指令可被一个或者一个以上的处理器执行以用于:In addition, to achieve the above purpose, the present application also provides a readable storage medium that stores one or more computer-readable instructions, and the one or more computer-readable instructions may be used by one or one The above processor executes for:
在接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receiving a failure analysis report and obtaining all potential failure points in the failure analysis report;
针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
本申请通过接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;针对每个潜在故障点,执行如下步骤:对所述潜在故障点进行健康探测,并获得第一探测返回值;获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;输出所述目标故障点与所述第二探测返回值。即在本申请中,在接收得到故障分析报告后,即自动获取潜在故障点,并自动对潜在故障点进行持续的迭代健康探测,而不是人工的探测,以快速获取得到目标故障点,即是快速定位到目标故障点,由于快速定位到目标故障点,因而节约了定位时间,因而,也相应地节约了修复时间,提升了用户即是运维人员的体验。因而解决了现有技术中对故障修复节点的定位效率低下,消耗宝贵的修复时间,延长了修复周期,影响了用户的使用体验的技术问题。This application receives the failure analysis report and obtains all potential failure points in the failure analysis report; for each potential failure point, the following steps are performed: health detection is performed on the potential failure point, and the first detection return value is obtained ; Obtaining a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point until no new fault point is generated, corresponding to the new fault point is no longer generated The fault point of is determined as the target fault point, and the second detection return value of the target fault point is obtained; the target fault point and the second detection return value are output. That is, in this application, after receiving the failure analysis report, the potential failure point is automatically obtained, and continuous iterative health detection is automatically performed on the potential failure point, rather than manual detection, to quickly obtain the target failure point, that is Quickly locate the target fault point, because it quickly locates the target fault point, it saves the positioning time, so it also saves the repair time accordingly, and improves the experience of users who are O&M personnel. Therefore, the technical problem of locating the fault repair node in the prior art is low, consumes valuable repair time, prolongs the repair cycle, and affects the user experience.
附图说明BRIEF DESCRIPTION
图1为本申请智能IT运维故障定位方法第一实施例的流程示意图;FIG. 1 is a schematic flowchart of a first embodiment of a smart IT operation and maintenance fault location method of this application;
图2为本申请智能IT运维故障定位方法中所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤的细化流程示意图;2 is a detailed flow diagram of the step of performing health detection on the potential fault point and obtaining the first detection return value in the intelligent IT operation and maintenance fault locating method of the present application;
图3是本申请实施例方法涉及的硬件运行环境的设备结构示意图。FIG. 3 is a schematic diagram of the device structure of the hardware operating environment involved in the method of the embodiment of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional characteristics and advantages of the present application will be further described in conjunction with the embodiments and with reference to the drawings.
具体实施方式detailed description
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.
本申请提供一种智能IT运维故障定位方法,在本申请智能IT运维故障定位方法的第一实施例中,参照图1,所述智能IT运维故障定位方法包括:The present application provides an intelligent IT O&M fault location method. In the first embodiment of the present intelligent IT O&M fault location method, referring to FIG. 1, the intelligent IT O&M fault location method includes:
步骤S10,接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Step S10: Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
步骤S20,对所述潜在故障点进行健康探测,并获得第一探测返回值;Step S20, health detection is performed on the potential failure point, and a first detection return value is obtained;
步骤S30,获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Step S30: Obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point until no new fault point is generated, and then the new fault point is not generated. The fault point corresponding to the point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
步骤S40,输出所述目标故障点与所述第二探测返回值。Step S40, output the target fault point and the second detection return value.
具体步骤如下:Specific steps are as follows:
步骤S10,接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Step S10: Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
需要说明的是,在本实施例中,实现对可信度前列的多个潜在故障点发起定位探测,以最终探测得到确定的目标故障点。It should be noted that, in this embodiment, positioning detection is performed on a plurality of potential fault points in the forefront of credibility, so as to finally detect the determined target fault point.
具体地,智能IT运维故障定位方法应用于智能IT运维故障定位系统,在接收故障分析报告之前,与智能IT运维故障定位系统通信的智能IT运维故障分析系统会得到某一时刻的各相关方报警信息,在得到各相关方报警信息后,智能IT运维分析系统会根据预存的故障分析计算机可读指令进行该各相关方报警信息的分析,得到故障分析报告,并将该故障分析报告发送给智能IT运维故障定位系统,其中,故障分析报告中列举各个潜在的故障点。Specifically, the intelligent IT operation and maintenance fault location method is applied to the intelligent IT operation and maintenance fault location system. Before receiving the failure analysis report, the intelligent IT operation and maintenance fault analysis system that communicates with the intelligent IT operation and maintenance fault location system will get a certain time Alarm information of all related parties. After obtaining the alarm information of each related party, the intelligent IT operation and maintenance analysis system will analyze the alarm information of each related party according to the pre-stored fault analysis computer-readable instructions, obtain a fault analysis report, and report the fault The analysis report is sent to the intelligent IT operation and maintenance fault location system, where the fault analysis report lists various potential fault points.
对于智能IT运维故障定位系统而言,在接收故障分析报告后,即可解析获取得到所述故障分析报告中的所有潜在故障点,例如,当前系统中A节点无法调用B节点,而报告中将A节点和B节点同时列为潜在故障点。那么系统将直接获取到A节点和B节点即所有潜在故障点,并对所有潜在故障点即A节点和B节点进行定位探测,以确定是A节点存在故障,还是B节点存在故障,或者是A节点与B节点均存在故障,并进一步地确定故障节点的具体故障流程或者故障源头。For the intelligent IT operation and maintenance fault location system, after receiving the fault analysis report, all potential fault points in the fault analysis report can be parsed and obtained. For example, node A in the current system cannot call node B, while the report Node A and node B are listed as potential failure points at the same time. Then the system will directly obtain all potential failure points of node A and node B, and locate and detect all potential failure points, namely node A and node B, to determine whether there is a failure of node A, or a failure of node B, or A Both node and node B have a fault, and further determine the specific fault flow or source of the faulty node.
针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
步骤S20,对所述潜在故障点进行健康探测,并获得第一探测返回值;Step S20, health detection is performed on the potential failure point, and a first detection return value is obtained;
所述潜在故障点中必定存在有一个或几个故障点失误导致其他节点也发生故障。本实施例中,智能IT运维故障定位系统通过对各个潜在故障点进行健康探测,以得到各个第一探测返回值。There must be one or several fault points in the potential fault points that cause other nodes to fail. In this embodiment, the intelligent IT operation and maintenance fault location system performs health detection on each potential fault point to obtain each first detection return value.
具体地,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤包括:Specifically, the step of performing health detection on the potential fault point and obtaining the first detection return value includes:
步骤S21,定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;Step S21: Locate the operation source and operation process cited by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
在本实施例中,在健康探测过程中,将潜在故障节点假设为正常节点,并定位得到所述潜在故障点所引用的运算源和各个运算流程,对所述运算源和各个运算流程进行健康探测,具体地,通过预存的运算源和各个运算流程在正常状态下的数据I/O指标进行探测。In this embodiment, during the health detection process, the potential failure node is assumed to be a normal node, and the operation source and each operation process referenced by the potential failure point are located, and the operation source and each operation process are healthy Detection, specifically, detection is performed through pre-stored operation sources and data I/O indicators of each operation process in a normal state.
需要说明的是,在本实施例中,还可以在获取所述潜在故障点的节点类型后,从预设工具库中获取到与所述节点类型相对应的探测功能,并获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的运算源和运算流程进行嗅探测试,在嗅探测后,再准确地选取所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测,以获得所述潜在故障点的第一探测返回值,以节约探测流程。It should be noted that, in this embodiment, after acquiring the node type of the potential failure point, a detection function corresponding to the node type may be obtained from a preset tool library, and the detection function may be obtained Detection index, and perform a sniffing test on the operation source and operation process in the potential fault point according to the detection index, and after sniffing detection, then accurately select the operation source and each operation process under normal conditions The pre-stored data I/O indicators are detected to obtain the first detection return value of the potential fault point, so as to save the detection process.
用以具体实施例进行说明,A故障点中需要执行A1、A2、A3三个顺序环节即是A1、A2、A3三个运算流程。智能IT运维故障定位系统从A1环节出发,通过在A1环节中输入预设的起始参数(数据I/O指标),A1环节会得到一个运算值,判断该运算值与预设的结果数值是否一致。若一致,则说明A1环节没有问题,此时,智能IT运维故障定位系统会接收到A1环节没有问题的第一探测返回值如a10,反之,则说明A1环节存在问题,此时,智能IT运维故障定位系统会接收到A1环节有问题的第一探测返回值如a11,即是智能IT运维故障定系统将定位到A1环节引用的运算源和运算流程,从而得到该定位过程对应的各个第一探测返回值。再对后续的A2环节进行探测,原理同A1环节。最终获取到A故障点中所有的第一探测返回值。若是A故障点完整正常,那么即是A故障点在将A3环节的数值提供给B故障点时,B故障点发生错误,此时可获得故障点为B节点的探测返回值如b。To illustrate with specific embodiments, the three sequential links A1, A2, and A3 need to be executed in the A fault point, that is, the three calculation processes of A1, A2, and A3. The intelligent IT operation and maintenance fault location system starts from the A1 link. By entering the preset starting parameters (data I/O indicators) in the A1 link, the A1 link will get an operation value to judge the operation value and the preset result value Is it consistent. If they are consistent, it means that there is no problem in the A1 link. At this time, the intelligent IT operation and maintenance fault location system will receive the first detection return value of the A1 link without problems, such as a10. Otherwise, it means that there is a problem in the A1 link. At this time, the intelligent IT The operation and maintenance fault location system will receive the first detection return value of the problem in the A1 link, such as a11, that is, the intelligent IT operation and maintenance fault determination system will locate the operation source and operation process referenced in the A1 link, so as to obtain the corresponding The return value of each first probe. Then detect the subsequent A2 link, the principle is the same as the A1 link. Finally, all the first detection return values in the A fault point are obtained. If the fault point A is complete and normal, then it is the fault point A that provides the value of the A3 link to the fault point B, and the fault point B has an error. At this time, the detection return value of the fault point B node can be obtained as b.
进一步地,为方便理解,以下通过例子进行解释说明:A故障点为产品订单节点,B故障点为订单数据库,故障情况为A故障点无法调用B故障点数据库中的订单内容。智能IT运维故障定位系统将确定A故障点调用订单编号是否正确,通过将已知订单编号对A故障点的订单调用进行检测,判断A故障点是否正常检索到B节点中的订单编号,若能正常检索到B节点中的订单编号则判断A故障点能否利用订单编号查询到该编号的记录,若能利用订单编号查询到该编号的记录则判断A点能够将记录中的订单内容正常拉取,能将记录中的订单内容正常拉取则判断A点拉取的内容是否发生改变,若A点拉取的内容不发生改变则判断A点显示订单内容的方式是否正常。若这个探测流程中哪个步骤与正常步骤应得的步骤结果不同,则系统将定位到该步骤结果不同的步骤节点上。例如智能IT运维故障定位系统定位到A故障点调用B故障点中订单记录时,B节点无法反馈相应的记录信息,则智能IT运维故障定位系统将探测到调用过程中哪个步骤失误导致调用失败,并返回一个代表该步骤调用失败的第一探测返回值。Further, for ease of understanding, the following is explained through examples: A fault point is the product order node, B fault point is the order database, and the fault condition is that A fault point cannot call the order content in the B fault point database. The intelligent IT operation and maintenance fault locating system will determine whether the order number of the A fault point is correct, and detect the order call of the A fault point by detecting the known order number to determine whether the A fault point normally retrieves the order number in the B node. The order number in node B can be retrieved normally, and then the fault point A can be used to query the record of the number. If the order number can be used to query the record, then point A can determine the order content in the record. Pull, if the content of the order in the record can be pulled normally, it is judged whether the content pulled at point A has changed. If the content pulled at point A does not change, it is judged whether the way of displaying the order content at point A is normal. If which step in the detection flow is different from the step result that the normal step should get, the system will locate the step node with different step result. For example, when the intelligent IT operation and maintenance fault location system locates the fault point A and calls the order record in the fault point B, node B cannot feedback the corresponding record information, then the intelligent IT operation and maintenance fault location system will detect which step in the calling process caused the error Failure, and return a return value of the first probe representing the failure of the step call.
步骤S30,获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Step S30: Obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point until no new fault point is generated, and then the new fault point is not generated. The fault point corresponding to the point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
第一探测返回值可能会指向一个新的故障点,因此,智能IT运维故障定位系统将对该新的故障点进行再一次的健康探测,重新得到一个新的第一探测返回值,再由新的第一探测返回值指向另一故障点,循环迭代以上步骤,直到最后不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值。The return value of the first detection may point to a new fault point. Therefore, the intelligent IT operation and maintenance fault location system will perform another health detection on the new fault point to obtain a new return value of the first detection. The new first detection return value points to another fault point, and iterates the above steps cyclically until no new fault point is generated in the end. The fault point corresponding to the new fault point is determined as the target fault point, and The second detection return value of the target fault point.
也就是说,智能IT运维故障定位系统需要对潜在故障点进行迭代探测,即迭代得到每个潜在故障点的第一探测返回值,若第一探测返回值指向新的故障点,说明数据探测未探测到底,若不再生成新的故障点,说明系统已遍历当前所有可能出现异常的故障点。此时,经过对多个潜在故障点的探测到底的操作,智能IT运维故障定位系统将得到经过由各第一探测返回值多次指向的一个或多个目标故障点,也即智能IT运维故障定位系统已经提取出各个潜在故障点所产生交集的公共故障点(目标故障点)。公共故障点可以不止一个,是对所有关联故障点产生数据偏移的源头。In other words, the intelligent IT operation and maintenance fault location system needs to iteratively detect potential fault points, that is, iteratively obtain the first detection return value of each potential fault point. If the first detection return value points to a new fault point, it means data detection Not detected in the end, if no new fault point is generated, it means that the system has traversed all the current fault points that may be abnormal. At this time, after the operation of detecting multiple potential failure points to the end, the intelligent IT operation and maintenance fault location system will obtain one or more target failure points pointed by the first detection return value multiple times, that is, the intelligent IT operation The dimension fault location system has extracted the common fault points (target fault points) generated by the intersection of each potential fault point. There can be more than one common fault point, which is the source of data offset for all associated fault points.
例如A调用B时产生故障,A调用C是产生故障,而B调用C并未产生故障,那么作为BC共同的交集故障点,A即为源头故障点。For example, when A calls B, A generates a fault, and A calls C to generate a fault, but B calls C to generate a fault, then as a common intersection fault point of BC, A is the source fault point.
步骤S40,输出所述目标故障点与所述第二探测返回值。Step S40, output the target fault point and the second detection return value.
在得到目标故障点以及第二探测返回值后,输出该目标故障点与第二探测返回值,以提示用户或者是运维人员。After the target fault point and the second detection return value are obtained, the target fault point and the second detection return value are output to prompt the user or the operation and maintenance personnel.
本申请通过接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;针对每个潜在故障点,执行如下步骤:对所述潜在故障点进行健康探测,并获得第一探测返回值;获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;输出所述目标故障点与所述第二探测返回值。即在本申请中,在接收得到故障分析报告后,即自动获取潜在故障点,并自动对潜在故障点进行持续的迭代健康探测,而不是人工的探测,以快速获取得到目标故障点,即是快速定位到目标故障点,由于快速定位到目标故障点,因而节约了定位时间,因而,也相应地节约了修复时间,提升了用户即是运维人员的体验。因而解决了现有技术中对故障修复节点的定位效率低下,消耗宝贵的修复时间,延长了修复周期,影响了用户的使用体验的技术问题。This application receives the failure analysis report and obtains all potential failure points in the failure analysis report; for each potential failure point, the following steps are performed: health detection is performed on the potential failure point, and the first detection return value is obtained ; Obtaining a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point until no new fault point is generated, corresponding to the new fault point is no longer generated The fault point of is determined as the target fault point, and the second detection return value of the target fault point is obtained; the target fault point and the second detection return value are output. That is, in this application, after receiving the failure analysis report, the potential failure point is automatically obtained, and continuous iterative health detection is automatically performed on the potential failure point, rather than manual detection, to quickly obtain the target failure point, that is Quickly locate the target fault point, because it quickly locates the target fault point, it saves the positioning time, so it also saves the repair time accordingly, and improves the experience of users who are O&M personnel. Therefore, the technical problem of locating the fault repair node in the prior art is low, consumes valuable repair time, prolongs the repair cycle, and affects the user experience.
进一步地,参照图2,本申请提供智能IT运维故障定位方法的另一实施例,在该实施例中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤还包括:Further, referring to FIG. 2, the present application provides another embodiment of an intelligent IT operation and maintenance fault locating method. In this embodiment, the step of performing health detection on the potential fault point and obtaining the first detection return value further include:
步骤S22,获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;Step S22: Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
在本实施例中,不同的潜在故障点在智能IT运维故障定位系统中具有各自的节点类型,而不同的节点类型在智能IT运维故障定位系统系统的预设功能库中都有对应的专属探测功能,获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能,例如,潜在故障点为网络通讯节点,那么系统将在预设功能库中获取到与网络通讯节点类别相互映射的网络探测功能。In this embodiment, different potential fault points have respective node types in the intelligent IT O&M fault location system, and different node types have corresponding ones in the preset function library of the intelligent IT O&M fault location system system Exclusive detection function to obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library, for example, if the potential failure point is a network communication node, then the system will be preset The network detection function mapped to the network communication node category is obtained from the function library.
步骤S23,获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。Step S23: Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
不同探测功能具有不同的探测指标,例如网络探测功能的探测指标是网络链接状态,数据传输速率等等。在本实施例中,通过探测功能对相应的潜在故障点中的指标数据进行嗅探测试。所述嗅探测试即为对潜在故障点中的指标数据进行分类筛选,以将指标数据中与探测指标同类型的指标数据筛选出来,并对筛选出来的该同类型的指标数据进行溯源探测,从而进一步获得该潜在故障点对应的各个第一探测返回值或者第二探测返回值。Different detection functions have different detection indicators. For example, the detection indicators of the network detection function are network link status, data transmission rate, and so on. In this embodiment, a sniffing test is performed on the index data in the corresponding potential failure points through the detection function. The sniffer test is to classify and filter the index data in the potential failure point, so as to filter out the index data of the same type as the detection index in the index data, and perform traceability detection on the selected index data of the same type. Thus, each first detection return value or second detection return value corresponding to the potential fault point is further obtained.
例如当前探测指标对潜在故障点中的网络连接状态进行探测,那么系统的探测步骤可包括以下步骤:智能IT运维故障定位确定网络连接状态的连接双端对象,从A节点向B节点发起建立网络连接指令,系统确定A节点的ip1地址,获取B节点的ip2地址,并确立A节点和B节点的之间的DNS解析服务是否正确等等。通过探测功能对网络连接中所涉及到的所有网络指标数据的输出输入进行测试,从而确认每一个流程当中哪个环节出现问题。For example, the current detection index detects the network connection status in the potential fault point, then the detection step of the system may include the following steps: intelligent IT operation and maintenance fault location to determine the network connection status of the connected dual-end object, and initiate the establishment from node A to node B In the network connection instruction, the system determines the ip1 address of node A, obtains the ip2 address of node B, and establishes whether the DNS resolution service between node A and node B is correct, and so on. Through the detection function, the output and input of all network index data involved in the network connection are tested to confirm which part of each process has a problem.
在本实施例中,获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。由于准确进行嗅探测试,因而,能够为有序并快速定位得到目标故障点奠下基础。In this embodiment, the node type of the potential failure point is obtained, and the detection function corresponding to the node type is obtained from a preset tool library; the detection index of the detection function is obtained, and according to the detection The indicator performs a sniffing test on the indicator data in the potential fault point to obtain the first detection return value of the potential fault point. Due to the accurate sniffer test, it can lay the foundation for orderly and rapid location to obtain the target fault point.
进一步地,本申请提供智能IT运维故障定位方法的另一实施例,在该实施例中,所述所述输出所述目标故障点与所述第二探测返回值步骤之后包括:Further, this application provides another embodiment of the intelligent IT operation and maintenance fault locating method. In this embodiment, the step of outputting the target fault point and the second detection return value includes:
步骤S50,获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;Step S50: Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
在本实施例中,预存有方案数据库,方案数据库中包括了各种针对目标故障点的节点类型或者故障状态的应急预案,用于解决目标故障点的故障情况。系统在确定目标故障点之后,直接从系统数据库中调取相应的目标应急预案并执行。In this embodiment, a plan database is pre-stored, and the plan database includes various emergency plans for the node type or fault state of the target fault point, and is used to solve the fault situation of the target fault point. After determining the target failure point, the system directly retrieves and executes the corresponding target emergency plan from the system database.
步骤S60,在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。Step S60, after executing the target emergency plan, perform health detection on the target fault point again.
系统执行完目标应急预案后,为验证是否解决当前目标故障点的问题,将重新对目标故障点进行健康探测,步骤与上述健康探测的步骤一致。After the system implements the target emergency plan, in order to verify whether the problem of the current target fault point is solved, the target fault point will be re-health-detected, the steps are the same as the above-mentioned health detection steps.
所述在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测步骤之后包括:After performing the target emergency plan, after performing the health detection step on the target failure point again, the method includes:
步骤S70,获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;Step S70: Obtain a third detection return value obtained after performing health detection on the target fault point again, and determine whether the third detection return value points to a new fault point;
获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点,在本实施例中,重新对所述目标故障点进行健康探测后,若得到新的故障点,则很显然智能IT运维故障定位系统并没有解决相应目标故障点的故障状态。Acquiring a third detection return value obtained after re-health detection of the target fault point, and determining whether the third detection return value points to a new fault point, in this embodiment, the target fault point is re-executed After health detection, if a new fault point is obtained, it is clear that the intelligent IT operation and maintenance fault location system has not solved the fault state of the corresponding target fault point.
步骤S80,若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。Step S80: If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
若智能IT运维故障定位系统无法自动完成上述故障状态的处理,因而需要输出无法自动处理的警告信息,以便运维人员进行人工处理,以提高智能IT运维故障定位系统的容错性能。If the intelligent IT operation and maintenance fault locating system cannot automatically complete the processing of the above-mentioned fault status, it is necessary to output a warning message that cannot be automatically processed, so that the operation and maintenance personnel can perform manual processing to improve the fault tolerance performance of the intelligent IT operation and maintenance fault locating system.
在本实施例中,通过获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。因而能够避免可能存在的目标应急预案与目标故障点的故障状态不一致的情况,提高智能IT运维故障定位系统的容错性。In this embodiment, the target emergency plan corresponding to the target fault point is selected from the pre-stored plan database by acquiring and according to the fault state of the target fault point, and the target emergency plan is executed on the target fault point ; After executing the target emergency plan, re-health the target fault point. Therefore, the possible inconsistency between the target emergency plan and the fault state of the target fault point can be avoided, and the fault tolerance of the intelligent IT operation and maintenance fault location system can be improved.
进一步地,本申请提供智能IT运维故障定位方法的另一实施例,在该实施例中,所述获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案步骤包括:Further, this application provides another embodiment of the intelligent IT operation and maintenance fault locating method. In this embodiment, the target fault is acquired and selected from the pre-stored solution database according to the fault status of the target fault point The steps of the target emergency plan corresponding to the points include:
步骤S51,获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;Step S51: Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count and execute the emergency plans in the past historical time period The frequency of passing the target failure point successfully passing the health detection;
在本实施例中,目标故障点对应的所有应急预案可以为多个,因而,系统统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次。In this embodiment, there may be multiple emergency plans corresponding to the target fault point. Therefore, the system counts the frequency of passing the target fault point through the health detection after the emergency plans are executed in the past historical time period.
步骤S52,从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。In step S52, the emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
具体地,系统自动识别直接通过健康探测的应急预案的成功次数,从预存的方案数据库中选取通过频次最高的应急预案,将其设置最优先推荐预案,在以后的应急预案匹配中优先推荐实施,因而,系统将所述通过频次最高的应急预案设置为目标应急预案。Specifically, the system automatically recognizes the number of successful emergency plans that directly pass health detection, selects the emergency plan that passes the highest frequency from the pre-stored plan database, sets it as the highest priority recommended plan, and recommends the implementation of priority in the future emergency plan matching, Therefore, the system sets the emergency plan with the highest passing frequency as the target emergency plan.
在本实施例中,由于获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。因而,能够最快速地解决目标故障点,因而能够提升运维人员即用户的体验。In this embodiment, since all the emergency plans corresponding to the target fault point are pre-stored and obtained according to the fault status of the target fault point; if there are multiple emergency plans, the statistics of the past historical time period are counted. After each emergency plan is executed, the target failure point successfully passes the health detection pass frequency; the emergency plan with the highest pass frequency is selected from the pre-stored plan database, and the emergency plan with the highest pass frequency is set as the target emergency plan. Therefore, the target fault point can be resolved most quickly, and thus the experience of the operation and maintenance personnel, that is, the user can be improved.
参照图3,图3是本申请实施例方案涉及的硬件运行环境的设备结构示意图。Referring to FIG. 3, FIG. 3 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the present application.
本申请实施例智能IT运维故障定位设备可以是PC,也可以是智能手机、平板电脑、电子书阅读器、MP3(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)播放器、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面3)播放器、便携计算机等终端设备。The intelligent IT operation and maintenance fault locating device in the embodiment of the present application may be a PC, or may be a smartphone, tablet computer, e-book reader, MP3 (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio layer 3 player, MP4 (Moving Picture Experts Group Audio Layer IV, the standard audio layer for motion picture experts compression 3) Terminal devices such as players and portable computers.
如图3所示,该智能IT运维故障定位设备可以包括:处理器1001,例如CPU,存储器1005,通信总线1002。其中,通信总线1002用于实现处理器1001和存储器1005之间的连接通信。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储设备。As shown in FIG. 3, the intelligent IT operation and maintenance fault locating device may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to implement connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as disk storage. The memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
可选地,该智能IT运维故障定位设备还可以包括目标用户接口、网络接口、摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。目标用户接口可以包括显示屏(Display)、输入子模块,用于比如键盘(Keyboard),可选目标用户接口还可以包括标准的有线接口、无线接口。网络接口可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。Optionally, the intelligent IT operation and maintenance fault locating device may further include a target user interface, a network interface, a camera, and RF (Radio Frequency (radio frequency) circuits, sensors, audio circuits, WiFi modules, etc. The target user interface may include a display (Display) and an input sub-module, such as a keyboard (Keyboard), and the optional target user interface may also include a standard wired interface and a wireless interface. The network interface may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
本领域技术人员可以理解,图3中示出的智能IT运维故障定位设备结构并不构成对智能IT运维故障定位设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the structure of the intelligent IT O&M fault locating device shown in FIG. 3 does not constitute a limitation on the intelligent IT O&M fault locating device, and may include more or fewer components than the illustration, or a combination Some components, or different component arrangements.
如图3所示,作为一种计算机可读存储介质的存储器1005中可以包括操作系统、网络通信模块以及智能IT运维故障定位计算机可读指令。操作系统是管理和控制智能IT运维故障定位设备硬件和软件资源的计算机可读指令,支持智能IT运维故障定位计算机可读指令以及其它软件和/或计算机可读指令的运行。网络通信模块用于实现存储器1005内部各组件之间的通信,以及与智能IT运维故障定位设备中其它硬件和软件之间通信。As shown in FIG. 3, the memory 1005 as a computer-readable storage medium may include an operating system, a network communication module, and computer-readable instructions for intelligent IT operation and maintenance fault location. The operating system is a computer-readable instruction that manages and controls the hardware and software resources of the intelligent IT O&M fault location equipment, and supports the operation of the intelligent IT O&M fault location computer-readable command and other software and/or computer-readable instructions. The network communication module is used to realize communication between various components inside the memory 1005, and to communicate with other hardware and software in the intelligent IT operation and maintenance fault locating device.
在图3所示的智能IT运维故障定位设备中,处理器1001用于执行存储器1005中存储的智能IT运维故障定位计算机可读指令,实现上述任一项所述的智能IT运维故障定位方法的步骤。In the intelligent IT O&M fault locating device shown in FIG. 3, the processor 1001 is configured to execute the intelligent IT O&M fault locating computer readable instructions stored in the memory 1005 to implement the intelligent IT O&M fault described in any one of the above Steps of positioning method.
本申请智能IT运维故障定位设备具体实施方式与上述智能IT运维故障定位方法各实施例基本相同,在此不再赘述。The specific implementation of the intelligent IT operation and maintenance fault locating device of the present application is basically the same as the above-mentioned embodiments of the intelligent IT operation and maintenance fault locating method, which will not be repeated here.
本申请还提供一种智能IT运维故障定位装置,本申请智能IT运维故障定位装置具体实施方式与上述智能IT运维故障定位方法各实施例基本相同,在此不再赘述。This application also provides an intelligent IT operation and maintenance fault locating device. The specific implementation of the intelligent IT operation and maintenance fault locating device in this application is basically the same as the above-mentioned embodiments of the intelligent IT operation and maintenance fault locating method, and details are not described herein again.
本申请提供了一种可读存储介质,所述可读存储介质可以为非易失性可读存储介质,所述可读存储介质存储有一个或者一个以上计算机可读指令,所述一个或者一个以上计算机可读指令还可被一个或者一个以上的处理器执行以用于实现上述任一项所述的智能IT运维故障定位方法的步骤。The present application provides a readable storage medium. The readable storage medium may be a non-volatile readable storage medium. The readable storage medium stores one or more computer-readable instructions. The one or one The above computer readable instructions may also be executed by one or more processors for implementing the steps of the intelligent IT operation and maintenance fault location method described in any one of the above.
本申请可读存储介质具体实施方式与上述智能IT运维故障定位方法各实施例基本相同,在此不再赘述。The specific implementation of the readable storage medium of the present application is basically the same as the above embodiments of the intelligent IT operation and maintenance fault locating method, which will not be repeated here.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利处理范围内。The above are only the preferred embodiments of the present application, and do not limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by the description and drawings of this application, or directly or indirectly used in other related technical fields The same reason is included in the patent processing scope of this application.

Claims (20)

  1. 一种智能IT运维故障定位方法,其中,所述智能IT运维故障定位方法包括: An intelligent IT operation and maintenance fault location method, wherein the intelligent IT operation and maintenance fault location method includes:
    接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
    针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
    对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
    获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
    输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
  2. 如权利要求1所述的智能IT运维故障定位方法,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤包括:The intelligent IT operation and maintenance fault locating method according to claim 1, wherein the step of performing health detection on the potential fault point and obtaining a first detection return value comprises:
    定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;Locate the operation source and operation process referenced by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
    其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  3. 如权利要求1所述的智能IT运维故障定位方法,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤还包括:The intelligent IT operation and maintenance fault locating method according to claim 1, wherein the step of performing health detection on the potential fault point and obtaining a first detection return value further comprises:
    获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
    获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
  4. 如权利要求1所述的智能IT运维故障定位方法,其中,所述输出所述目标故障点与所述第二探测返回值步骤之后包括:The intelligent IT operation and maintenance fault locating method according to claim 1, wherein the step of outputting the target fault point and the second detection return value comprises:
    获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
    在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。After executing the target emergency plan, health detection is performed on the target fault point again.
  5. 如权利要求4所述的智能IT运维故障定位方法,其中,所述在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测步骤之后包括:The intelligent IT operation and maintenance fault locating method according to claim 4, wherein after performing the target emergency plan, after performing the health detection step on the target fault point again includes:
    获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;Acquiring a third detection return value obtained after performing health detection on the target fault point again, and determining whether the third detection return value points to a new fault point;
    若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
  6. 如权利要求5所述的智能IT运维故障定位方法,其中,所述获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案步骤包括:The intelligent IT operation and maintenance fault locating method according to claim 5, wherein the step of acquiring and selecting a target emergency plan corresponding to the target fault point from a pre-stored plan database according to the fault state of the target fault point includes :
    获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count the statistics after the execution of the emergency plans in the past historical time period Passing frequency of successfully passing the health detection of the target fault point;
    从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。The emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
  7. 一种智能IT运维故障定位装置,其中,所述智能IT运维故障定位装置包括:An intelligent IT operation and maintenance fault locating device, wherein the intelligent IT operation and maintenance fault locating device includes:
    接收模块,用于接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;A receiving module, used to receive a failure analysis report and obtain all potential failure points in the failure analysis report;
    针对每个潜在故障点,存在执行模块,所述执行模块包括:For each potential point of failure, there is an execution module, which includes:
    健康探测子模块,用于对所述潜在故障点进行健康探测,并获得第一探测返回值;A health detection submodule, configured to perform health detection on the potential failure point and obtain a first detection return value;
    第一获取子模块,用于获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;A first obtaining submodule, configured to obtain a new fault point specified by the first detection return value, and perform continuous health detection on the new fault point, until no new fault point is generated, the The fault point corresponding to the new fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
    输出子模块,用于输出所述目标故障点与所述第二探测返回值。The output submodule is used to output the target fault point and the second detection return value.
  8. 如权利要求7所述的智能IT运维故障定位装置,其中,所述健康探测子模块包括:The intelligent IT operation and maintenance fault locating device according to claim 7, wherein the health detection sub-module includes:
    定位单元,用于定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;A positioning unit, configured to locate the operation source and operation process cited by the potential failure point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
    其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  9. 一种智能IT运维故障定位设备,其中,所述智能IT运维故障定位设备包括:存储器、处理器,通信总线以及存储在所述存储器上的计算机可读指令,An intelligent IT operation and maintenance fault locating device, wherein the intelligent IT operation and maintenance fault locating device includes: a memory, a processor, a communication bus, and computer-readable instructions stored on the memory,
    所述通信总线用于实现处理器与存储器间的通信连接;The communication bus is used to realize the communication connection between the processor and the memory;
    所述处理器用于执行所述计算机可读指令,以实现如下步骤:The processor is used to execute the computer-readable instructions to implement the following steps:
    接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
    针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
    对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
    获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
    输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
  10. 如权利要求9所述的智能IT运维故障定位设备,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤包括:The intelligent IT operation and maintenance fault locating device according to claim 9, wherein the step of performing health detection on the potential fault point and obtaining the first detection return value comprises:
    定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;Locate the operation source and operation process referenced by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
    其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  11. 如权利要求9所述的智能IT运维故障定位设备,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤还包括:The intelligent IT operation and maintenance fault locating device according to claim 9, wherein the step of performing health detection on the potential fault point and obtaining a first detection return value further comprises:
    获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
    获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
  12. 如权利要求9所述的智能IT运维故障定位设备,其中,所述输出所述目标故障点与所述第二探测返回值步骤之后包括:The intelligent IT operation and maintenance fault locating device according to claim 9, wherein the step of outputting the target fault point and the second detection return value comprises:
    获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
    在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。After executing the target emergency plan, health detection is performed on the target fault point again.
  13. 如权利要求12所述的智能IT运维故障定位设备,其中,所述在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测步骤之后包括:The intelligent IT operation and maintenance fault locating device according to claim 12, wherein after performing the target emergency plan, after performing the health detection step on the target fault point again includes:
    获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;Acquiring a third detection return value obtained after performing health detection on the target fault point again, and determining whether the third detection return value points to a new fault point;
    若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
  14. 如权利要求13所述的智能IT运维故障定位设备,其中,所述获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案步骤包括:The intelligent IT operation and maintenance fault locating device according to claim 13, wherein the step of acquiring and selecting a target emergency plan corresponding to the target fault point from a pre-stored plan database according to the fault state of the target fault point includes :
    获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count the statistics after the execution of the emergency plans in the past historical time period Passing frequency of successfully passing the health detection of the target fault point;
    从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。The emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan.
  15. 一种可读存储介质,其中,所述可读存储介质上存储有智能IT运维故障定位计算机可读指令,所述智能IT运维故障定位计算机可读指令被处理器执行时实现如下步骤:A readable storage medium, wherein the computer-readable instructions for intelligent IT operation and maintenance fault location are stored on the readable storage medium, and the following steps are implemented when the intelligent IT operation and maintenance fault location computer-readable instructions are executed by a processor:
    接收故障分析报告,并获取所述故障分析报告中的所有潜在故障点;Receive a failure analysis report and obtain all potential failure points in the failure analysis report;
    针对每个潜在故障点,执行如下步骤:For each potential failure point, perform the following steps:
    对所述潜在故障点进行健康探测,并获得第一探测返回值;Health detection is performed on the potential failure point, and the first detection return value is obtained;
    获取所述第一探测返回值指定的新的故障点,并对所述新的故障点进行持续健康探测,直至不再产生新的故障点为止,将所述不再产生新的故障点对应的故障点确定为目标故障点,得到所述目标故障点的第二探测返回值;Acquiring a new fault point specified by the first detection return value, and performing continuous health detection on the new fault point, until no new fault point is generated, the corresponding new fault point is generated The fault point is determined as the target fault point, and the second detection return value of the target fault point is obtained;
    输出所述目标故障点与所述第二探测返回值。The target fault point and the second detection return value are output.
  16. 如权利要求15所述的可读存储介质,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤包括:The readable storage medium of claim 15, wherein the step of performing health detection on the potential failure point and obtaining a first detection return value includes:
    定位所述潜在故障点所引用的运算源和运算流程,对所述运算源和运算流程进行健康探测,得到所述运算源以及各运算流程对应的第一探测返回值;Locate the operation source and operation process referenced by the potential fault point, perform health detection on the operation source and operation process, and obtain the first detection return value corresponding to the operation source and each operation process;
    其中,所述健康探测包括利用所述运算源以及各个运算流程在正常状态下的预存的数据I/O指标进行探测的步骤。Wherein, the health detection includes the step of performing detection using the pre-stored data I/O indicators of the calculation source and each calculation flow in a normal state.
  17. 如权利要求15所述的可读存储介质,其中,所述对所述潜在故障点进行健康探测,并获得第一探测返回值步骤还包括:The readable storage medium of claim 15, wherein the step of performing health detection on the potential failure point and obtaining a first detection return value further comprises:
    获取所述潜在故障点的节点类型,并从预设工具库中获取到与所述节点类型相对应的探测功能;Obtain the node type of the potential failure point, and obtain the detection function corresponding to the node type from the preset tool library;
    获取所述探测功能的探测指标,并根据所述探测指标对所述潜在故障点中的指标数据进行嗅探测试,以获得所述潜在故障点的第一探测返回值。Obtain the detection index of the detection function, and perform a sniffing test on the index data in the potential fault point according to the detection index to obtain the first detection return value of the potential fault point.
  18. 如权利要求15所述的可读存储介质,其中,所述输出所述目标故障点与所述第二探测返回值步骤之后包括:The readable storage medium of claim 15, wherein the step of outputting the target fault point and the second detection return value comprises:
    获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案,并对所述目标故障点执行所述目标应急预案;Acquire and select the target emergency plan corresponding to the target fault point from the pre-stored plan database according to the fault state of the target fault point, and execute the target emergency plan for the target fault point;
    在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测。After executing the target emergency plan, health detection is performed on the target fault point again.
  19. 如权利要求18所述的可读存储介质,其中,所述在执行所述目标应急预案之后,重新对所述目标故障点进行健康探测步骤之后包括:The readable storage medium according to claim 18, wherein after performing the target emergency plan, after performing the health detection step on the target failure point again includes:
    获取重新对所述目标故障点进行健康探测后得到的第三探测返回值,并确定所述第三探测返回值是否指向新的故障点;Acquiring a third detection return value obtained after performing health detection on the target fault point again, and determining whether the third detection return value points to a new fault point;
    若所述第三探测返回值指向新的故障点,则输出无法自动处理的警告信息。If the third detection return value points to a new fault point, a warning message that cannot be automatically processed is output.
  20. 如权利要求19所述的可读存储介质,其中,所述获取并根据所述目标故障点的故障状态,从预存的方案数据库中选取所述目标故障点对应的目标应急预案步骤包括:The readable storage medium according to claim 19, wherein the step of acquiring and selecting a target emergency plan corresponding to the target fault point from a pre-stored plan database according to the fault state of the target fault point includes:
    获取并根据所述目标故障点的故障状态,得到预存的所述目标故障点对应的所有应急预案;若所述所有应急预案为多个,统计过去历史时间段所述各应急预案执行后所述目标故障点成功通过健康探测的通过频次;Obtain and obtain all emergency plans corresponding to the target fault point according to the fault status of the target fault point; if there are multiple emergency plans, count the statistics after the execution of the emergency plans in the past historical time period Passing frequency of successfully passing the health detection of the target fault point;
    从预存的方案数据库中选取通过频次最高的应急预案,将所述通过频次最高的应急预案设置为目标应急预案。 The emergency plan with the highest passing frequency is selected from the pre-stored plan database, and the emergency plan with the highest passing frequency is set as the target emergency plan. The
PCT/CN2019/117548 2018-12-13 2019-11-12 Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium WO2020119369A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811530943.XA CN109633351B (en) 2018-12-13 2018-12-13 Intelligent IT operation and maintenance fault positioning method, device, equipment and readable storage medium
CN201811530943.X 2018-12-13

Publications (1)

Publication Number Publication Date
WO2020119369A1 true WO2020119369A1 (en) 2020-06-18

Family

ID=66073827

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117548 WO2020119369A1 (en) 2018-12-13 2019-11-12 Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium

Country Status (2)

Country Link
CN (1) CN109633351B (en)
WO (1) WO2020119369A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112242938A (en) * 2020-10-14 2021-01-19 亚信科技(中国)有限公司 Detection method, detection device, electronic equipment and computer-readable storage medium
CN112433926A (en) * 2020-11-27 2021-03-02 中国建设银行股份有限公司 Fault analysis method, system, device and storage medium based on IT product
CN113537760A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Intelligent recommendation method and system for fault handling plan
CN114294778A (en) * 2021-12-27 2022-04-08 深圳市兴特能源科技有限公司 Air circulation disinfection and purification method and system for classroom lamp
CN115237091A (en) * 2022-07-18 2022-10-25 西安交通大学 Electromechanical device fault tracing method and system
CN115857461A (en) * 2023-03-02 2023-03-28 东莞正大康地饲料有限公司 Piglet premixed feed production online monitoring method and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109633351B (en) * 2018-12-13 2021-10-22 平安普惠企业管理有限公司 Intelligent IT operation and maintenance fault positioning method, device, equipment and readable storage medium
CN113283462B (en) * 2021-03-24 2022-09-20 国网四川省电力公司电力科学研究院 Secondary system fault positioning method based on improved IDNN model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2680494A1 (en) * 2012-06-29 2014-01-01 Alcatel-Lucent Home network trouble shooting
CN106941423A (en) * 2017-04-13 2017-07-11 腾讯科技(深圳)有限公司 Failure cause localization method and device
CN107342878A (en) * 2016-04-29 2017-11-10 中兴通讯股份有限公司 A kind of fault handling method and device
CN107612756A (en) * 2017-10-31 2018-01-19 广西宜州市联森网络科技有限公司 A kind of operation management system with intelligent trouble analyzing and processing function
CN107809322A (en) * 2016-09-06 2018-03-16 中兴通讯股份有限公司 The distribution method and device of work order
CN107862393A (en) * 2017-10-31 2018-03-30 广西宜州市联森网络科技有限公司 A kind of IT operation management system
CN108768753A (en) * 2018-06-26 2018-11-06 腾讯科技(深圳)有限公司 Localization method, device, storage medium and the electronic device of alarm source
CN109633351A (en) * 2018-12-13 2019-04-16 平安普惠企业管理有限公司 Intelligent IT O&M Fault Locating Method, device, equipment and readable storage medium storing program for executing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008217735A (en) * 2007-03-08 2008-09-18 Nec Corp Fault analysis system, method and program
CN102306244B (en) * 2011-07-29 2014-03-05 北京航星机器制造有限公司 Fault eliminating method based on evaluation of detecting points
CN103412805A (en) * 2013-07-31 2013-11-27 交通银行股份有限公司 IT (information technology) fault source diagnosis method and IT fault source diagnosis system
CN106059813A (en) * 2016-06-14 2016-10-26 西安电子科技大学 Comprehensive detection method based on dynamic time interval
CN106789243B (en) * 2016-12-22 2019-06-14 烟台东方纵横科技股份有限公司 A kind of IT operational system with intelligent trouble analytic function

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2680494A1 (en) * 2012-06-29 2014-01-01 Alcatel-Lucent Home network trouble shooting
CN107342878A (en) * 2016-04-29 2017-11-10 中兴通讯股份有限公司 A kind of fault handling method and device
CN107809322A (en) * 2016-09-06 2018-03-16 中兴通讯股份有限公司 The distribution method and device of work order
CN106941423A (en) * 2017-04-13 2017-07-11 腾讯科技(深圳)有限公司 Failure cause localization method and device
CN107612756A (en) * 2017-10-31 2018-01-19 广西宜州市联森网络科技有限公司 A kind of operation management system with intelligent trouble analyzing and processing function
CN107862393A (en) * 2017-10-31 2018-03-30 广西宜州市联森网络科技有限公司 A kind of IT operation management system
CN108768753A (en) * 2018-06-26 2018-11-06 腾讯科技(深圳)有限公司 Localization method, device, storage medium and the electronic device of alarm source
CN109633351A (en) * 2018-12-13 2019-04-16 平安普惠企业管理有限公司 Intelligent IT O&M Fault Locating Method, device, equipment and readable storage medium storing program for executing

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112242938A (en) * 2020-10-14 2021-01-19 亚信科技(中国)有限公司 Detection method, detection device, electronic equipment and computer-readable storage medium
CN112242938B (en) * 2020-10-14 2022-08-19 亚信科技(中国)有限公司 Detection method, device, electronic equipment and computer readable storage medium
CN112433926A (en) * 2020-11-27 2021-03-02 中国建设银行股份有限公司 Fault analysis method, system, device and storage medium based on IT product
CN112433926B (en) * 2020-11-27 2024-03-01 中国建设银行股份有限公司 IT product-based fault analysis method, system, equipment and storage medium
CN113537760A (en) * 2021-07-14 2021-10-22 深圳供电局有限公司 Intelligent recommendation method and system for fault handling plan
CN114294778A (en) * 2021-12-27 2022-04-08 深圳市兴特能源科技有限公司 Air circulation disinfection and purification method and system for classroom lamp
CN114294778B (en) * 2021-12-27 2023-11-14 深圳市兴特能源科技有限公司 Air circulation disinfection and purification method and system for classroom lamp
CN115237091A (en) * 2022-07-18 2022-10-25 西安交通大学 Electromechanical device fault tracing method and system
CN115857461A (en) * 2023-03-02 2023-03-28 东莞正大康地饲料有限公司 Piglet premixed feed production online monitoring method and system

Also Published As

Publication number Publication date
CN109633351A (en) 2019-04-16
CN109633351B (en) 2021-10-22

Similar Documents

Publication Publication Date Title
WO2020119369A1 (en) Intelligent it operation and maintenance fault positioning method, apparatus and device, and readable storage medium
WO2020048047A1 (en) System fault warning method, apparatus, and device, and storage medium
WO2020015064A1 (en) System fault processing method, apparatus, device and storage medium
WO2020233077A1 (en) System service monitoring method, device, and apparatus, and storage medium
WO2020015060A1 (en) Power consumption anomaly estimation method and apparatus, device, and computer storage medium
WO2020019403A1 (en) Electricity consumption abnormality detection method, apparatus and device, and readable storage medium
WO2020015067A1 (en) Data acquisition method, device, equipment and storage medium
WO2020253034A1 (en) Client buried point test method, apparatus and device, and storage medium
WO2020073688A1 (en) Method, apparatus and device for predicting network device anomaly, and storage medium
WO2020119115A1 (en) Data verification method, device, apparatus, and storage medium
WO2020147385A1 (en) Data entry method and apparatus, terminal and computer-readable storage medium
WO2020253112A1 (en) Test strategy acquisition method, device, terminal, and readable storage medium
WO2020015061A1 (en) Monitoring alarm method, device and system for weblogic server, and computer storage medium
WO2018014580A1 (en) Data interface test method and apparatus, and server and storage medium
WO2021003930A1 (en) Quality inspection method, apparatus, and device for customer service audio, and computer readable storage medium
WO2014204179A1 (en) Method for verifying bad pattern in time series sensing data and apparatus thereof
WO2018120681A1 (en) Data synchronization method, device and system, data processing server, and storage medium
WO2020107591A1 (en) Double insurance limiting method, apparatus, device, and readable storage medium
WO2020087978A1 (en) Risk control audit model generation method, apparatus and device and storage medium
WO2020233089A1 (en) Test case generating method and apparatus, terminal, and computer readable storage medium
WO2019024485A1 (en) Data sharing method and device and computer readable storage medium
WO2020062658A1 (en) Contract generation method and apparatus, and device and storage medium
WO2020019405A1 (en) Database monitoring method, device and apparatus, and computer storage medium
WO2020119384A1 (en) Medical insurance abnormity detection method, apparatus and device based on big data analysis, and medium
WO2021012481A1 (en) System performance monitoring method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19896421

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.10.2021)