CN111338836B - Method, apparatus, computer device and storage medium for processing fault data - Google Patents

Method, apparatus, computer device and storage medium for processing fault data Download PDF

Info

Publication number
CN111338836B
CN111338836B CN202010113452.6A CN202010113452A CN111338836B CN 111338836 B CN111338836 B CN 111338836B CN 202010113452 A CN202010113452 A CN 202010113452A CN 111338836 B CN111338836 B CN 111338836B
Authority
CN
China
Prior art keywords
fault
trained
decision tree
item
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010113452.6A
Other languages
Chinese (zh)
Other versions
CN111338836A (en
Inventor
郑宇卿
景小琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202010113452.6A priority Critical patent/CN111338836B/en
Publication of CN111338836A publication Critical patent/CN111338836A/en
Application granted granted Critical
Publication of CN111338836B publication Critical patent/CN111338836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to a method, a device, a computer device and a storage medium for processing fault data. The method comprises the following steps: acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item; inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training. And constructing a decision tree model through the fault log of the fault event, and positioning the cause of the new fault event by adopting the constructed decision tree model, thereby improving the troubleshooting efficiency of the fault event.

Description

Method, apparatus, computer device and storage medium for processing fault data
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a computer device, and a storage medium for processing fault data.
Background
In the practical use of each project, the log is very important, and can help staff to quickly locate the processing result of the key node of the task, and the log can store fault information when the project has problems, so that the fault cause can be located through the information, and the problems can be repaired.
With the increase of associated projects, multiple layers of calling relations exist in each project, each project independently records and maintains own log information, and if fault points are required to be located, the layer-by-layer checking is low in checking efficiency.
Disclosure of Invention
In order to solve the technical problems, the application provides a method, a device, computer equipment and a storage medium for processing fault data.
In a first aspect, the present application provides a method of processing fault data, comprising:
acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training.
In a second aspect, the present application provides an apparatus for processing fault data, comprising:
the data acquisition module is used for acquiring a current fault event, acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
the fault positioning module is used for inputting fault logs of all the associated items into the trained fault event positioning model, outputting the abnormal probability of each basic item in the trained fault event positioning model, screening the abnormal items of the fault event from the abnormal probability of each basic item according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the trained decision tree model.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:
acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training.
The method, the device, the computer equipment and the storage medium for processing the fault data comprise the following steps: acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item; inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training. And constructing a decision tree model through the fault log of the fault event, and positioning the cause of the new fault event by adopting the constructed decision tree model, thereby improving the troubleshooting efficiency of the fault event.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is an application environment diagram of a method of processing fault data in one embodiment;
FIG. 2 is a flow diagram of a method of processing fault data in one embodiment;
FIG. 3 is a schematic diagram of a micro-service architecture in one embodiment;
FIG. 4 is a block diagram of an apparatus for processing fault data in one embodiment;
fig. 5 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
FIG. 1 is an application environment diagram of a method of processing fault data in one embodiment. Referring to fig. 1, the method of processing fault data is applied to a system of processing fault data. The system for processing fault data includes a terminal 110 and a server 120. The terminal 110 and the server 120 are connected through a network. The terminal 110 or the server 120 obtains a current fault event, obtains at least one fault log of a preset associated item according to the current fault event, wherein the fault log of each associated item carries item identification of the associated item and hierarchical information of the item, inputs the fault log of each associated item to a trained fault event positioning model, outputs abnormal probability of each basic item in the trained fault event positioning model, screens abnormal items of the fault event from the abnormal probability of each basic item according to the abnormal probability, the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the trained decision tree model.
The terminal 110 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 120 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
As shown in FIG. 2, in one embodiment, a method of processing fault data is provided. The present embodiment is mainly exemplified by the application of the method to the terminal 110 (or the server 120) in fig. 1. Referring to fig. 2, the method for processing fault data specifically includes the following steps:
step S201, a current fault event is obtained, and a fault log of at least one preset association item is obtained according to the current fault event.
In this particular embodiment, the fault log for each associated item carries the item identification of the associated item and the hierarchical information of the item.
Specifically, the current fault event refers to a fault event that has occurred. The failure event refers to an event in which an abnormality occurs in a server, a terminal, an application program, a network, or the like. The preset associated items refer to items associated with the occurrence of the fault event, namely, items affected by the fault event. The item identification is used to uniquely identify the item, and the item hierarchy information is information used to identify the level of the item. The relation between the call and the called items exists between different items, the items with high hierarchy can call one or more items with low hierarchy, and the items with low hierarchy can be called in a override way when the items with low hierarchy are called, and the items with the same hierarchy can also be called mutually. For items of four levels, the first-level item can directly call the first-level, second-level, third-level and fourth-level items, the second-level item can directly call the second-level, third-level and fourth-level items, and specific item call is set according to actual service requirements.
In one embodiment, the items in the underlying base item may be items set up according to the requirements, or may be items invoked from a third party according to the requirements.
Step S202, inputting fault logs of all the associated items into a trained fault event positioning model, outputting abnormal probabilities of all the basic items in the trained fault event positioning model, and screening abnormal items of the fault event from the abnormal probabilities of all the basic items according to the abnormal probabilities.
In this embodiment, the trained fault event localization model is a decision tree model, and the underlying item is a leaf node of the trained decision tree model.
Specifically, the trained fault event positioning model is a decision tree model, which is a graphical method by intuitively applying probability analysis. The decision tree is a prediction model for predicting the mapping relationship between the fault event and the fault cause. The decision tree is a tree structure (which may be a binary tree or a non-binary tree). Each non-leaf node of which represents a test on a characteristic attribute, each branch representing the output of this characteristic attribute over a range of values, and each leaf node storing a class. The decision making process using the decision tree is to start from the root node, test the corresponding characteristic attribute in the item to be classified, select the output branch according to the value until the leaf node is reached, and take the category stored by the leaf node as the decision result. The probability of a fault event occurring in each base item, i.e., the abnormal probability of each base item.
After a fault event occurs, a fault log related to the fault event is input into a trained fault event positioning model, the fault log is analyzed through each node of the trained fault event positioning model, and a final analysis result, namely the abnormal probability of each basic item, is output. The greater the anomaly probability indicates that the fault is due to the greater probability that the item causes. And screening basic items corresponding to the fault event from the basic items according to the magnitude of the abnormal probability, taking the basic items corresponding to the fault event as abnormal items, checking the abnormal items, and determining the occurrence reason of the fault event.
In one embodiment, the items with high anomaly probabilities are preferentially checked, namely, the basic items with the anomaly probabilities larger than a preset threshold probability value are checked, or the items are checked one by one according to the anomaly probabilities from large to small.
The method for processing fault data comprises the following steps: acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item; inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of a decision tree obtained through training. And constructing a decision tree model through the fault log of the fault event, and positioning the cause of the new fault event by adopting the constructed decision tree model, thereby improving the troubleshooting efficiency of the fault event.
In one embodiment, the method for processing fault data includes:
step S301, a plurality of historical fault events are acquired.
Step S302, obtaining a fault log of a real associated item of each historical fault event to obtain a first fault log.
In this specific embodiment, the first fault log carries the item identifier of the real associated item and the hierarchy information of the real associated item, and invokes the item identifier of the real associated item and the corresponding hierarchy information.
Specifically, the historical fault event refers to a fault event which has occurred, and the true associated item of the fault event refers to an item of which the existence association has been confirmed. And taking a log set formed by fault logs of the real associated items corresponding to the fault time as a first fault log. Each fault log carries the corresponding item identification of each real associated item and the corresponding hierarchical information of the item, and invokes the item identification of each real associated item and the corresponding hierarchical information, for example, the real associated item comprises a second-level item such as an item A, an item B and an item C, wherein the item for invoking the item A is a first-level item D, and the item for invoking the item B and the item C is a first-level item F. The fault log carries the item identification and the hierarchy information of the item a and the primary item D of the calling item a, the calling item B, the primary item F of the calling item C, the calling item B, and the item identification and the hierarchy information of the calling item C.
Step S303, training a first decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained first decision tree.
Step S304, determining whether the numerical difference between the number of candidate basic items in the trained first decision tree and the preset numerical value is within the preset difference interval.
In this embodiment, the preset value is the number of real basic items.
In step S305, when the numerical difference is within the preset difference interval, the trained first decision tree is used as the trained fault event localization model.
Specifically, the preset classification tree algorithm is a predefined algorithm for constructing a classification tree, and the preset classification tree algorithm is a common decision tree modeling algorithm, and common algorithms for constructing a classification tree include an ID3 algorithm, a C4.5 algorithm, a CART algorithm, and the like. The ID3 algorithm adopts the measure of purity of the information gain, and selects the feature which makes the information gain maximum to split. Information entropy is the complexity (uncertainty) representing a random variable, and conditional entropy represents the complexity (uncertainty) of a random variable under a certain condition. And the information gain is just the information entropy and the conditional entropy. The C4.5 algorithm splits with features chosen to maximize the information gain. The CART algorithm adopts a GINI value as a basis for node splitting; CART is a regression tree, and the minimum variance of the samples can be used as the basis for node splitting. A first trained decision tree is generated using a common classification tree algorithm. The number of candidate basic items of the trained first decision tree refers to the number of leaf nodes, whether the number of leaf nodes is matched with the number of real basic items or not is judged, namely whether the decision tree can effectively position fault logs or not is judged, and when the number of leaf nodes is matched with the number (preset value) of the real basic items, whether the numerical difference between the number of candidate basic items in the trained first decision tree and the preset value is located in a preset difference value interval or not is judged, so that the trained first decision tree is obtained as a trained fault event positioning model. The real basic item refers to the lowest item among the actual items.
In one embodiment, training the first decision tree by using the fault log preset classification tree algorithm of the project, before obtaining the trained first decision tree, the method further comprises: and extracting key information in the fault log of the project, wherein the type of the extracted key information can be customized, namely, the information to be analyzed can be customized according to requirements, such as acquisition time information, fault type, fault prompt information and the like.
In one embodiment, the method for processing fault data includes:
step S306, when the value difference is not within the preset difference interval, the trained first decision tree is adjusted according to the value difference to obtain a second decision tree.
Step S307, training a second decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained second decision tree, and taking the trained second decision tree as a trained fault event positioning model until the numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is within a preset difference value interval.
In one embodiment, adjusting the trained first decision tree according to the numerical difference results in a second decision tree, comprising: and pruning the trained first decision tree according to a preset pruning rule when the value difference is larger than a preset value to obtain a second decision tree.
Specifically, the numerical difference is not located in a preset difference interval, which indicates that the constructed decision tree has a larger difference from the decision tree with real requirements. In particular, when the number of leaf nodes is greater than the number of real underlying items, the decision tree needs to be pruned. When pruning the decision tree, a common decision tree pruning algorithm is adopted to prune the decision tree, namely, the branch and leaf nodes of the decision tree are reduced, and a simpler decision tree is obtained. The pruning algorithm adopted in pruning comprises two pruning methods of post pruning and pre pruning. In the process of spanning trees, if there is no pruning operation, it is grown in such a way that each leaf is of a separate type. This is a complete fit to our training set, but is very unfriendly to the test set, and does not generalize. It is therefore necessary to subtract some branches and leaves so that the model generalization ability is stronger.
According to the time point of pruning, the pruning method is divided into pre-pruning and post-pruning. The pre-pruning is performed during the generation of the decision tree, and the post-pruning is performed after the generation of the decision tree. Common post pruning algorithms include error-based pruning, cost-complex based pruning, minimal error-based pruning, and the like. Pruning the trained first decision data to obtain a second decision tree, training the second decision tree by adopting a fault day to obtain a trained second decision tree, judging whether the numerical difference between the number of candidate basic items in the trained second decision tree and a preset numerical value is in a preset difference value interval, and taking the trained second decision tree as a trained fault event positioning model when the numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is in the preset difference value interval.
In a specific embodiment, the micro-service architecture is adopted to deploy items, that is, in the case that mutual calling exists between the services, as shown in fig. 3, fig. 3 shows a calling relationship between the items, for example, all the items are divided into three levels, the first level includes mp and qifang, the second level includes mcn, contract, reward and renzheng, and the third level includes qifang-mcn, qifang-contract, thirdparty-oa, thirdparty-puyu, qifang-reward, qifang-basic and thirdmaster-else. Wherein the third party service is a third party service, and the mp-invoked service includes all services of the second tier. The services called by the qifang include qifang-mcn, qifang-contract, qifang-reward, qifang-basic and thirdmaster-else. Services called by mcn include qifang-mcn, qifang-contact, thirdmaster-oa and qifang-basic. The services called by the contact include the qifang-contact, the thirdpoint-oa and the qifang-basic. Services called by reward include thirdmaster-puyu, qifang-reward. Services called by renzheng include qifang-basic and thirdmaster-else. The services called by the qifang-mcn include the qifang-mcn and the qifang-contact. The services called by the qifang-con include qifang-mcn, thirdparty-oa, thirdparty-puyu and thirdmaster-else. The services called by the qifang-reward include qifang-reward, qifang-basic and thirdmaster-else.
In one embodiment, when the hierarchy of the item is less than or equal to 3, the item may not be hierarchically labeled, that is, the fault log does not include corresponding hierarchy information.
In one embodiment, when a problem occurs, the short-term log of the related item is primarily judged and input into the model, the model extracts and classifies the key log, and the initial position of the occurrence of the abnormality is determined according to the final classification result, so that the problem position is rapidly positioned and can be processed in time. The fault logs in each item can be generalized according to the need, and the key information to be extracted is as follows:
{ time (time),
item name (item identification),
the functionName,
position (Position),
error log.
The method is combined with actual projects, and the root cause analysis method is used for extracting and modeling the fault log, so that the fault processing efficiency is effectively improved, the work of technical staff for troubleshooting is reduced, the time for processing on-line problems is saved, and particularly, the reasons corresponding to fault events can be positioned in an accelerated manner aiming at a plurality of projects of related projects. And the faults are rapidly positioned, so that the processing progress of fault reporting events is accelerated, and the user satisfaction is improved.
FIG. 2 is a flow diagram of a method of processing fault data in one embodiment. It should be understood that, although the steps in the flowchart of fig. 2 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
In one embodiment, as shown in FIG. 4, there is provided an apparatus 200 for processing fault data, comprising:
the data acquisition module 201 is configured to acquire a current fault event, and acquire a fault log of at least one preset associated item according to the current fault event, where the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item.
The fault location module 202 is configured to input fault logs of each associated item to a trained fault event location model, output abnormal probabilities of each base item in the trained fault event location model, screen abnormal items of a fault event from the abnormal probabilities of each base item according to the abnormal probabilities, and the trained fault event location model is a decision tree model, and the base items are leaf nodes of the trained decision tree model.
In one embodiment, the apparatus 200 for processing fault data further includes:
the model generation module is used for generating a trained fault event positioning model, wherein the model generation module comprises:
an event acquisition unit configured to acquire a plurality of historical fault events;
the system comprises a log acquisition unit, a first fault log and a second fault log, wherein the log acquisition unit is used for acquiring a fault log of a real associated item of each historical fault event to obtain a first fault log, and the first fault log carries item identification of the real associated item and hierarchy information of the real associated item, and invokes the item identification of the real associated item and the corresponding hierarchy information;
the training unit is used for training the first decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained first decision tree;
the judging unit is used for judging whether the numerical difference between the number of candidate basic items in the trained first decision tree and a preset numerical value is in a preset difference value interval, wherein the preset numerical value is the number of real basic items;
and the model determining unit is used for taking the trained first decision tree as a trained fault event positioning model when the numerical value difference is in the preset difference value interval.
In one embodiment, the model generation module further comprises:
the model adjusting unit is used for adjusting the trained first decision tree according to the numerical value difference to obtain a second decision tree when the numerical value difference is not in the preset difference value interval;
the model determining unit is further configured to execute training the second decision tree according to the fault log of each item and a preset classification tree algorithm, to obtain a trained second decision tree, and to use the trained second decision tree as a trained fault event positioning model until a numerical difference between the number of candidate basic items in the trained second decision tree and a preset numerical value is located in a preset difference interval.
In one embodiment, the model adjustment unit is specifically configured to prune the trained first decision tree according to a preset pruning rule to obtain the second decision tree when the value difference is greater than a preset value.
FIG. 5 illustrates an internal block diagram of a computer device in one embodiment. The computer device may be specifically the terminal 110 (or the server 120) in fig. 1. As shown in fig. 5, the computer device is connected to the processor, memory, network interface, input device and display screen via a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program which, when executed by a processor, causes the processor to implement a method of processing fault data. The internal memory may also have stored therein a computer program which, when executed by the processor, causes the processor to perform a method of processing fault data. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in FIG. 5 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, the apparatus for processing fault data provided by the present application may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 5. The memory of the computer device may store various program modules that make up the apparatus for processing fault data, such as the data acquisition module 201 and the fault location module 202 shown in fig. 4. The computer program of each program module causes a processor to execute the steps of the method of processing fault data of each embodiment of the present application described in the present specification.
For example, the computer device shown in fig. 5 may perform obtaining a current fault event through the data obtaining module 201 in the device for processing fault data shown in fig. 4, and obtain, according to the current fault event, a fault log of at least one preset associated item, where the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item. The computer device may perform inputting fault logs of each associated item to a trained fault event positioning model through the fault positioning module 202, outputting an abnormal probability of each base item in the trained fault event positioning model, screening abnormal items of the fault event from the abnormal probabilities of each base item according to the abnormal probability, where the trained fault event positioning model is a decision tree model, and the base items are leaf nodes of the trained decision tree model.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program: acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item; inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the trained decision tree model.
In one embodiment, the processor when executing the computer program further performs the steps of: generating a trained fault event localization model comprising: acquiring a plurality of historical fault events; obtaining a fault log of a real association item of each historical fault event to obtain a first fault log, wherein the first fault log carries item identification of the real association item and hierarchy information of the real association item, and invokes the item identification of the real association item and the corresponding hierarchy information; training a first decision tree according to the fault logs of each item and a preset classification tree algorithm to obtain a trained first decision tree; judging whether the numerical difference between the number of candidate basic items in the trained first decision tree and a preset numerical value is in a preset difference value interval, wherein the preset numerical value is the number of real basic items; and when the numerical difference is in the preset difference interval, taking the trained first decision tree as a trained fault event positioning model.
In one embodiment, the processor when executing the computer program further performs the steps of: when the numerical value difference is not located in the preset difference value interval, the trained first decision tree is adjusted according to the numerical value difference to obtain a second decision tree; and executing the training of the second decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained second decision tree, and taking the trained second decision tree as a trained fault event positioning model until the numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is in a preset difference interval.
In one embodiment, adjusting the trained first decision tree according to the numerical difference results in a second decision tree, comprising: and pruning the trained first decision tree according to a preset pruning rule when the value difference is larger than a preset value to obtain a second decision tree.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item; inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, screening abnormal items of fault events from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the trained decision tree model.
In one embodiment, the computer program when executed by the processor further performs the steps of: generating a trained fault event localization model comprising: acquiring a plurality of historical fault events; obtaining a fault log of a real association item of each historical fault event to obtain a first fault log, wherein the first fault log carries item identification of the real association item and hierarchy information of the real association item, and invokes the item identification of the real association item and the corresponding hierarchy information; training a first decision tree according to the fault logs of each item and a preset classification tree algorithm to obtain a trained first decision tree; judging whether the numerical difference between the number of candidate basic items in the trained first decision tree and a preset numerical value is in a preset difference value interval, wherein the preset numerical value is the number of real basic items; and when the numerical difference is in the preset difference interval, taking the trained first decision tree as a trained fault event positioning model.
In one embodiment, the computer program when executed by the processor further performs the steps of: when the numerical value difference is not located in the preset difference value interval, the trained first decision tree is adjusted according to the numerical value difference to obtain a second decision tree; and executing the training of the second decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained second decision tree, and taking the trained second decision tree as a trained fault event positioning model until the numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is in a preset difference interval.
In one embodiment, adjusting the trained first decision tree according to the numerical difference results in a second decision tree, comprising: and pruning the trained first decision tree according to a preset pruning rule when the value difference is larger than a preset value to obtain a second decision tree.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the application to enable those skilled in the art to understand or practice the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A method of processing fault data, the method comprising:
acquiring a current fault event, and acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
inputting fault logs of all the associated items to a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, and screening abnormal items of the fault event from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the decision tree model obtained through training;
wherein generating the trained fault event localization model comprises:
acquiring a plurality of historical fault events;
obtaining a fault log of a real associated item of each historical fault event to obtain a first fault log, wherein the first fault log carries an item identifier of the real associated item and hierarchy information of the real associated item, and invokes the item identifier of the real associated item and the corresponding hierarchy information;
training a first decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained first decision tree;
judging whether the numerical difference between the number of candidate basic items in the trained first decision tree and a preset numerical value is in a preset difference interval, wherein the preset numerical value is the number of real basic items;
and when the numerical difference is in the preset difference interval, taking the trained first decision tree as the trained fault event positioning model.
2. The method according to claim 1, wherein the method further comprises:
when the numerical value difference is not located in the preset difference value interval, adjusting the trained first decision tree according to the numerical value difference to obtain a second decision tree;
and executing the training of the second decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained second decision tree, and taking the trained second decision tree as the trained fault event positioning model until the numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is in the preset difference interval.
3. The method of claim 2, wherein said adjusting the trained first decision tree based on the numerical difference results in a second decision tree, comprising:
and pruning the trained first decision tree according to a preset pruning rule when the numerical value difference is larger than the preset numerical value to obtain the second decision tree.
4. An apparatus for processing fault data, the apparatus comprising:
the data acquisition module is used for acquiring a current fault event, acquiring a fault log of at least one preset associated item according to the current fault event, wherein the fault log of each associated item carries an item identifier of the associated item and hierarchical information of the item;
the fault positioning module is used for inputting fault logs of all the associated items into a trained fault event positioning model, outputting abnormal probability of all basic items in the trained fault event positioning model, and screening abnormal items of the fault event from the abnormal probability of all the basic items according to the abnormal probability, wherein the trained fault event positioning model is a decision tree model, and the basic items are leaf nodes of the decision tree model obtained through training;
the model generation module is used for generating the trained fault event positioning model, wherein the model generation module comprises the following components:
an event acquisition unit configured to acquire a plurality of historical fault events;
the log acquisition unit is used for acquiring a fault log of a real associated item of each historical fault event to obtain a first fault log, wherein the first fault log carries an item identifier of the real associated item and hierarchy information of the real associated item, and invokes the item identifier of the real associated item and the corresponding hierarchy information;
the training unit is used for training the first decision tree according to the fault log of each item and a preset classification tree algorithm to obtain a trained first decision tree;
the judging unit is used for judging whether the numerical difference between the number of candidate basic items in the trained first decision tree and a preset numerical value is in a preset difference value interval or not, wherein the preset numerical value is the number of real basic items;
and the model determining unit is used for taking the trained first decision tree as the trained fault event positioning model when the numerical value difference is in the preset difference value interval.
5. The apparatus of claim 4, wherein the model generation module further comprises:
the model adjusting unit is used for adjusting the trained first decision tree according to the numerical value difference to obtain a second decision tree when the numerical value difference is not located in the preset difference value interval;
the model determining unit is further configured to perform training of the second decision tree according to the fault log of each item and a preset classification tree algorithm, to obtain a trained second decision tree, and use the trained second decision tree as the trained fault event positioning model until a numerical difference between the number of candidate basic items in the trained second decision tree and the preset numerical value is located in the preset difference interval.
6. The apparatus according to claim 5, wherein the model adjustment unit is specifically configured to prune the trained first decision tree according to a preset pruning rule to obtain the second decision tree when the numerical difference is greater than the preset numerical value.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 3 when the computer program is executed by the processor.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.
CN202010113452.6A 2020-02-24 2020-02-24 Method, apparatus, computer device and storage medium for processing fault data Active CN111338836B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010113452.6A CN111338836B (en) 2020-02-24 2020-02-24 Method, apparatus, computer device and storage medium for processing fault data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010113452.6A CN111338836B (en) 2020-02-24 2020-02-24 Method, apparatus, computer device and storage medium for processing fault data

Publications (2)

Publication Number Publication Date
CN111338836A CN111338836A (en) 2020-06-26
CN111338836B true CN111338836B (en) 2023-09-01

Family

ID=71183638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010113452.6A Active CN111338836B (en) 2020-02-24 2020-02-24 Method, apparatus, computer device and storage medium for processing fault data

Country Status (1)

Country Link
CN (1) CN111338836B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308126A (en) * 2020-10-27 2021-02-02 深圳前海微众银行股份有限公司 Fault recognition model training method, fault recognition device and electronic equipment
CN112606779B (en) * 2020-12-24 2023-01-06 东风汽车有限公司 Automobile fault early warning method and electronic equipment
CN113515434A (en) * 2021-01-04 2021-10-19 腾讯科技(深圳)有限公司 Abnormity classification method, abnormity classification device, abnormity classification equipment and storage medium
CN113407374A (en) * 2021-06-22 2021-09-17 未鲲(上海)科技服务有限公司 Fault processing method and device, fault processing equipment and storage medium
CN113591477B (en) * 2021-08-10 2023-09-15 平安银行股份有限公司 Fault positioning method, device, equipment and storage medium based on associated data
CN113795032B (en) * 2021-09-26 2023-12-08 中国联合网络通信集团有限公司 Method and device for judging invisible faults of indoor division, storage medium and equipment
CN115225460B (en) * 2022-07-15 2023-11-28 北京天融信网络安全技术有限公司 Fault determination method, electronic device, and storage medium
CN115426276B (en) * 2022-08-22 2024-03-12 神华准格尔能源有限责任公司 Method for monitoring 5G major equipment of strip mine and cloud server
CN115601013A (en) * 2022-10-25 2023-01-13 中广核研究院有限公司(Cn) Nuclear reactor failure determination method, device, apparatus, storage medium, and product
CN115794479B (en) * 2023-02-10 2023-05-12 深圳依时货拉拉科技有限公司 Log data processing method and device, electronic equipment and storage medium
CN117573428B (en) * 2023-11-08 2024-05-07 安徽鼎甲计算机科技有限公司 Disaster recovery backup method, device, computer equipment and storage medium
CN117278383B (en) * 2023-11-21 2024-02-20 航天科工广信智能技术有限公司 Internet of things fault investigation scheme generation system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634828A (en) * 2018-12-17 2019-04-16 浪潮电子信息产业股份有限公司 Failure prediction method, device, equipment and storage medium
CN110647446A (en) * 2018-06-26 2020-01-03 中兴通讯股份有限公司 Log fault association and prediction method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10936564B2 (en) * 2017-04-19 2021-03-02 Xerox Corporation Diagnostic method and system utilizing historical event logging data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647446A (en) * 2018-06-26 2020-01-03 中兴通讯股份有限公司 Log fault association and prediction method, device, equipment and storage medium
CN109634828A (en) * 2018-12-17 2019-04-16 浪潮电子信息产业股份有限公司 Failure prediction method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111338836A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN111338836B (en) Method, apparatus, computer device and storage medium for processing fault data
CN109032829B (en) Data anomaly detection method and device, computer equipment and storage medium
Chien et al. Analysing semiconductor manufacturing big data for root cause detection of excursion for yield enhancement
CN110489520B (en) Knowledge graph-based event processing method, device, equipment and storage medium
CN111176990B (en) Test data generation method and device based on data decision, and computer equipment
CN110245034B (en) Service metric analysis based on structured log patterns of usage data
CN109032824A (en) Database method of calibration, device, computer equipment and storage medium
CN108563734A (en) Institutional information querying method, device, computer equipment and storage medium
CN112231224A (en) Business system testing method, device, equipment and medium based on artificial intelligence
US20230086863A1 (en) Systems and methods for intelligent cybersecurity alert similarity detection and cybersecurity alert handling
Senthan et al. Development of churn prediction model using xgboost-telecommunication industry in sri lanka
CN114679341A (en) Network intrusion attack analysis method, equipment and medium combined with ERP system
binti Oseman et al. Data mining in churn analysis model for telecommunication industry
CN110838940A (en) Underground cable inspection task configuration method and device
Eken et al. Predicting defects with latent and semantic features from commit logs in an industrial setting
CN114462859A (en) Workflow processing method and device, computer equipment and storage medium
Sun et al. An automatic test sequence generation method based on Markov chain model
CN114490415A (en) Service testing method, computer device, storage medium, and computer program product
Tamura et al. Fault identification tool based on deep learning for fault big data
EP4303777A1 (en) Method and system for exception management
CN115499289B (en) Equipment state evaluation early warning method and system
Wu et al. Revisting the Impact of Regression Models for Predicting the Number of Defects.
US11936672B2 (en) Systems and methods for intelligently generating cybersecurity contextual intelligence and generating a cybersecurity intelligence interface
CN115829543B (en) Method for determining validity of preventive test of power equipment based on fault detection interval
CN116578583B (en) Abnormal statement identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant