CN112035286A - Method and device for determining fault cause, storage medium and electronic device - Google Patents

Method and device for determining fault cause, storage medium and electronic device Download PDF

Info

Publication number
CN112035286A
CN112035286A CN202010866416.7A CN202010866416A CN112035286A CN 112035286 A CN112035286 A CN 112035286A CN 202010866416 A CN202010866416 A CN 202010866416A CN 112035286 A CN112035286 A CN 112035286A
Authority
CN
China
Prior art keywords
program
probability
fault
network model
classification network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010866416.7A
Other languages
Chinese (zh)
Inventor
李本浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haier Uplus Intelligent Technology Beijing Co Ltd
Original Assignee
Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haier Uplus Intelligent Technology Beijing Co Ltd filed Critical Haier Uplus Intelligent Technology Beijing Co Ltd
Priority to CN202010866416.7A priority Critical patent/CN112035286A/en
Publication of CN112035286A publication Critical patent/CN112035286A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification

Abstract

The invention provides a method and a device for determining a fault reason, a storage medium and an electronic device, wherein the method comprises the following steps: inputting the characteristic attributes acquired during program faults into a classification network model to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; and taking the target category corresponding to the maximum probability value in the multiple determined probability values as the fault reason of the program fault, and adopting the technical scheme to solve the problems that the response is slow when the program fault occurs, the fault reason of the program fault cannot be found out in time and the like in the related technology.

Description

Method and device for determining fault cause, storage medium and electronic device
Technical Field
The present invention relates to the field of communications, and in particular, to a method and an apparatus for determining a cause of a fault, a storage medium, and an electronic apparatus.
Background
For a program, the most urgent is the high availability of the program, the service brought by the program is directly facing the user, and if the program is not available, the user experience will be adversely affected. In the prior art, the abnormal condition of the program, such as unavailable server resources or table locking, can be discovered in time only when the program is abnormal, and the fault occurs at this time. In addition, most of the existing solutions for the problem of the program are to manually check error logs and to troubleshoot the problem, sometimes the same error log causes different error reasons, and the correct error reason can be found only by analyzing according to the current scene, which may take a lot of time, which causes the unavailability of the service of the program, and may greatly affect the customer experience.
Aiming at the problems that in the related art, the response is slow when a program fails, the failure reason of the program failure cannot be found out in time and the like, an effective technical scheme is not provided.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining a fault reason, a storage medium and an electronic device, which are used for at least solving the problems that in the related art, the response is slow when a program fails, the fault reason of the program failure cannot be found out in time and the like.
According to an embodiment of the present invention, there is provided a method for determining a cause of a failure, including: inputting the characteristic attributes acquired during program faults into a classification network model to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; and taking the target category corresponding to the maximum probability value in the plurality of determined probability values as a fault reason of the program fault.
Optionally, before inputting the feature attributes acquired during the program failure into the classification network model to obtain a plurality of probability values of different classes corresponding to the feature attributes, the method further includes: configuring a category library for the classification network model, wherein the category library comprises a plurality of categories corresponding to the characteristic attributes; and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
Optionally, inputting the feature attributes acquired during the program failure into a classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes, including: a determining step, comprising: determining a first probability, a second probability and a third probability of each category according to the classification network model, wherein the first probability is the probability that the characteristic attribute exists under the condition that the target category occurs, the second probability is the probability that the target category occurs, and the third probability is the probability that the characteristic attribute exists; determining a product of the first probability and the second probability, and taking a ratio of the product to the third probability as a probability value of the target category; and circularly executing the determining steps to determine a plurality of probability values of the different classes.
Optionally, after the target category corresponding to the maximum probability value in the multiple determined probability values is used as the fault cause of the program fault, the method further includes: selecting a fault processing record corresponding to the fault reason according to a fault processing record of a program; and processing the program fault according to the fault processing record.
Optionally, processing the program fault according to the fault processing record includes: and when the program fault is not successfully processed according to the fault processing record, taking the target category corresponding to the second approximate value in the probability values as the fault reason of the program fault.
Optionally, in a case that the program fault is successfully processed according to the fault processing record, the method further includes: and correspondingly storing the fault processing record and the program reason in a target device for running the program.
According to an embodiment of the present invention, there is provided an apparatus for determining a cause of a failure, including: the first processing module is used for inputting the characteristic attributes acquired during program faults into a classification network model so as to obtain a plurality of probability values of the characteristic attributes corresponding to different categories, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; and the determining module is used for taking the target category corresponding to the maximum probability value in the plurality of determined probability values as the fault reason of the program fault.
Optionally, the apparatus further comprises: the configuration module is used for configuring a category library for the classification network model, wherein the category library comprises a plurality of categories corresponding to the characteristic attributes; and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
Optionally, the first processing module is further configured to perform a determining step, where the determining step includes: determining a first probability, a second probability and a third probability of each category according to the classification network model, wherein the first probability is the probability that the characteristic attribute exists under the condition that the target category occurs, the second probability is the probability that the target category occurs, and the third probability is the probability that the characteristic attribute exists; determining a product of the first probability and the second probability, and taking a ratio of the product to the third probability as a probability value of the target category; and circularly executing the determining steps to determine a plurality of probability values of the different classes.
Optionally, the apparatus further comprises: the selection module is used for selecting the fault processing record corresponding to the fault reason according to the fault processing record of the program; and the second processing module is used for processing the program fault according to the fault processing record.
Optionally, the second processing module is further configured to, when the program fault is not successfully processed according to the fault processing record, use a target category corresponding to a second rough probability value of the plurality of probability values as a fault cause of the program fault.
Optionally, the second processing module is further configured to, when the program failure is successfully processed according to the failure processing record, correspondingly store the failure processing record and the program reason in a target device running a program.
According to another embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the invention, the characteristic attributes acquired during program faults are input into the classification network model to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; the method and the device have the advantages that the target category corresponding to the maximum probability value in the probability values is used as the fault reason of the program fault, namely, the probability value result output by the classification network model is used, so that the processing time of the program fault is shortened.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a block diagram of a hardware configuration of a computer terminal of a method for determining a cause of a fault according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of determining a cause of a fault according to an embodiment of the invention;
FIG. 3 is a process flow diagram of a classification network model for program failure handling and prediction based on naive Bayes classification in accordance with an alternative embodiment of the present invention;
fig. 4 is a block diagram (one) of the structure of the apparatus for determining the cause of a failure according to the embodiment of the present invention;
fig. 5 is a block diagram (ii) of the configuration of the apparatus for determining the cause of a failure according to the embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method provided by the embodiment of the application can be executed in a mobile terminal, a computer terminal or a similar operation device. Taking the example of the method running on a computer terminal, fig. 1 is a hardware structure block diagram of the computer terminal of the method for determining the cause of the failure according to the embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal.
For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration with equivalent functionality to that shown in FIG. 1 or with more functionality than that shown in FIG. 1. The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the method for determining the cause of the failure in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
An embodiment of the present invention provides a method for determining a cause of a fault, which is applied to the above-mentioned computer terminal, and fig. 2 is a flowchart of the method for determining a cause of a fault according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, inputting the characteristic attributes acquired during program failure into a classification network model to obtain a plurality of probability values of the characteristic attributes corresponding to different categories, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
and step S204, taking the target category corresponding to the maximum probability value in the plurality of determined probability values as the fault reason of the program fault.
Through the steps, the characteristic attributes acquired during program faults are input into a classification network model so as to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; the method and the device have the advantages that the target category corresponding to the maximum probability value in the probability values is used as the fault reason of the program fault, namely, the probability value result output by the classification network model is used, so that the processing time of the program fault is shortened.
It should be noted that, the feature attributes input into the classification network model may be one or multiple, which are determined according to the training process of the classification network model and can be flexibly transformed, and this is not limited in the embodiments of the present invention.
In step S202, there are multiple implementation manners for the training manner of the classification network model, and optionally, a category library is configured for the classification network model, where the category library includes multiple categories corresponding to the feature attributes; and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
In other words, in order to facilitate the identification of multiple classes by the classification network model and the class distinction of different program faults, when the classification network model is trained, the class library is configured for the classification network model, and then multiple classes corresponding to the program faults are obtained from the class library to train the classification network model.
Optionally, inputting the feature attributes acquired during the program failure into a classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes, including: a determining step, comprising: determining a first probability, a second probability and a third probability of each category according to the classification network model, wherein the first probability is the probability that the characteristic attribute exists under the condition that the target category occurs, the second probability is the probability that the target category occurs, and the third probability is the probability that the characteristic attribute exists; determining a product of the first probability and the second probability, and taking a ratio of the product to the third probability as a probability value of the target category; and circularly executing the determining steps to determine a plurality of probability values of the different classes.
In short, in order to make the determined multiple probability values more accurate, when multiple probability values of different classes corresponding to the feature attributes are obtained through the classification network model, it is necessary to determine, through the classification network model, that a first probability of the feature attributes exists under the condition that the target class occurs; a second probability of occurrence of the target class upon program failure; and further combining the conditional probability and a Bayesian formula, multiplying the first probability by the second probability, performing proportional operation on the third probability, and taking a ratio result of the proportional operation as a probability value of the target class, thereby circularly executing the proportional operation and determining probability values corresponding to a plurality of different classes one by one.
Optionally, after the target category corresponding to the maximum probability value in the multiple determined probability values is used as the fault cause of the program fault, the method further includes: selecting a fault processing record corresponding to the fault reason according to a fault processing record of a program; and processing the program fault according to the fault processing record.
That is, in order to improve the processing efficiency of a program failure for determining a cause of a failure, after the target class of a program failure is confirmed by the classification model, a failure processing record is selected from target devices running the program, the cause of the failure being the same as or similar to the cause of the target class of the current program failure, and the program failure of the currently confirmed target class is processed, thereby speeding up the processing response to the program failure.
Optionally, processing the program fault according to the fault processing record includes: and when the program fault is not successfully processed according to the fault processing record, taking the target category corresponding to the second approximate value in the probability values as the fault reason of the program fault.
Due to the diversity of the program fault types, the selected fault processing records are not necessarily capable of successfully processing the program faults, at this time, the target type corresponding to the second approximate value under the current maximum probability value in the probability values determined by the classification network model is used as the fault cause of the program faults, and then the fault processing records corresponding to the second approximate value are reselected from the fault processing records to process the program faults.
Optionally, in a case that the program fault is successfully processed according to the fault processing record, the method further includes: and correspondingly storing the fault processing record and the program fault reason in a target device for running a program.
That is, after the program failure processing is successfully completed by the failure processing record, the processing can be performed quickly for the next occurrence of the same program failure, and therefore, the failure processing record and the program failure cause need to be correspondingly saved in the target device running the program.
In order to better understand the above determining process of the failure cause, the following description is made with reference to an alternative embodiment, but is not intended to limit the technical solution of the embodiment of the present invention.
The technical background of the optional embodiment of the present invention is based on a naive bayes algorithm, naive bayes classification is the simplest of bayes classification, and a common classification method is also mathematically, the classification problem can be defined as follows: the known set C ═ y1,y2,....,ynAnd I ═ x1,x2,....,xnThe mapping rule y ═ f (x) is determined such that any x ∈ I has and only one y ∈ C, such that y ∈ f (x) I holds, where C is called a class set, where each element is a class, and I is called an item set (feature set), where each element is an item to be classified, and f is called a classifier. The task of the classification algorithm is to construct the classifier f.
Errors and field conditions (corresponding to program faults in the embodiments of the present invention) occurring during the running of a program may be used as a feature attribute of the program, and different errors occurring each time may be used as a feature set, for example, an error log reported by each program, a time when the error occurs, resource information of a server when the error occurs, QoS (Quality of Service, QoS for short) of the program, and the like may all be used as a feature attribute to form the feature set.
The specific reason why the program has an error each time may be taken as a category (corresponding to an object category in the embodiment of the present invention), for example: server resources are scarce, concurrency is too large, etc. may be a set of categories.
To facilitate understanding of the classification network model in the embodiment of the present invention, the working process is now explained as follows:
(1) let D be the set of training tuples (equivalent to characteristic attributes in embodiments of the invention) and their associated class labels (equivalent to classes in embodiments of the invention). Using one n-dimensional attribute vector X ═ X for each tuple1,x2,...,xnRepresents it.
(2) Assume that there are m class labels C1,C2,...CmGiven tuple X, the taxonomy will predict that X belongs to the class with the highest a posteriori probability. That is, naive Bayes classification predicts that X belongs to class CiIf and only if P (C)i|X)>P(Cj| X), j is more than or equal to 1 and less than or equal to m, and j is not equal to i; thus, P (C)i| X) largest class C1Referred to as maximum a posteriori probability. According to Bayes theorem:
Figure BDA0002649885730000091
wherein, P (X | C)i) Belong to C as a modeiThe probability density of X occurring under the condition of the class is called the class conditional probability density of X; p (C)i) Indicates the occurrence of C in the identification problem studiediProbability of a class, also known as prior probability; p (X) is the probability density of the feature vector X.
(3) Since P (X) is constant for all classes, only P (C) is requiredi|X)P(Ci) And (4) the maximum is obtained. If the prior probabilities of classes are unknown, it is generally assumed that the classes are equi-probable, i.e., P (C)1)=P(C2)=...=P(Cm) And accordingly P (C) is pairedi| X), otherwise, P (C)i|X)P(Ci);
(4) Given a data set with many attributes, P (C) is calculatedi| X) is very expensive. To reduce computational overhead, naive assumptions of class condition independence can be made. Given the class labels of the tuples, it is assumed that the attribute values are conditionally independent of each other. Thus obtaining
Figure BDA0002649885730000101
Investigating whether the property is categorised or continuous, e.g. for calculating P (X | C)i) Consider the following two cases:
(a) if A iskIs a categorical attribute, then P (x)k|Ci) Is attribute A in DkHas a value of xkC of (A)iNumber of tuples of class divided by C in DiNumber of tuple of class | Ci,D|
(b) If A iskIs a continuous value attribute, the continuous value attribute is assumed to obey a gaussian distribution with mean η and standard deviation σ, defined by the following equation:
Figure BDA0002649885730000102
i.e. P (x)k|Ci)=g(xkcici);
(5) And, for each class C, to predict the class label of XiCalculating P (C)i|X)P(Ci). The classification method predicts class C of input tuple Xi,If and only if, P (X | C)i)P(Ci)>P(X|Cj)P(Cj) J is more than or equal to 1 and less than or equal to m, and j is not equal to i. That is, the class label to be predicted is P (X | C)i)P(Ci) Largest class Ci
An alternative embodiment of the invention provides a classification network model for program fault processing and prediction based on naive Bayes classification. As shown in fig. 3, the processing flow of the network model is as follows:
step S302: the method comprises a preparation stage, wherein characteristic attributes such as common characteristic attributes of 'error logs, access time, QoS' of a program and the like in the program are determined, meanwhile, what a predicted value is clear, each characteristic attribute is appropriately divided, then, a part of data is manually classified to form a training sample, the stage is the stage which only needs to be manually completed in the whole naive Bayes classification, the quality of the stage has important influence on the whole process, and the quality of a classifier is determined by the characteristic attributes, the characteristic attribute division and the quality of the training sample to a great extent.
Step S304: in the training stage, inputting characteristic attributes and training samples to obtain classifiers with different outputs; the classifier is generated at the stage, and the main work is to calculate the occurrence frequency of each class in a training sample and the conditional probability P (y) of each characteristic attribute partition for each classi)。
Step S306: application phase with P (x | y)i)P(yi) The maximum term is defined as the category to which x belongs, and P (x | y) is calculated for each categoryi)P(yi) The new data is classified using a classifier at this stage, the input is the classifier and the new data, and the output is the classification result of the new data.
Note that when P (x) appearsk|Ci) This occurs when the attribute item in a class is not present, which is 0, and this occurs when: despite the absence of this zero probability, it is still possible to obtain an indication that X belongs to CiHigh probability of class. In this case, the feed can be passed through the LappTo avoid this phenomenon, the laplace calibration or laplace estimation method assumes that the training database D is so large that the change in the estimated probability due to adding 1 to each count is negligible, thereby conveniently avoiding a probability value of 0.
In order to better understand the classification network model for program fault processing and prediction based on naive Bayes classification, a practical case is described below.
After the classification network model has been trained, a new set of data is processed, assuming that there are two categories for the new set of data: c1: the server resources are insufficient; c2: program deadlock; the characteristic attributes of the new set of data are: a. the1: error Log, A2: access time, A3: the QoS of the program.
When it is determined that the position is A1、A2、A3C is obtained on the premise that 3 characteristic attributes existjThe probability of a class, expressed as a conditional probability, is P (C)j|A1A2A3) Combining the Bayesian formula to obtain:
Figure BDA0002649885730000111
since there are two categories for the new set of data, P (C) needs to be found1|A1A2A3) And P (C)2|A1A2A3) By comparing C1、C2Which classification is highly probable so that the classification result for the new set of data can be derived. That is, the classification result of the new set of data is equivalent to P (A)1A2A3|Cj)P(Cj) And (4) obtaining the maximum value.
Further, let AiAre independent of each other, then: p (A)1A2A3|Cj)=P(A1|Cj)P(A2|Cj)P(A3|Cj);
P(A1|C1)=Y1;P(A2|C1)=Y2;P(A3|C1)=Y3
P(A1|C2)=Y3;P(A2|C2)=Y4;P(A3|C5)=Y6
P(A1A2A3|C1)=Y1 Y2 Y3
P(A1A2A3|C2)=Y4 Y5 Y6
If Y appears1 Y2 Y3>Y4 Y5 Y6C, the category determined after the new data is processed by the classification network model is the server resource shortage1,If Y appears1 Y2 Y3<Y4 Y5 Y6C, the category determined after the new data is processed by the classification network model is program deadlock2
The optional embodiment of the invention solves the problems that the response is slow when the program fails, the failure reason of the program failure cannot be found out in time and the like in the related technology, can predict the failure problem of the program failure which possibly occurs through the classification network model, and can rapidly respond and confirm the failure problem, thereby avoiding the long-time analysis of the cause of the program failure and improving the processing efficiency of the program failure, and the classification network model structure provided by the optional embodiment of the invention can rapidly process the sudden program problem, can timely find the point of the problem, respond and process, and timely predict and feed back the corresponding processing method for the problem which possibly occurs, thereby improving the efficiency of processing the program abnormal problem of developers and improving the usability of program codes, and does not require manual extraction of feature attributes.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a device for determining a cause of a fault is also provided, where the device is used to implement the foregoing embodiments and preferred embodiments, and details are not repeated after the description is given. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a block diagram of a configuration of an apparatus for determining a cause of a failure according to an embodiment of the present invention, as shown in fig. 4, the apparatus including:
(1) the first processing module 42 is configured to input the feature attributes acquired during the program failure into a classification network model, so as to obtain a plurality of probability values corresponding to different categories of the feature attributes, where the classification network model is trained through machine learning by using a plurality of groups of data, and each group of data in the plurality of groups of data includes: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
(2) and the determining module 44 is configured to use the target category corresponding to the maximum probability value in the determined multiple probability values as the fault cause of the program fault.
By the device, the characteristic attributes acquired during program faults are input into the classification network model so as to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute; the method and the device have the advantages that the target category corresponding to the maximum probability value in the probability values is used as the fault reason of the program fault, namely, the probability value result output by the classification network model is used, so that the processing time of the program fault is shortened.
It should be noted that, the feature attributes input into the classification network model may be one or multiple, which are determined according to the training process of the classification network model and can be flexibly transformed, and this is not limited in the embodiments of the present invention.
Fig. 5 is another apparatus for determining a cause of a fault according to an embodiment of the present invention, and as shown in fig. 5, the apparatus includes, in addition to all modules shown in fig. 4:
optionally, the apparatus further comprises: a configuration module 40, configured to configure a category library for the classification network model, where the category library includes multiple categories corresponding to feature attributes; and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
In other words, in order to facilitate the identification of multiple classes by the classification network model and the class distinction of different program faults, when the classification network model is trained, the class library is configured for the classification network model, and then multiple classes corresponding to the program faults are obtained from the class library to train the classification network model.
Optionally, the first processing module 42 is further configured to execute the determining step, including: determining a first probability, a second probability and a third probability of each category according to the classification network model, wherein the first probability is the probability that the characteristic attribute exists under the condition that the target category occurs, the second probability is the probability that the target category occurs, and the third probability is the probability that the characteristic attribute exists; determining a product of the first probability and the second probability, and taking a ratio of the product to the third probability as a probability value of the target category; and circularly executing the determining steps to determine a plurality of probability values of the different classes.
In short, in order to make the determined multiple probability values more accurate, when multiple probability values of different classes corresponding to the feature attributes are obtained through the classification network model, it is necessary to determine, through the classification network model, that a first probability of the feature attributes exists under the condition that the target class occurs; a second probability of occurrence of the target class upon program failure; and further combining the conditional probability and a Bayesian formula, multiplying the first probability by the second probability, performing proportional operation on the third probability, and taking a ratio result of the proportional operation as a probability value of the target class, thereby circularly executing the proportional operation and determining probability values corresponding to a plurality of different classes one by one.
Optionally, the apparatus further comprises: a selecting module 46, configured to select a fault processing record corresponding to the fault cause according to a fault processing record of a program; and the second processing module is used for processing the program fault according to the fault processing record.
That is, in order to improve the processing efficiency of a program failure for determining a cause of a failure, after the target class of a program failure is confirmed by the classification model, a failure processing record is selected from target devices running the program, the cause of the failure being the same as or similar to the cause of the target class of the current program failure, and the program failure of the currently confirmed target class is processed, thereby speeding up the processing response to the program failure.
Optionally, the second processing module 48 is further configured to, when the program fault is not successfully processed according to the fault processing record, use a target category corresponding to a second rough probability value of the plurality of probability values as a fault cause of the program fault.
Due to the diversity of the program fault types, the selected fault processing records are not necessarily capable of successfully processing the program faults, at this time, the target type corresponding to the second approximate value under the current maximum probability value in the probability values determined by the classification network model is used as the fault cause of the program faults, and then the fault processing records corresponding to the second approximate value are reselected from the fault processing records to process the program faults.
Optionally, the second processing module 48 is further configured to, when the program fault is successfully processed according to the fault processing record, correspondingly store the fault processing record and the program reason in a target device running a program.
That is, after the program failure processing is successfully completed by the failure processing record, the processing can be performed quickly for the next occurrence of the same program failure, and therefore, the failure processing record and the program failure cause need to be correspondingly saved in the target device running the program.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
In an exemplary embodiment, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, inputting the feature attributes collected during the program failure into a classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes, wherein the classification network model is trained by machine learning using a plurality of groups of data, and each group of data in the plurality of groups of data includes: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
and S2, taking the target category corresponding to the maximum probability value in the plurality of determined probability values as the fault reason of the program fault.
An embodiment of the present invention further provides a storage medium including a stored program, wherein the program executes any one of the methods described above.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, inputting the feature attributes collected during the program failure into a classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes, wherein the classification network model is trained by machine learning using a plurality of groups of data, and each group of data in the plurality of groups of data includes: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
and S2, taking the target category corresponding to the maximum probability value in the plurality of determined probability values as the fault reason of the program fault.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for determining a cause of a fault, comprising:
inputting the characteristic attributes acquired during program faults into a classification network model to obtain a plurality of probability values of the characteristic attributes corresponding to different classes, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
and taking the target category corresponding to the maximum probability value in the plurality of determined probability values as a fault reason of the program fault.
2. The method of claim 1, wherein before inputting the feature attributes collected during the program failure into the classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes, the method further comprises:
configuring a category library for the classification network model, wherein the category library comprises a plurality of categories corresponding to the characteristic attributes;
and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
3. The method of claim 1, wherein inputting the feature attributes collected during the program failure into a classification network model to obtain a plurality of probability values corresponding to different classes of the feature attributes comprises:
a determining step, comprising: determining a first probability, a second probability and a third probability of each category according to the classification network model, wherein the first probability is the probability that the characteristic attribute exists under the condition that the target category occurs, the second probability is the probability that the target category occurs, and the third probability is the probability that the characteristic attribute exists; determining a product of the first probability and the second probability, and taking a ratio of the product to the third probability as a probability value of the target category;
and circularly executing the determining steps to determine a plurality of probability values of the different classes.
4. The method of claim 1, wherein after the determined target category corresponding to the highest probability value of the plurality of probability values is used as a fault cause of the program fault, the method further comprises:
selecting a fault processing record corresponding to the fault reason according to a fault processing record of a program;
and processing the program fault according to the fault processing record.
5. The method of claim 4, wherein processing the program fault according to the fault processing record comprises:
and when the program fault is not successfully processed according to the fault processing record, taking the target category corresponding to the second approximate value in the probability values as the fault reason of the program fault.
6. The method of claim 5, wherein in the event that the program fault is successfully handled according to the fault handling record, the method further comprises:
and correspondingly storing the fault processing record and the program fault reason in a target device for running a program.
7. An apparatus for determining a cause of a fault, comprising:
the first processing module is used for inputting the characteristic attributes acquired during program faults into a classification network model so as to obtain a plurality of probability values of the characteristic attributes corresponding to different categories, wherein the classification network model is trained by using a plurality of groups of data through machine learning, and each group of data in the plurality of groups of data comprises: the method comprises the steps of obtaining a characteristic attribute and a plurality of probability values of different categories corresponding to the characteristic attribute;
and the determining module is used for taking the target category corresponding to the maximum probability value in the plurality of determined probability values as the fault reason of the program fault.
8. The device of claim 7, further comprising a configuration module configured to configure a category library for the classification network model, wherein the category library comprises a plurality of categories corresponding to the feature attributes; and training the classification network model according to the corresponding characteristic attribute when the program fails and a plurality of classes selected from the class library and corresponding to the program failures.
9. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to carry out the method of any one of claims 1 to 6 when executed.
10. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 6.
CN202010866416.7A 2020-08-25 2020-08-25 Method and device for determining fault cause, storage medium and electronic device Pending CN112035286A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010866416.7A CN112035286A (en) 2020-08-25 2020-08-25 Method and device for determining fault cause, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010866416.7A CN112035286A (en) 2020-08-25 2020-08-25 Method and device for determining fault cause, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN112035286A true CN112035286A (en) 2020-12-04

Family

ID=73580091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010866416.7A Pending CN112035286A (en) 2020-08-25 2020-08-25 Method and device for determining fault cause, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN112035286A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699048A (en) * 2021-01-13 2021-04-23 腾讯科技(深圳)有限公司 Program fault processing method, device and equipment based on artificial intelligence and storage medium
CN112882887A (en) * 2021-01-12 2021-06-01 昆明理工大学 Dynamic establishment method for service fault model in cloud computing environment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008052376A (en) * 2006-08-22 2008-03-06 Fuji Xerox Co Ltd Image forming apparatus, failure diagnostic system and fault diagnostic program
US20150347926A1 (en) * 2014-06-02 2015-12-03 Salesforce.Com, Inc. Fast Naive Bayesian Framework with Active-Feature Ordering
CN105530122A (en) * 2015-12-03 2016-04-27 国网江西省电力公司信息通信分公司 Network failure diagnosis method based on selective hidden Naive Bayesian classifier
CN106570513A (en) * 2015-10-13 2017-04-19 华为技术有限公司 Fault diagnosis method and apparatus for big data network system
CN110555477A (en) * 2019-08-30 2019-12-10 青岛海信网络科技股份有限公司 municipal facility fault prediction method and device
CN110568286A (en) * 2019-09-12 2019-12-13 齐鲁工业大学 Transformer fault diagnosis method and system based on weighted double-hidden naive Bayes
CN111174370A (en) * 2018-11-09 2020-05-19 珠海格力电器股份有限公司 Fault detection method and device, storage medium and electronic device
US20200159601A1 (en) * 2018-11-15 2020-05-21 International Business Machines Corporation Storage mounting event failure prediction
CN111340099A (en) * 2020-02-24 2020-06-26 上海明略人工智能(集团)有限公司 Method, device, storage medium and electronic device for determining state of object

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008052376A (en) * 2006-08-22 2008-03-06 Fuji Xerox Co Ltd Image forming apparatus, failure diagnostic system and fault diagnostic program
US20150347926A1 (en) * 2014-06-02 2015-12-03 Salesforce.Com, Inc. Fast Naive Bayesian Framework with Active-Feature Ordering
CN106570513A (en) * 2015-10-13 2017-04-19 华为技术有限公司 Fault diagnosis method and apparatus for big data network system
CN105530122A (en) * 2015-12-03 2016-04-27 国网江西省电力公司信息通信分公司 Network failure diagnosis method based on selective hidden Naive Bayesian classifier
CN111174370A (en) * 2018-11-09 2020-05-19 珠海格力电器股份有限公司 Fault detection method and device, storage medium and electronic device
US20200159601A1 (en) * 2018-11-15 2020-05-21 International Business Machines Corporation Storage mounting event failure prediction
CN110555477A (en) * 2019-08-30 2019-12-10 青岛海信网络科技股份有限公司 municipal facility fault prediction method and device
CN110568286A (en) * 2019-09-12 2019-12-13 齐鲁工业大学 Transformer fault diagnosis method and system based on weighted double-hidden naive Bayes
CN111340099A (en) * 2020-02-24 2020-06-26 上海明略人工智能(集团)有限公司 Method, device, storage medium and electronic device for determining state of object

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高欣;刁新平;刘婧;张密;何杨;: "基于模型自适应选择融合的智能电表故障多分类方法", 电网技术, no. 06, pages 105 - 111 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112882887A (en) * 2021-01-12 2021-06-01 昆明理工大学 Dynamic establishment method for service fault model in cloud computing environment
CN112882887B (en) * 2021-01-12 2022-08-09 昆明理工大学 Dynamic establishment method for service fault model in cloud computing environment
CN112699048A (en) * 2021-01-13 2021-04-23 腾讯科技(深圳)有限公司 Program fault processing method, device and equipment based on artificial intelligence and storage medium
CN112699048B (en) * 2021-01-13 2023-11-17 腾讯科技(深圳)有限公司 Program fault processing method, device, equipment and storage medium based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN108683530B (en) Data analysis method and device for multi-dimensional data and storage medium
CN109587008B (en) Method, device and storage medium for detecting abnormal flow data
US20200401945A1 (en) Data Analysis Device and Multi-Model Co-Decision-Making System and Method
US10581667B2 (en) Method and network node for localizing a fault causing performance degradation of a service
CN111294819B (en) Network optimization method and device
CN113254254B (en) Root cause positioning method and device of system fault, storage medium and electronic device
CN111723226B (en) Information management method based on big data and Internet and artificial intelligence cloud server
CN111859047A (en) Fault solving method and device
CN112035286A (en) Method and device for determining fault cause, storage medium and electronic device
CN111796955A (en) Fault source positioning method, system, device and storage medium
CN111552509A (en) Method and device for determining dependency relationship between interfaces
CN111651595A (en) Abnormal log processing method and device
CN112181767A (en) Method and device for determining software system exception and storage medium
CN111274084A (en) Fault diagnosis method, device, equipment and computer readable storage medium
CN110895506A (en) Construction method and construction system of test data
CN111367782B (en) Regression testing data automatic generation method and device
CN110807104B (en) Method and device for determining abnormal information, storage medium and electronic device
CN111431733A (en) Service alarm coverage information evaluation method and device
WO2023041992A1 (en) Systems and methods for performing root cause analysis
CN113259878B (en) Call bill settlement method, system, electronic device and computer readable storage medium
CN113010310A (en) Job data processing method and device and server
CN114020971A (en) Abnormal data detection method and device
CN112508518A (en) RPA flow generation method combining RPA and AI, corresponding device and readable storage medium
US20200267054A1 (en) Determining the importance of network devices based on discovered topology, managed endpoints, and activity
CN111740871A (en) Data acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination