CN112052151A - Fault root cause analysis method, device, equipment and storage medium - Google Patents

Fault root cause analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN112052151A
CN112052151A CN202011072717.9A CN202011072717A CN112052151A CN 112052151 A CN112052151 A CN 112052151A CN 202011072717 A CN202011072717 A CN 202011072717A CN 112052151 A CN112052151 A CN 112052151A
Authority
CN
China
Prior art keywords
analyzed
root cause
fault root
fault
component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011072717.9A
Other languages
Chinese (zh)
Other versions
CN112052151B (en
Inventor
刘志煌
胡林红
罗朝亮
武睿彪
李冠灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011072717.9A priority Critical patent/CN112052151B/en
Publication of CN112052151A publication Critical patent/CN112052151A/en
Application granted granted Critical
Publication of CN112052151B publication Critical patent/CN112052151B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The application discloses a fault root cause analysis method, a fault root cause analysis device, equipment and a storage medium, wherein the method comprises the steps of obtaining original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed; determining implicit sequence mode characteristics based on the original timing sequence information; acquiring an alarm log of each component in a component set to be analyzed within a first preset time range; determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range; based on a root cause association probability analysis model, according to the text features of the alarm logs and the hidden sequence mode features, carrying out fault root cause association probability analysis on the components in the component set to be analyzed to obtain fault root cause association probability among the components in the component set to be analyzed; and determining the fault root cause association relation among the components according to the fault root cause association probability. By the aid of the technical scheme, the incidence relation of the fault root causes among the components can be efficiently and accurately determined in fault detection, and reliability of fault root cause analysis is improved.

Description

Fault root cause analysis method, device, equipment and storage medium
Technical Field
The application relates to the technical field of operation and maintenance management, in particular to a fault root cause analysis method, device, equipment and storage medium.
Background
With the continuous progress of digital transformation, data indexes and calling relations of various systems become more and more complex, one system is often composed of a large number of components such as servers, and once a fault occurs, huge loss can be brought, so that besides rapid detection, fault root cause analysis is also needed, so that similar faults are prevented from occurring again later, and loss brought by the fault is reduced.
In the prior art, when fault root cause analysis is carried out, rules are often manually specified or experiences are accumulated, a decision tree is constructed or a knowledge graph is established, the flexibility is low, the dependence on manpower is strong, the efficiency is low, errors are difficult to avoid, more time and manpower resources are consumed when the rules and the like need to be updated, and a more reliable and efficient scheme needs to be provided.
Disclosure of Invention
In order to solve the problems in the prior art, the application provides a fault root cause analysis method, a fault root cause analysis device, equipment and a storage medium. The technical scheme is as follows:
one aspect of the present application provides a fault root cause analysis method, including:
acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, wherein the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range;
based on a root cause correlation probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed;
and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
Another aspect of the present application provides a fault root cause analysis apparatus, including:
the system comprises an original time sequence information acquisition module, a time sequence analysis module and a time sequence analysis module, wherein the original time sequence information acquisition module is used for acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, and the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
the implicit sequence mode characteristic determining module is used for determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
the alarm log acquisition module is used for acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
the text characteristic determining module is used for determining the text characteristic of the alarm log corresponding to the alarm log of each component in a first preset time range;
the fault root cause correlation probability analysis module is used for carrying out fault root cause correlation probability analysis on the component set to be analyzed according to the alarm log text characteristic and the hidden sequence mode characteristic based on a root cause correlation probability analysis model to obtain the fault root cause correlation probability among the component set to be analyzed;
and the fault root incidence relation determining module is used for determining the fault root incidence relation among the components in the component set to be analyzed according to the fault root incidence probability among the components in the component set to be analyzed.
Another aspect of the present application provides a fault root cause analysis device, which includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method as described above.
Another aspect of the present application provides a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the fault root cause analysis method as described above.
The fault root cause analysis method, the fault root cause analysis device, the fault root cause analysis equipment and the storage medium have the following technical effects:
according to the method and the device, the implicit sequence mode characteristics can be determined by acquiring the original time sequence information of a plurality of indexes to be analyzed corresponding to the component set to be analyzed; acquiring an alarm log of each component in the component set to be analyzed within a first preset time range to determine corresponding text characteristics of the alarm log, so as to adapt to the requirements of dynamic operation and maintenance change; then, based on a root cause association probability analysis model, according to the text features of the alarm logs and the hidden sequence mode features, carrying out fault root cause association probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, so that the fault root cause association probabilities among the concentrated assemblies to be analyzed can be quickly and accurately obtained, and finally, the fault root cause association relation among the concentrated assemblies to be analyzed is determined according to the fault root cause association probabilities among the concentrated assemblies to be analyzed. By using the technical scheme provided by the embodiment of the specification, the incidence relation of the fault root causes among the components can be rapidly and accurately determined, and the reliability of fault root cause analysis is further improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flow chart of a fault root cause analysis method provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of another fault root cause analysis method provided in the embodiments of the present application;
FIG. 4 is a schematic flow chart illustrating another fault root cause analysis method provided in the embodiments of the present application;
FIG. 5 is a schematic structural diagram of a root cause association probability analysis model according to an embodiment of the present disclosure;
FIG. 6 is a schematic flow chart illustrating another fault root cause analysis method provided in the embodiments of the present application;
FIG. 7 is a schematic structural diagram of another root cause relevance probability analysis model provided in the embodiments of the present application;
FIG. 8 is a schematic flow chart diagram illustrating another fault root cause analysis method provided in the embodiments of the present application;
fig. 9 is a schematic diagram of a fault root cause analysis device according to an embodiment of the present application;
fig. 10 is a hardware structure block diagram of a server for implementing a fault root cause analysis method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. Examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment provided by the present application, and as shown in fig. 1, the application environment may include a root cause analysis server 01 and a plurality of service components 02.
In this embodiment of the present specification, the root cause analysis server 01 may be configured to perform fault root cause analysis by combining data of the multiple service components 02, and optionally, the root cause analysis server 01 may be an independent physical server, a server cluster or a distributed system formed by multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, CDN (Content Delivery Network) and a big data and artificial intelligence platform.
In this embodiment, the plurality of service components 02 may generate operation data, alarm logs, and the like, so that the root cause analysis server 01 may obtain required data to implement fault root cause analysis, and in one embodiment, the plurality of service components 02 may include servers for implementing different functions, may be independent physical servers, may be a server cluster or a distributed system formed by a plurality of physical servers, and may be cloud servers providing basic cloud computing services such as cloud services, a cloud database, cloud computing, cloud functions, cloud storage, Network services, cloud communication, middleware services, domain name services, security services, a CDN (Content Delivery Network), big data, an artificial intelligence platform, and the like. In practical applications, the service component 02 may further include, but is not limited to, a terminal device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a network device, a firewall, and the like.
In the embodiment of the present specification, the root cause analysis server 01 and the plurality of service components 02 may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Fig. 2 is a flow chart of a fault root cause analysis method provided in an embodiment of the present application, and the present specification provides the method operation steps as described in the embodiment or the flow chart, but more or less operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:
s201: and acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to the component set to be analyzed.
In an embodiment of the present specification, the set of components to be analyzed includes at least two components. In particular, the components of the set of components to be analyzed may be configured in association with actual root cause analysis requirements for a fault, and in one particular embodiment, the set of components to be analyzed may include a component that failed in an exception event and at least one component that may be associated with the failed component.
In this embodiment of the present specification, the analysis of the root cause of the fault may include analyzing whether there is a fault correlation among several preset components, so as to avoid similar faults from occurring again; in the embodiment of the present specification, in combination with actual failure root cause analysis requirements, several preset components may be taken as a to-be-analyzed component set, and then failure root cause association probability analysis may be performed based on a root cause association probability analysis model to determine a failure root cause association relationship between components in the to-be-analyzed component set, so as to facilitate the operation and maintenance personnel to perform corresponding maintenance subsequently, avoiding a similar failure from occurring again.
Specifically, the components may include, but are not limited to, terminal devices, servers for implementing different functions, network devices, firewalls, and the like; the index may be used to characterize relevant operational information of the corresponding component, and specifically, the index may include, but is not limited to, average response time, average throughput rate, number of requests, error rate, health, and processing time.
In an embodiment of the present specification, the plurality of to-be-analyzed indicators include an to-be-analyzed indicator corresponding to each component in the to-be-analyzed component set. Because each component may correspond to a plurality of indexes, some indexes in all indexes corresponding to each component can be obtained by combining with the actual fault root cause analysis requirement to serve as the indexes to be analyzed corresponding to the component. For example, the component set to be analyzed includes a component a, a component B, and a component C, 3 indexes of all indexes corresponding to the component a may be obtained as indexes to be analyzed corresponding to the component a, 5 indexes of all indexes corresponding to the component B may be obtained as indexes to be analyzed corresponding to the component B, 2 indexes of all indexes corresponding to the component C may be obtained as indexes to be analyzed corresponding to the component C, and at this time, the 10 indexes may be used as the plurality of indexes to be analyzed corresponding to the component set to be analyzed.
In an embodiment of the present specification, the raw timing information of each index to be analyzed may represent a variation relationship of a value of the index to be analyzed with time, and in an embodiment, the raw timing information may include a two-dimensional curve continuously varying with time, or a plurality of point values discretely varying with time. For example, when the index to be analyzed includes the average throughput rate of the component a, the original timing information of the index to be analyzed may be a two-dimensional curve varying with time, the abscissa is time, and the ordinate is the value of the average throughput rate, and the value of the index to be analyzed at each time and the variation trend may be obtained by obtaining the original timing information of the index to be analyzed. In practical application, the original time sequence information of the index to be analyzed at any time can be acquired by combining the requirements of practical fault root cause analysis, and the method is flexible.
S203: and determining the characteristic of the implicit sequence mode based on the original time sequence information of the plurality of indexes to be analyzed.
In a specific embodiment, as shown in fig. 3, the determining the implicit sequence pattern feature based on the raw timing information of the plurality of indicators to be analyzed may include:
s301: and determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed.
In this embodiment, an index time sequence ascending and descending sequence may include a plurality of index change identifiers, and the index change identifiers may represent changes of corresponding indexes to be analyzed.
In one embodiment, when the original timing information includes a two-dimensional curve that continuously changes over time, the incremental change of the original timing information may be determined by determining a change node of the curve (e.g., the curve change trend is originally increasing before the change node, and the curve change trend becomes decreasing after the change node, or the curve change trend is originally decreasing before the change node, and the curve change trend is changed to increasing after the change node). Specifically, the second preset time range may include a plurality of preset continuous time periods, and the preset continuous time periods may be determined in combination with the actual fault root cause analysis requirement. The determining the indicator timing ascending and descending sequence within the second preset time range according to the original timing information of the plurality of indicators to be analyzed may include:
and determining corresponding index time sequence ascending and descending sequences respectively based on the appearance sequence of the change nodes of the original time sequence information of the plurality of indexes to be analyzed in each preset continuous time period, and taking the index time sequence ascending and descending sequences corresponding to all the preset continuous time periods as the index time sequence ascending and descending sequences in the second preset time range.
In a specific embodiment, the second preset time range may include 3 preset continuous time periods of 20 to 23 days of 7-month 9, 20 to 23 days of 7-month 10, and 20 to 23 days of 7-month 11, where the to-be-analyzed index includes an index a, an index b, a index c, and an index d, and in the preset continuous time period of 20 to 23 days of 7-month 9, a curve corresponding to the index b first appears at a change node, a curve corresponding to the index b changes into an increase b after the change node, then a curve corresponding to the index c appears at a change node, a curve corresponding to the index c changes into an increase c after the change node, then a curve corresponding to the index a appears at a change node, a curve corresponding to the index a decrease a after the change node, then a curve corresponding to the index d appears at a change node, and a curve corresponding to the index d changes into an increase d after the change node, then the corresponding index time sequence ascending and descending sequence determined at this time is 'b increase-c increase-a decrease-d increase', and the index time sequence ascending and descending sequence comprises index change identifiers of b increase, c increase, a decrease and d increase; similarly, the index time sequence ascending and descending sequence corresponding to the other 2 preset continuous time periods can be determined.
The corresponding index time sequence ascending and descending sequence is determined according to the appearance sequence of the change nodes of the original time sequence information of the indexes to be analyzed in each preset continuous time period, so that whether potential causal relationships exist among the changes of the indexes or not is determined, the follow-up fault root cause analysis is facilitated according to needs, and the reliability and the comprehensiveness of the fault root cause analysis are improved.
In the former embodiment, the incremental and decremental change of the original time sequence information may be determined by determining a change node of a curve, and the corresponding index time sequence ascending and descending sequence is determined based on the occurrence order of the change nodes of the original time sequence information of the plurality of indexes to be analyzed in each preset continuous time period. In another embodiment provided by the present specification, the time sequence of the index ascending and descending in the second preset time range may also be determined by setting a plurality of time intervals based on the original time sequence information, and comparing the value of the index in one time interval with the value of the index in the corresponding previous time interval to determine the increase and decrease of the value of the index in each time interval to be analyzed. In this embodiment, specifically, as shown in fig. 4, determining the indicator timing ascending and descending sequence within the second preset time range according to the original timing information of the plurality of indicators to be analyzed may include:
s401: and determining the time sequence ascending and descending information of the plurality of indexes to be analyzed according to the original time sequence information of the plurality of indexes to be analyzed.
Specifically, the determining the timing ascending and descending information of the plurality of indexes to be analyzed according to the original timing information of the plurality of indexes to be analyzed may include:
1) setting a plurality of time nodes, and taking a time interval between every two adjacent time nodes as a time interval;
2) respectively determining the increase and decrease information of the value of each index to be analyzed at each time interval according to the original time sequence information of each index to be analyzed;
in practical application, the increase and decrease information of the value of each index to be analyzed in each time interval is determined according to the original time sequence information of each index to be analyzed, and the increase and decrease information of the value of each index to be analyzed in each time interval can be determined by comparing the value of each index to be analyzed in each time interval with the value of each index to be analyzed in the corresponding previous time interval.
3) And performing time sequence ascending and descending marking according to the increasing and decreasing information of the value of the index to be analyzed at each time interval to obtain the time sequence ascending and descending information of the index to be analyzed, and integrating the time sequence ascending and descending information of the plurality of indexes to be analyzed.
For example, when it is determined that the value of the a-index is increased when it is 1-2 compared to the value of the a-index when it is 0-1, the index change identifier corresponding to the a-index when it is 1-2 may be marked as increased.
In a specific embodiment, a time node may be set every 1 hour, the plurality of indexes to be analyzed include an index a, an index b, an index c, an index d, an index e, and an index f, and taking 0-12 of 1/7/2020 as an example, the following time sequence ascending and descending information of the plurality of indexes to be analyzed may be utilized
Table 1 shows:
time interval a index b index c index d index e index f index
At 0-1 time a minus Increase of b c increase d is increased e is decreased f increase
1-2 times a increase b is decreased c increase d is decreased e increase f minus
2-3 times a minus Increase of b c reduction d is increased e increase f minus
At 3-4 times a minus b is decreased c increase d is decreased e is decreased f minus
At 4-5 deg.C a increase Increase of b c increase d is increased e increase f increase
At 5-6 times a increase b is decreased c increase d is decreased e is decreased f increase
At 6-7 times a increase Increase of b c reduction d is decreased e increase f minus
At 7-8 times a increase b is decreased c reduction d is increased e increase f increase
At 8-9 times a increase Increase of b c increase d is decreased e increase f minus
When 9-10 times a minus Increase of b c increase d is decreased e is decreased f increase
At 10-11 hours a minus b is decreased c increase d is increased e increase f increase
At 11-12 times a increase b is decreased c reduction d is decreased e is decreased f increase
TABLE 1
The timing ascending and descending information of the indexes to be analyzed on other dates can be similar to the form of the table 1, and is not described herein again.
S403: and constructing an index time sequence ascending and descending sequence within a second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
In practical applications, the second predetermined time range may be set according to the actual failure cause analysis requirement, in one embodiment, the second predetermined time range may be the same time period (time interval) on different dates, and in another embodiment, the second predetermined time range may also be different time periods (time intervals) on the same day, which is not limited in this application. Taking the same time period of different dates as an example of the second preset time range, an index time sequence ascending and descending sequence corresponding to the time interval between 8 and 9 of 7/1/2020 and an index time sequence ascending and descending sequence corresponding to the time interval between 8 and 9 of 7/2/2020 can be respectively constructed according to the generated time sequence ascending and descending information of a plurality of indexes to be analyzed, as shown in table 2:
date Time interval Index timing sequence up-down sequence
20200701 At 8-9 times a increase, b increase, c increase, d decrease, e increase, f decrease
20200702 At 8-9 times a minus b plus c plus d minus e minus f minus
TABLE 2
Taking the index time sequence ascending and descending sequence of 'a increasing, b increasing, c increasing, d decreasing, e increasing, f decreasing' as an example, it can be understood that at this time, a increasing is accompanied by b increasing, d decreasing is accompanied by d increasing, and f decreasing is accompanied by e increasing, the time sequence ascending and descending information of the multiple indexes to be analyzed is determined according to the original time sequence information of the multiple indexes to be analyzed, and then the index time sequence ascending and descending sequence in the second preset time range is constructed according to the time sequence ascending and descending information of the multiple indexes to be analyzed, so that a large amount of index change information is favorably acquired so as to determine whether potential correlation exists among changes of the multiple indexes, and therefore, the subsequent fault root cause analysis is favorably carried out as required, and the reliability of the fault root cause analysis is improved.
S303: and mining a sequence mode according to the index time sequence ascending and descending sequence to obtain an implicit sequence mode.
In this embodiment of the present specification, sequence Pattern mining may be performed by using a Prefix-Projected Pattern Growth (Prefix-Projected sequence Pattern mining) according to the index timing ascending and descending sequence, so as to obtain an implicit sequence Pattern. Specifically, the mining of the sequence mode according to the index time sequence ascending and descending sequence to obtain the implicit sequence mode may include the following steps:
1) determining the frequency number of each index change identifier in the index time sequence ascending and descending sequence;
in particular, the frequency may represent the number of occurrences of the index change identifier in the entire index timing ascending and descending sequence.
Taking the above table 2 as an example, there are 2 index timing ascending and descending sequences, namely, "a increases-b increases-c increases-d decreases-e increases-f decreases" and "a decreases-b increases-c increases-d decreases-e decreases-f decreases", and the frequency of each index change identifier in the above index timing ascending and descending sequences is determined as shown in table 3:
index change identifier a increase a minus Increase of b c increase d is decreased e increase e is decreased f minus
Frequency of occurrence 1 1 2 2 2 1 1 2
TABLE 3
2) And determining the index change identifier meeting a preset minimum support threshold based on the frequency of the index change identifier, respectively taking the index change identifier meeting the preset minimum support threshold as a prefix, and determining a corresponding suffix.
In an embodiment of the present disclosure, the preset minimum support threshold may be set according to an actual application requirement, and in an embodiment, the preset minimum support threshold may be determined according to the following formula:
min_sup=a×n
where min _ sup represents the preset minimum support threshold, n represents the number of days (number of days) included in the second preset time range, and a represents the minimum support rate, which may be determined according to the actual application requirements, for example, the minimum support rate may be adjusted according to the number of ascending and descending sequences of the indicator timing sequence, and the minimum support rate may be adjusted downward along with the number of ascending and descending sequences of the indicator timing sequence. The preset minimum support threshold may represent a requirement for a frequency of occurrence of data, for example, the preset minimum support threshold is 0.5, and is satisfied when an occurrence frequency of the target data in all data is higher than 0.5, and if there are 10 index timing ascending and descending sequences, it is determined that the target element satisfies the preset minimum support threshold when the target element occurs in more than 5 index timing ascending and descending sequences.
Referring to table 4, when the predetermined minimum support threshold is 0.5, the prefix and the corresponding suffix determined in step 2) are shown in table 4:
Figure BDA0002715610520000121
TABLE 4
3) Respectively determining the single item meeting the preset minimum support threshold in the suffixes corresponding to the two prefixes, combining the single item meeting the preset minimum support threshold with the corresponding one prefix to obtain two prefixes, and continuously determining the suffixes corresponding to the two prefixes.
Referring to table 5, when the preset minimum support threshold is 0.5, the prefixes and suffixes determined in step 3) are as shown in table 5:
Figure BDA0002715610520000122
TABLE 5
4) In the same way, determining the single item meeting the preset minimum support threshold in the suffixes corresponding to the i prefixes respectively, combining the single item meeting the preset minimum support threshold with the corresponding i prefixes to obtain (i +1) prefixes, and determining the suffixes (i is an integer greater than 1) corresponding to the (i +1) prefixes;
and repeating the step 4) until the longest prefix sequence is mined, and taking the longest prefix sequence as the implicit sequence mode.
Referring to table 6 and table 7, when the preset minimum support threshold is 0.5, the determined prefixes and suffixes of three items are shown in table 6, and the prefixes and suffixes of four items are shown in table 7:
Figure BDA0002715610520000131
TABLE 6
Prefix of four items Corresponding suffix
b increase, c increase, d decrease, f decrease Is free of
TABLE 7
At this time, the excavated longest prefix sequence is "b is increased-c is increased-d is decreased-f is decreased", that is, an implicit sequence mode obtained by excavating a sequence mode according to an index time sequence ascending and descending sequence shown in table 2 is "b is increased-c is increased-d is decreased-f is decreased". Determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed, and mining a sequence pattern according to the index time sequence ascending and descending sequence to obtain a hidden sequence pattern, wherein the hidden sequence pattern can be a rule hidden by the plurality of indexes to be analyzed, and can be an incidence relation or a causal relation of changes of a plurality of indexes, and the hidden sequence pattern can be subjected to feature coding subsequently, and fault root cause analysis is performed by combining alarm logs of all components, so that the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and some indexes are not associated in the past period of time, but are likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved.
S307: and carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristic.
In the embodiment of the present specification, One-Hot feature coding (One-Hot coding) may be performed on the implicit sequence mode to obtain an implicit sequence mode feature.
S205: and acquiring an alarm log of each component in the component set to be analyzed within a first preset time range.
Specifically, the first preset time range may be set in combination with an actual fault root cause analysis requirement; in a specific embodiment, the first preset time range may include one hour before the occurrence time of the fault to one hour after the occurrence time of the fault. For example, the component set to be analyzed includes a component a, a component B, and a component C, where the component a generates 4 alarm logs within a first preset time range, the component B generates 3 alarm logs within the first preset time range, and the component C generates 3 alarm logs within the first preset time range, and may obtain the 10 alarm logs, and then determine the alarm log text features corresponding to each alarm log respectively.
The alarm log belongs to semi-structured data, and is characterized by real time and rich data, and by acquiring the alarm log of each component in the component set to be analyzed within a first preset time range, fault root cause analysis can be subsequently performed by combining an implicit sequence mode, so that the reliability of the fault root cause analysis is improved.
S207: and determining the text characteristics of the alarm log corresponding to the alarm log of each component in the first preset time range.
In an embodiment of the present specification, determining an alarm log text feature corresponding to the alarm log of each component in the first preset time range may include:
and respectively carrying out text vectorization on each alarm log to obtain corresponding text characteristics of the alarm logs.
In a specific embodiment, the text vectorizing is performed on each alarm log respectively, and obtaining the text feature of the corresponding alarm log may include:
1) obtaining a word vector corresponding to each word in the alarm log based on a preset word vector model;
in practical applications, the preset Word vector model may include a Word2vec Word vector model. It should be noted that, when the text of the alarm log is of a preset text type, for example, chinese, before the word vector corresponding to each word in the alarm log is obtained based on the preset word vector model, text segmentation needs to be performed on the alarm log.
2) Calculating the characteristic weight corresponding to each word in the alarm log;
since the alarm log has many format words existing for unifying the alarm specifications, these words appear in many alarm logs, and in order to reduce the influence of these words on the text vectorization feature representation of the alarm log, it is necessary to calculate the feature weight corresponding to each word in the alarm log. If a word frequently appears in the alarm log and rarely appears in other alarm logs, the word has the distinguishing capability for the alarm log, and the distinguishing capability of the alarm log and other alarm logs is facilitated. In a specific embodiment, a TFIDF method (term frequency-inverse document frequency) may be used to calculate a feature weight corresponding to each word in the alarm log, and specifically, the feature weight corresponding to each word in the alarm log calculated by using the TFIDF method may specifically be based on the following formula:
Figure BDA0002715610520000151
Figure BDA0002715610520000152
TF-IDF value Term Frequency (TF) x Inverse Document Frequency (IDF)
Specifically, in the calculation formula of the Inverse Document Frequency (IDF), the base of the logarithmic function can be set according to the actual application requirement. The TF-IDF value described above may characterize the corresponding feature weight of the word.
3) And carrying out weighted summation based on the word vector corresponding to each word in the alarm log and the corresponding characteristic weight to obtain the text characteristic of the alarm log corresponding to the alarm log.
The word vector corresponding to each word in the alarm log is obtained based on a preset word vector model, the characteristic weight corresponding to each word in the alarm log is calculated, the word vector corresponding to each word in the alarm log and the corresponding characteristic weight are subjected to weighted summation to obtain the alarm log text characteristics corresponding to the alarm log, the influence of irrelevant words on the alarm log text characteristics is favorably reduced, words with distinguishing capability are determined to be subjected to corresponding weight setting, then the alarm log text characteristics which are more favorable for fault root cause analysis can be obtained, and the accuracy of fault root cause analysis is improved.
S209: and based on a root cause correlation probability analysis model, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed according to the text features of the alarm logs and the implicit sequence mode features to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed.
In an embodiment of the present specification, as shown in fig. 5, the root cause relevance probability analysis model may include a correlation mining module 510, a feature fusion layer 520, a feed forward layer 530, and a classification layer 540.
As shown in fig. 6, based on a root cause association probability analysis model, according to the text feature of the alarm log and the implicit sequence pattern feature, performing a fault root cause association probability analysis on the to-be-analyzed component set, and obtaining a fault root cause association probability between the to-be-analyzed component set components may include:
s601: performing relevance mining on the text characteristics of the alarm log based on the relevance mining module to obtain the relevance characteristics of the alarm log;
since the alarm logs appearing in a specified time range often have strong correlation which is extremely important for fault root cause analysis, it is necessary to perform correlation mining on the text features of the alarm logs based on the correlation mining module.
In this embodiment of the present specification, the correlation mining module may include a transform model (a translation model based on the self-attention mechanism), and in practical applications, the correlation mining module may be used as a part of the root cause association probability analysis model, or may be cascaded with the root cause association probability analysis model as an independent neural network. Compared with a CNN (Convolutional Neural Networks) network, the Transformer model can acquire global information better; compared with an RNN (Current Neural Network Recurrent Neural Network), the method has the advantages that the training of the transform model is faster, the efficiency is high, and the fast parallel can be realized by utilizing a self-attention mechanism. In a specific embodiment, the Transformer model may include, but is not limited to, a multi-head self-attention module, a summation and normalization module, and a feedforward module, wherein the multi-head self-attention module may be composed of a plurality of self-attention units having the same structure but different weight matrices, so that each self-attention unit can focus on different features, and further the Transformer model can focus on more features, thereby avoiding a situation where the model only focuses on a part of features, facilitating more comprehensive correlation mining of the text features of the alarm log, obtaining more accurate correlation features of the alarm log, and further improving accuracy of fault root cause analysis.
S603: performing feature fusion on the alarm log correlation feature and the implicit sequence mode feature based on the feature fusion layer to obtain a target fusion feature;
in this embodiment of the present specification, the feature fusion layer can perform deep feature extraction on the alarm log correlation feature and the above-mentioned implicit sequence pattern feature to implement feature fusion, and in one embodiment, the feature fusion layer may include a GRU layer (Gate recovery Unit gated cycle Unit) that has fewer GRU parameters and can better process sequence information compared to an LSTM (Long-Short Term Memory network); in another embodiment, the feature fusion layer may also include a plurality of cascaded feed-forward layers, which can also effectively process and fuse the alarm log correlation feature and the implicit sequence pattern feature described above, and the application is not limited thereto.
Referring to fig. 7, when the correlation mining module includes a Transformer model 5101 and the feature fusion layer includes a GRU layer 5102, the structure of the root cause correlation probability analysis model is as shown in fig. 7.
S605: performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
in the embodiments of the present specification, the feature processing of the target fusion feature based on the above-mentioned feedforward layer may include, but is not limited to, feature extraction and weight configuration of the target fusion feature.
S607: and calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
In this embodiment of the present specification, the classification layer may include a classification layer, and the classification layer may calculate a fault root cause association probability based on the classification layer, and output the fault root cause association probability between the components in the component set to be analyzed, where the fault root cause association probability may represent a probability that a fault root cause association exists between the components in the component set to be analyzed.
In the embodiment of the present specification, the loss function of the root cause correlation probability analysis model may include, but is not limited to, cross entropy loss and hinge loss.
Based on a root cause association probability analysis model, according to the alarm log text features and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among the assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced.
S211: and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
Specifically, the relationship of the root cause of the failure between the components in the component set to be analyzed may include: and fault association exists among the components in the component set to be analyzed, or fault association does not exist among the components in the component set to be analyzed.
Referring to fig. 8, in an embodiment of the present specification, determining a fault root association relationship between components in the component set to be analyzed according to the fault root association probability between the components in the component set to be analyzed may include:
s801: and when the fault root cause correlation probability meets a preset condition, determining that the fault root cause correlation relationship among the components in the component set to be analyzed is the fault root cause correlation.
In a specific embodiment, the condition that the fault root cause correlation probability satisfies the preset condition may include that the fault root cause correlation probability is greater than a preset threshold, and the preset threshold may be determined by combining with the actual fault root cause analysis requirement, for example, the preset threshold may include 50% or 80%. Accordingly, when the fault root cause association probability does not satisfy a predetermined condition (for example, when the fault root cause association probability is less than or equal to a predetermined threshold), it may be determined that the fault root cause association relationship between the components in the component set to be analyzed is no fault root cause association.
The fault root incidence relation among the assemblies in the assemblies set to be analyzed is determined according to the fault root incidence probability among the assemblies in the assemblies set to be analyzed, so that operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, similar faults are avoided from happening again, and the loss caused by the fault is reduced.
In an embodiment of the present specification, a method for training a root cause association probability analysis model may also be included, as follows:
1) acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among components and corresponding sample alarm log text characteristics;
in the embodiments of the present specification, the sample component set as the training sample may include a sample component set in which a fault root cause association exists between components (i.e., the probability of the fault root cause association between the components is high), and a sample component set in which a fault root cause association does not exist between components (i.e., the probability of the fault root cause association between the components is low).
Specifically, the obtaining of the text feature of the sample alarm log corresponding to the sample component set labeled with the association probability of the fault root cause between the components may include:
respectively obtaining a sample alarm log of each component in the sample component set within a third preset time range, wherein the third preset time can be determined by combining with actual application requirements; respectively performing text vectorization on the sample alarm logs to obtain corresponding text features of the sample alarm logs, wherein the specific process is similar to the process of S207, and reference may be made to the related description of S207, which is not repeated herein. The specific process of obtaining the sample implicit sequence mode features corresponding to the plurality of sample component sets labeled with the inter-component fault root association probability is similar to the process of S201 to S203, and reference may be made to the relevant description of S201 to S203, which is not described herein again.
2) And training a preset neural network model for fault root cause association probability analysis based on the sample implicit sequence mode characteristics corresponding to the plurality of sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, and adjusting the model parameters of the preset neural network model in the training of the fault root cause association probability analysis until the preset neural network model meets a preset convergence condition to obtain the fault root cause association probability analysis model.
By utilizing the sample implicit sequence mode characteristics corresponding to the sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, the preset neural network model is trained for fault root cause association probability analysis, a more reliable fault root cause association probability model is obtained, and the reliability of the fault root cause analysis is improved.
As can be seen from the technical solutions provided in the embodiments of the present specification, by obtaining original time sequence information of a plurality of indicators to be analyzed corresponding to a set of components to be analyzed, and determining an implicit sequence pattern feature based on the original time sequence information of the plurality of indicators to be analyzed, determining an implicit sequence pattern feature based on the original time sequence information of the plurality of indicators to be analyzed may include determining an indicator time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indicators to be analyzed, performing sequence pattern mining according to the indicator time sequence ascending and descending sequence to obtain an implicit sequence pattern, where the implicit sequence pattern may be a rule implicit to a change of the plurality of indicators to be analyzed, and may be an association relationship or a causal relationship of a change of several indicators, and subsequently performing fault root cause analysis by combining with an alarm log of each component, the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and a certain index is not associated in the past period of time, but is likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved; then, carrying out feature coding on the hidden sequence mode to obtain hidden sequence mode features, acquiring an alarm log of each component in the component set to be analyzed within a first preset time range, and determining alarm log text features corresponding to the alarm log of each component within the first preset time range; then, based on a root cause association probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among a plurality of assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced; and finally, determining the fault root cause incidence relation among the assemblies in the assemblies set to be analyzed according to the fault root cause incidence probability among the assemblies in the assemblies set to be analyzed, so that the operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, the occurrence of similar faults again is avoided, and the loss caused by the fault is reduced.
An embodiment of the present application further provides a fault root cause analysis device, as shown in fig. 9, the device may include:
an original timing information obtaining module 910, configured to obtain original timing information of a plurality of to-be-analyzed indexes corresponding to a to-be-analyzed component set, where the plurality of to-be-analyzed indexes include to-be-analyzed indexes corresponding to each component in the to-be-analyzed component set;
an implicit sequence pattern feature determining module 920, configured to determine an implicit sequence pattern feature based on the original timing information of the multiple to-be-analyzed indicators;
an alarm log obtaining module 930, configured to obtain an alarm log of each component in the component set to be analyzed within a first preset time range;
a text feature determining module 940, configured to determine a text feature of the alarm log corresponding to the alarm log of each component within a first preset time range;
a fault root cause association probability analysis module 950, configured to perform fault root cause association probability analysis on the component set to be analyzed according to the alarm log text feature and the implicit sequence pattern feature based on a root cause association probability analysis model, so as to obtain a fault root cause association probability between the component set to be analyzed;
a failure root cause association relation determining module 960, configured to determine a failure root cause association relation between the components in the component set to be analyzed according to the failure root cause association probability between the components in the component set to be analyzed.
In some embodiments, the root cause association probability analysis model may include:
a correlation mining module, a feature fusion layer, a feed-forward layer, and a classification layer.
When the root cause correlation probability analysis module includes a correlation mining module, a feature fusion layer, a feed-forward layer, and a classification layer, the fault root cause correlation probability analysis module 950 may include:
the correlation mining unit is used for performing correlation mining on the text characteristics of the alarm log based on the correlation mining module to obtain the correlation characteristics of the alarm log;
the characteristic fusion unit is used for carrying out characteristic fusion on the alarm log correlation characteristic and the implicit sequence mode characteristic based on the characteristic fusion layer to obtain a target fusion characteristic;
the feature processing unit is used for performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
and the fault root cause association probability determination unit is used for calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
In some embodiments, the implicit sequence mode feature determination module 920 described above may include:
the index time sequence ascending and descending sequence determining unit is used for determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed;
the sequence pattern mining unit is used for mining a sequence pattern according to the index time sequence ascending and descending sequence to obtain a hidden sequence pattern;
and the characteristic coding unit is used for carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristics.
In some embodiments, the indicator timing ascending and descending sequence determining unit may include:
the time sequence lifting information determining unit is used for determining the time sequence lifting information of the plurality of indexes to be analyzed according to the original time sequence information of the plurality of indexes to be analyzed;
and the time sequence ascending and descending sequence constructing unit is used for constructing the index time sequence ascending and descending sequence within the second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
In some embodiments, the apparatus may further comprise:
the sample data acquisition unit is used for acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among the components and corresponding sample alarm log text characteristics;
and the model training unit is used for training fault root cause correlation probability analysis on a preset neural network model based on the sample implicit sequence mode characteristics corresponding to the plurality of sample component sets marked with the fault root cause correlation probability among the components and the corresponding sample alarm log text characteristics, and adjusting the model parameters of the preset neural network model in the training of the fault root cause correlation probability analysis until the preset neural network model meets a preset convergence condition to obtain the fault root cause correlation probability analysis model.
In some embodiments, the sample data acquiring unit may include:
the sample alarm log acquisition unit is used for respectively acquiring a sample alarm log of each component in the sample component set within a third preset time range;
and the text vectorization unit is used for respectively carrying out text vectorization on the sample alarm logs to obtain corresponding text characteristics of the sample alarm logs.
In some embodiments, the failure root cause association determination module 960 may include:
and the fault root cause association determining unit is used for determining that the fault root cause association relationship among the components in the component set to be analyzed is fault root cause association when the fault root cause association probability meets a preset condition.
The device and method embodiments in the device embodiment are based on the same application concept.
The embodiment of the present application provides a computer device, which includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method provided by the above method embodiment.
The memory may be used to store software programs and modules, and the processor may execute various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, a server, or a similar computing device, that is, the computer device may include a mobile terminal, a computer terminal, a server, or a similar computing device. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. Taking the example of running on a server, fig. 10 is a hardware structure block diagram of a server for implementing the fault root cause analysis method according to the embodiment of the present application. As shown in fig. 10, the server 1000 may have a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1010 (the processor 1010 may include a package)A processing device including, but not limited to, a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 1030 for storing data, one or more storage media 1020 (e.g., one or more mass storage devices) storing applications 1023 or data 1022. Memory 1030 and storage media 1020 may be, among other things, transient or persistent storage. The program stored in the storage medium 1020 may include one or more modules, each of which may include a series of instruction operations for a server. Still further, the central processor 1010 may be configured to communicate with the storage medium 1020 and execute a series of instruction operations in the storage medium 1020 on the server 1000. The Server 1000 may also include one or more power supplies 1060, one or more wired or wireless network interfaces 1050, one or more input-output interfaces 1040, and/or one or more operating systems 1021, such as a Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on.
The Processor 1010 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
Input-output interface 1040 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the server 1000. In one example, i/o Interface 1040 includes a Network adapter (NIC) that may be coupled to other Network devices via a base station to communicate with the internet. In one example, the input/output interface 1040 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The operating system 1021 may include system programs for handling various basic system services and performing hardware related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various underlying services and handling hardware based tasks.
It will be understood by those skilled in the art that the structure shown in fig. 10 is merely illustrative and is not intended to limit the structure of the electronic device. For example, server 1000 may also include more or fewer components than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
Embodiments of the present application further provide a computer-readable storage medium, which may be disposed in a server to store at least one instruction or at least one program for implementing a fault root cause analysis method according to the method embodiments, where the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method according to the method embodiments.
Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
As can be seen from the embodiments of the fault root cause analysis method, device, apparatus, or storage medium provided in the present application, determining the implicit sequence pattern characteristics based on the original time sequence information of the plurality of to-be-analyzed indicators by obtaining the original time sequence information of the plurality of to-be-analyzed indicators corresponding to the to-be-analyzed set, wherein determining the implicit sequence pattern characteristics based on the original time sequence information of the plurality of to-be-analyzed indicators may include determining an indicator time sequence ascending/descending sequence within a second preset time range according to the original time sequence information of the plurality of to-be-analyzed indicators, performing sequence pattern mining according to the indicator time sequence ascending/descending sequence to obtain an implicit sequence pattern, where the implicit sequence pattern may be a rule implicit in a change of the plurality of to-be-analyzed indicators, and may be an association relationship or a causal relationship of changes of several indicators, and subsequently performing fault root cause analysis by combining with the alarm logs of each assembly, the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and a certain index is not associated in the past period of time, but is likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved; then, carrying out feature coding on the hidden sequence mode to obtain hidden sequence mode features, acquiring an alarm log of each component in the component set to be analyzed within a first preset time range, and determining alarm log text features corresponding to the alarm log of each component within the first preset time range; then, based on a root cause association probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among several assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced; and finally, determining the fault root cause incidence relation among the assemblies in the assemblies set to be analyzed according to the fault root cause incidence probability among the assemblies in the assemblies set to be analyzed, so that the operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, the occurrence of similar faults again is avoided, and the loss caused by the fault is reduced.
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, device and storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of root cause analysis of a fault, the method comprising:
acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, wherein the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range;
based on a root cause correlation probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed;
and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
2. The method of claim 1, wherein the root cause correlation probability analysis model comprises a correlation mining module, a feature fusion layer, a feed forward layer, and a classification layer;
the analyzing model based on root cause correlation probability analyzes the fault root cause correlation probability of the concentrated assemblies to be analyzed according to the text features of the alarm logs and the implicit sequence mode features, and the obtaining of the fault root cause correlation probability among the concentrated assemblies to be analyzed comprises the following steps:
performing relevance mining on the text features of the alarm logs based on the relevance mining module to obtain the relevance features of the alarm logs;
performing feature fusion on the alarm log correlation feature and the implicit sequence mode feature based on the feature fusion layer to obtain a target fusion feature;
performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
and calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
3. The method of claim 1, wherein the determining implicit sequence pattern features based on the raw timing information of the plurality of metrics to be analyzed comprises:
determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed;
mining a sequence mode according to the index time sequence ascending and descending sequence to obtain an implicit sequence mode;
and carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristic.
4. The method of claim 3, wherein the determining the index timing ascending and descending sequence within a second preset time range according to the original timing information of the plurality of indexes to be analyzed comprises:
determining time sequence lifting information of the multiple indexes to be analyzed according to the original time sequence information of the multiple indexes to be analyzed;
and constructing an index time sequence ascending and descending sequence within the second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
5. The method of claim 1, further comprising:
acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among components and corresponding sample alarm log text characteristics;
based on the sample implicit sequence mode characteristics corresponding to the sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, training of fault root cause association probability analysis is carried out on a preset neural network model, model parameters of the preset neural network model are adjusted in the training of the fault root cause association probability analysis until the preset neural network model meets a preset convergence condition, and the fault root cause association probability analysis model is obtained.
6. The method of claim 5, wherein the obtaining sample alarm log text features corresponding to a plurality of sample component sets labeled with inter-component fault root cause association probabilities comprises:
respectively obtaining a sample alarm log of each component in the sample component set within a third preset time range;
and respectively carrying out text vectorization on the sample alarm logs to obtain corresponding text characteristics of the sample alarm logs.
7. The method according to claim 1, wherein the determining the correlation relationship of the fault root cause between the components in the component set to be analyzed according to the probability of the correlation of the fault root cause between the components in the component set to be analyzed comprises:
and when the fault root cause correlation probability meets a preset condition, determining that the fault root cause correlation relationship among the components in the component set to be analyzed is the fault root cause correlation.
8. A fault root cause analysis apparatus, the apparatus comprising:
the system comprises an original time sequence information acquisition module, a time sequence analysis module and a time sequence analysis module, wherein the original time sequence information acquisition module is used for acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, and the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
the implicit sequence mode characteristic determining module is used for determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
the alarm log acquisition module is used for acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
the text characteristic determining module is used for determining the text characteristic of the alarm log corresponding to the alarm log of each component in a first preset time range;
the fault root cause correlation probability analysis module is used for carrying out fault root cause correlation probability analysis on the component set to be analyzed according to the alarm log text characteristic and the hidden sequence mode characteristic based on a root cause correlation probability analysis model to obtain the fault root cause correlation probability among the component set to be analyzed;
and the fault root incidence relation determining module is used for determining the fault root incidence relation among the components in the component set to be analyzed according to the fault root incidence probability among the components in the component set to be analyzed.
9. A fault root cause analysis device comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method according to any one of claims 1 to 7.
10. A computer-readable storage medium, wherein at least one instruction or at least one program is stored in the storage medium, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the fault root cause analysis method according to any one of claims 1 to 7.
CN202011072717.9A 2020-10-09 2020-10-09 Fault root cause analysis method, device, equipment and storage medium Active CN112052151B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011072717.9A CN112052151B (en) 2020-10-09 2020-10-09 Fault root cause analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011072717.9A CN112052151B (en) 2020-10-09 2020-10-09 Fault root cause analysis method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112052151A true CN112052151A (en) 2020-12-08
CN112052151B CN112052151B (en) 2022-02-18

Family

ID=73605513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011072717.9A Active CN112052151B (en) 2020-10-09 2020-10-09 Fault root cause analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112052151B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699005A (en) * 2020-12-30 2021-04-23 网宿科技股份有限公司 Server hardware fault monitoring method, electronic equipment and storage medium
CN112799868A (en) * 2021-02-08 2021-05-14 腾讯科技(深圳)有限公司 Root cause determination method and device, computer equipment and storage medium
CN112804079A (en) * 2020-12-10 2021-05-14 北京浪潮数据技术有限公司 Cloud computing platform alarm analysis method, device, equipment and storage medium
CN112905371A (en) * 2021-01-28 2021-06-04 清华大学 Software change checking method and device based on heterogeneous multi-source data anomaly detection
CN112905479A (en) * 2021-03-17 2021-06-04 中通天鸿(北京)通信科技股份有限公司 Cloud platform based alarm accident root cause optimal path determination method and system
CN113177584A (en) * 2021-04-19 2021-07-27 合肥工业大学 Zero sample learning-based composite fault diagnosis method
CN113240139A (en) * 2021-06-03 2021-08-10 南京中兴新软件有限责任公司 Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment
CN113255780A (en) * 2021-05-28 2021-08-13 润联软件系统(深圳)有限公司 Reduction gearbox fault prediction method and device, computer equipment and storage medium
CN113552856A (en) * 2021-09-22 2021-10-26 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113569083A (en) * 2021-06-17 2021-10-29 南京大学 Intelligent sound box local end digital evidence obtaining system and method based on data traceability model
CN113590451A (en) * 2021-09-29 2021-11-02 阿里云计算有限公司 Root cause positioning method, operation and maintenance server and storage medium
CN113640699A (en) * 2021-10-14 2021-11-12 南京国铁电气有限责任公司 Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system
CN113821418A (en) * 2021-06-24 2021-12-21 腾讯科技(深圳)有限公司 Fault tracking analysis method and device, storage medium and electronic equipment
CN113821408A (en) * 2021-09-23 2021-12-21 中国建设银行股份有限公司 Server alarm processing method and related equipment
CN113872814A (en) * 2021-09-29 2021-12-31 北京金山云网络技术有限公司 Information processing method, device and system for content distribution network
CN114490303A (en) * 2022-04-07 2022-05-13 阿里巴巴达摩院(杭州)科技有限公司 Fault root cause determination method and device and cloud equipment
CN114629776A (en) * 2020-12-11 2022-06-14 中国联合网络通信集团有限公司 Fault analysis method and device based on graph model
WO2023011618A1 (en) * 2021-08-06 2023-02-09 International Business Machines Corporation Predicting root cause of alert using recurrent neural network
CN115878421A (en) * 2022-12-09 2023-03-31 国网湖北省电力有限公司信息通信公司 Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining
CN117093407A (en) * 2023-10-19 2023-11-21 北京凡得科技有限公司 Improved S-learner-based flow anomaly cascade root cause analysis method and system
CN117527523A (en) * 2023-11-23 2024-02-06 广东堡塔安全技术有限公司 Cloud computing-based server security monitoring system
CN117656846A (en) * 2024-02-01 2024-03-08 临沂大学 Dynamic storage method for automobile electric drive fault data
CN113255780B (en) * 2021-05-28 2024-05-03 润联智能科技股份有限公司 Reduction gearbox fault prediction method and device, computer equipment and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090049338A1 (en) * 2007-08-16 2009-02-19 Gm Global Technology Operations, Inc. Root cause diagnostics using temporal data mining
US20140129876A1 (en) * 2012-11-05 2014-05-08 Cisco Technology, Inc. Root cause analysis in a sensor-actuator fabric of a connected environment
US20150032746A1 (en) * 2013-07-26 2015-01-29 Genesys Telecommunications Laboratories, Inc. System and method for discovering and exploring concepts and root causes of events
CN105812177A (en) * 2016-03-08 2016-07-27 华为技术有限公司 Network fault processing method and processing apparatus
CN105893380A (en) * 2014-12-11 2016-08-24 成都网安科技发展有限公司 Improved text classification characteristic selection method
CN107301119A (en) * 2017-06-28 2017-10-27 北京优特捷信息技术有限公司 The method and device of IT failure root cause analysis is carried out using timing dependence
CN109358602A (en) * 2018-10-23 2019-02-19 山东中创软件商用中间件股份有限公司 A kind of failure analysis methods, device and relevant device
CN109687999A (en) * 2018-12-11 2019-04-26 山东中创软件商用中间件股份有限公司 A kind of association analysis method of alarm failure, device and equipment
CN110147387A (en) * 2019-05-08 2019-08-20 腾讯科技(上海)有限公司 A kind of root cause analysis method, apparatus, equipment and storage medium
US20190320329A1 (en) * 2017-01-26 2019-10-17 Telefonaktiebolaget Lm Ericsson (Publ) System and Method for Analyzing Network Performance Data
US20190324831A1 (en) * 2017-03-28 2019-10-24 Xiaohui Gu System and Method for Online Unsupervised Event Pattern Extraction and Holistic Root Cause Analysis for Distributed Systems
CN110609759A (en) * 2018-06-15 2019-12-24 华为技术有限公司 Fault root cause analysis method and device
CN111191230A (en) * 2019-12-27 2020-05-22 国网天津市电力公司 Fast network attack backtracking mining method based on convolutional neural network and application
CN111552609A (en) * 2020-04-12 2020-08-18 西安电子科技大学 Abnormal state detection method, system, storage medium, program and server
CN111722952A (en) * 2020-05-25 2020-09-29 中国建设银行股份有限公司 Fault analysis method, system, equipment and storage medium of business system
CN111726248A (en) * 2020-05-29 2020-09-29 北京宝兰德软件股份有限公司 Alarm root cause positioning method and device

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090049338A1 (en) * 2007-08-16 2009-02-19 Gm Global Technology Operations, Inc. Root cause diagnostics using temporal data mining
US20140129876A1 (en) * 2012-11-05 2014-05-08 Cisco Technology, Inc. Root cause analysis in a sensor-actuator fabric of a connected environment
US20150032746A1 (en) * 2013-07-26 2015-01-29 Genesys Telecommunications Laboratories, Inc. System and method for discovering and exploring concepts and root causes of events
CN105893380A (en) * 2014-12-11 2016-08-24 成都网安科技发展有限公司 Improved text classification characteristic selection method
CN105812177A (en) * 2016-03-08 2016-07-27 华为技术有限公司 Network fault processing method and processing apparatus
US20190320329A1 (en) * 2017-01-26 2019-10-17 Telefonaktiebolaget Lm Ericsson (Publ) System and Method for Analyzing Network Performance Data
US20190324831A1 (en) * 2017-03-28 2019-10-24 Xiaohui Gu System and Method for Online Unsupervised Event Pattern Extraction and Holistic Root Cause Analysis for Distributed Systems
CN107301119A (en) * 2017-06-28 2017-10-27 北京优特捷信息技术有限公司 The method and device of IT failure root cause analysis is carried out using timing dependence
CN110609759A (en) * 2018-06-15 2019-12-24 华为技术有限公司 Fault root cause analysis method and device
CN109358602A (en) * 2018-10-23 2019-02-19 山东中创软件商用中间件股份有限公司 A kind of failure analysis methods, device and relevant device
CN109687999A (en) * 2018-12-11 2019-04-26 山东中创软件商用中间件股份有限公司 A kind of association analysis method of alarm failure, device and equipment
CN110147387A (en) * 2019-05-08 2019-08-20 腾讯科技(上海)有限公司 A kind of root cause analysis method, apparatus, equipment and storage medium
CN111191230A (en) * 2019-12-27 2020-05-22 国网天津市电力公司 Fast network attack backtracking mining method based on convolutional neural network and application
CN111552609A (en) * 2020-04-12 2020-08-18 西安电子科技大学 Abnormal state detection method, system, storage medium, program and server
CN111722952A (en) * 2020-05-25 2020-09-29 中国建设银行股份有限公司 Fault analysis method, system, equipment and storage medium of business system
CN111726248A (en) * 2020-05-29 2020-09-29 北京宝兰德软件股份有限公司 Alarm root cause positioning method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IULIA GABRIELA CARJEU 等: "Clustering IT Events around Common Root Causes", 《2014 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING》 *
贾统 等: "基于日志数据的分布式软件系统故障诊断综述", 《软件学报》 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804079A (en) * 2020-12-10 2021-05-14 北京浪潮数据技术有限公司 Cloud computing platform alarm analysis method, device, equipment and storage medium
CN112804079B (en) * 2020-12-10 2023-04-07 北京浪潮数据技术有限公司 Alarm analysis method, device, equipment and storage medium for cloud computing platform
CN114629776A (en) * 2020-12-11 2022-06-14 中国联合网络通信集团有限公司 Fault analysis method and device based on graph model
CN112699005A (en) * 2020-12-30 2021-04-23 网宿科技股份有限公司 Server hardware fault monitoring method, electronic equipment and storage medium
CN112905371A (en) * 2021-01-28 2021-06-04 清华大学 Software change checking method and device based on heterogeneous multi-source data anomaly detection
CN112905371B (en) * 2021-01-28 2022-05-20 清华大学 Software change checking method and device based on heterogeneous multi-source data anomaly detection
CN112799868B (en) * 2021-02-08 2023-01-24 腾讯科技(深圳)有限公司 Root cause determination method and device, computer equipment and storage medium
CN112799868A (en) * 2021-02-08 2021-05-14 腾讯科技(深圳)有限公司 Root cause determination method and device, computer equipment and storage medium
CN112905479A (en) * 2021-03-17 2021-06-04 中通天鸿(北京)通信科技股份有限公司 Cloud platform based alarm accident root cause optimal path determination method and system
CN113177584A (en) * 2021-04-19 2021-07-27 合肥工业大学 Zero sample learning-based composite fault diagnosis method
CN113177584B (en) * 2021-04-19 2022-10-28 合肥工业大学 Compound fault diagnosis method based on zero sample learning
CN113255780A (en) * 2021-05-28 2021-08-13 润联软件系统(深圳)有限公司 Reduction gearbox fault prediction method and device, computer equipment and storage medium
CN113255780B (en) * 2021-05-28 2024-05-03 润联智能科技股份有限公司 Reduction gearbox fault prediction method and device, computer equipment and storage medium
CN113240139A (en) * 2021-06-03 2021-08-10 南京中兴新软件有限责任公司 Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment
CN113240139B (en) * 2021-06-03 2023-09-26 南京中兴新软件有限责任公司 Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment
CN113569083A (en) * 2021-06-17 2021-10-29 南京大学 Intelligent sound box local end digital evidence obtaining system and method based on data traceability model
CN113569083B (en) * 2021-06-17 2023-11-03 南京大学 Intelligent sound box local digital evidence obtaining system and method based on data tracing model
CN113821418A (en) * 2021-06-24 2021-12-21 腾讯科技(深圳)有限公司 Fault tracking analysis method and device, storage medium and electronic equipment
WO2023011618A1 (en) * 2021-08-06 2023-02-09 International Business Machines Corporation Predicting root cause of alert using recurrent neural network
US11928009B2 (en) 2021-08-06 2024-03-12 International Business Machines Corporation Predicting a root cause of an alert using a recurrent neural network
CN113552856A (en) * 2021-09-22 2021-10-26 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113552856B (en) * 2021-09-22 2021-12-10 成都数之联科技有限公司 Process parameter root factor positioning method and related device
CN113821408A (en) * 2021-09-23 2021-12-21 中国建设银行股份有限公司 Server alarm processing method and related equipment
CN113590451A (en) * 2021-09-29 2021-11-02 阿里云计算有限公司 Root cause positioning method, operation and maintenance server and storage medium
CN113872814A (en) * 2021-09-29 2021-12-31 北京金山云网络技术有限公司 Information processing method, device and system for content distribution network
CN113640699A (en) * 2021-10-14 2021-11-12 南京国铁电气有限责任公司 Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system
CN113640699B (en) * 2021-10-14 2021-12-24 南京国铁电气有限责任公司 Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system
CN114490303B (en) * 2022-04-07 2022-07-12 阿里巴巴达摩院(杭州)科技有限公司 Fault root cause determination method and device and cloud equipment
CN114490303A (en) * 2022-04-07 2022-05-13 阿里巴巴达摩院(杭州)科技有限公司 Fault root cause determination method and device and cloud equipment
CN115878421A (en) * 2022-12-09 2023-03-31 国网湖北省电力有限公司信息通信公司 Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining
CN115878421B (en) * 2022-12-09 2023-11-14 国网湖北省电力有限公司信息通信公司 Data center equipment level fault prediction method, system and medium
CN117093407A (en) * 2023-10-19 2023-11-21 北京凡得科技有限公司 Improved S-learner-based flow anomaly cascade root cause analysis method and system
CN117093407B (en) * 2023-10-19 2024-03-19 北京凡得科技有限公司 Improved S-learner-based flow anomaly cascade root cause analysis method and system
CN117527523A (en) * 2023-11-23 2024-02-06 广东堡塔安全技术有限公司 Cloud computing-based server security monitoring system
CN117656846A (en) * 2024-02-01 2024-03-08 临沂大学 Dynamic storage method for automobile electric drive fault data
CN117656846B (en) * 2024-02-01 2024-04-19 临沂大学 Dynamic storage method for automobile electric drive fault data

Also Published As

Publication number Publication date
CN112052151B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
CN112052151B (en) Fault root cause analysis method, device, equipment and storage medium
US20210342369A1 (en) Method and system for implementing efficient classification and exploration of data
CN107436875B (en) Text classification method and device
CN109871311B (en) Method and device for recommending test cases
US20140365827A1 (en) Architecture for end-to-end testing of long-running, multi-stage asynchronous data processing services
CN111914159B (en) Information recommendation method and terminal
KR101850993B1 (en) Method and apparatus for extracting keyword based on cluster
CN111104242A (en) Method and device for processing abnormal logs of operating system based on deep learning
CN114327964A (en) Method, device, equipment and storage medium for processing fault reasons of service system
CN113204621A (en) Document storage method, document retrieval method, device, equipment and storage medium
CN112800197A (en) Method and device for determining target fault information
CN114418226B (en) Fault analysis method and device for power communication system
CN113626241B (en) Abnormality processing method, device, equipment and storage medium for application program
CN115033876A (en) Log processing method, log processing device, computer device and storage medium
US20200110815A1 (en) Multi contextual clustering
CN112364185B (en) Method and device for determining characteristics of multimedia resources, electronic equipment and storage medium
CN111831536A (en) Automatic testing method and device
CN111950623B (en) Data stability monitoring method, device, computer equipment and medium
US11372904B2 (en) Automatic feature extraction from unstructured log data utilizing term frequency scores
CN110866007B (en) Information management method, system and computer equipment for big data application and table
CN112749543A (en) Matching method, device, equipment and storage medium for information analysis process
EP4198758A1 (en) Method and system for scalable acceleration of data processing pipeline
CN115563310A (en) Method, device, equipment and medium for determining key service node
CN116155541A (en) Automatic machine learning platform and method for network security application
CN114860872A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant