CN114430363A - Fault reason positioning method, device, equipment and storage medium - Google Patents

Fault reason positioning method, device, equipment and storage medium Download PDF

Info

Publication number
CN114430363A
CN114430363A CN202011186822.5A CN202011186822A CN114430363A CN 114430363 A CN114430363 A CN 114430363A CN 202011186822 A CN202011186822 A CN 202011186822A CN 114430363 A CN114430363 A CN 114430363A
Authority
CN
China
Prior art keywords
fault
work order
data
alarm
alarm data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011186822.5A
Other languages
Chinese (zh)
Other versions
CN114430363B (en
Inventor
花小磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Ltd Research Institute filed Critical China Mobile Communications Group Co Ltd
Priority to CN202011186822.5A priority Critical patent/CN114430363B/en
Publication of CN114430363A publication Critical patent/CN114430363A/en
Application granted granted Critical
Publication of CN114430363B publication Critical patent/CN114430363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/064Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention discloses a fault cause positioning method, a fault cause positioning device, equipment and a storage medium. The method comprises the following steps: acquiring associated alarm data of a target work order based on the target work order to be positioned by the fault reason; determining the characteristic data of the fault characteristics corresponding to the target work order based on the target work order and the associated alarm data; inputting the characteristic data into a trained fault cause positioning model to obtain a target fault cause corresponding to a target work order; the associating alarm data includes: based on a plurality of alarm data in the set duration determined by the fault occurrence time of the work order, the characteristic data comprises: and the fault reason positioning model is generated based on the fault reason and the characteristic data of the historical work order in a training way. The method can automatically fit the alarm data associated with each work order, and is favorable for improving the positioning precision and the intelligent degree of the fault.

Description

Fault reason positioning method, device, equipment and storage medium
Technical Field
The present invention relates to the field of network technologies, and in particular, to a method, an apparatus, a device, and a storage medium for locating a failure cause.
Background
In the related art, after a network fault occurs, a network management system receives a series of alarm data reported by network equipment, and the network management system completes work order dispatching based on the alarm data and dispatching rules. After receiving the work order, the operation and maintenance personnel checks the fault reason by combining with multi-dimensional information such as alarm data and the like at the time of the fault, positions the reason and takes processing measures, and after the fault is solved, relevant descriptions such as the fault reason, the processing measures and the like are replied through the work order form.
Because the fault cause needs to be determined manually by operation and maintenance personnel, the operation and maintenance personnel often rely on individualized operation and maintenance experience, the traditional fault troubleshooting mode relying on the operation and maintenance experience and manual operation is low in efficiency, operation and maintenance challenges of network development trends such as large scale, distributed and heterogeneous can not be met, fault processing efficiency is affected, and therefore the fault cause is low in positioning efficiency and poor in intelligence. The related automatic positioning of the fault cause based on the AI (Artificial Intelligence) technology is often difficult to meet the requirement of positioning accuracy.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for locating a fault cause, which aim to implement automatic location of a fault cause of a network fault, and improve location accuracy and intelligence.
The technical scheme of the embodiment of the invention is realized as follows:
the embodiment of the invention provides a fault cause positioning method, which comprises the following steps:
acquiring associated alarm data of a target work order based on the target work order to be positioned by a fault reason;
determining characteristic data of fault characteristics corresponding to the target work order based on the target work order and the associated alarm data;
inputting the characteristic data into a trained fault cause positioning model to obtain a target fault cause corresponding to the target work order;
wherein the associated alarm data comprises: based on a plurality of alarm data in a set time length determined by the fault occurrence time of the work order, the characteristic data comprises: the fault cause positioning model is generated based on the fault cause of the historical work order and the characteristic data of the historical work order in a training mode.
The embodiment of the present invention further provides a fault cause positioning apparatus, including:
the data association module is used for acquiring association alarm data of a target work order based on the target work order to be positioned by a fault reason;
the characteristic extraction module is used for determining the characteristic data of the fault characteristics corresponding to the target work order based on the target work order and the associated alarm data;
the classification module is used for inputting the characteristic data into a trained fault cause positioning model to obtain a target fault cause corresponding to the target work order;
wherein the associated alarm data comprises: based on a plurality of alarm data in a set time length determined by the fault occurrence time of the work order, the characteristic data comprises: the fault cause positioning model is generated based on the fault cause of the historical work order and the characteristic data of the historical work order in a training mode.
An embodiment of the present invention further provides a fault cause positioning apparatus, including: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor, when running the computer program, is configured to perform the steps of the method according to an embodiment of the invention.
The embodiment of the invention also provides a storage medium, wherein a computer program is stored on the storage medium, and when the computer program is executed by a processor, the steps of the method of the embodiment of the invention are realized.
According to the technical scheme provided by the embodiment of the invention, based on a target work order to be positioned by a fault reason, the associated alarm data of the target work order is obtained; determining characteristic data of fault characteristics corresponding to the target work order based on the target work order and the associated alarm data; and inputting the characteristic data into a trained fault reason positioning model to obtain a target fault reason corresponding to the target work order, wherein the target work order is matched with the fault reason positioning model based on the characteristic data, and the fault reason positioning model is generated based on the fault reason of the historical work order and the characteristic data training, so that the alarm data associated with each work order can be automatically fitted, and the positioning precision and the intelligent degree of the fault can be improved.
Drawings
Fig. 1 is a schematic flow chart of a fault cause positioning method according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating an exemplary method for locating a cause of failure according to the present invention;
FIG. 3 is a schematic structural diagram of a fault cause locating device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a fault cause positioning device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Because a large amount of operation and maintenance experience is accumulated in a historical work order, in the related technology, the fault cause positioning of network faults can be automatically completed based on an AI (Artificial Intelligence) technology, and the method comprises the following three schemes:
according to the first scheme, the similarity between the related alarm at the fault occurrence time and the historical alarm is calculated, the importance degree of the occurred alarm is sorted, the historical fault root cause is given, and the root cause (namely the root cause) of the current fault is positioned. However, since the historical fault of the network comes from the work order, the types of the root factors recovered by the work order are few and the granularity is coarse, and even if the root factor based on the work order is accurately positioned, the fault refinement operation and maintenance can not be supported. And after a network fault occurs, a series of alarms can be triggered, and due to the fact that the redundancy of wireless network alarms is high, the similarity between the alarms is not easy to measure, the importance degree sequencing of the alarms is difficult to give, and therefore accurate positioning of fault reasons is affected.
And in the second scheme, a fault propagation diagram is constructed based on network topology and the like, and the fault root cause is positioned by utilizing the fault propagation diagram. However, the wireless network has a complex structure, the topology acquisition difficulty is high, the topology is not fixed and unchanged, and the method strictly depending on the topology has poor adaptability to network changes. And the generalization capability of the method is often limited by considering the different topologies of different provinces.
And thirdly, based on the relevance of the indexes or events, positioning the root cause through multi-dimensional information such as similarity measurement, time sequence and the like. However, the alarm related to the network fault is event data, and the similarity measurement cannot be accurately completed. A large amount of noise data can be introduced through an event correlation method of frequent item mining, the effect is poor, and the requirement for accurately positioning the fault reason is difficult to achieve.
Based on this, in various embodiments of the present invention, based on a target work order to be located by a fault cause, obtaining associated alarm data of the target work order; determining characteristic data of fault characteristics corresponding to the target work order based on the target work order and the associated alarm data; and inputting the characteristic data into a trained fault reason positioning model to obtain a target fault reason corresponding to the target work order, wherein the target work order is matched with the fault reason positioning model based on the characteristic data, and the fault reason positioning model is generated based on the fault reason of the historical work order and the characteristic data training, so that the alarm data associated with each work order can be automatically fitted, and the positioning precision and the intelligent degree of the fault can be improved.
An embodiment of the present invention provides a fault cause positioning method, which may be applied to a fault cause positioning device, as shown in fig. 1, where the method includes:
step 101, acquiring associated alarm data of a target work order based on the target work order to be positioned by a fault reason;
here, the target work order may be a work order generated by the network management system based on a series of alarm data reported by the network device, and the work order may be understood as indication information indicating the operation and maintenance staff to perform network maintenance, so that the operation and maintenance staff may perform troubleshooting after receiving the work order, thereby recovering the network in time.
Here, the associating alarm data includes: and a plurality of alarm data within a set time length determined based on the fault occurrence time of the work order.
Illustratively, the obtaining of the associated alarm data of the target work order based on the target work order to be located by the fault cause includes:
based on the fault occurrence time of a target work order to be subjected to fault location, acquiring a plurality of alarm data corresponding to a fault occurrence network domain of the target work order within a first time length before the fault occurrence time and a second set time length after the fault occurrence time.
In an application example, the target work order includes a first field indicating "time of occurrence of fault" and a second field indicating "city of occurrence of fault", and associating the alarm data may include: and all alarm data of the 'fault occurrence city' in the time period of pushing forwards for 50 minutes and pushing backwards for 10 minutes by taking the 'fault occurrence time' of the target work order as a reference. In this manner, all alarm data associated with the perimeter of the target work order may be collected.
102, determining characteristic data of fault characteristics corresponding to the target work order based on the target work order and the associated alarm data;
here, the feature data includes: the alarm time sequence characteristic is determined based on the associated alarm data.
Illustratively, the determining feature data of the fault characteristic corresponding to the target work order based on the target work order and the associated alarm data includes:
extracting attribute information of the target work order, and coding the attribute information to obtain coding information;
counting each alarm data in the associated alarm data based on an alarm title, and determining the statistical characteristics of the alarm title;
and sequencing all alarm data in the associated alarm data based on an alarm time sequence, and determining the alarm time sequence characteristics.
For example, the attribute information of the target work order may include: the title of the work order (also called alarm title), the name of the network element and the manufacturer of the network equipment are enumerated types, and the enumerated types have no sequence and size relation and cannot be input into the model. In the embodiment of the present invention, the attribute information may be encoded based on one-hot coding (one-hot coding), and vectors after one-hot coding may be spliced together to serve as the encoded information of the target work order.
Illustratively, the performing statistics on each alarm data in the associated alarm data based on an alarm title and determining the statistical characteristics of the alarm title include:
counting alarm titles of all alarm data in the associated alarm data based on a term frequency-inverse text frequency index (TF-IDF) to obtain a first characterization vector;
and performing vector dimensionality reduction on the first characterization vector based on Non-Negative Matrix Factorization (NMF) to obtain a second characterization vector serving as the statistical characteristic of the alarm title.
Here, TF-IDF is a weighting technique for information retrieval and data mining that can evaluate how important a word is to one of a set of documents or a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. Therefore, the alarm titles of the alarm data in the associated alarm data are counted based on the TF-IDF, the fault characteristics of the alarm subjects are favorably described, and the learning capacity of the model is favorably enhanced by effectively reducing the weight of the high-frequency alarm titles existing in various faults. In addition, the vector dimension reduction is carried out based on the NMF, the dimension reduction can be carried out on the original matrix, and the dimension reduction matrix of the data characteristics is obtained, so that the storage space is reduced, and the important information of the original alarm data can be effectively reserved. For example, if a certain work order is associated with hundreds of alarm data, the corresponding first characterization vector may have hundreds of dimensions, and the vector sparsity is strong, and based on the second characterization vector after the dimension reduction of the vector, the important information of the original alarm data can be well retained while the dimension reduction is performed.
Illustratively, the sorting the alarm data in the associated alarm data based on an alarm time sequence and determining the alarm time sequence characteristic includes:
and based on a word vector model, carrying out vector conversion on the data of each alarm data in the associated alarm data after the alarm data are sequenced based on the alarm time sequence to obtain a third feature vector serving as the alarm time sequence feature.
Here, after the wireless network fails, the peripheral device is caused to report alarm data, then the fault influence range is gradually expanded, and more alarm data are reported in the subsequent time, so that the alarm data associated with the work order have a time sequence. In the embodiment of the invention, the alarm titles of the alarm data in the associated alarm data can be sequenced according to the time sequence, and are converted into the third vector based on the Word2vec model, so that the alarm time sequence characteristics of the associated alarm data can be obtained.
103, inputting the characteristic data into a trained fault reason positioning model to obtain a target fault reason corresponding to the target work order;
here, the fault cause location model is generated based on the fault cause of the historical work order and the feature data training of the historical work order.
For example, the coding information, the second characterization vector, and the third characterization vector corresponding to the target work order may be spliced and used as an input of a fault cause positioning model, and the fault cause positioning model outputs a target fault cause of the target work order.
In the embodiment of the invention, because the target work order is matched with the fault reason positioning model based on the characteristic data, and the fault reason positioning model is generated based on the fault reason of the historical work order and the characteristic data training, the alarm data associated with each work order can be automatically fitted, and the positioning precision and the intelligent degree of the fault can be improved.
Because a fault cause positioning model needs to be trained in advance, in an embodiment, before the target work order positioned based on the to-be-fault cause acquires the associated alarm data of the target work order, the method further includes:
acquiring the characteristic data of a plurality of historical work orders;
extracting the fault reason of each historical work order;
and training an initial fault reason positioning model based on the characteristic data of each historical work order and the extracted fault reason to obtain the trained fault reason positioning model.
Here, the characteristic data of each historical work order may refer to a determination process of the characteristic data of the target work order, and details are not described here.
In the embodiment of the invention, the fault reason of the historical work order can be extracted from the processing measures fed back by the historical work order, and because the description text of the work order processing measures has extremely strong speciality and individuation and lacks large-scale marking, the text mining difficulty is very high. Based on this, in an embodiment, the extracting the fault cause of each historical work order includes:
inputting the processing measures of each historical work order into a fault theme extraction model to obtain the classification result of the fault theme of each historical work order;
based on a clustering algorithm, determining that the classification result belongs to an existing target fault subject, and taking the target fault subject as the fault reason of the corresponding historical work order; or determining that the classification result is a new fault subject, adding the new fault subject and using the new fault subject as the fault reason of the corresponding historical work order;
wherein the fault topic extraction model is generated based on attention mechanism training.
It can be understood that the fault topic extraction model is used for mining the fault topic of the network fault, but not the specific fault reason, so that similar fault reasons can be classified into the same fault topic, the universality of extracted information is further improved, and the problem of strong text personalization is solved.
Aiming at the characteristic of strong text specialization, the fault topic extraction model introduces an attention mechanism, and the contribution degree of each word in the processing measures to the generated fault topic can be distinguished in the training process.
For the problem that a sample is short of labels, the fault topic extraction model in the embodiment of the present invention may use a depth learning model BERT (Bidirectional Encoder retrieval from transforms) with pre-training as a basic model, and parameters are finely adjusted on the pre-training basic model to implement fault topic extraction, thereby reducing dependence on large-scale labeling of data.
Illustratively, the fault topic extraction model extracts a fault topic vector ThemeVec by using an attention mechanism, wherein ThemeVec approximation degree belonging to the same fault topic is higher, ThemeVec approximation degrees belonging to different fault topics are lower, a central vector and a maximum interval of each fault topic are calculated based on a clustering algorithm, new categories are uniformly put into other categories in a text similarity calculation mode, and self-updating of a new fault topic of a network fault is completed. If the fault cause corresponding to a certain historical work order is a new fault subject, the operation and maintenance personnel can add the label value of the new fault subject and use the label value as the fault cause.
In an embodiment, the method for locating the fault cause further includes:
determining statistical data of the fault reasons corresponding to the work order titles based on the work order titles of the historical work orders and the extracted fault reasons;
training an initial fault cause positioning model based on the feature data of each historical work order and the extracted fault cause to obtain the trained fault cause positioning model, wherein the training comprises the following steps:
and training an initial fault reason positioning model based on the characteristic data of each historical work order, the extracted fault reason and the statistical data of the fault reason corresponding to the work order title to obtain the trained fault reason positioning model.
Here, the statistical data of the fault cause corresponding to the work order title may be understood as expert experience, so that historical expert experience may be introduced into the training of the fault cause positioning model, and the accuracy of fault cause positioning is further improved. For example, if it is determined based on statistics that the fault causes corresponding to a specific work order title have different distributions, the weight of the fault cause with high probability can be increased, and the accuracy of fault cause location can be further improved.
The following describes embodiments of the present invention in further detail with reference to application examples.
Fig. 2 is a schematic diagram illustrating a principle of the method for locating a cause of a wireless network failure, where as shown in fig. 2, the method includes:
step 1, data association;
here, a series of alarms are triggered upon failure of the wireless network. The key alarm triggers the dispatch of the work order, and other alarms are discarded. If only the alarm of the work order is sent, the fault characteristics at the fault moment cannot be completely reflected, and overfitting of the model is easily caused. In the application example, all alarms around the work order are collected, and the learning capacity of the model is improved. The specific method comprises the following steps: and using two fields of 'failure occurrence time' and 'failure occurrence city' of the failure work order, and taking the 'failure occurrence time' as a reference, pushing forward for 50 minutes and pushing backward for 10 minutes, and associating all alarms of the 'failure occurrence city' in the period of time.
Step 2, feature extraction;
and processing input data of the fault root cause positioning method based on feature extraction, and better depicting the fault characteristics of the wireless network at the fault moment. The input data includes work order data and work order associated alarm data. The feature data after feature extraction includes:
1) one-hot characteristics: the alarm title, the network element name and 3 fields of the network element equipment manufacturer in the work order are enumeration types, and the enumeration types have no sequence and size relationship and cannot be input into the model. The method completes one-hot coding on 3 fields, and splices 3 vectors after the one-hot coding to serve as one-hot characteristics of the worksheet.
2) And the statistical characteristics of the alarm titles are as follows: when a wireless network fails, high-frequency alarms caused by each fault are different, and the application example utilizes alarm header distribution around the fault occurrence time to better depict different fault characteristics and enhance the learning capability of a model. In order to prevent some high-frequency alarms from existing in each fault and count the alarms, which has no meaning for describing the characteristics of different faults, the method provides that the alarm title statistics is completed by using a TF-IDF method. However, each work order can be associated with hundreds of alarms, the corresponding TF-IDF representation vector has hundreds of dimensions, the vector sparsity is strong, the vector dimension reduction is further performed based on NMF, and important information of the original alarm is well reserved while the dimension reduction is performed.
3) And an alarm timing characteristic. When a wireless network has a fault, peripheral equipment is triggered to report an alarm at the first time of the fault, then the fault influence range is expanded step by step, and more alarms are reported in the subsequent time, so that alarm data associated with a fault work order has a time sequence.
Step 3, extracting fault subjects;
here, the detailed fault cause of the historical fault work order may be extracted from the processing measures fed back by the work order, and the extracted detailed fault cause is classified into different fault subjects as the output of the root cause positioning method. However, the work order processing measure description text has strong speciality and individuation, is short of large-scale marking, and has great difficulty in text mining.
In the application example, a fault topic extraction model based on an attention mechanism is provided. Similar fault reasons can be classified into the same fault topic by mining the fault topic of the wireless network instead of specific fault reasons, so that the universality of extracted information is improved, and the problem of strong text personalization is solved.
In the application example, the fault topic extraction model can adopt a deep learning model BERT with pre-training as a basic model, and parameters are finely adjusted on the pre-training basic model to realize fault topic extraction, so that dependence on large-scale data labeling is reduced.
As shown in fig. 2, the fault topic extraction model includes: embedding, Encoder, Decode, and Linear. An attention mechanism is introduced in Encoder and Decoder, and the fact that the contribution degree of each word in the fault processing measures to the generated fault subject is different is emphasized. Illustratively, the input of the fault topic extraction model is text data of processing measures fed back by a work order, such as "transmission 2M line is damaged, resulting in error warning, and after the transmission 2M line is replaced, fault is recovered", and Embedding converts the text data into a first vector. The vector is converted into a second vector with high dimensionality by the Encoder, and the second vector comprises information of the character itself and information of surrounding characters. The second vector to high dimensionality is mapped to the failure topic of the output by Decoder, assuming that Decoder's output is 128-dimensional and failure topic is 64-dimensional. Mapping of the Decoder vector to the fault subject is completed through Linear, namely mapping 128 dimensions to 64 dimensions.
The fault topic extraction model utilizes an attention mechanism to extract fault topic vectors ThemeVec, wherein ThemeVec approximation degree belonging to the same fault topic is high, ThemeVec approximation degree belonging to different fault topics is low, a central vector and a maximum interval of each fault topic are calculated based on a clustering algorithm, new classes are uniformly put into other classes in a text similarity calculation mode, and class differentiation of the fault topics of wireless network faults is completed.
Step 4, positioning the root cause based on expert experience sharing;
aiming at the problem that a large amount of operation and maintenance experience cannot exert value due to the fact that a wireless network lacks a knowledge sharing mechanism, after a fault theme is extracted based on a fault reason extraction model, in the application example, statistics is also completed on historical fault theme distribution corresponding to a work order title in each fault work order, and the historical fault theme distribution is used as historical expert operation and maintenance experience and is synchronously input into the positioning method of the application example. Through checking the operation and maintenance experiences, expert experience sharing is achieved, and root cause positioning accuracy is improved.
In the application example, the fault root cause positioning of the wireless network is realized by adopting a fault cause positioning model, and a fault propagation diagram scheme depending on a topological structure is replaced. In particular, a tree model based classifier may be employed. For example, in large samples and high dimensional data, a more efficient LightGBM model may be used as a fault cause localization model. The input of the LightGBM model is the feature extraction result of the step 2, and the output of the model is the fault subject extracted in the step 3. Of course, before fault cause location, the fault cause location model needs to be trained based on the historical work order, so as to obtain a trained fault cause location model. It can be understood that the trained fault cause location model can also be updated based on the data of the newly added historical work order.
And 5, discovering a new reason.
After the operation and maintenance personnel of the wireless network finish the fault operation and maintenance, if the recommended fault root is 'other', the operation and maintenance personnel describe the newly found fault reason theme in the 'processing measure' field when returning the order, namely, the operation and maintenance personnel adds the label value of the new fault theme as the fault reason to finish the self-updating of the fault theme.
The fault reason positioning method of the application example is based on a deep learning method, and extracts a fault theme from processing measures fed back by a work order, so that the fault reason corresponding to the work order can be expanded, and the fault refinement operation and maintenance of a wireless network are well supported; in addition, new fault reasons in the 5G field and the like can be automatically found based on a new fault theme of the wireless network identified by the clustering algorithm, and the fault reasons can be automatically updated based on the manual marking and automatic updating of the reason labels; and thirdly, machine learning integrating expert experience is provided, after the expert experience is integrated, the positioning accuracy can be improved by about 18%, and meanwhile the universality of the method is ensured. Besides the field of wireless networks, the method effect can still be tested in the field of core networks, and high accuracy can still be maintained.
In order to implement the method according to the embodiment of the present invention, an embodiment of the present invention further provides a fault cause positioning apparatus, where the fault cause positioning apparatus corresponds to the fault cause positioning method, and each step in the fault cause positioning method is also completely applicable to the fault cause positioning apparatus according to the embodiment of the present invention.
As shown in fig. 3, the fault cause locating device includes: the system comprises a data association module 301, a feature extraction module 302 and a classification module 303, wherein the data association module 301 is used for acquiring association alarm data of a target work order based on the target work order to be positioned by a fault reason; the feature extraction module 302 is configured to determine feature data of a fault characteristic corresponding to the target work order based on the target work order and the associated alarm data; the classification module 303 is configured to input the feature data into a trained fault cause positioning model to obtain a target fault cause corresponding to the target work order; the associated alarm data includes: based on a plurality of alarm data in a set time length determined by the fault occurrence time of the work order, the characteristic data comprises: the fault cause positioning model is generated based on the fault cause of the historical work order and the characteristic data of the historical work order in a training mode.
In some embodiments, the data association module 301 is specifically configured to:
based on the fault occurrence time of a target work order to be subjected to fault location, acquiring a plurality of alarm data corresponding to a fault occurrence network domain of the target work order within a first time length before the fault occurrence time and a second set time length after the fault occurrence time.
In some embodiments, the feature extraction module 302 is specifically configured to:
extracting attribute information of the target work order, and coding the attribute information to obtain coding information;
counting each alarm data in the associated alarm data based on an alarm title, and determining the statistical characteristics of the alarm title;
and sequencing all alarm data in the associated alarm data based on an alarm time sequence, and determining the alarm time sequence characteristics.
In some embodiments, the feature extraction module 302 is specifically configured to:
counting alarm titles of all alarm data in the associated alarm data based on a word frequency-inverse text frequency index to obtain a first characterization vector;
and carrying out vector dimensionality reduction on the first characterization vector based on non-negative matrix decomposition to obtain a second characterization vector serving as the statistical characteristic of the alarm title.
In some embodiments, the feature extraction module 302 is specifically configured to:
and based on a word vector model, carrying out vector conversion on the data of each alarm data in the associated alarm data after the alarm data are sequenced based on the alarm time sequence to obtain a third feature vector serving as the alarm time sequence feature.
In some embodiments, the fault cause locating device further comprises: a training module 304 to:
acquiring the characteristic data of a plurality of historical work orders;
extracting the fault reason of each historical work order;
and training an initial fault reason positioning model based on the characteristic data of each historical work order and the extracted fault reason to obtain the trained fault reason positioning model.
The training module 304 extracts the fault reason of each historical work order, including:
inputting the processing measures of each historical work order into a fault theme extraction model to obtain the classification result of the fault theme of each historical work order;
based on a clustering algorithm, determining that the classification result belongs to an existing target fault subject, and taking the target fault subject as the fault reason of the corresponding historical work order; or determining that the classification result is a new fault subject, adding the new fault subject and using the new fault subject as the fault reason of the corresponding historical work order;
wherein the fault topic extraction model is generated based on attention mechanism training.
In some embodiments, training module 304 is further configured to:
determining statistical data of the fault reasons corresponding to the work order titles based on the work order titles of the historical work orders and the extracted fault reasons;
correspondingly, the training an initial fault cause location model based on the feature data of each historical work order and the extracted fault cause to obtain the trained fault cause location model includes:
and training an initial fault reason positioning model based on the characteristic data of each historical work order, the extracted fault reason and the statistical data of the fault reason corresponding to the work order title to obtain the trained fault reason positioning model.
In actual application, the data association module 301, the feature extraction module 302, the classification module 303, and the training module 304 may be implemented by a processor in the fault cause positioning apparatus. Of course, the processor needs to run a computer program in memory to implement its functions.
It should be noted that: in the fault cause location device provided in the above embodiment, when locating the fault cause, only the division of each program module is taken as an example, and in practical applications, the above processing distribution may be completed by different program modules according to needs, that is, the internal structure of the device is divided into different program modules to complete all or part of the above-described processing. In addition, the fault cause positioning apparatus and the fault cause positioning method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Based on the hardware implementation of the program module, and in order to implement the method according to the embodiment of the present invention, an embodiment of the present invention further provides a fault cause positioning device. Fig. 4 shows only an exemplary structure of the fault cause locating apparatus, not the entire structure, and a part or the entire structure shown in fig. 4 may be implemented as necessary.
As shown in fig. 4, the fault cause locating apparatus 400 according to an embodiment of the present invention includes: at least one processor 401, memory 402, a user interface 403, and at least one network interface 404. The various components in the fault cause locating device 400 are coupled together by a bus system 405. It will be appreciated that the bus system 405 is used to enable communications among the components. The bus system 405 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 405 in fig. 4.
The user interface 403 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.
The memory 402 in embodiments of the present invention is used to store various types of data to support the operation of the fault cause locating device. Examples of such data include: any computer program for operating on a fault cause location device.
The method for locating the fault cause disclosed by the embodiment of the invention can be applied to the processor 401, or can be implemented by the processor 401. The processor 401 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the fault cause locating method may be implemented by hardware integrated logic circuits or instructions in the form of software in the processor 401. The Processor 401 described above may be a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 401 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software module may be located in a storage medium located in the memory 402, and the processor 401 reads information in the memory 402, and completes the steps of the fault cause locating method provided by the embodiment of the present invention in combination with hardware thereof.
In an exemplary embodiment, the fault cause locating Device may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), FPGAs, general purpose processors, controllers, Micro Controllers (MCUs), microprocessors (microprocessors), or other electronic components for performing the aforementioned methods.
It will be appreciated that the memory 402 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The described memory for embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.
In an exemplary embodiment, the embodiment of the present invention further provides a storage medium, that is, a computer storage medium, which may be specifically a computer readable storage medium, for example, a memory 402 storing a computer program, where the computer program is executable by a processor 401 of a fault cause locating device to complete the steps described in the method of the embodiment of the present invention. The computer readable storage medium may be a ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM, among others.
It should be noted that: "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
In addition, the technical solutions described in the embodiments of the present invention may be arbitrarily combined without conflict.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (11)

1. A fault cause positioning method is characterized by comprising the following steps:
acquiring associated alarm data of a target work order based on the target work order to be positioned by a fault reason;
determining characteristic data of fault characteristics corresponding to the target work order based on the target work order and the associated alarm data;
inputting the characteristic data into a trained fault cause positioning model to obtain a target fault cause corresponding to the target work order;
wherein the associated alarm data comprises: based on a plurality of alarm data in a set time length determined by the fault occurrence time of the work order, the characteristic data comprises: the fault cause positioning model is generated based on the fault cause of the historical work order and the characteristic data of the historical work order in a training mode.
2. The method according to claim 1, wherein the obtaining of the associated alarm data of the target work order based on the target work order to be located by the fault cause comprises:
based on the fault occurrence time of a target work order to be subjected to fault location, acquiring a plurality of alarm data corresponding to a fault occurrence network domain of the target work order within a first time length before the fault occurrence time and a second set time length after the fault occurrence time.
3. The method of claim 1, wherein determining the characteristic data of the fault characteristic corresponding to the target work order based on the target work order and the associated alarm data comprises:
extracting attribute information of the target work order, and coding the attribute information to obtain coding information;
counting each alarm data in the associated alarm data based on an alarm title, and determining the statistical characteristics of the alarm title;
and sequencing all alarm data in the associated alarm data based on an alarm time sequence, and determining the alarm time sequence characteristics.
4. The method according to claim 3, wherein the performing statistics on each alarm data in the associated alarm data based on an alarm header, and determining the alarm header statistical characteristics comprises:
counting alarm titles of all alarm data in the associated alarm data based on a word frequency-inverse text frequency index to obtain a first characterization vector;
and carrying out vector dimensionality reduction on the first characterization vector based on non-negative matrix decomposition to obtain a second characterization vector serving as the statistical characteristic of the alarm title.
5. The method of claim 3, wherein the sorting the alarm data of the associated alarm data based on alarm timing to determine the alarm timing characteristic comprises:
and based on a word vector model, carrying out vector conversion on the data of each alarm data in the associated alarm data after the alarm data are sequenced based on the alarm time sequence to obtain a third feature vector serving as the alarm time sequence feature.
6. The method according to claim 1, wherein before the target work order located based on the reason to be failed obtains the associated alarm data of the target work order, the method further comprises:
acquiring the characteristic data of a plurality of historical work orders;
extracting the fault reason of each historical work order;
and training an initial fault reason positioning model based on the characteristic data of each historical work order and the extracted fault reason to obtain the trained fault reason positioning model.
7. The method of claim 6, wherein said extracting a cause of failure for each of said historical work orders comprises:
inputting the processing measures of each historical work order into a fault theme extraction model to obtain the classification result of the fault theme of each historical work order;
based on a clustering algorithm, determining that the classification result belongs to an existing target fault subject, and taking the target fault subject as the fault reason of the corresponding historical work order; or determining that the classification result is a new fault subject, adding the new fault subject and using the new fault subject as the fault reason of the corresponding historical work order;
wherein the fault topic extraction model is generated based on attention mechanism training.
8. The method of claim 6, further comprising:
determining statistical data of the fault reasons corresponding to the work order titles based on the work order titles of the historical work orders and the extracted fault reasons;
training an initial fault cause positioning model based on the feature data of each historical work order and the extracted fault cause to obtain the trained fault cause positioning model, wherein the training comprises the following steps:
and training an initial fault reason positioning model based on the characteristic data of each historical work order, the extracted fault reason and the statistical data of the fault reason corresponding to the work order title to obtain the trained fault reason positioning model.
9. A fault cause locating device, comprising:
the data association module is used for acquiring association alarm data of a target work order based on the target work order to be positioned by a fault reason;
the characteristic extraction module is used for determining the characteristic data of the fault characteristics corresponding to the target work order based on the target work order and the associated alarm data;
the classification module is used for inputting the characteristic data into a trained fault cause positioning model to obtain a target fault cause corresponding to the target work order;
wherein the associated alarm data comprises: based on a plurality of alarm data in a set time length determined by the fault occurrence time of the work order, the characteristic data comprises: the fault cause positioning model is generated based on the fault cause of the historical work order and the characteristic data of the historical work order in a training mode.
10. A fault cause locating apparatus, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein,
the processor, when executing the computer program, is adapted to perform the steps of the method of any of claims 1 to 8.
11. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, performs the steps of the method of any one of claims 1 to 8.
CN202011186822.5A 2020-10-29 2020-10-29 Fault cause positioning method, device, equipment and storage medium Active CN114430363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011186822.5A CN114430363B (en) 2020-10-29 2020-10-29 Fault cause positioning method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011186822.5A CN114430363B (en) 2020-10-29 2020-10-29 Fault cause positioning method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114430363A true CN114430363A (en) 2022-05-03
CN114430363B CN114430363B (en) 2023-03-28

Family

ID=81309773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011186822.5A Active CN114430363B (en) 2020-10-29 2020-10-29 Fault cause positioning method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114430363B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115221892A (en) * 2022-07-12 2022-10-21 中国电信股份有限公司 Work order data processing method and device, storage medium and electronic equipment
CN115550139A (en) * 2022-09-19 2022-12-30 中国电信股份有限公司 Fault root cause positioning method, device and system, electronic equipment and storage medium
CN115766404A (en) * 2022-10-24 2023-03-07 浪潮通信信息系统有限公司 Communication operator network fault management method and system based on intelligent analysis
CN116016120A (en) * 2023-01-05 2023-04-25 中国联合网络通信集团有限公司 Fault processing method, terminal device and readable storage medium
CN116112341A (en) * 2022-12-30 2023-05-12 中国电信股份有限公司 Network equipment detection method and device, electronic equipment and storage medium
CN115174355B (en) * 2022-07-26 2024-01-19 杭州东方通信软件技术有限公司 Method for generating fault root positioning model, fault root positioning method and device
CN117610667A (en) * 2024-01-17 2024-02-27 湖南傲思软件股份有限公司 Fault handling expert system, method and computer equipment based on open source large model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101917297A (en) * 2010-08-30 2010-12-15 烽火通信科技股份有限公司 Method and system for diagnosing faults of core network based on Bayesian network
CN109905269A (en) * 2018-01-17 2019-06-18 华为技术有限公司 The method and apparatus for determining network failure
CN111366816A (en) * 2020-04-26 2020-07-03 华北电力大学 Power grid fault diagnosis method based on machine learning
CN111460164A (en) * 2020-05-22 2020-07-28 南京大学 Intelligent barrier judgment method for telecommunication work order based on pre-training language model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101917297A (en) * 2010-08-30 2010-12-15 烽火通信科技股份有限公司 Method and system for diagnosing faults of core network based on Bayesian network
CN109905269A (en) * 2018-01-17 2019-06-18 华为技术有限公司 The method and apparatus for determining network failure
CN111366816A (en) * 2020-04-26 2020-07-03 华北电力大学 Power grid fault diagnosis method based on machine learning
CN111460164A (en) * 2020-05-22 2020-07-28 南京大学 Intelligent barrier judgment method for telecommunication work order based on pre-training language model

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115221892A (en) * 2022-07-12 2022-10-21 中国电信股份有限公司 Work order data processing method and device, storage medium and electronic equipment
CN115221892B (en) * 2022-07-12 2024-02-27 中国电信股份有限公司 Work order data processing method and device, storage medium and electronic equipment
CN115174355B (en) * 2022-07-26 2024-01-19 杭州东方通信软件技术有限公司 Method for generating fault root positioning model, fault root positioning method and device
CN115550139A (en) * 2022-09-19 2022-12-30 中国电信股份有限公司 Fault root cause positioning method, device and system, electronic equipment and storage medium
CN115550139B (en) * 2022-09-19 2024-02-02 中国电信股份有限公司 Fault root cause positioning method, device, system, electronic equipment and storage medium
CN115766404A (en) * 2022-10-24 2023-03-07 浪潮通信信息系统有限公司 Communication operator network fault management method and system based on intelligent analysis
CN116112341A (en) * 2022-12-30 2023-05-12 中国电信股份有限公司 Network equipment detection method and device, electronic equipment and storage medium
CN116112341B (en) * 2022-12-30 2024-04-30 中国电信股份有限公司 Network equipment detection method and device, electronic equipment and storage medium
CN116016120A (en) * 2023-01-05 2023-04-25 中国联合网络通信集团有限公司 Fault processing method, terminal device and readable storage medium
CN117610667A (en) * 2024-01-17 2024-02-27 湖南傲思软件股份有限公司 Fault handling expert system, method and computer equipment based on open source large model
CN117610667B (en) * 2024-01-17 2024-04-26 湖南傲思软件股份有限公司 Fault handling expert system, method and computer equipment based on open source large model

Also Published As

Publication number Publication date
CN114430363B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN114430363B (en) Fault cause positioning method, device, equipment and storage medium
CN108197132B (en) Graph database-based electric power asset portrait construction method and device
CN110851598B (en) Text classification method and device, terminal equipment and storage medium
CN109726393B (en) Policy analysis system and method based on natural language processing technology
CN110334186B (en) Data query method and device, computer equipment and computer readable storage medium
WO2021175009A1 (en) Early warning event graph construction method and apparatus, device, and storage medium
CN113158653A (en) Training method, application method, device and equipment for pre-training language model
CN112148881A (en) Method and apparatus for outputting information
CN110019820B (en) Method for detecting time consistency of complaints and symptoms of current medical history in medical records
CN115547466B (en) Medical institution registration and review system and method based on big data
WO2020074017A1 (en) Deep learning-based method and device for screening for keywords in medical document
Li et al. Efficiently mining high quality phrases from texts
CN112131453A (en) Method, device and storage medium for detecting network bad short text based on BERT
CN114238632A (en) Multi-label classification model training method and device and electronic equipment
Vitiugin et al. Cross-lingual query-based summarization of crisis-related social media: An abstractive approach using transformers
CN113535906A (en) Text classification method and related device for hidden danger events in electric power field
Rahimi et al. Service quality monitoring in confined spaces through mining Twitter data
CN116467461A (en) Data processing method, device, equipment and medium applied to power distribution network
US20230229639A1 (en) Predictive recommendations for schema mapping
CN115344661A (en) Equipment halt diagnosis method and device, electronic equipment and storage medium
AU2019290658B2 (en) Systems and methods for identifying and linking events in structured proceedings
CN111008281B (en) Text classification method and device, computer equipment and storage medium
Farruggia et al. Bayesian network based classification of mammography structured reports
CN113505117A (en) Data quality evaluation method, device, equipment and medium based on data indexes
Zitnik et al. Token-and constituent-based linear-chain crf with svm for named entity recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant