WO2021109578A1 - 业务运维中告警的预测方法、装置与电子设备 - Google Patents

业务运维中告警的预测方法、装置与电子设备 Download PDF

Info

Publication number
WO2021109578A1
WO2021109578A1 PCT/CN2020/101818 CN2020101818W WO2021109578A1 WO 2021109578 A1 WO2021109578 A1 WO 2021109578A1 CN 2020101818 W CN2020101818 W CN 2020101818W WO 2021109578 A1 WO2021109578 A1 WO 2021109578A1
Authority
WO
WIPO (PCT)
Prior art keywords
network element
alarm
prediction
hidden markov
alarms
Prior art date
Application number
PCT/CN2020/101818
Other languages
English (en)
French (fr)
Inventor
徐键
Original Assignee
北京天元创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京天元创新科技有限公司 filed Critical 北京天元创新科技有限公司
Publication of WO2021109578A1 publication Critical patent/WO2021109578A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • This application relates to the technical field of IT operation and maintenance, and more specifically, to a method, device, and electronic equipment for predicting alarms in business operation and maintenance.
  • fault management systems In the field of IT operation and maintenance, a complete operation and maintenance system includes systems with fault management capabilities, which are usually called fault management systems.
  • Traditional fault management systems generally have functions such as equipment alarm monitoring, business indicator monitoring, fault response and fault location.
  • equipment alarm monitoring is "recognized afterwards”. That is, after the relevant collection tool obtains the data, it is found that the data triggers the corresponding rules before generating an alarm and reviewing the dispatch.
  • the shorter the process from the discovery of an alarm to the dispatch of an order the smaller the impact of the corresponding alarm, and the wider the time window for O&M personnel to solve the problem. Therefore, if equipment alarms can be accurately predicted in advance, corresponding evasive measures can be taken in advance to avoid the occurrence of related failures or reduce the impact of related failures.
  • the embodiments of the present application provide a method, device and electronic equipment for predicting alarms in service operation and maintenance, which are used to effectively improve the accuracy of alarm prediction in service operation and maintenance, thereby effectively avoiding Failure or reduce the impact of failure.
  • an embodiment of the present application provides a method for predicting an alarm during service operation and maintenance, including:
  • the trained hidden Markov prediction model is used to perform alarm prediction on the target network element object
  • the trained hidden Markov prediction model is initialized and constructed by analyzing the relationship information of network element objects, network element failures and network element alarms in the fault management system in advance, and using the original data selected according to the relationship information The samples are obtained through training.
  • the alarm prediction method in the service operation and maintenance of the embodiment of the present application further includes:
  • the alarm data constitutes a training sample set
  • the step of selecting corresponding historical alarm data in the fault management system to form a training sample set specifically includes:
  • a second given number of historical alarm data is selected, and the historical alarm data includes the network element objects and The one-to-one correspondence between the network element alarms;
  • the training sample set is formed.
  • the method for predicting alarms in service operation and maintenance of the embodiment of the present application further includes:
  • the training set and the test set are divided according to a fixed ratio.
  • the step of iteratively training the hidden Markov initial model specifically includes:
  • test set in the sub-training sample set to correspondingly verify whether each candidate prediction model meets the set standard, and select a prediction model that meets the set standard as the trained hidden Markov prediction model ;
  • the set standard is that the accuracy of the prediction result verified by the test set is the highest.
  • the step of using the trained hidden Markov prediction model to perform alarm prediction on the target network element object specifically includes: selecting multiple alarms of different categories from the alarm set generated by all network element objects , And based on the historical alarm sequence and the selected alarms, the trained hidden Markov prediction model is used to perform forward calculations to obtain the corresponding probabilities of the selected alarms, and based on the The probability determines the alarm prediction result of the target network element object.
  • the step of determining the alarm prediction result of the target network element object based on the probability specifically includes: sorting all the probabilities according to the magnitude of the probability, and obtaining a value according to the sorting result The alarm corresponding to the largest one is used as the alarm of the next prediction period of the target network element object.
  • an embodiment of the present application provides an alarm prediction device during service operation and maintenance, including:
  • the data acquisition module is configured to acquire a given number of historical alarm data of the target network element object before the current alarm period to form a historical alarm sequence
  • the prediction output module is configured to use the trained hidden Markov prediction model to perform alarm prediction on the target network element object based on the historical alarm sequence;
  • the trained hidden Markov prediction model is initialized and constructed by analyzing the relationship information of network element objects, network element failures and network element alarms in the fault management system in advance, and using the original data selected according to the relationship information The samples are obtained through training.
  • an embodiment of the present application provides an electronic device including a memory, a processor, and a computer program stored on the memory and capable of running on the processor.
  • the processor executes the computer program, The steps of the method for predicting alarms in service operation and maintenance as described in the first aspect above are implemented.
  • an embodiment of the present application provides a non-transitory computer-readable storage medium on which computer instructions are stored.
  • the computer instructions are executed by a computer, the alarms during business operation and maintenance as described in the first aspect are implemented.
  • the steps of the forecasting method are implemented.
  • the method, device, and electronic equipment for predicting alarms in service operation and maintenance analyze the relationship between network element objects, faults, and alarms in the fault management system, and combine with the hidden Markov prediction model.
  • the target sequence constructed by the historical alarm of the meta-object is processed to finally realize the alarm prediction of the network element object, which can effectively improve the accuracy of the alarm prediction in business operation and maintenance, thereby effectively avoiding failures or reducing the impact of failures.
  • FIG. 1 is a schematic flowchart of a method for predicting an alarm in service operation and maintenance provided by an embodiment of the application;
  • FIG. 2 is a schematic diagram of the execution principle of the method for predicting alarms in service operation and maintenance provided by an embodiment of the application;
  • FIG. 3 is a schematic structural diagram of an alarm prediction device in service operation and maintenance provided by an embodiment of the application
  • FIG. 4 is a schematic diagram of the physical structure of an electronic device provided by an embodiment of the application.
  • a hidden Markov alarm prediction method based on supervised learning. This method uses offline supervised learning to generate a prediction model, and uses the model to more accurately predict the most likely alarms in the next prediction cycle, thereby enhancing the automation and intelligence of operation and maintenance ⁇ .
  • the embodiment of the present application addresses the problem of poor accuracy of alarm prediction in business operation and maintenance in the prior art.
  • the prediction model is used to process the target sequence constructed based on the historical alarm of the network element object, and finally realize the alarm prediction of the network element object, which can effectively improve the accuracy of the alarm prediction in business operation and maintenance, thereby effectively avoiding or reducing failures
  • the impact of the occurrence will be explained and introduced in detail through a plurality of embodiments.
  • Fig. 1 is a schematic flowchart of a method for predicting an alarm in service operation and maintenance provided by an embodiment of the application. As shown in Fig. 1, the method includes:
  • S101 Acquire historical alarm data of a given number of target network element objects before the current alarm period to form a historical alarm sequence.
  • the embodiment of the present application predicts the alarm of the next alarm cycle of the network element object based on the historical alarm data of the network element object. Therefore, the embodiment of the present application first obtains the alarms that the target network element object has issued before the current alarm period, that is, the historical alarm data, according to the historical record data of the fault management system. It is understandable that, in order to avoid errors caused by contingency and take into account the characteristics of the hidden Markov prediction model, the number of selected historical alarm data must reach a certain amount, and the certain amount can be obtained through implementation settings. Afterwards, these historical alarm data can be processed and coded according to a certain time sequence to form a data sequence, that is, a historical alarm sequence.
  • the trained hidden Markov prediction model is initialized and constructed by analyzing the relationship information of network element objects, network element failures and network element alarms in the fault management system in advance, and is obtained by training with original data samples selected according to the relationship information of.
  • the target network element object is obtained.
  • the predicted result of one or more alarm periods can be input into the pre-trained hidden Markov prediction model, and through the forward calculation of the prediction model, the target network element object is obtained. The predicted result of one or more alarm periods.
  • a certain model building method needs to be used to build the model in advance. Specifically, the relationship between network element objects, faults and alarms in the fault management system can be analyzed first, and a hidden Markov initial model can be constructed based on this initialization. After that, according to the results of the above analysis, the corresponding original alarm data is selected, and these original alarm data are processed to train the constructed initial hidden Markov model, and finally the trained hidden Markov prediction model is obtained, which can be used Alarm prediction for network element objects.
  • the method for predicting alarms in service operation and maintenance provided by the embodiments of the present application, through the analysis of the relationship between network element objects, faults and alarms in the fault management system, and combined with the hidden Markov prediction model, constructs the historical alarms based on the network element objects
  • the target sequence is calculated and processed to finally realize the alarm prediction of the network element object, which can effectively improve the accuracy of the alarm prediction in business operation and maintenance, thereby effectively avoiding failures or reducing the impact of failures.
  • the alarm prediction method in the service operation and maintenance of the embodiment of this application is also include:
  • the embodiment of the present application also adopts a certain model establishment method to establish the model in advance. Specifically, first, according to the historical record information of the fault management system, the network element objects in the fault management system are obtained, and the fault data generated by each network element object and the corresponding alarm data are obtained. After that, the relationship between these network element objects, faults and alarms is comprehensively analyzed, and on this basis, a hidden Markov initial model based on supervised learning is initialized and constructed.
  • the maximum likelihood estimation method is used to iteratively train the hidden Markov initial model constructed by initialization, and the prediction results of the model are tested during each round of training. Finally, a prediction model that meets the set criteria is obtained, that is, as a trained hidden Markov prediction model.
  • the steps of selecting corresponding historical alarm data in the fault management system to form a training sample set specifically include: combining operation and maintenance knowledge, analyzing network element objects, network element faults and network element faults in the fault management system. For the causality of network element alarms, select the second given number of historical alarm data, which includes the one-to-one correspondence between network element objects and network element alarms; preprocess historical alarm data based on timing and missing values , And encode the preprocessing results to obtain sample data; according to all sample data, a training sample set is formed.
  • the embodiment of the present application realizes the construction of the training sample set of the model. Specifically, first, based on the knowledge of operation and maintenance, analyze the causal relationship between the network element object and the network element fault and the network element alarm generated by the network element object and the network element object in the historical record information of the fault management system, and select a given number according to the analysis result. Historical alarm data.
  • the historical alarm data is represented as a one-to-one correspondence between the network element object and the alarm information generated by it. For example, at a certain historical moment, a certain network element object si generates alarm information o i , the historical alarm data selected according to it can be expressed as (o i , s i ).
  • the selected historical alarm data is preprocessed, including processing in time sequence and equivalent supplementation of missing values, etc., and then encoding the preprocessing results to obtain the corresponding encoding results as sample data. Finally, build a sample set based on these sample data, which is the training sample set.
  • the method for predicting alarms in service operation and maintenance of the embodiment of the present application further includes: continuously adjusting the number of sample data in the training sample set, The training sample set is divided to generate multiple sub-training sample sets; for all sub-training sample sets generated, the training set and the test set are divided according to a fixed ratio.
  • the division of the training sample set is improved, so as to improve the traditional shortcomings of only dividing the training sample set into a training set and a test set.
  • the number of sample data in the training sample set is continuously adjusted by selecting more historical alarm data.
  • the training sample set is divided into subsets according to application requirements, and multiple corresponding sub-training sample sets are obtained. After that, for each sub-training sample set, the corresponding multiple training sets and test sets are divided according to a fixed ratio.
  • Table 1 it is an example table for dividing the training sample set according to the embodiment of the present application.
  • the sample data in the training sample set is evenly divided into five sub-training sample sets, and each sub-training sample set is divided into the corresponding training set and test set according to a fixed ratio of 7:3, 8:2, and 9:1.
  • Table 1 An example table of the division of the training sample set according to the embodiment of the application
  • the step of iteratively training the hidden Markov initial model specifically includes: using the training set in each sub-training sample set, using the maximum likelihood estimation method, and iteratively training the hidden Markov initial model.
  • Model corresponding to obtain multiple candidate prediction models; use the test set in the sub-training sample set to correspondingly verify whether each candidate prediction model meets the set criteria, and select the prediction model that meets the set criteria as the trained hidden Markov prediction model ;
  • the set standard is that the accuracy of the prediction result verified by the test set is the highest.
  • the embodiment of this application uses each sub-training sample set divided according to the above-mentioned embodiment to separately compare the constructed Hidden Markov Initial
  • the model is trained. Specifically, extract the training set of each sub-training sample set, and use the maximum likelihood estimation method to separately train the constructed hidden Markov initial model, and obtain multiple trained prediction models as candidate prediction models. .
  • the test set corresponding to the training set for training the candidate prediction model is used to test the accuracy of the candidate prediction model respectively. That is to say, for any obtained candidate prediction model, use its corresponding test set to perform forward calculation to obtain the prediction result, and compare it with the reference alarm result in the test set to test the prediction accuracy. Then for each test set, an accuracy test result can be correspondingly obtained, and the accuracy test result is expressed as the ratio of the predicted accurate data to the total test data when the test data in the test set is used for testing.
  • test object corresponding to the highest accuracy test result of each test set that is, the candidate prediction model
  • the candidate prediction model consider it to meet the set criteria of the test, and use it as the final trained hidden Markov prediction model.
  • the embodiment of the present application improves the division standard of the training sample set, which can effectively avoid the overfitting problem caused by the improper loss function selected in one training session, thereby further improving the prediction accuracy of the prediction model.
  • the step of using the trained hidden Markov prediction model to predict the target network element object specifically includes: selecting multiple different categories from the set of alarms generated by all network element objects Based on the historical alarm sequence and the selected alarms, the trained hidden Markov prediction model is used to perform forward calculations to obtain the corresponding probabilities of each selected alarm, and based on the probability, determine the target network element object Warning prediction result.
  • the target network element object when the target network element object is forecasted for alarm, it not only predicts whether it will issue an alarm, but also includes its specific alarm type. Therefore, for the historical alarm sequence of length n obtained from the historical alarm data of the target network element object before the current alarm period, it is also necessary to select different types of alarms from the alarm information set generated by all network element objects in the fault management system, and Combine each selected alarm with a historical alarm sequence of length n to construct a target sequence of length n+1.
  • the above-mentioned target sequences are input into the trained hidden Markov prediction model, and forward calculation is performed to obtain the corresponding probabilities of each category of alarm. Based on these probabilities, the final target network element object is determined. Warning prediction result.
  • the step of determining the alarm prediction result of the target network element object specifically includes: sorting all the probabilities according to the probability, and obtaining the alarm corresponding to the largest value according to the sorting result, as the target network element The alarm for the next forecast period of the object.
  • the obtained probabilities corresponding to each category of alarms are sorted according to the value size, and the maximum probability value among them is selected according to the sorting result. After that, the alarm corresponding to the maximum probability value and the type of the alarm are determined as the predicted alarm of the next prediction period of the target network element object.
  • the embodiments of the present application provide the following specific descriptions based on the foregoing embodiments, but do not limit the protection scope of the embodiments of the present application.
  • the warning prediction of supervised learning Hidden Markov is based on the fact that after a certain network element object has a fault or a related index reaches a certain threshold, the fault management system generates a corresponding alarm, and then passes In some links, orders are finally sent to operation and maintenance personnel, that is, the network element object generates an alarm.
  • FIG. 2 a schematic diagram of the execution principle of the method for predicting alarms in business operation and maintenance provided by this embodiment of the application.
  • the principle diagram is composed of two parts: The first part is the main node of the execution principle diagram, which describes the training model and The process of predicting through the model; the second part is the time axis at the top of the figure, which is intended to indicate the order of execution of the specific process of the first part, that is, the model is first trained, and then combined with the model to predict the real-time data. It is understandable that some details about data processing are omitted in the figure. Therefore, if the intent is the same, these nodes can have other forms, or merge, or increase, and generally belong to the scope of this preparation.
  • the model training stage According to the provided historical data, through the idea of maximum likelihood estimation, combined with specific alarm categories and network element object categories, the initial state probability, state transition probability matrix, and observation probability of the hidden Markov model are obtained Matrix, the hidden Markov model.
  • model prediction stage According to the time series data provided in real time, predict and output the alarms that may appear in the next cycle.
  • the training data set needs to be acquired and preprocessed, and the model selection strategy needs to be determined.
  • the acquisition of the training data set includes: combining operation and maintenance knowledge, clarifying the causal relationship between network element objects, faults, and alarms, selecting raw data, and performing corresponding preprocessing on these raw data to obtain a preliminary training data set.
  • the division of training data includes: continuously adjusting the selection amount of the initial training set, generating several sub-training sets, and dividing all the sub-training sets generated into the training set and the test set according to a fixed ratio.
  • the training process includes: using methods such as maximum likelihood estimation to estimate parameters for all sub-training sets to form a model.
  • the criteria for model selection include: using the corresponding sub-test set to verify the model generated by each sub-training set. The verification is based on the accurate proportion of the corresponding alarm prediction in the future prediction period (or within several observations). The one with the highest accuracy of all models is the final model.
  • the method for predicting alarms in service operation and maintenance of the embodiment of the present application includes the following processing steps:
  • n is the total number of network element objects
  • Step 1 According to the relationship that the network element object generates an alarm, combined with the hidden Markov theory: the network element object is used as the state, and the alarm category is used as the observation.
  • n and m have the same meaning
  • ⁇ i represents the initial probability of the i-th network element object
  • a ij represents the probability that the state is i at the previous moment and the state changes from i to j at the next moment
  • b jk represents the state j Probability that the observation is k when appears.
  • Equations (4)-(8) adopt the idea of maximum likelihood estimation to solve the related unknown data in equations (1)-(3).
  • each component ⁇ i of ⁇ in formula (1) is obtained by dividing the frequency of the corresponding state in the data set by the total number of records in the data set.
  • a ij means that the state at the previous moment is i and the state at the next moment changes from i to The frequency of j
  • a ij in formula (2) is obtained by dividing A ij by the sum of the elements of A'corresponding to a row in formula (4)
  • B jk represents the frequency of observation k when the state is j
  • formula ( 3) b jk is obtained by dividing B jk by the sum of the elements in the corresponding row of B'in formula (5).
  • Step 4 Evaluate the training effect.
  • Data set D (generally speaking, when the network element object has not exited the network and the related structure remains unchanged, the more data the better) is grouped according to Table 1, and the accuracy of each group of model predictions is evaluated. From all models choose the highest.
  • Step 5 Use the trained Hidden Markov prediction model to make predictions. That is, for the observation (alarm) sequence o i , o i+1 ,..., o i+j-1 at a certain moment, predict the probability of o i+j in the next prediction period : select o k from O in turn , Compose m o i , o i+1 ,..., o i+j-1 , o k sequences, and use the hidden Markov prediction model for forward calculation to obtain each P(o i ,o i+1 ,...,o i+j-1 ,o k
  • the embodiment of the application uses the hidden Markov alarm prediction method of supervised learning, which can more accurately predict the alarm sequence generated in a short period of time in the future and the corresponding network element object that generates the alarm, provide decision-making for fault avoidance, etc., and shorten the fault processing. It takes a long time and reduces the impact of failures.
  • the embodiments of the present application provide a device for predicting alarms during service operation and maintenance according to the foregoing embodiments, and the device is used to realize the prediction of alarms during service operation and maintenance in the foregoing embodiments. Therefore, the descriptions and definitions in the method for predicting alarms in service operation and maintenance in the foregoing embodiments can be used to understand the execution modules in the embodiments of the present application. For details, please refer to the foregoing embodiments, which will not be repeated here.
  • FIG. 3 is a schematic diagram of the structure of the device for predicting alarms in business operation and maintenance provided in this embodiment of the application.
  • the device can be used for To realize the prediction of alarms in service operation and maintenance in the foregoing method embodiments, the device includes: a data acquisition module 301 and a prediction output module 302. among them:
  • the data acquisition module 301 obtains a given number of historical alarm data of the target network element object before the current alarm period to form a historical alarm sequence; the prediction output module 302 uses the trained hidden Markov prediction model to analyze the target network element in the historical alarm sequence.
  • the object makes alarm predictions.
  • the trained hidden Markov prediction model is initialized and constructed by analyzing the relationship information of network element objects, network element failures and network element alarms in the fault management system in advance, and is obtained by training with original data samples selected according to the relationship information of.
  • the data obtaining module 301 obtains the alarms that the target network element object has issued before the current alarm period, that is, the historical alarm data, according to the historical record data of the fault management system. It is understandable that, in order to avoid errors caused by contingency and take into account the characteristics of the hidden Markov prediction model, the number of selected historical alarm data must reach a certain amount, and the certain amount can be obtained through implementation settings. After that, the data acquisition module 301 can process and encode these historical alarm data according to a certain time sequence to form a data sequence, that is, a historical alarm sequence.
  • the prediction output module 302 inputs the historical alarm sequence of the target netizen object into the pre-trained hidden Markov prediction model, and obtains one or more alarms for the target net element object through the forward calculation of the prediction model. The predicted results of periodic alarms.
  • a certain model building method needs to be used to build the model in advance. Specifically, the relationship between network element objects, faults and alarms in the fault management system can be analyzed first, and a hidden Markov initial model can be constructed based on this initialization. After that, according to the results of the above analysis, the corresponding original alarm data is selected, and these original alarm data are processed to train the constructed initial hidden Markov model, and finally the trained hidden Markov prediction model is obtained, which can be used Alarm prediction for network element objects.
  • the device for predicting alarms in business operation and maintenance analyzes the relationship between network element objects, faults and alarms in the fault management system by setting corresponding execution modules, and combines the hidden Markov prediction model to determine the basis
  • the target sequence constructed by the historical alarm of the network element object is processed to finally realize the alarm prediction of the network element object, which can effectively improve the accuracy of the alarm prediction in business operation and maintenance, thereby effectively avoiding failures or reducing the impact of failures.
  • a hardware processor may be used to implement the relevant program modules in the apparatuses of the foregoing embodiments.
  • the device for predicting alarms during service operation and maintenance in the embodiments of the present application uses the above program modules to realize the prediction process of alarms during service operation and maintenance in the foregoing method embodiments, and is used to implement the business operations in the foregoing method embodiments.
  • the beneficial effects produced by the device of the embodiment of the present application are the same as the corresponding method embodiments described above, and reference may be made to the method embodiments described above, which will not be repeated here.
  • this embodiment provides an electronic device according to the foregoing embodiments.
  • the electronic device includes a memory, a processor, and a computer program stored on the memory and running on the processor, When the processor executes the computer program, it implements the steps of the method for predicting alarms in business operation and maintenance as described in the foregoing embodiments.
  • the electronic device in the embodiment of the present application may also include a communication interface and a bus.
  • FIG. 4 a schematic diagram of the physical structure of an electronic device provided by an embodiment of this application, including: at least one memory 401, at least one processor 402, a communication interface 403, and a bus 404.
  • the memory 401, the processor 402, and the communication interface 403 communicate with each other through the bus 404.
  • the communication interface 403 is used for information transmission between the electronic device and the fault management system device; the memory 401 is stored in the processor 402.
  • the processor 402 executes the computer program running on the computer program, the steps of the method for predicting alarms in business operation and maintenance as described in the foregoing embodiments are implemented.
  • the electronic device includes at least a memory 401, a processor 402, a communication interface 403, and a bus 404, and the memory 401, the processor 402, and the communication interface 403 form a mutual communication connection through the bus 404, and can complete mutual communication.
  • the processor 402 reads from the memory 401 the program instructions of the method for predicting alarms in business operation and maintenance.
  • the communication interface 403 can also realize the communication connection between the electronic device and the fault management system device, and can complete mutual information transmission, such as obtaining the alarm data of the network element object through the communication interface 403.
  • the processor 402 calls the program instructions in the memory 401 to execute the methods provided in the above method embodiments, for example, including: obtaining a given number of historical alarm data of the target network element object before the current alarm period to form a history Alarm sequence: Based on the historical alarm sequence, the trained hidden Markov prediction model is used to perform alarm prediction on the target network element object.
  • the above-mentioned program instructions in the memory 401 can be implemented in the form of a software functional unit and when sold or used as an independent product, they can be stored in a computer readable storage medium.
  • all or part of the steps in the foregoing method embodiments may be implemented by a program instructing relevant hardware.
  • the foregoing program may be stored in a computer readable storage medium.
  • the execution includes the foregoing method implementations. Examples of steps; and the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks and other storage The medium of the program code.
  • the embodiments of the present application also provide a non-transitory computer-readable storage medium according to the foregoing embodiments, on which computer instructions are stored.
  • the computer instructions are executed by a computer, the business operation and maintenance as described in the foregoing embodiments is implemented.
  • the steps of the alarm prediction method include, for example, obtaining historical alarm data of a given number of target network element objects before the current alarm period to form a historical alarm sequence; based on the historical alarm sequence, the trained hidden Markov prediction model is used to determine the target The network element object performs alarm prediction and so on.
  • the electronic equipment and the non-transitory computer-readable storage medium provided by the embodiments of the present application perform the steps of the method for predicting alarms in business operation and maintenance described in the above embodiments to obtain information about network element objects, faults, and alarms in the fault management system. Analyze the relationship between, combined with the hidden Markov prediction model, and process the target sequence constructed based on the historical alarm of the network element object, and finally realize the alarm prediction of the network element object, which can effectively improve the accuracy of the alarm prediction in business operation and maintenance. , So as to effectively avoid failures or reduce the impact of failures.
  • each implementation manner can be implemented by means of software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the above technical solution essentially or the part that contributes to the existing technology can be embodied in the form of a computer software product, which can be stored in a computer-readable storage medium, such as a USB flash drive or mobile Hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (such as a personal computer, server, or network device, etc.) execute the above method embodiments or some parts of the method embodiments The method described.

Abstract

本申请实施例提供一种业务运维中告警的预测方法、装置与电子设备,其中所述方法包括:获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;基于所述历史告警序列,利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测;其中,所述训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据所述关系信息选取的原始数据样本进行训练获取的。本申请实施例能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。

Description

业务运维中告警的预测方法、装置与电子设备
相关申请的交叉引用
本申请要求于2019年12月02日提交的申请号为201911215004.0,发明名称为“业务运维中告警的预测方法、装置与电子设备”的中国专利申请的优先权,其通过引用方式全部并入本文。
技术领域
本申请涉及IT运维技术领域,更具体地,涉及一种业务运维中告警的预测方法、装置与电子设备。
背景技术
在IT运维领域,一个完善的运维体系中都包含具有故障管理能力的系统,通常称为故障管理系统。传统的故障管理系统一般具备设备告警监控、业务指标监控、响应故障和定位故障等功能。
传统故障管理系统关于设备告警监控,都是“后知后觉”的。即,相关的采集工具获取到数据后,发现数据触发了相应的规则才产生告警并审核派单。通常从发现告警到派单的过程越短,相应告警产生的影响就会越小,运维人员解决问题的时间窗口就越宽裕。因此,如果能够准确地提前预测到设备告警,就可以提前做好相应的规避措施,从而避免相关故障的产生或者减少相关故障产生时带来的影响。
目前,关于设备告警预测,相关学者和研究人员提出了基于人工智能的分析方法。例如,先采用机器学习和人工智能算法对数据进行聚类分析,提取设备或服务告警的规律信息,再采用相似性度量方式预测告警的发生。但是,由于该方法仅仅是对现有告警数据的规律性总结,并不能真正体现设备状态,导致对具体设备的告警预测准确性不高。
发明内容
为了克服上述问题或者至少部分地解决上述问题,本申请实施例提供一种业务运维中告警的预测方法、装置与电子设备,用以有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。
第一方面,本申请实施例提供一种业务运维中告警的预测方法,包括:
获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;
基于所述历史告警序列,利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测;
其中,所述训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据所述关系信息选取的原始数据样本进行训练获取的。
进一步的,在所述利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测的步骤之前,本申请实施例的业务运维中告警的预测方法还包括:
通过分析故障管理系统中网元对象、网元故障和网元告警间的关系信息,初始化构建基于监督学习的隐马尔科夫初始模型,并根据所述关系信息,选取故障管理系统中相应的历史告警数据,构成训练样本集;
利用所述训练样本集中的各样本数据,采用极大似然估计法,迭代训练所述隐马尔科夫初始模型,获取满足设定标准的预测模型,作为所述训练好的隐马尔科夫预测模型。
其中可选的,所述选取故障管理系统中相应的历史告警数据,构成训练样本集的步骤具体包括:
结合运维知识,通过分析故障管理系统中网元对象、网元故障和网元告警的因果关系,选取第二给定数量的历史告警数据,所述历史告警数据中包括所述网元对象与所述网元告警一对一的对应关系;
对所述历史告警数据根据时序性和缺失值进行预处理,并对预处理结果进行编码,得到样本数据;
根据所有所述样本数据,构成所述训练样本集。
进一步的,在所述迭代训练所述隐马尔科夫初始模型的步骤之前,本申请实施例的业务运维中告警的预测方法还包括:
不断调整所述训练样本集中所述样本数据的数量,并对所述训练样本集进行划分,产生多个子训练样本集;
对产生的所有所述子训练样本集,按照固定比例划分出训练集和测试 集。
其中可选的,所述迭代训练所述隐马尔科夫初始模型的步骤具体包括:
利用各所述子训练样本集中的训练集,分别采用极大似然估计法,迭代训练所述隐马尔科夫初始模型,对应获取多个候选预测模型;
利用所述子训练样本集中的测试集,对应验证各所述候选预测模型是否满足所述设定标准,选取满足所述设定标准的预测模型,作为所述训练好的隐马尔科夫预测模型;
其中,所述设定标准为利用所述测试集验证的预测结果的准确率为最高。
其中可选的,所述利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测的步骤具体包括:从所有网元对象产生的告警集合中选取多个不同类别的告警,并基于所述历史告警序列和选取的各所述告警,利用所述训练好的隐马尔科夫预测模型分别进行前向计算,获取选取的各所述告警分别对应的概率,并基于所述概率,确定所述目标网元对象的告警预测结果。
其中可选的,所述基于所述概率,确定所述目标网元对象的告警预测结果的步骤具体包括:根据所述概率的大小,对所有所述概率进行排序,并根据排序结果获取取值最大者对应的告警,作为所述目标网元对象的下一预测周期的告警。
第二方面,本申请实施例提供一种业务运维中告警的预测装置,包括:
数据获取模块,配置为获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;
预测输出模块,配置为基于所述历史告警序列,利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测;
其中,所述训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据所述关系信息选取的原始数据样本进行训练获取的。
第三方面,本申请实施例提供一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如上第一方面所述的业务运维中告警的预测 方法的步骤。
第四方面,本申请实施例提供一种非暂态计算机可读存储介质,其上存储有计算机指令,所述计算机指令被计算机执行时,实现如上第一方面所述的业务运维中告警的预测方法的步骤。
本申请实施例提供的业务运维中告警的预测方法、装置与电子设备,通过对故障管理系统中网元对象、故障和告警的关系的分析,并结合隐马尔科夫预测模型,对根据网元对象历史告警构建的目标序列进行运算处理,最终实现对网元对象的告警预测,能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的业务运维中告警的预测方法的流程示意图;
图2为本申请实施例提供的业务运维中告警的预测方法的执行原理示意图;
图3为本申请实施例提供的业务运维中告警的预测装置的结构示意图;
图4为本申请实施例提供的电子设备的实体结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请实施例的一部分实施例,而不是全部的实施例。基于本申请实施例中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请实施例保护的范围。
针对运维过程中告警预测问题,即,若能较为准确地知道一些重要的告警在未来很短的一段时间内发生,就可以提前采取措施避免或减少相应的影响,本申请实施例提出了一种基于监督学习的隐马尔科夫告警预测方 法,该方法采用离线的监督学习的方式产生预测模型,利用模型较为准确地预测下一个预测周期最可能产生的告警,从而提升运维的自动化和智能化。
也就是说,本申请实施例针对现有技术中对业务运维中告警预测准确性较差的问题,通过对故障管理系统中网元对象、故障和告警的关系的分析,并结合隐马尔科夫预测模型,对根据网元对象历史告警构建的目标序列进行运算处理,最终实现对网元对象的告警预测,能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。以下将具体通过多个实施例对本申请实施例进行展开说明和介绍。
图1为本申请实施例提供的业务运维中告警的预测方法的流程示意图,如图1所示,该方法包括:
S101,获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列。
可以理解为,本申请实施例根据网元对象的历史告警数据,来对网元对象下一个告警周期的告警进行预测。因此,本申请实施例首先根据故障管理系统的历史记录数据,获取到目标网元对象在当前告警周期之前已经发出的告警,即历史告警数据。可以理解的是,为了避免偶然性带来的误差,兼顾隐马尔科夫预测模型的特性,选取的该历史告警数据的数量要达到一定的量,该一定的量可以通过实现设定得到。之后,可将这些历史告警数据按一定的时序性进行处理并编码,构成一数据序列,即为历史告警序列。
S102,基于历史告警序列,利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测。其中,训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据关系信息选取的原始数据样本进行训练获取的。
可以理解为,在得到目标网友对象的历史告警序列的基础上,可将其输入到预先训练好的隐马尔科夫预测模型中,通过该预测模型的前向计算,得到对目标网元对象下一个或多个告警周期的告警的预测结果。
可以理解的是,在对预测模型进行应用之前,需要采用一定的模型建立方法事先对模型进行建立。具体而言,可以先对故障管理系统中网元对 象、故障和告警的关系进行分析,并基于此初始化构建出一隐马尔科夫初始模型。之后,根据上述分析的结果,选取出相应的原始告警数据,再对这些原始告警数据进行处理后,训练构建出的隐马尔科夫初始模型,最终得到训练好的隐马尔科夫预测模型,可用于网元对象的告警预测。
本申请实施例提供的业务运维中告警的预测方法,通过对故障管理系统中网元对象、故障和告警的关系的分析,并结合隐马尔科夫预测模型,对根据网元对象历史告警构建的目标序列进行运算处理,最终实现对网元对象的告警预测,能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。
进一步的,在上述各实施例的基础上,在利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测的步骤之前,本申请实施例的业务运维中告警的预测方法还包括:
通过分析故障管理系统中网元对象、网元故障和网元告警间的关系信息,初始化构建基于监督学习的隐马尔科夫初始模型,并根据关系信息,选取故障管理系统中相应的历史告警数据,构成训练样本集;利用训练样本集中的各样本数据,采用极大似然估计法,迭代训练隐马尔科夫初始模型,获取满足设定标准的预测模型,作为训练好的隐马尔科夫预测模型。
可以理解为,在对预测模型进行应用之前,本申请实施例还采用一定的模型建立方法事先对模型进行建立。具体而言,首先根据故障管理系统的历史记录信息,获取故障管理系统中的网元对象,并获取各网元对象产生的故障数据和对应形成的告警数据。之后,对这些网元对象、故障和告警的关系进行综合分析,并在此基础上初始化构建出一基于监督学习的隐马尔科夫初始模型。同时,当然也可以在之前或之后,根据上述分析的结果,也即关系信息,从故障管理系统中选取一定量相应的历史告警数据,对应形成多个训练样本数据,并基于此构成初始预测模型的训练样本集。
然后,通过从上述训练样本集中逐个选取各样本数据,对初始化构建的隐马尔科夫初始模型采用极大似然估计法进行迭代训练,并在每轮训练过程中对模型的预测结果进行测试,最终得到满足设定标准的预测模型,即作为训练好的隐马尔科夫预测模型。
其中,根据上述各实施例可选的,选取故障管理系统中相应的历史告 警数据,构成训练样本集的步骤具体包括:结合运维知识,通过分析故障管理系统中网元对象、网元故障和网元告警的因果关系,选取第二给定数量的历史告警数据,历史告警数据中包括网元对象与网元告警一对一的对应关系;对历史告警数据根据时序性和缺失值进行预处理,并对预处理结果进行编码,得到样本数据;根据所有样本数据,构成训练样本集。
可以理解为,本申请实施例实现对模型的训练样本集的构建。具体而言,首先结合运维知识,对故障管理系统历史记录信息中网元对象及网元对象产生的网元故障和网元告警的因果关系进行分析,并根据分析结果选取出给定数量的历史告警数据。其中,该历史告警数据表示为网元对象与其产生的告警信息之间的一对一的对应关系。例如,某历史时刻,某网元对象s i产生了告警信息o i,则根据其选取的历史告警数据可表示为(o i,s i)。
之后,对选取出的历史告警数据进行预处理,包括按时序进行处理以及对缺失值的等效补充等,再对预处理的结果进行编码处理,得到对应的编码结果作为样本数据。最后,根据这些样本数据构建一样本集合,即为训练样本集。
另外,在上述各实施例的基础上,在迭代训练隐马尔科夫初始模型的步骤之前,本申请实施例的业务运维中告警的预测方法还包括:不断调整训练样本集中样本数据的数量,并对训练样本集进行划分,产生多个子训练样本集;对产生的所有子训练样本集,按照固定比例划分出训练集和测试集。
可以理解为,本申请实施例在对预测模型进行训练之前,对训练样本集的划分进行改进处理,以改善传统仅将训练样本集划分为训练集和测试集时存在的缺陷。具体而言,根据实际训练样本集的容量情况,通过选取更多的历史告警数据,不断调整训练样本集中样本数据的数量。同时,对训练样本集按照应用需求进行子集划分,得到对应的多个子训练样本集。之后,再分别对各个子训练样本集,按照固定的比例划分出对应的多个训练集和测试集。
例如,如表1所示,为根据本申请实施例对训练样本集的划分示例表。表中将训练样本集中的样本数据均匀划分到五个子训练样本集中,并对每个子训练样本集按照固定比例7:3、8:2和9:1,划分出对应的训练集和测 试集。
表1 根据本申请实施例对训练样本集的划分示例表
Figure PCTCN2020101818-appb-000001
其中,根据上述各实施例可选的,迭代训练隐马尔科夫初始模型的步骤具体包括:利用各子训练样本集中的训练集,分别采用极大似然估计法,迭代训练隐马尔科夫初始模型,对应获取多个候选预测模型;利用子训练样本集中的测试集,对应验证各候选预测模型是否满足设定标准,选取满足设定标准的预测模型,作为训练好的隐马尔科夫预测模型;其中,设定标准为利用测试集验证的预测结果的准确率为最高。
可以理解为,为避免一次训练选取的损失函数不当而导致模型准确性不高的问题,本申请实施例利用根据上述实施例划分出的各子训练样本集,分别对构建的隐马尔科夫初始模型进行训练。具体而言,分别提取各子训练样本集中的训练集,采用极大似然估计法,对构建的隐马尔科夫初始模型分别单独进行训练,对应得到多个训练完成的预测模型作为候选预测模型。
之后,对于对应得到的各候选预测模型,利用与训练该候选预测模型的训练集对应的测试集,分别对应测试候选预测模型的准确性。也就是说,对于得到的任一候选预测模型,利用其对应的测试集进行前向计算得到预测结果,并与测试集中的参考告警结果进行比对测试预测准确性。则对于每个测试集,可对应得到一个准确率测试结果,该准确率测试结果表示为利用该测试集中的测试数据进行测试时,预测准确的数据占总测试数据的比例。
最后,选取各测试集对应的准确率测试结果中准确率最高者对应的测试对象,也即候选预测模型,认为其满足测试的设定标准,将其作为最终的训练好的隐马尔科夫预测模型。
本申请实施例通过改进对训练样本集的划分标准,能够有效避免一次训练选取的损失函数不当而导致过拟合问题,从而进一步提高预测模型的预测准确性。
其中,根据上述各实施例可选的,利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测的步骤具体包括:从所有网元对象产生的告警集合中选取多个不同类别的告警,并基于历史告警序列和选取的各告警,利用训练好的隐马尔科夫预测模型分别进行前向计算,获取选取的各告警分别对应的概率,并基于概率,确定目标网元对象的告警预测结果。
可以理解为,在对目标网元对象进行告警预测时,不仅预测其是否发出告警,还包括其具体的告警类型。于是,对于根据当前告警周期前目标网元对象的历史告警数据得到的长度为n的历史告警序列,还需从故障管理系统中所有网元对象产生的告警信息集合中选取不同类别的告警,并将选取的每个告警与长度为n的历史告警序列结合,构造长度为n+1的目标序列。
之后,将上述各目标序列分别输入到训练好的隐马尔科夫预测模型中,进行前向计算,得到每个类别的告警分别对应的概率,并基于这些概率,确定目标网元对象的最终的告警预测结果。
其中可选的,基于概率,确定目标网元对象的告警预测结果的步骤具体包括:根据概率的大小,对所有概率进行排序,并根据排序结果获取取值最大者对应的告警,作为目标网元对象的下一预测周期的告警。
具体而言,对于得到的每个类别的告警分别对应的概率,按照取值大小进行排序,并根据排序结果选取其中的最大概率值。之后确定该最大概率值对应的告警及该告警的类别,作为目标网元对象的下一预测周期的预测告警。
为进一步说明本申请实施例的技术方案,本申请实施例根据上述各实施例提供如下具体说明,但不对本申请实施例的保护范围进行限制。
首先可以理解的是,关于监督学习隐马尔科夫的告警预测,是基于这 样的事实:某一网元对象产生故障或相关指标达到某一阈值后,由故障管理系统产生相应的告警,再经过一些环节最终派单给运维人员,即网元对象产生告警。
如图2所示,为本申请实施例提供的业务运维中告警的预测方法的执行原理示意图,该原理图由两部分组成:第一部分是执行原理图的主体结点,描述了训练模型及通过模型预测的过程;第二部分是图顶端的时间轴,意在表示第一部分具体过程执行的先后顺序,即先训练得到模型,再结合模型对实时数据进行预测。可以理解的是,图中省略了一些关于数据处理的细节,因此,在意图一致的情况下,这些结点可以有其他的形式,或合并,或增加,总体上还是属于这一准备的范畴。
由图2的原理图可见,其示出了如下两个阶段的处理流程:
首先,模型训练阶段:根据提供的历史数据,通过极大似然估计的思想,结合具体的告警类别和网元对象类别,得到隐马尔科夫模型的初始状态概率、状态转移概率矩阵、观测概率矩阵,即隐马尔科夫模型。
即:先分析故障管理系统中网元对象、故障和告警的关系,并以这个关系为基础,构建监督学习的隐马尔科夫模型;再根据上述关系选取相应的原始数据,并对相应的数据进行预处理(时序性,缺失值、编码等),最终形成训练数据集;最后改进隐马尔科夫模型的训练数据集划分和模型选择的标准,对构建的隐马尔科夫模型进行训练。
其次,利用模型预测阶段:根据实时提供的时序数据,预测紧接着的一个周期可能出现的告警并输出。
即:在预测方面,根据当前长度为n的序列(输入序列),构造长度为n+1的序列(目标序列),结合前向算法得出概率最大的序列,完成预测。
可以理解的是,在对模型进行训练之前,需要对训练数据集进行获取和预处理,并需要确定模型选择策略。具体的,对训练数据集的获取包括:结合运维知识,明确网元对象、故障、告警等因果关系选取原始数据,对这些原始数据做相应的预处理工作,得到初步的训练数据集。对训练数据的划分包括:不断调整初始训练集的选取量,产生若干子训练集,对产生的所有子训练集按照固定比例划分训练集和测试集。
另外,训练过程包括:对所有子训练集,利用极大似然估计等方法估计参数,形成模型。模型选择的标准包括:对每个子训练集产生的模型,利用相应的子测试集进行模型验证,验证的依据是在未来的预测周期内(或若干个观测内)相应告警预测准确的比例,取所有模型准确比例最高的那个为最终模型。
为更清楚的说明上述处理过程,以下进行具体举例说明,但不对本申请的保护范围进行限制。本申请实施例的业务运维中告警的预测方法包括以下处理步骤:
首先,做出如下假设:某一区域或某一网元组内的网元对象的集合为S={s 1,s 2,...,s n},n=1,2,...,其中n为网元对象的总数,所有网元对象产生的告警类别的集合为O={o 1,o 2,...,o m},m=1,2,...,其中m为告警类别的总数。
其次,说明具体处理过程。
步骤1,根据网元对象产生告警这样的关系,结合隐马尔科夫理论:将网元对象作为状态,将告警类别作为观测。
步骤2,获取一定数量的告警历史数据,构成历史告警序列D={(o 1,s 1),(o 2,s 2),...,(o d,s d)},d=1,2,...,其中d为数据集中记录的数量,o和s分别是集合O和S中的元素,它们在一条记录中是一对一的关系。
步骤3,利用极大似然估计法,训练并得到隐马尔科夫模型M=(π,A,B),其中,π为初始概率分布向量(即D中每个网元对象的初始概率分布),如式(1)所示,A为状态转移矩阵(即D中前一时刻网元对象s i到后一个时刻网元对象s j的概率),如式(2)所示,B为观测概率矩阵(即D中出现相应告警o i的概率),如式(3)所示。
π=(π 12,...,π n);                   (1)
Figure PCTCN2020101818-appb-000002
Figure PCTCN2020101818-appb-000003
其中,n和m含义不变,π i表示第i个网元对象的初始概率,a ij表示 前一个时刻状态为i而在下一个时刻状态由i变为j的概率,b jk表示状态为j时出现观测为k的概率。
式(4)-式(8)为采用极大似然估计的思想,求解式(1)-式(3)中相关的未知数的相关数据。
Figure PCTCN2020101818-appb-000004
Figure PCTCN2020101818-appb-000005
Figure PCTCN2020101818-appb-000006
Figure PCTCN2020101818-appb-000007
Figure PCTCN2020101818-appb-000008
其中,式(1)中π的每个分量π i由相应状态在数据集中的频数除以数据集总记录数得到,A ij是表示前一个时刻状态为i而在下一个时刻状态由i变为j的频数,式(2)中的a ij是由A ij除以式(4)中A'相应一行元素的和求得的,B jk表示状态为j时出现观测为k的频数,式(3)中的b jk是由B jk除以式(5)中B'相应一行元素的和求得的。
步骤4,评估训练效果。对数据集D(一般而言,在网元对象未退网且相关结构不变的情况下,数据越多越好)按照表1进行分组,评估每组模型预测的准确率,从所有模型中选择最高的。
步骤5,利用训练得到的隐马尔科夫预测模型进行预测。即对于某一时刻的观测(告警)序列o i,o i+1,...,o i+j-1,预测下一个预测周期出现o i+j的概率:从O中依次选取o k,组成m个o i,o i+1,...,o i+j-1,o k序列,利用隐马尔科夫预测模型进行前向计算,得到各P(o i,o i+1,...,o i+j-1,o k|M)的大小,最终选取出
Figure PCTCN2020101818-appb-000009
相应的k对应的告警就是预测出的下一个预测周期的告警。
本申请实施例利用监督学习的隐马尔科夫告警预测方法,能够较为准确地预测未来一小段时间产生的告警序列和相应产生告警的网元对象,为故障规避提供决策等,同时缩短了故障处理时间长、减小了故障的影响等问题。
基于相同的发明构思,本申请实施例根据上述各实施例提供一种业务运维中告警的预测装置,该装置用于在上述各实施例中实现对业务运维中告警的预测。因此,在上述各实施例的业务运维中告警的预测方法中的描述和定义,可以用于本申请实施例中各个执行模块的理解,具体可参考上述实施例,此处不在赘述。
根据本申请实施例的一个实施例,业务运维中告警的预测装置的结构如图3所示,为本申请实施例提供的业务运维中告警的预测装置的结构示意图,该装置可以用于实现上述各方法实施例中对业务运维中告警的预测,该装置包括:数据获取模块301和预测输出模块302。其中:
数据获取模块301取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;预测输出模块302于历史告警序列,利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测。其中,训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据关系信息选取的原始数据样本进行训练获取的。
具体而言,数据获取模块301根据故障管理系统的历史记录数据,获取到目标网元对象在当前告警周期之前已经发出的告警,即历史告警数据。可以理解的是,为了避免偶然性带来的误差,兼顾隐马尔科夫预测模型的特性,选取的该历史告警数据的数量要达到一定的量,该一定的量可以通过实现设定得到。之后,数据获取模块301可将这些历史告警数据按一定的时序性进行处理并编码,构成一数据序列,即为历史告警序列。
然后,预测输出模块302将得到目标网友对象的历史告警序列输入到预先训练好的隐马尔科夫预测模型中,通过该预测模型的前向计算,得到对目标网元对象下一个或多个告警周期的告警的预测结果。
可以理解的是,在对预测模型进行应用之前,需要采用一定的模型建立方法事先对模型进行建立。具体而言,可以先对故障管理系统中网元对 象、故障和告警的关系进行分析,并基于此初始化构建出一隐马尔科夫初始模型。之后,根据上述分析的结果,选取出相应的原始告警数据,再对这些原始告警数据进行处理后,训练构建出的隐马尔科夫初始模型,最终得到训练好的隐马尔科夫预测模型,可用于网元对象的告警预测。
本申请实施例提供的业务运维中告警的预测装置,通过设置相应的执行模块,对故障管理系统中网元对象、故障和告警的关系的分析,并结合隐马尔科夫预测模型,对根据网元对象历史告警构建的目标序列进行运算处理,最终实现对网元对象的告警预测,能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。
可以理解的是,本申请实施例中可以通过硬件处理器(hardware processor)来实现上述各实施例的装置中的各相关程序模块。并且,本申请实施例的业务运维中告警的预测装置利用上述各程序模块,能够实现上述各方法实施例的业务运维中告警的预测流程,在用于实现上述各方法实施例中对业务运维中告警的预测时,本申请实施例的装置产生的有益效果与对应的上述各方法实施例相同,可以参考上述各方法实施例,此处不再赘述。
作为本申请实施例的又一个方面,本实施例根据上述各实施例提供一种电子设备,该电子设备包括存储器、处理器及存储在该存储器上并可在该处理器上运行的计算机程序,该处理器执行该计算机程序时,实现如上述各实施例所述的业务运维中告警的预测方法的步骤。
进一步的,本申请实施例的电子设备还可以包括通信接口和总线。参考图4,为本申请实施例提供的电子设备的实体结构示意图,包括:至少一个存储器401、至少一个处理器402、通信接口403和总线404。
其中,存储器401、处理器402和通信接口403通过总线404完成相互间的通信,通信接口403用于该电子设备与故障管理系统设备之间的信息传输;存储器401中存储有可在处理器402上运行的计算机程序,处理器402执行该计算机程序时,实现如上述各实施例所述的业务运维中告警的预测方法的步骤。
可以理解为,该电子设备中至少包含存储器401、处理器402、通信接口403和总线404,且存储器401、处理器402和通信接口403通过总线404形 成相互间的通信连接,并可完成相互间的通信,如处理器402从存储器401中读取业务运维中告警的预测方法的程序指令等。另外,通信接口403还可以实现该电子设备与故障管理系统设备之间的通信连接,并可完成相互间信息传输,如通过通信接口403实现对网元对象告警数据的获取等。
电子设备运行时,处理器402调用存储器401中的程序指令,以执行上述各方法实施例所提供的方法,例如包括:获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;基于历史告警序列,利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测等。
上述的存储器401中的程序指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。或者,实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本申请实施例还根据上述各实施例提供一种非暂态计算机可读存储介质,其上存储有计算机指令,该计算机指令被计算机执行时,实现如上述各实施例所述的业务运维中告警的预测方法的步骤,例如包括:获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;基于历史告警序列,利用训练好的隐马尔科夫预测模型,对目标网元对象进行告警预测等。
本申请实施例提供的电子设备和非暂态计算机可读存储介质,通过执行上述各实施例所述的业务运维中告警的预测方法的步骤,对故障管理系统中网元对象、故障和告警的关系的分析,并结合隐马尔科夫预测模型,对根据网元对象历史告警构建的目标序列进行运算处理,最终实现对网元对象的告警预测,能够有效提高业务运维中告警预测的准确性,从而有效避免故障或减小故障发生带来的影响。
可以理解的是,以上所描述的装置、电子设备及存储介质的实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理 上分开的,既可以位于一个位置,或者也可以分布到不同网络单元上。可以根据实际需要选择其中的部分或全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上实施方式的描述,本领域的技术人员可以清楚地了解,各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令,用以使得一台计算机设备(如个人计算机,服务器,或者网络设备等)执行上述各方法实施例或者方法实施例的某些部分所述的方法。
另外,本领域内的技术人员应当理解的是,在本申请实施例的申请文件中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本申请实施例的说明书中,说明了大量具体细节。然而应当理解的是,本申请实施例的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。类似地,应当理解,为了精简本申请实施例公开并帮助理解各个发明方面中的一个或多个,在上面对本申请实施例的示例性实施例的描述中,本申请实施例的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。
最后应说明的是:以上实施例仅用以说明本申请实施例的技术方案,而非对其限制;尽管参照前述实施例对本申请实施例进行了详细的说明,本领域的技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请实施例各实施例技术方案的精 神和范围。

Claims (10)

  1. 一种业务运维中告警的预测方法,其特征在于,包括:
    获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;
    基于所述历史告警序列,利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测;
    其中,所述训练好的隐马尔科夫预测模型为预先通过分析故障管理系统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据所述关系信息选取的原始数据样本进行训练获取的。
  2. 根据权利要求1所述的业务运维中告警的预测方法,其特征在于,在所述利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测的步骤之前,还包括:
    通过分析故障管理系统中网元对象、网元故障和网元告警间的关系信息,初始化构建基于监督学习的隐马尔科夫初始模型,并根据所述关系信息,选取故障管理系统中相应的历史告警数据,构成训练样本集;
    利用所述训练样本集中的各样本数据,采用极大似然估计法,迭代训练所述隐马尔科夫初始模型,获取满足设定标准的预测模型,作为所述训练好的隐马尔科夫预测模型。
  3. 根据权利要求2所述的业务运维中告警的预测方法,其特征在于,所述选取故障管理系统中相应的历史告警数据,构成训练样本集的步骤具体包括:
    结合运维知识,通过分析故障管理系统中网元对象、网元故障和网元告警的因果关系,选取第二给定数量的历史告警数据,所述历史告警数据中包括所述网元对象与所述网元告警一对一的对应关系;
    对所述历史告警数据根据时序性和缺失值进行预处理,并对预处理结果进行编码,得到样本数据;
    根据所有所述样本数据,构成所述训练样本集。
  4. 根据权利要求2所述的业务运维中告警的预测方法,其特征在于,在所述迭代训练所述隐马尔科夫初始模型的步骤之前,还包括:
    不断调整所述训练样本集中所述样本数据的数量,并对所述训练样本 集进行划分,产生多个子训练样本集;
    对产生的所有所述子训练样本集,按照固定比例划分出训练集和测试集。
  5. 根据权利要求4所述的业务运维中告警的预测方法,其特征在于,所述迭代训练所述隐马尔科夫初始模型的步骤具体包括:
    利用各所述子训练样本集中的训练集,分别采用极大似然估计法,迭代训练所述隐马尔科夫初始模型,对应获取多个候选预测模型;
    利用所述子训练样本集中的测试集,对应验证各所述候选预测模型是否满足所述设定标准,选取满足所述设定标准的预测模型,作为所述训练好的隐马尔科夫预测模型;
    其中,所述设定标准为利用所述测试集验证的预测结果的准确率为最高。
  6. 根据权利要求1-5中任一项所述的业务运维中告警的预测方法,其特征在于,所述利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测的步骤具体包括:
    从所有网元对象产生的告警集合中选取多个不同类别的告警,并基于所述历史告警序列和选取的各所述告警,利用所述训练好的隐马尔科夫预测模型分别进行前向计算,获取选取的各所述告警分别对应的概率,并基于所述概率,确定所述目标网元对象的告警预测结果。
  7. 根据权利要求6所述的业务运维中告警的预测方法,其特征在于,所述基于所述概率,确定所述目标网元对象的告警预测结果的步骤具体包括:
    根据所述概率的大小,对所有所述概率进行排序,并根据排序结果获取取值最大者对应的告警,作为所述目标网元对象的下一预测周期的告警。
  8. 一种业务运维中告警的预测装置,其特征在于,包括:
    数据获取模块,配置为获取当前告警周期之前目标网元对象给定数量的历史告警数据,构成历史告警序列;
    预测输出模块,配置为基于所述历史告警序列,利用训练好的隐马尔科夫预测模型,对所述目标网元对象进行告警预测;
    其中,所述训练好的隐马尔科夫预测模型为预先通过分析故障管理系 统中网元对象、网元故障和网元告警的关系信息进行初始化构建,并利用根据所述关系信息选取的原始数据样本进行训练获取的。
  9. 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时,实现如权利要求1至7中任一项所述的业务运维中告警的预测方法的步骤。
  10. 一种非暂态计算机可读存储介质,其上存储有计算机指令,其特征在于,所述计算机指令被计算机执行时,实现如权利要求1至7中任一项所述的业务运维中告警的预测方法的步骤。
PCT/CN2020/101818 2019-12-02 2020-07-14 业务运维中告警的预测方法、装置与电子设备 WO2021109578A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911215004.0 2019-12-02
CN201911215004.0A CN111124840B (zh) 2019-12-02 2019-12-02 业务运维中告警的预测方法、装置与电子设备

Publications (1)

Publication Number Publication Date
WO2021109578A1 true WO2021109578A1 (zh) 2021-06-10

Family

ID=70496872

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/101818 WO2021109578A1 (zh) 2019-12-02 2020-07-14 业务运维中告警的预测方法、装置与电子设备

Country Status (2)

Country Link
CN (1) CN111124840B (zh)
WO (1) WO2021109578A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537349A (zh) * 2021-07-16 2021-10-22 中国工商银行股份有限公司 大型主机硬件故障识别方法、装置、设备及存储介质
CN113568991A (zh) * 2021-09-22 2021-10-29 北京必示科技有限公司 一种基于动态风险的告警处理方法及系统
CN113627496A (zh) * 2021-07-27 2021-11-09 交控科技股份有限公司 道岔转辙机故障预测方法、装置、电子设备和可读存储介质
CN113691311A (zh) * 2021-08-27 2021-11-23 中国科学院半导体研究所 光网络的故障定位方法、电子设备及计算机可读存储介质
CN113821408A (zh) * 2021-09-23 2021-12-21 中国建设银行股份有限公司 一种服务器告警处理方法及相关设备
CN113987481A (zh) * 2021-12-23 2022-01-28 浙江国利网安科技有限公司 工控入侵检测方法、装置、存储介质和设备
CN114629813A (zh) * 2021-12-30 2022-06-14 亚信科技(中国)有限公司 意图报告上报方法、装置、电子设备、存储介质及产品
CN114697203A (zh) * 2022-03-31 2022-07-01 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN115001753A (zh) * 2022-05-11 2022-09-02 绿盟科技集团股份有限公司 一种关联告警的分析方法、装置、电子设备及存储介质
CN114999182A (zh) * 2022-05-25 2022-09-02 中国人民解放军国防科技大学 基于lstm回馈机制的车流量预测方法、装置及设备
CN115174355A (zh) * 2022-07-26 2022-10-11 杭州东方通信软件技术有限公司 故障根因定位模型的生成方法,故障根因定位方法和装置
CN115238831A (zh) * 2022-09-21 2022-10-25 中国南方电网有限责任公司超高压输电公司广州局 故障预测方法、装置、计算机设备、存储介质和程序产品
CN115550139A (zh) * 2022-09-19 2022-12-30 中国电信股份有限公司 故障根因定位方法、装置、系统、电子设备及存储介质
CN115829172A (zh) * 2023-02-24 2023-03-21 清华大学 污染预测方法、装置、计算机设备和存储介质
CN116502156A (zh) * 2023-06-30 2023-07-28 中国电力科学研究院有限公司 一种换流站光ct异常状态智能辨识方法及系统
CN116910006A (zh) * 2023-07-24 2023-10-20 深圳市盛弘新能源设备有限公司 基于新能源电池的数据压缩存储处理方法及系统
CN117218300A (zh) * 2023-11-08 2023-12-12 腾讯科技(深圳)有限公司 三维模型的构建方法、三维构建模型的训练方法及装置
CN117592865A (zh) * 2023-12-21 2024-02-23 中国人民解放军军事科学院系统工程研究院 一种装备零备件质量状态预测方法及装置

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124840B (zh) * 2019-12-02 2022-02-08 北京天元创新科技有限公司 业务运维中告警的预测方法、装置与电子设备
CN111611517B (zh) * 2020-05-13 2023-07-21 咪咕文化科技有限公司 指标监控方法、装置、电子设备及存储介质
CN112231183B (zh) * 2020-07-13 2022-09-30 国网宁夏电力有限公司电力科学研究院 通信设备告警预测方法和装置、电子设备和可读存储介质
CN114095344B (zh) * 2020-08-04 2023-10-27 中国移动通信集团重庆有限公司 批量网络投诉的故障定位方法、设备及计算机存储介质
CN112085621B (zh) * 2020-09-11 2022-08-02 杭州华电下沙热电有限公司 一种基于K-Means-HMM模型的分布式光伏电站故障预警算法
CN112132195B (zh) * 2020-09-14 2024-03-29 江西山水光电科技股份有限公司 一种利用马尔科夫模型分析与预测机房故障的方法
CN112148561B (zh) * 2020-09-28 2022-12-09 建信金融科技有限责任公司 业务系统的运行状态预测方法、装置及服务器
CN112422351B (zh) * 2021-01-21 2022-12-09 南京群顶科技股份有限公司 一种基于深度学习的网络告警预测模型建立方法及装置
CN115208773B (zh) * 2021-04-09 2023-09-19 中国移动通信集团广东有限公司 网络隐性故障监测方法及装置
CN113446988B (zh) * 2021-06-08 2022-04-15 武汉理工大学 基于云边融合架构的机场跑道道面状态监测系统及方法
CN113420917B (zh) * 2021-06-18 2023-10-27 广东工业大学 对业务系统未来故障预测的方法、计算机设备及存储介质
CN113395182B (zh) * 2021-06-21 2022-03-18 山东八五信息技术有限公司 具有故障预测的智能网络设备管理系统及方法
CN113852515B (zh) * 2021-08-26 2023-05-09 西安电子科技大学广州研究院 一种数字孪生网络的节点状态管控方法及系统
CN113780597B (zh) * 2021-09-16 2023-04-07 睿云奇智(重庆)科技有限公司 影响传播关系模型构建和告警影响评估方法、计算机设备、存储介质
CN113835961B (zh) * 2021-09-23 2023-05-16 中国联合网络通信集团有限公司 告警信息监控方法、装置、服务器及存储介质
CN113988452A (zh) * 2021-11-08 2022-01-28 成都四方伟业软件股份有限公司 一种基于stacked LSTM的网元告警预测方法及装置
CN114374597A (zh) * 2021-12-27 2022-04-19 浪潮通信信息系统有限公司 一种网络事件的故障处理方法、装置、设备及产品
CN114422322B (zh) * 2021-12-29 2024-04-30 中国电信股份有限公司 一种告警压缩的方法、装置、设备及存储介质
CN114201246A (zh) * 2022-02-18 2022-03-18 浙江中控技术股份有限公司 数据预测方法及相关设备
CN114692487B (zh) * 2022-03-11 2023-05-26 中国电子科技集团公司第二十九研究所 电子装备维修备件预投方法、装置、设备及存储介质
CN114844767A (zh) * 2022-04-27 2022-08-02 中国电子科技集团公司第五十四研究所 一种基于对抗生成网络的告警数据生成方法
CN115134260A (zh) * 2022-07-12 2022-09-30 北京东土拓明科技有限公司 用户感知提升方法及装置、计算设备和存储介质
CN115361061A (zh) * 2022-08-24 2022-11-18 中铁电气化局集团有限公司 一种光纤故障监测方法
CN115311829B (zh) * 2022-10-12 2023-02-03 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) 一种基于海量数据的精准告警方法及系统
CN117057676B (zh) * 2023-10-11 2024-02-23 深圳润世华软件和信息技术服务有限公司 多数据融合的故障分析方法、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995008A (zh) * 2016-10-27 2018-05-04 中兴通讯股份有限公司 一种业务告警处理方法、装置及系统
CN108681923A (zh) * 2018-05-16 2018-10-19 浙江大学城市学院 一种基于改进型隐马尔可夫模型的消费者消费行为预测方法
US20190228105A1 (en) * 2018-01-24 2019-07-25 Rocket Fuel Inc. Dynamic website content optimization
CN110224850A (zh) * 2019-04-19 2019-09-10 北京亿阳信通科技有限公司 电信网络故障预警方法、装置及终端设备
CN111124840A (zh) * 2019-12-02 2020-05-08 北京天元创新科技有限公司 业务运维中告警的预测方法、装置与电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2530781A1 (en) * 2005-12-14 2007-06-14 Peter F. Werner Electrical component monitoring system
US20130226501A1 (en) * 2012-02-23 2013-08-29 Infosys Limited Systems and methods for predicting abnormal temperature of a server room using hidden markov model
US9652525B2 (en) * 2012-10-02 2017-05-16 Banjo, Inc. Dynamic event detection system and method
CN103856344B (zh) * 2012-12-05 2017-09-15 中国移动通信集团北京有限公司 一种告警事件信息处理方法及装置
CN107562606A (zh) * 2017-08-29 2018-01-09 郑州云海信息技术有限公司 一种告警监控数据显示方法和装置
CN107822622B (zh) * 2017-09-22 2022-09-09 成都比特律动科技有限责任公司 基于深度卷积神经网络的心电图诊断方法和系统
CN109117941A (zh) * 2018-07-16 2019-01-01 北京思特奇信息技术股份有限公司 告警预测方法、系统、存储介质及计算机设备
CN108880915B (zh) * 2018-08-20 2023-03-24 全球能源互联网研究院有限公司 一种电力信息网络安全告警信息误报判定方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107995008A (zh) * 2016-10-27 2018-05-04 中兴通讯股份有限公司 一种业务告警处理方法、装置及系统
US20190228105A1 (en) * 2018-01-24 2019-07-25 Rocket Fuel Inc. Dynamic website content optimization
CN108681923A (zh) * 2018-05-16 2018-10-19 浙江大学城市学院 一种基于改进型隐马尔可夫模型的消费者消费行为预测方法
CN110224850A (zh) * 2019-04-19 2019-09-10 北京亿阳信通科技有限公司 电信网络故障预警方法、装置及终端设备
CN111124840A (zh) * 2019-12-02 2020-05-08 北京天元创新科技有限公司 业务运维中告警的预测方法、装置与电子设备

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537349A (zh) * 2021-07-16 2021-10-22 中国工商银行股份有限公司 大型主机硬件故障识别方法、装置、设备及存储介质
CN113627496A (zh) * 2021-07-27 2021-11-09 交控科技股份有限公司 道岔转辙机故障预测方法、装置、电子设备和可读存储介质
CN113691311A (zh) * 2021-08-27 2021-11-23 中国科学院半导体研究所 光网络的故障定位方法、电子设备及计算机可读存储介质
CN113568991A (zh) * 2021-09-22 2021-10-29 北京必示科技有限公司 一种基于动态风险的告警处理方法及系统
CN113821408A (zh) * 2021-09-23 2021-12-21 中国建设银行股份有限公司 一种服务器告警处理方法及相关设备
CN113987481A (zh) * 2021-12-23 2022-01-28 浙江国利网安科技有限公司 工控入侵检测方法、装置、存储介质和设备
CN114629813A (zh) * 2021-12-30 2022-06-14 亚信科技(中国)有限公司 意图报告上报方法、装置、电子设备、存储介质及产品
CN114697203A (zh) * 2022-03-31 2022-07-01 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN114697203B (zh) * 2022-03-31 2023-07-25 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN115001753A (zh) * 2022-05-11 2022-09-02 绿盟科技集团股份有限公司 一种关联告警的分析方法、装置、电子设备及存储介质
CN115001753B (zh) * 2022-05-11 2023-06-09 绿盟科技集团股份有限公司 一种关联告警的分析方法、装置、电子设备及存储介质
CN114999182A (zh) * 2022-05-25 2022-09-02 中国人民解放军国防科技大学 基于lstm回馈机制的车流量预测方法、装置及设备
CN115174355A (zh) * 2022-07-26 2022-10-11 杭州东方通信软件技术有限公司 故障根因定位模型的生成方法,故障根因定位方法和装置
CN115174355B (zh) * 2022-07-26 2024-01-19 杭州东方通信软件技术有限公司 故障根因定位模型的生成方法,故障根因定位方法和装置
CN115550139B (zh) * 2022-09-19 2024-02-02 中国电信股份有限公司 故障根因定位方法、装置、系统、电子设备及存储介质
CN115550139A (zh) * 2022-09-19 2022-12-30 中国电信股份有限公司 故障根因定位方法、装置、系统、电子设备及存储介质
CN115238831B (zh) * 2022-09-21 2023-04-14 中国南方电网有限责任公司超高压输电公司广州局 故障预测方法、装置、计算机设备和存储介质
CN115238831A (zh) * 2022-09-21 2022-10-25 中国南方电网有限责任公司超高压输电公司广州局 故障预测方法、装置、计算机设备、存储介质和程序产品
CN115829172B (zh) * 2023-02-24 2023-05-12 清华大学 污染预测方法、装置、计算机设备和存储介质
CN115829172A (zh) * 2023-02-24 2023-03-21 清华大学 污染预测方法、装置、计算机设备和存储介质
CN116502156A (zh) * 2023-06-30 2023-07-28 中国电力科学研究院有限公司 一种换流站光ct异常状态智能辨识方法及系统
CN116502156B (zh) * 2023-06-30 2023-09-08 中国电力科学研究院有限公司 一种换流站光ct异常状态智能辨识方法及系统
CN116910006A (zh) * 2023-07-24 2023-10-20 深圳市盛弘新能源设备有限公司 基于新能源电池的数据压缩存储处理方法及系统
CN116910006B (zh) * 2023-07-24 2024-03-29 深圳市盛弘新能源设备有限公司 基于新能源电池的数据压缩存储处理方法及系统
CN117218300A (zh) * 2023-11-08 2023-12-12 腾讯科技(深圳)有限公司 三维模型的构建方法、三维构建模型的训练方法及装置
CN117218300B (zh) * 2023-11-08 2024-03-01 腾讯科技(深圳)有限公司 三维模型的构建方法、三维构建模型的训练方法及装置
CN117592865A (zh) * 2023-12-21 2024-02-23 中国人民解放军军事科学院系统工程研究院 一种装备零备件质量状态预测方法及装置
CN117592865B (zh) * 2023-12-21 2024-04-05 中国人民解放军军事科学院系统工程研究院 一种装备零备件质量状态预测方法及装置

Also Published As

Publication number Publication date
CN111124840B (zh) 2022-02-08
CN111124840A (zh) 2020-05-08

Similar Documents

Publication Publication Date Title
WO2021109578A1 (zh) 业务运维中告警的预测方法、装置与电子设备
CN111539515B (zh) 一种基于故障预测的复杂装备维修决策方法
JP6182242B1 (ja) データのラベリングモデルに係る機械学習方法、コンピュータおよびプログラム
US20190272553A1 (en) Predictive Modeling with Entity Representations Computed from Neural Network Models Simultaneously Trained on Multiple Tasks
CN113361680B (zh) 一种神经网络架构搜索方法、装置、设备及介质
US20220255817A1 (en) Machine learning-based vnf anomaly detection system and method for virtual network management
CN108052528A (zh) 一种存储设备时序分类预警方法
US11650968B2 (en) Systems and methods for predictive early stopping in neural network training
US20220172037A1 (en) Proactive anomaly detection
CN109471698B (zh) 云环境下虚拟机异常行为检测系统和方法
WO2023116111A1 (zh) 一种磁盘故障预测方法及装置
WO2017071369A1 (zh) 一种预测用户离网的方法和设备
WO2021103823A1 (zh) 模型更新系统、模型更新方法及相关设备
CN117041017B (zh) 数据中心的智能运维管理方法及系统
JP2023547849A (ja) ラベルなしセンサデータを用いた産業システム内の稀な障害の自動化されたリアルタイムの検出、予測、及び予防に関する、方法または非一時的コンピュータ可読媒体
CN114584406B (zh) 一种联邦学习的工业大数据隐私保护系统及方法
KR20200038072A (ko) 엔트로피 기반 신경망(Neural Networks) 부분학습 방법 및 시스템
CN113379059A (zh) 用于量子数据分类的模型训练方法以及量子数据分类方法
US20210373987A1 (en) Reinforcement learning approach to root cause analysis
CN111325350B (zh) 可疑组织发现系统和方法
CN116208513A (zh) 网关的健康度预测方法及装置
CN115174421B (zh) 基于自监督解缠绕超图注意力的网络故障预测方法及装置
CN117474127B (zh) 分布式机器学习模型训练系统、方法、装置及电子设备
CN114819328A (zh) 标签预测方法、装置、设备及存储介质
Huo et al. Fault Prediction of IoT Terminals based on Improved ResNet and BiLSTM Models

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20897095

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20897095

Country of ref document: EP

Kind code of ref document: A1