CN113420422A - Alarm log proportion prediction method, system, device and medium - Google Patents

Alarm log proportion prediction method, system, device and medium Download PDF

Info

Publication number
CN113420422A
CN113420422A CN202110599920.XA CN202110599920A CN113420422A CN 113420422 A CN113420422 A CN 113420422A CN 202110599920 A CN202110599920 A CN 202110599920A CN 113420422 A CN113420422 A CN 113420422A
Authority
CN
China
Prior art keywords
module
sequence
proportion
predicted
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110599920.XA
Other languages
Chinese (zh)
Other versions
CN113420422B (en
Inventor
王崇娇
杨虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Inspur Data Technology Co Ltd
Original Assignee
Jinan Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Inspur Data Technology Co Ltd filed Critical Jinan Inspur Data Technology Co Ltd
Priority to CN202110599920.XA priority Critical patent/CN113420422B/en
Publication of CN113420422A publication Critical patent/CN113420422A/en
Application granted granted Critical
Publication of CN113420422B publication Critical patent/CN113420422B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Economics (AREA)
  • Computational Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for predicting the proportion of an alarm log, which comprises the following steps: acquiring alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquiring logs respectively generated by all the modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time; forming a proportion sequence by the multiple proportions and establishing an ARIMA model by using the proportion sequence; acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences; and predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model. The invention also discloses a system, a computer device and a readable storage medium. The scheme provided by the invention not only considers the influence of the self rule of the module to be predicted, but also considers the mutual influence of all modules of the server, and more accurately predicts the alarm log occupation ratio in the T +1 time period.

Description

Alarm log proportion prediction method, system, device and medium
Technical Field
The invention relates to the field of prediction, in particular to a method, a system, equipment and a storage medium for predicting the proportion of an alarm log.
Background
With the development and innovation of computer network technology, people's life and work are increasingly unable to leave the support of computer technology, which also highlights the important role of a stable and safe data center. The business communication and stability maintenance among the whole data center servers are really important, and the premise of ensuring the stability of the data center is to ensure the safety and stability of each server as far as possible, so the alarm state of the server module is particularly important. The existing alarm state analysis is mainly based on the current alarm log number, the method can feed back the state of the server module more accurately, but the timeliness is not strong, and generally only a single server module can be analyzed, and the relation among all the modules is ignored.
Disclosure of Invention
In view of this, in order to overcome at least one aspect of the above problem, an embodiment of the present invention provides an alarm log proportion prediction method, including the following steps:
acquiring alarm logs respectively generated by a module to be predicted in a plurality of unit times and acquiring logs respectively generated by all modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
forming a proportion sequence by the plurality of proportions and establishing an ARIMA model by utilizing the proportion sequence;
acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
and predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
In some embodiments, constructing a plurality of the fractions into a fraction sequence and using the fraction sequence to build an ARIMA model further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording the current order d;
calculating the differential comparison sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential comparison sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the proportion sequence.
In some embodiments, constructing the ARIMA model using the order d, the parameter p, the parameter q, the autocorrelation coefficients, the partial autocorrelation coefficients, and the dominating sequence further comprises constructing the ARIMA model according to:
Figure BDA0003092383870000021
θ(B)=1-θ1B-θ2B2-…-θqBq
Figure BDA0003092383870000022
Figure BDA0003092383870000023
wherein, WtIs a ratio sequence; b is a delay operator; theta1,θ2,θ3...θqIs a function of the partial auto-correlation coefficient,
Figure BDA0003092383870000024
is an autocorrelation coefficient; f. oftIs an error.
In some embodiments, obtaining a plurality of influence factor sequences corresponding to each of the other modules further includes:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module1,AN2,…,ANn}, module historical life value sequence { MT1,MT2,…,MTnAnd interval duration sequence of adjacent abnormal conditions (IT)1,IT2,…,ITn};
Wherein n is the total number of other modules.
In some embodiments, determining the state prediction model for each of the other modules from the plurality of sequences of impact factors further comprises:
according to Y ═ m (X)i) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, XiFor each element in each sequence of influencing factors; according to
Figure DEST_PATH_IMAGE002
Calculating the state predicted value when each influence factor takes the value of x, and calculating the state predicted value when each influence factor takes the value of x
Figure BDA0003092383870000032
As a corresponding state prediction model for each of the other modules; wherein, XiAnd XjRespectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, YiIs XiCorresponding to the state value, K is a second-order Gaussian kernel fitting kernel function,
Figure BDA0003092383870000033
the abnormal times appearing in unit time is taken as xanThe corresponding state prediction value is obtained according to the time,
Figure BDA0003092383870000034
valuing a module historical life value as xmtThe corresponding state prediction value is obtained according to the time,
Figure BDA0003092383870000035
taking the interval duration of adjacent abnormal conditions to be xitAnd (4) corresponding state prediction values.
In some embodiments, predicting a proportion of an alarm log generated by the module to be predicted in a next unit time according to the state prediction model and the ARIMA model, further comprises:
calculating the correlation between the state value corresponding to each other module predicted by the state prediction model and the difference ratio sequence;
taking the state value of which the correlation is greater than the threshold as a linear parameter, and taking the state value of which the correlation is not greater than the threshold as a nonlinear parameter;
according to UTCalculating a correction value β + m (t) + e, where U ═ U1,…,Ua)T,U1,…,UaRepresents a linear parameters, T ═ T1,…,Tb)T,T1,…,TbRepresents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
In some embodiments, further comprising predicting the occupancy using:
Figure BDA0003092383870000036
wherein the content of the first and second substances,
Figure BDA0003092383870000041
for differential ratio sequences, alphat,αt-1…αt-pIs composed of
Figure BDA0003092383870000042
βt-1,...,βt-qIs theta2B2,...,θqBq
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides an alarm log proportion prediction system, including:
the acquisition module is configured to acquire alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquire logs respectively generated by all the modules in the plurality of unit times so as to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
the first model building module is configured to form a proportion sequence by a plurality of proportion and build an ARIMA model by utilizing the proportion sequence;
the second module establishing module is configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
and the prediction module is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer apparatus, including:
at least one processor; and
a memory storing a computer program operable on the processor, wherein the processor executes the program to perform any of the steps of the alarm log proportion prediction method described above.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of any of the alarm log proportion prediction methods described above.
The invention has one of the following beneficial technical effects: the scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times occurring in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all the modules of a server is considered, an NPM factor system is added through an ARIMA main system, an SPM correction-combination prediction system is established, and the alarm log proportion of the module to be predicted is predicted. Therefore, through the system logs collected by ISREST, based on the influence of the self rules of the module to be predicted and other factor modules on the module, an ARIMA model and an SPM model are established, the alarm log proportion of the T +1 time period of the module to be predicted is accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a greater degree is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an alarm log proportion prediction method according to an embodiment of the present invention;
FIG. 2 is an architecture diagram of an embodiment of a method for predicting alarm log fraction according to the present invention;
FIG. 3 is a block diagram of a data processing system according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an alarm log proportion prediction system according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a computer device provided in an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
According to an aspect of the present invention, an embodiment of the present invention provides an alarm log proportion prediction method, as shown in fig. 1, which may include the steps of:
s1, acquiring alarm logs generated by the module to be predicted in a plurality of unit times respectively and acquiring logs generated by all the modules in a plurality of unit times respectively to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
s2, forming a proportion sequence by the proportion and establishing an ARIMA model by using the proportion sequence;
s3, acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
and S4, predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
According to the scheme provided by the invention, the alarm state of the module to be predicted is predicted according to the system alarm log proportion of the module to be predicted and the alarm state influence factor data set of each other factor module of the server, and the log data can be more objectively utilized by taking the relative variable of the proportion number of the system alarm logs of the module to be predicted in all logs as a main analysis factor. Meanwhile, the relation between the alarm log percentage and the historical value of the alarm log is considered, an Autoregressive Integrated Moving Average model (ARIMA) is established by analyzing the independence and the stationarity of data, future values are predicted according to the historical values of the ARIMA, all modules of the server are mutually influenced and inseparable, the module to be predicted is influenced not only by the historical values of the module to be predicted but also by other factor modules of the server, and the residual error sequence of the ARIMA model predicted value is corrected through three influence factors such as the historical service life of other factor modules, the abnormal times occurring in unit time, the interval duration of adjacent abnormal conditions and the like, so that the alarm state of the module to be predicted can be predicted more accurately. It should be noted that, the user can replace the module to be predicted and other factor modules according to the needs of the user.
In the embodiment of the invention, based on the server system alarm log percentage of the time series type, the influence factors of the historical service life of the module, the abnormal times in unit time and the interval duration of adjacent abnormal conditions are added, and an ARIMA model and an SPM model are established to predict the alarm state of the module to be predicted. Firstly, SYSTEM logs are collected in-band through a server REST tool (ISREST), the number of logs and the total number of logs of the warming level of a plurality of modules such as a CPU, a DISK, a DIRVER, a GPU, an HBA, a MEMORY, an NIC, a RAID, a SYSTEM and a FAN are obtained, the ratio of the number of logs of the warming level of the modules to be predicted of the server in unit time is obtained through calculation, an ARIMA model is established according to the sequence arrangement of a time shaft and is used as a main SYSTEM, and a primary predicted value of a dependent variable T +1 time period is obtained. Because strong association relation may exist among all modules of the server, the modules may also be independent and have small influence on other modules, the alarm state of the module to be predicted in the state existing among all the modules of the server is not only related to self historical data, but also influenced by other modules, and because the linear relation between the modules and the module to be predicted is uncertain, an SPM model is established on the basis of a main model to correct errors and improve the accuracy of prediction. And providing the prediction result for server operation and maintenance personnel, and checking the module to be predicted in advance to avoid more serious loss.
In some embodiments, as shown in FIG. 2, the modular prediction approach of the ARIMA model and the SPM model may be implemented using a data collection system, a data processing system, an ARIMA host system, an NPM factor system, and an SPM correction-combination prediction system.
In some embodiments, the data collection SYSTEM may be an in-band log collection function of the application server management software ISREST, and may further obtain logs of a plurality of server modules such as CPU, DISK, DIRVER, GPU, HBA, MEMORY, NIC, RAID, SYSTEM, FAN, and the like, and it is generally considered that the greater the number of logs at the module warning level, the more serious the alarm state of the module is, the more uneven the log collection time is, and the number of various logs is notThe method takes the ratio of the number of the logs at the warming level of the module to be predicted as a main research object, and takes one module as a module M to be predictedWThe other server modules are used as factor modules, and n factor modules { M }are arranged1,M2,…,Mn}. Taking out the number N of the logs of the warming level of the module to be predicted from a large number of logsWAnd total log number N of all modulessumAlternative module historical lifetime { MT1,MT2,…,MTn,MTn+1}, number of abnormal times occurring per unit time { AN1,AN2,…,ANn,ANn+1And interval duration of adjacent abnormal conditions { IT1,IT2,…,ITn,ITn+1The variables serve as influencing factors for the status of the factor module.
In some embodiments, as shown in fig. 3, the data processing system may calculate, according to the time axis, a ratio of the number of the warming level logs of the module to be predicted as an input parameter of the ARIMA main system according to the data set of the data collection system; the historical service life of the factor module, the abnormal times in unit time and the interval duration variable of adjacent abnormal conditions are used as input parameters of the NPM factor system; the output parameters of the ARIMA main system and the SNM factor system are used as the input parameters of the SPM correction-combination prediction system.
In some embodiments, compared with the method for collecting logs out of band, the method for collecting logs in band can collect needed logs more directly and more comprehensively, the quantity proportion of alarm logs has strong correlation with a time axis, namely the sequence has a certain rule, future data is influenced by historical data and error disturbance items, and based on the characteristic, an ARIMA model can be established through a historical data set to predict the system alarm log proportion at the moment.
The basis of ARIMA model modeling is that the time sequence must be a stable sequence, namely the change of the characteristics of the decision sequence along with time is fixed and unchanged, and a non-stable sequence can be changed into the stable sequence through differential operation.
In some embodiments, in step S2, forming a plurality of the fractions into a fraction sequence and using the fraction sequence to establish an ARIMA model, further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording the current order d;
calculating the differential comparison sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential comparison sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the proportion sequence.
Specifically, after each step difference is processed, the stationarity of the data can be subjectively judged through a data line graph, if the whole data has no ascending or descending trend and no local data set obviously influenced by time is observed in different regions, the data after the current step difference processing is considered to be stable, and a stationarity sequence (namely a difference proportion sequence) is recorded as
Figure BDA0003092383870000081
The difference processing procedure may be:
Figure BDA0003092383870000082
then, parameters p, q, autocorrelation coefficients and partial autocorrelation coefficients of the ARIMA model can be calculated by applying the ACF function and the PACF function to the differential ratio sequence.
Alarm log proportion sequence { W) of prediction module warming level is treated based on processing1,W2,…,WtEstablishing an ARIMA (p, d, q) model:
Figure BDA0003092383870000091
θ(B)=1-θ1B-θ2B2-…-θqBq
Figure BDA0003092383870000092
Figure BDA0003092383870000093
wherein, WtIs a ratio sequence; b is a delay operator; theta1,θ2,θ3...θqIs a function of the partial auto-correlation coefficient,
Figure BDA0003092383870000094
is an autocorrelation coefficient; f. oftIs an error.
Therefore, the sequence of the module to be predicted of the server is analyzed by the ARIMA main system to obtain the predicted value of the module influenced by the self rule.
In some embodiments, the NPM factor system may be utilized to obtain the status value for each of the other factor modules.
Wherein, obtaining a plurality of influence factor sequences corresponding to each other module further comprises:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module1,AN2,…,ANn}, module historical life value sequence { MT1,MT2,…,MTnAnd interval duration sequence of adjacent abnormal conditions (IT)1,IT2,…,ITn};
Wherein n is the total number of other modules.
Then using the Non-Parametric model, Y ═ m (X)i) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, XiFor each element in each sequence of influencing factors. Because the predicted values must be separated from the true values, the equations are written rigorously, marking the separation with epsilon, but bringing in the data does not require considering epsilon.
Thus obtaining eachEach element X in each sequence of influencing factors of the other factor modulesiCorresponding state value Yi
Then, according to
Figure BDA0003092383870000101
Calculating each influence factor value as x (x can be AN)n+1、MTn+1、ITn+1) A predicted value of the time of day, and
Figure BDA0003092383870000102
as a corresponding state prediction model for each of the other modules; wherein, XiAnd XjRespectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, YiIs XiCorresponding to the state value, K is a second-order Gaussian kernel fitting kernel function,
Figure BDA0003092383870000103
the abnormal times appearing in unit time is taken as xanThe corresponding state prediction value is obtained according to the time,
Figure BDA0003092383870000104
valuing a module historical life value as xmtThe corresponding state prediction value is obtained according to the time,
Figure BDA0003092383870000105
taking the interval duration of adjacent abnormal conditions to be xitAnd (4) corresponding state prediction values.
For example, taking the first other factor block as AN example, the x values are ANn+1、MTn+1、ITn+1To obtain the corresponding state value
Figure BDA0003092383870000106
The error disturbance terms are respectively marked as epsilon11、ε12And ε13Similarly, because the predicted value must be different from the true value, the formula is written strictly, and the formula is expressed by epsilon11MarkingThis gap, however, does not need to take into account epsilon when bringing in data11、ε12And ε13
And averaging the three state values to obtain the state prediction value of the first other factor module.
In some embodiments, the ARIMA main system mainly analyzes the influence of the self rule of the module to be predicted of the server on future values, and predicts the proportion of the quantity of the warming level logs of the module to be predicted in the T +1 time period according to historical data; the SNP factor system mainly considers the influence of other factor modules of the server on the module to be predicted, selects three influence factors of the historical service life of the module, the abnormal times appearing in unit time and the interval duration of adjacent abnormal conditions, and establishes a state value of the factor module estimated by a Non-Parametric model and an averaging mode.
And constructing an SPM correction-combined prediction system based on the two model systems, further analyzing the residual error of the SPM correction-combined prediction system on the basis of the model output data of the ARIMA system, and correcting the prediction of the ratio of the ARIMA model to the prediction module warming log. Because the linear correlation relationship of the influence of each factor module of the server on the to-be-predicted module is uncertain, firstly, the correlation coefficient between each factor module and the output residual error of the ARIMA system is calculated, and the correlation coefficient between the state value of each factor module and the output residual error sequence (differential ratio sequence) of the ARIMA system is determined by a correlation calculation method. Taking the state value with the correlation larger than the threshold as a linear parameter, taking the state value with the correlation not larger than the threshold as a nonlinear parameter, further obtaining a linear parameters and b nonlinear parameters, and then establishing a Semi-parameter model of the factor module and the residual sequence, wherein the Semi-parameter model meets the following requirements:
Y=UTβ+m(T)+ε
wherein, U ═ U (U)1,…,Ua)T,U1,…,UaRepresents a linear parameters, T ═ T1,…,Tb)T, T1,…,TbRepresents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
In some embodiments, Local-Polynomial-Regressio may be appliedn method obtains the estimation result of Non-Parametric model beta, combines the prediction of ARIMA system to predict the module warming log quantity ratio, and uses the differentiated module warming log quantity ratio sequence to be predicted
Figure BDA0003092383870000111
And obtaining a combined predicted value after error correction as follows, and corresponding to the module alarm level.
Figure BDA0003092383870000112
Wherein the content of the first and second substances,
Figure BDA0003092383870000113
for differential ratio sequences, alphat,αt-1…αt-pIs composed of
Figure BDA0003092383870000114
βt-1,...,βt-qIs theta2B2,...,θqBq
The scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times occurring in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all the modules of a server is considered, an NPM factor system is added through an ARIMA main system, an SPM correction-combination prediction system is established, and the alarm log proportion of the module to be predicted is predicted. Therefore, through the system logs collected by ISREST, based on the influence of the self rules of the module to be predicted and other factor modules on the module, an ARIMA model and an SPM model are established, the alarm log proportion of the T +1 time period of the module to be predicted is accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a greater degree is avoided.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides an alarm log proportion prediction system 400, as shown in fig. 4, including:
the obtaining module 401 is configured to obtain alarm logs respectively generated by a module to be predicted in a plurality of unit times and obtain logs respectively generated by all modules in a plurality of unit times to obtain a ratio of the alarm logs generated by the module to be predicted in each unit time;
a first model building module 402 configured to form a plurality of the occupation ratios into occupation ratio sequences and build an ARIMA model using the occupation ratio sequences;
a second module establishing module 403, configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
and the prediction module 404 is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
The scheme provided by the invention is based on three influence factor data sets such as system logs, historical service lives of modules, abnormal times occurring in unit time, interval duration of adjacent abnormal conditions and the like, not only the influence of the self rule of the module to be predicted is considered, but also the mutual influence of all the modules of a server is considered, an NPM factor system is added through an ARIMA main system, an SPM correction-combination prediction system is established, and the alarm log proportion of the module to be predicted is predicted. Therefore, through the system logs collected by ISREST, based on the influence of the self rules of the module to be predicted and other factor modules on the module, an ARIMA model and an SPM model are established, the alarm log proportion of the T +1 time period of the module to be predicted is accurately predicted, the alarm grade is correspondingly obtained and provided for operation and maintenance personnel of a server, timely and accurate prejudgment is provided for the operation and maintenance personnel, and the loss of a greater degree is avoided.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 5, an embodiment of the present invention further provides a computer apparatus 501, comprising:
at least one processor 520; and
memory 510, memory 510 storing a computer program 511 executable on the processor, processor 520 when executing the program performing the steps of any of the alarm log proportion prediction methods described above.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 6, an embodiment of the present invention further provides a computer-readable storage medium 601, where the computer-readable storage medium 601 stores computer program instructions 610, and the computer program instructions 610, when executed by a processor, perform the steps of any one of the above alarm log proportion prediction methods.
Finally, it should be noted that, as will be understood by those skilled in the art, all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above.
Further, it should be appreciated that the computer-readable storage media (e.g., memory) herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (10)

1. A method for predicting the proportion of alarm logs is characterized by comprising the following steps:
acquiring alarm logs respectively generated by a module to be predicted in a plurality of unit times and acquiring logs respectively generated by all modules in a plurality of unit times to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
forming a proportion sequence by the plurality of proportions and establishing an ARIMA model by utilizing the proportion sequence;
acquiring a plurality of influence factor sequences corresponding to each other module and determining a state prediction model of each other module according to the plurality of influence factor sequences;
and predicting the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
2. The method of claim 1, wherein forming a plurality of said fractions into a fraction sequence and using said fraction sequence to build an ARIMA model, further comprises:
carrying out multi-order differential processing on the ratio sequence until a stable differential ratio sequence is obtained and recording the current order d;
calculating the differential comparison sequence using an ACF function to determine parameters p and autocorrelation coefficients of the ARIMA model and calculating the differential comparison sequence using a PACF function to determine parameters q and partial autocorrelation coefficients of the ARIMA model;
and constructing the ARIMA model by utilizing the order d, the parameter p, the parameter q, the autocorrelation coefficient, the partial autocorrelation coefficient and the proportion sequence.
3. The method of claim 2, wherein the ARIMA model is constructed using the order d, the parameter p, the parameter q, the autocorrelation coefficients, the partial autocorrelation coefficients, and the dominating sequence, further comprising constructing the ARIMA model according to:
Figure FDA0003092383860000011
θ(B)=1-θ1B-θ2B2-…-θqBq
Figure FDA0003092383860000012
Figure FDA0003092383860000027
wherein, WtIs a ratio sequence; b is a delay operator; theta1,θ2,θ3...θqIs a function of the partial auto-correlation coefficient,
Figure FDA0003092383860000021
is an autocorrelation coefficient; f. oftIs an error.
4. The method of claim 3, wherein obtaining a plurality of sequences of impact factors for each of the other modules further comprises:
acquiring the abnormal times sequence (AN) appearing in unit time corresponding to each other module1,AN2,…,ANn}, module historical life value sequence { MT1,MT2,…,MTnAnd interval duration sequence of adjacent abnormal conditions (IT)1,IT2,…,ITn};
Wherein n is the total number of other modules.
5. The method of claim 4, wherein determining the state prediction model for each of the other modules based on the plurality of sequences of impact factors further comprises:
according to Y ═ m (X)i) + ε calculating the state value corresponding to each element in each sequence of influence factors, where m (-) is the regression function, ε is the error perturbation term, XiFor each element in each sequence of influencing factors;
according to
Figure FDA0003092383860000022
Calculating the state predicted value when each influence factor takes the value of x, and calculating the state predicted value when each influence factor takes the value of x
Figure FDA0003092383860000023
As a corresponding state prediction model for each of the other modules; wherein, XiAnd XjRespectively being the ith and jth elements in the abnormal times sequence, the module historical life value sequence or the interval duration sequence of the adjacent abnormal conditions, YiIs XiCorresponding to the state value, K is a second-order Gaussian kernel fitting kernel function,
Figure FDA0003092383860000024
the abnormal times appearing in unit time is taken as xanThe corresponding state prediction value is obtained according to the time,
Figure FDA0003092383860000025
valuing a module historical life value as xmtThe corresponding state prediction value is obtained according to the time,
Figure FDA0003092383860000026
taking the interval duration of adjacent abnormal conditions to be xitAnd (4) corresponding state prediction values.
6. The method of claim 5, wherein predicting a proportion of alarm logs generated by the module to be predicted in a next unit of time based on the state prediction model and the ARIMA model, further comprises:
calculating the correlation between the state value corresponding to each other module predicted by the state prediction model and the difference ratio sequence;
taking the state value of which the correlation is greater than the threshold as a linear parameter, and taking the state value of which the correlation is not greater than the threshold as a nonlinear parameter;
according to UTCalculating a correction value β + m (t) + e, where U ═ U1,…,Ua)T,U1,…,UaRepresents a linear parameters, T ═ T1,…,Tb)T,T1,…,TbRepresents b nonlinear parameters, beta is a coefficient, and epsilon is an error disturbance term.
7. The method of claim 6, further comprising predicting the occupancy using:
Figure FDA0003092383860000031
wherein the content of the first and second substances,
Figure FDA0003092383860000032
for differential ratio sequences, alphat,αt-1…αt-pIs composed of
Figure FDA0003092383860000033
βt-1,...,βt-qIs theta2B2,...,θqBq
8. An alarm log proportion prediction system, comprising:
the acquisition module is configured to acquire alarm logs respectively generated by the module to be predicted in a plurality of unit times and acquire logs respectively generated by all the modules in the plurality of unit times so as to obtain the proportion of the alarm logs generated by the module to be predicted in each unit time;
the first model building module is configured to form a proportion sequence by a plurality of proportion and build an ARIMA model by utilizing the proportion sequence;
the second module establishing module is configured to obtain a plurality of influence factor sequences corresponding to each other module and determine a state prediction model of each other module according to the plurality of influence factor sequences;
and the prediction module is configured to predict the proportion of the alarm log generated by the module to be predicted in the next unit time according to the state prediction model and the ARIMA model.
9. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, wherein the processor executes the program to perform the steps of the method according to any of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 7.
CN202110599920.XA 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium Active CN113420422B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110599920.XA CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110599920.XA CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Publications (2)

Publication Number Publication Date
CN113420422A true CN113420422A (en) 2021-09-21
CN113420422B CN113420422B (en) 2023-04-07

Family

ID=77713291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110599920.XA Active CN113420422B (en) 2021-05-31 2021-05-31 Alarm log proportion prediction method, system, device and medium

Country Status (1)

Country Link
CN (1) CN113420422B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116531A (en) * 2013-01-25 2013-05-22 浪潮(北京)电子信息产业有限公司 Storage system failure predicting method and storage system failure predicting device
CN108256898A (en) * 2017-12-26 2018-07-06 深圳索信达数据技术股份有限公司 A kind of product Method for Sales Forecast method, system and storage medium
US20190228353A1 (en) * 2018-01-19 2019-07-25 EMC IP Holding Company LLC Competition-based tool for anomaly detection of business process time series in it environments
CN110224865A (en) * 2019-05-30 2019-09-10 宝付网络科技(上海)有限公司 A kind of log warning system based on Stream Processing
CN110458374A (en) * 2019-08-23 2019-11-15 山东浪潮通软信息科技有限公司 A kind of business electrical maximum demand prediction technique based on ARIMA and SVM
CN110688069A (en) * 2019-09-20 2020-01-14 苏州浪潮智能科技有限公司 Service life prediction method, device and equipment of solid state disk and readable storage medium
CN110888788A (en) * 2019-10-16 2020-03-17 平安科技(深圳)有限公司 Anomaly detection method and device, computer equipment and storage medium
CN110907984A (en) * 2019-11-21 2020-03-24 中国地震局地震预测研究所 Method for detecting earthquake front infrared long-wave radiation abnormal information based on autoregressive moving average model
CN111008114A (en) * 2019-11-30 2020-04-14 北京浪潮数据技术有限公司 Disk partition monitoring method, device, equipment and readable storage medium
CN111314173A (en) * 2020-01-20 2020-06-19 腾讯科技(深圳)有限公司 Monitoring information abnormity positioning method and device, computer equipment and storage medium
CN111314115A (en) * 2020-01-19 2020-06-19 苏州浪潮智能科技有限公司 Alarm method, device and equipment based on IDL log and readable medium
US20210026725A1 (en) * 2019-07-15 2021-01-28 Bull Sas Method and device for determining an estimated time before a technical incident in a computing infrastructure from values of performance indicators

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116531A (en) * 2013-01-25 2013-05-22 浪潮(北京)电子信息产业有限公司 Storage system failure predicting method and storage system failure predicting device
CN108256898A (en) * 2017-12-26 2018-07-06 深圳索信达数据技术股份有限公司 A kind of product Method for Sales Forecast method, system and storage medium
US20190228353A1 (en) * 2018-01-19 2019-07-25 EMC IP Holding Company LLC Competition-based tool for anomaly detection of business process time series in it environments
CN110224865A (en) * 2019-05-30 2019-09-10 宝付网络科技(上海)有限公司 A kind of log warning system based on Stream Processing
US20210026725A1 (en) * 2019-07-15 2021-01-28 Bull Sas Method and device for determining an estimated time before a technical incident in a computing infrastructure from values of performance indicators
CN110458374A (en) * 2019-08-23 2019-11-15 山东浪潮通软信息科技有限公司 A kind of business electrical maximum demand prediction technique based on ARIMA and SVM
CN110688069A (en) * 2019-09-20 2020-01-14 苏州浪潮智能科技有限公司 Service life prediction method, device and equipment of solid state disk and readable storage medium
CN110888788A (en) * 2019-10-16 2020-03-17 平安科技(深圳)有限公司 Anomaly detection method and device, computer equipment and storage medium
CN110907984A (en) * 2019-11-21 2020-03-24 中国地震局地震预测研究所 Method for detecting earthquake front infrared long-wave radiation abnormal information based on autoregressive moving average model
CN111008114A (en) * 2019-11-30 2020-04-14 北京浪潮数据技术有限公司 Disk partition monitoring method, device, equipment and readable storage medium
CN111314115A (en) * 2020-01-19 2020-06-19 苏州浪潮智能科技有限公司 Alarm method, device and equipment based on IDL log and readable medium
CN111314173A (en) * 2020-01-20 2020-06-19 腾讯科技(深圳)有限公司 Monitoring information abnormity positioning method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113420422B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN110880984B (en) Model-based flow anomaly monitoring method, device, equipment and storage medium
EP2490126B1 (en) System operation management device, system operation management method, and program storage medium
US8880946B2 (en) Fault detection apparatus, a fault detection method and a program recording medium
US8868993B1 (en) Data replacement policy
CN111310139B (en) Behavior data identification method and device and storage medium
US9600391B2 (en) Operations management apparatus, operations management method and program
WO2003007174A9 (en) Early warning in e-service management systems
CN111708682B (en) Data prediction method, device, equipment and storage medium
US20090307508A1 (en) Optimizing the Efficiency of an Organization's Technology Infrastructure
CN111259922A (en) Order data processing method and device based on customer order-returning early warning
CN106713267A (en) Network security assessment method and system
Hazrati‐Marangaloo et al. Detecting outbreaks in temporally dependent networks
KR101960755B1 (en) Method and apparatus of generating unacquired power data
WO2020220437A1 (en) Method for virtual machine software aging prediction based on adaboost-elman
CN108696397B (en) Power grid information security assessment method and device based on AHP and big data
CN113420422B (en) Alarm log proportion prediction method, system, device and medium
CN112148551B (en) Method, apparatus and computer program product for determining a rate of change of usage of a storage system
KR20220096568A (en) Apparatus and method for predicting power comsumption
Scagliarini et al. Monitoring operating room turnaround time: a retrospective analysis
CN112039715A (en) Network system capacity prediction method and system
Guillen et al. Improving the efficiency of the Nelson–Aalen estimator: The naive local constant estimator
Shao et al. A markov chain approach to study flow disruptions on surgery in emergency care
CN117745110B (en) Intelligent campus restaurant operation management and control method and system based on behavior analysis
Rojas et al. Towards a model to estimate the reliability of large-scale hybrid supercomputers
CN117560423A (en) Cloud storage node-based intelligent lock cloud storage resource scheduling system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant