CN113014412B - Method and system for predicting delay time of downtime fault service - Google Patents

Method and system for predicting delay time of downtime fault service Download PDF

Info

Publication number
CN113014412B
CN113014412B CN201911328918.8A CN201911328918A CN113014412B CN 113014412 B CN113014412 B CN 113014412B CN 201911328918 A CN201911328918 A CN 201911328918A CN 113014412 B CN113014412 B CN 113014412B
Authority
CN
China
Prior art keywords
service
downtime
predicted
log data
delay time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911328918.8A
Other languages
Chinese (zh)
Other versions
CN113014412A (en
Inventor
王晓春
费菲
王斌
胡治西
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Shanxi Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Shanxi Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Shanxi Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201911328918.8A priority Critical patent/CN113014412B/en
Publication of CN113014412A publication Critical patent/CN113014412A/en
Application granted granted Critical
Publication of CN113014412B publication Critical patent/CN113014412B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and a system for predicting delay time of a downtime fault service, wherein the method comprises the following steps: acquiring service log data generated by a service to be predicted in a downtime process; performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted; inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted; and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted. The invention summarizes the degree of the different services affected by the downtime fault according to the service log data, predicts the service delay time and evaluates the downtime influence, so that different downtime fault processing methods can be adopted according to the service delay time, thereby avoiding the problems of large resource waste and influence on the use perception of users caused by adopting a uniform fault processing method.

Description

Method and system for predicting delay time of downtime fault service
Technical Field
The invention relates to the technical field of communication, in particular to a method and a system for predicting delay time of downtime fault service.
Background
At present, communication services are various in types and have large requirements for resources, and in the face of increasing large data access volume and high concurrent access pressure, the operating efficiency of a support system gradually becomes slow, the failure probability is increased, even a downtime failure occurs, and the influence of the failure on service delay is unknown, so that the estimation of service completion time is crucial to the adoption of different failure processing methods. If the delay time of the downtime fault is long, the fault can be processed for a long time, for example, the fault can be solved by adopting the modes of system capacity expansion, resource configuration adjustment and the like; if the delay time of the down fault is short, and there is only a little fault processing time, for example, the service is delayed for only 5 minutes, then the delayed fault does not need human intervention. The processing of the fault after the downtime is very important, and the blind processing of the fault can cause unreasonable resource allocation, slow service response and even influence service continuity. In the prior art, a uniform fault processing method is adopted, and a great deal of alarms are generated when a downtime fault occurs, so that a great deal of resources are wasted.
Disclosure of Invention
In view of the above, the present invention has been developed to provide a method and system for predicting a time delay of downed failure traffic that overcomes or at least partially solves the above-mentioned problems.
According to one aspect of the present invention, a method for predicting delay time of a downtime fault service is provided, which includes the following steps:
acquiring service log data generated by a service to be predicted in a downtime process;
performing feature extraction on the service log data of the service to be predicted to obtain service features of the service to be predicted;
inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
According to another aspect of the present invention, there is provided a system for predicting delay time of a downed fault service, including:
the data acquisition module is used for acquiring service log data generated by the service to be predicted in the downtime process;
the service characteristic extraction module is used for extracting the characteristics of the service log data of the service to be predicted to obtain the service characteristics of the service to be predicted;
the downtime influence factor acquisition module is used for inputting the service characteristics of the service to be predicted into the downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and the prediction module is used for calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
According to yet another aspect of the present invention, there is provided a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the method for predicting the delay time of the downtime fault service.
According to still another aspect of the present invention, there is provided a computer storage medium, where at least one executable instruction is stored, and the executable instruction causes a processor to execute operations corresponding to the method for predicting the delay time of the downtime fault service.
According to the method and the system for predicting the delay time of the downtime fault service, the method comprises the steps of obtaining service log data generated by a service to be predicted in the downtime process; performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted; inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted; and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted. The method comprises the steps of summarizing the degrees of different services affected by downtime faults according to service log data, predicting service delay time and evaluating the downtime influence, so that different downtime fault processing methods can be adopted according to the service delay time, and the problems of large amount of resource waste and influence on user use perception caused by adoption of a uniform fault processing method are solved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a flowchart illustrating a method for predicting delay time of a downed fault service according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a training process for a downtime impact prediction model provided by an embodiment of the invention;
fig. 3 is a schematic structural diagram illustrating a system for predicting delay time of a downed fault service according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a training apparatus for a downtime influence prediction model according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Fig. 1 shows a flowchart of an embodiment of a method for predicting delay time of downtime fault service according to the present invention, and as shown in fig. 1, the method includes the following steps:
s101: and acquiring service log data generated by the service to be predicted in the downtime process.
In this step, the service to be predicted at least includes: loading type service, scheduling type service, statistic analysis type service, foreground application type service, etc. Correspondingly, the business account number can be at least divided into: the method comprises the steps of loading business accounts, scheduling business accounts, statistic analysis business accounts, foreground application business accounts and the like. Respectively generating service log data of different service labels according to different service categories; the service log data are stored in a distributed database, and the service log data are respectively obtained in the distributed database according to the type of the service to be predicted.
S102: and performing feature extraction on the service log data of the service to be predicted to obtain the service features of the service to be predicted.
In this step, the service features include: process class features, priority class features, resource restriction class features, computing capability features, interaction capability class features, and network class features.
Wherein, the process class characteristics include: the service to be predicted can preempt the memory size of other processes from a system resource pool, preempt the upper limit of the memory, process the concurrency number of any single query thread, sort the resource requests of a default pool and the like, wherein the resource requests comprise directly initiated and obtained requests and memory preemption requests; the priority class characteristics include: the method comprises the steps of inquiring the priority of an operation state, the execution time at a high priority, an overtime interval, the parallelism of a resource pool, the maximum concurrent number on the resource pool, the maximum time length of specified inquiry execution and the like, wherein the maximum concurrent number on the resource pool is the maximum concurrent connection number controlled by a slave connection layer; the resource restriction class features include: the method comprises the following steps of setting the number of Central Processing Units (CPUs) locked by a resource pool, the number of CPUs set by an exclusive mode or a shared mode in the resource pool, service concurrency, minimum bandwidth, bandwidth limit, dial-in access availability, maximum downtime per day and the like; the computing power characteristics include: the utilization rate of the CPU (i.e., the percentage of the size of the CPU divided by the flow rate of data communication), the number of the CPUs, a CPU redundancy coefficient, memory configuration, the total Input/Output (IO) amount per unit time, the total amount of traffic per unit time, and the like; the interaction capability class features include: single transaction processing time, rated concurrency, IO capability redundancy coefficient, allocated bandwidth, single request processing duration, CPU processing capability consumed by a single request, read database proportion, write database proportion and the like; the network class features include: actual used bandwidth, network bandwidth utilization, network bandwidth redundancy factor, and the like.
S103: and inputting the service characteristics of the service to be predicted into the downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted.
Specifically, fig. 2 is a flowchart of a training process of the downtime influence prediction model, and as shown in fig. 2, the training steps of the downtime influence prediction model are further explained through steps S1001 to S1005:
s1001: and collecting service log data generated by each service in the downtime test process.
Specifically, the service log data generated during the downtime test is stored in a distributed database.
S1002: and preprocessing the service log data of each service, and storing the preprocessed service log data of each service into a distributed database.
In this step, in order to avoid the influence of abnormal data on the predicted value, the service log data collected in step S1001 is preprocessed in a manner including: removing abnormal service log data, such as null data generated due to maintenance or failure; and completing the missing service log data.
S1003: and analyzing the service log data of each service to obtain a target downtime influence factor corresponding to each service.
Step S1003 further includes: analyzing the service log data of each service to determine a first performance index value of the service in a normal operation state and a second performance index value of the service in a downtime state; and calculating a target downtime influence factor corresponding to the service by using the first performance index value and the second performance index value.
Specifically, the calculation formula of the target downtime influence factor is as in formula (1):
Figure BDA0002329074720000051
wherein, a is a target downtime influence factor, xi corresponds to a performance index of a certain service, t xi Is the first performance index value, y, of the performance index xi of the service in the normal operation state xi And a second performance index value representing the performance index xi of the service in the downtime state.
The value range of the target downtime influence factor is [ -1,1], positive numbers represent performance improvement, negative numbers represent performance reduction, and zeros represent performance invariance. Generally, if a performance-improved or performance-unchanged service goes down, no intervention is needed, and a performance-reduced service can continue to predict delay time and perform corresponding processing.
S1004: and extracting the characteristics of the service log data of each service to obtain the service characteristics of each service.
Generally, the collected service log data of each service is numerous and complex, and feature extraction needs to be performed on the service log data, where the extracted service features include: process class features, priority class features, resource restriction class features, computing capability features, interaction capability class features, and network class features.
S1005: and training to obtain a downtime influence prediction model according to the service characteristics of each service and the target downtime influence factor corresponding to each service.
Specifically, step S1005 further includes: the service characteristics of each service and the target downtime influence factors corresponding to each service are used as training samples; and inputting the training samples into a support vector regression model for training, and taking the trained support vector regression model as a downtime influence prediction model.
The function of the downtime influence prediction model is to establish a mapping relation between the service characteristics of the service and the target downtime influence factors corresponding to the service. And predicting the downtime influence according to the new service characteristics. Specifically, a logistic regression algorithm is adopted to establish a mapping relation corresponding to the service characteristics and the target downtime influence factor, and the mapping relation can be as in formula (2):
g=f(x1,x2,x3,…,xn) (2)
wherein, x1, x2, x3, …, xn correspond to the service characteristics, and g is the target downtime influence factor.
The value range of the target downtime influence factor is [ -1,1], so the method belongs to the regression problem.
Support Vector Machines (SVM) are an algorithm for classification, support vectors can also be used for Regression, in which case Support Vector Regression (SVR), for the general Regression problem given that training samples D = { (x 1, y 1), (x 2, y 2),., (xn, yn) }, yi { (R), it is desirable to learn an f (x) so that it is as close as possible to y, in which model the loss is zero only if f (x) and y are exactly the same, support Vector Regression assumes that there is at most a deviation of epsilon between f (x) and y that can be tolerated, and the loss is calculated only if f (x) and y differ by an absolute value greater than epsilon, in which case it is equivalent to construct a compartment band of width 2 epsilon centered on f (x), and if a training sample falls into this compartment band, it is considered to be correctly predicted.
Known training data
Figure BDA0002329074720000062
Wherein X represents the input space (X = R) d ) Epsilon-the training goal of the support vector regression model is to solve a function f (x) such that the predicted values y x of the training data do not deviate from the true values y by more than epsilon. For linear regression, f (x) =<w,x>+ b, X ∈ X, b ∈ R, and (xi, yi) is fitted by this function, i =1,2, …, l. The ε -support vector regression model is as follows:
Figure BDA0002329074720000061
Figure BDA0002329074720000071
wherein ξ i ,ξ i * Is a relaxation factor and is zero if there is no error. The normal C represents the degree of penalty for samples that exceed the error ε.
The epsilon-support vector regression model solving function f (x) belongs to an optimization problem with constraints, and the original problem is converted into a dual problem by often utilizing Lagrange dual property (Lagrange dual), which is shown in formula (4) and formula (5):
Figure BDA0002329074720000072
Figure BDA0002329074720000073
wherein α (=) = (α) 11 *,α 22 *,…α ll * ) The linear regression function is expressed as formula (6):
Figure BDA0002329074720000074
it can be seen that the solution to the ε -support vector regression model can be represented by the inner product over the input space and the objective function of the training samples.
For the service log data, the { xi } and { yi } of the training data (i is more than or equal to 1 and less than or equal to l) respectively represent service characteristics and target downtime influence factors of the service, through the training process, a mapping function f (x) is established for the service characteristics and the target downtime influence factors of the service, the service characteristics of the service to be predicted can predict the suitable predicted downtime influence factors of the service according to the mapping function, and the trained support vector regression model can be used as a downtime influence prediction model.
To sum up, after the downtime influence prediction model is obtained in steps S1001 to S1005, the service characteristics of the service to be predicted are input into the downtime influence prediction model, and the predicted downtime influence factor corresponding to the service to be predicted can be obtained according to the mapping relationship between the service characteristics and the target downtime influence factor.
S104: and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
According to the value of the predicted downtime influence factor, a positive number represents that the performance is improved, a negative number represents that the performance is reduced, and a zero represents that the performance is unchanged. Generally, if a performance-improved or performance-unchanged service goes down, no intervention is needed, and a performance-reduced service can continue to predict delay time and perform corresponding processing.
For example, if the service to be predicted is a certain loading-class service, the required running time in the normal running state is 100 minutes, and the predicted downtime influence factor corresponding to the loading-class service is-0.2, substituting t into t according to the formula (1) xi Is 100, a is-0.2, y of the loading class service is calculated xi That is, the running time of the service in the downtime state is 120 minutes, so the predicted delay time of the loading type service in the downtime fault state is 20 minutes (the running time in the downtime state minus the running time required in the normal running state). If the loading service is an important report, a corresponding fault processing method, such as human intervention, can be adopted according to the predicted delay time.
By adopting the method of the embodiment, the service log data generated by the service to be predicted in the downtime process is acquired; performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted; inputting the service characteristics of the service to be predicted into the downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted; and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted. The method comprises the steps of summarizing the degrees of different services affected by downtime faults according to service log data, predicting service delay time and evaluating the downtime influence, so that different downtime fault processing methods can be adopted according to the service delay time, and the problems of large amount of resource waste and influence on user use perception caused by adoption of a uniform fault processing method are solved.
Example two
Fig. 3 is a schematic structural diagram illustrating an embodiment of the system for predicting delay time of downed fault traffic according to the present invention. As shown in fig. 3, the system includes: the system comprises a data acquisition module 201, a business feature extraction module 202, a downtime influence factor acquisition module 203 and a prediction module 204.
The data obtaining module 201 is configured to obtain service log data generated by a service to be predicted in a downtime process.
The service feature extraction module 202 is configured to perform feature extraction on the service log data of the service to be predicted, so as to obtain a service feature of the service to be predicted.
Specifically, the service features include: process class features, priority class features, resource restriction class features, computing capability features, interaction capability class features, and network class features.
The downtime influence factor obtaining module 203 is configured to input the service characteristics of the service to be predicted into the downtime influence prediction model, so as to obtain a predicted downtime influence factor corresponding to the service to be predicted.
Fig. 4 is a schematic structural diagram of a training apparatus for the downtime impact prediction model, and as shown in fig. 4, the training of the downtime impact prediction model further includes:
and the service log data acquisition submodule 2001 is used for acquiring service log data generated in the downtime test process of each service.
The service log data processing sub-module 2002 is configured to pre-process the service log data of each service, and store the pre-processed service log data of each service in the distributed database.
And the target downtime factor acquisition submodule 2003 is configured to analyze the service log data of each service to obtain a target downtime influence factor corresponding to each service.
The target downtime factor obtaining sub-module 2003 is further configured to: analyzing the service log data of each service to determine a first performance index value of the service in a normal operation state and a second performance index value of the service in a downtime state; and calculating a target downtime influence factor corresponding to the service by using the first performance index value and the second performance index value.
The service feature extraction submodule 2004 is configured to perform feature extraction on the service log data of each service to obtain service features of each service.
And the downtime influence prediction model training submodule 2005 is configured to train to obtain a downtime influence prediction model according to the service characteristics of each service and the target downtime influence factor corresponding to each service.
Specifically, the downtime impact prediction model training submodule 2005 is further configured to: the service characteristics of each service and the target downtime influence factors corresponding to each service are used as training samples; and inputting the training samples into a support vector regression model for training, and taking the trained support vector regression model as a downtime influence prediction model.
The prediction module 204 is configured to calculate a prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
By adopting the system of the embodiment, the service log data generated by the service to be predicted in the downtime process is acquired; performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted; inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted; and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted. The method comprises the steps of summarizing the degrees of different services affected by downtime faults according to service log data, predicting service delay time and evaluating the downtime influence, so that different downtime fault processing strategies can be adopted according to the service delay time, and the problems that a large amount of resources are wasted and user use perception is affected due to the fact that a uniform fault processing system is adopted are solved.
EXAMPLE III
The embodiment of the invention provides a nonvolatile computer storage medium, wherein at least one executable instruction is stored in the computer storage medium, and the computer executable instruction can execute the method for predicting the delay time of the downtime fault service in any method embodiment.
The executable instructions may be specifically configured to cause the processor to:
acquiring service log data generated by a service to be predicted in a downtime process;
performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted;
inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
Example four
Fig. 5 is a schematic structural diagram of an embodiment of a computing device according to the present invention, and a specific embodiment of the present invention does not limit a specific implementation of the computing device.
As shown in fig. 5, the computing device may include: a processor (processor), a Communications Interface (Communications Interface), a memory (memory), and a Communications bus.
Wherein: the processor, the communication interface, and the memory communicate with each other via a communication bus. A communication interface for communicating with network elements of other devices, such as clients or other servers. The processor is configured to execute a program, and may specifically execute relevant steps in the embodiment of the method for predicting the downtime fault service delay time.
In particular, the program may include program code comprising computer operating instructions.
The processor may be a central processing unit CPU or an Application Specific Integrated Circuit ASIC or one or more Integrated circuits configured to implement embodiments of the present invention. The server comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And the memory is used for storing programs. The memory may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program may specifically be adapted to cause a processor to perform the following operations:
acquiring service log data generated by a service to be predicted in a downtime process;
performing feature extraction on service log data of a service to be predicted to obtain service features of the service to be predicted;
inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (8)

1. A method for predicting delay time of downtime fault service is characterized by comprising the following steps:
acquiring service log data generated by a service to be predicted in a downtime process;
performing feature extraction on the service log data of the service to be predicted to obtain service features of the service to be predicted;
inputting the service characteristics of the service to be predicted into a downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
2. The method of claim 1, wherein the step of training the downtime effect prediction model comprises:
collecting service log data generated by each service in the downtime test process;
analyzing the service log data of each service to obtain a target downtime influence factor corresponding to each service;
extracting the characteristics of the service log data of each service to obtain the service characteristics of each service;
and training to obtain a downtime influence prediction model according to the service characteristics of each service and the target downtime influence factor corresponding to each service.
3. The method according to claim 2, wherein the analyzing the service log data of each service to obtain the target downtime influence factor corresponding to each service further comprises:
analyzing the service log data of each service to determine a first performance index value of the service in a normal operation state and a second performance index value of the service in a downtime state;
and calculating a target downtime influence factor corresponding to the service by using the first performance index value and the second performance index value.
4. The method according to claim 2, wherein the training of the downtime impact prediction model according to the service characteristics of each service and the target downtime impact factor corresponding to each service further comprises:
the service characteristics of each service and the target downtime influence factors corresponding to each service are used as training samples;
and inputting the training samples into a support vector regression model for training, and taking the trained support vector regression model as a downtime influence prediction model.
5. The method according to claim 2, wherein after the collecting the service log data generated during the downtime test for each service, the method further comprises:
and preprocessing the service log data of each service, and storing the preprocessed service log data of each service into a distributed database.
6. A system for predicting delay time of a downed fault service, comprising:
the data acquisition module is used for acquiring service log data generated by the service to be predicted in the downtime process;
the service characteristic extraction module is used for extracting the characteristics of the service log data of the service to be predicted to obtain the service characteristics of the service to be predicted;
the downtime influence factor acquisition module is used for inputting the service characteristics of the service to be predicted into the downtime influence prediction model to obtain a predicted downtime influence factor corresponding to the service to be predicted;
and the prediction module is used for calculating the prediction delay time corresponding to the service to be predicted according to the prediction downtime influence factor corresponding to the service to be predicted.
7. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the method for predicting the downtime fault service delay time according to any one of claims 1 to 5.
8. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the method for predicting a delay time of downed failure traffic according to any one of claims 1-5.
CN201911328918.8A 2019-12-20 2019-12-20 Method and system for predicting delay time of downtime fault service Active CN113014412B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911328918.8A CN113014412B (en) 2019-12-20 2019-12-20 Method and system for predicting delay time of downtime fault service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911328918.8A CN113014412B (en) 2019-12-20 2019-12-20 Method and system for predicting delay time of downtime fault service

Publications (2)

Publication Number Publication Date
CN113014412A CN113014412A (en) 2021-06-22
CN113014412B true CN113014412B (en) 2022-11-29

Family

ID=76381921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911328918.8A Active CN113014412B (en) 2019-12-20 2019-12-20 Method and system for predicting delay time of downtime fault service

Country Status (1)

Country Link
CN (1) CN113014412B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH117431A (en) * 1997-06-16 1999-01-12 Hitachi Ltd Failure recovery system for job executed by plural computers
CN103024762B (en) * 2012-12-26 2015-04-15 北京邮电大学 Service feature based communication service forecasting method
CN109558292A (en) * 2017-09-26 2019-04-02 阿里巴巴集团控股有限公司 A kind of monitoring method and device
CN108549981B (en) * 2018-03-30 2022-06-03 安徽大学 Method for improving service quality of massive parallel business processes
CN109634828A (en) * 2018-12-17 2019-04-16 浪潮电子信息产业股份有限公司 Failure prediction method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113014412A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
US9208131B2 (en) Techniques to simulate statistical tests
US20130318022A1 (en) Predictive Analytics for Information Technology Systems
US11228489B2 (en) System and methods for auto-tuning big data workloads on cloud platforms
CN111381970B (en) Cluster task resource allocation method and device, computer device and storage medium
JP2017529575A (en) Disk capacity prediction method, apparatus, device, and non-executable computer storage medium
US20070150430A1 (en) Decision support methods and apparatus
US20230004419A1 (en) Resource Migration Method and System, and Device
Dogani et al. Multivariate workload and resource prediction in cloud computing using CNN and GRU by attention mechanism
CN112506619A (en) Job processing method, apparatus, electronic device, storage medium, and program product
US20130212584A1 (en) Method for distributed caching and scheduling for shared nothing computer frameworks
WO2020220437A1 (en) Method for virtual machine software aging prediction based on adaboost-elman
CN114443310A (en) Resource scheduling method, device, equipment, medium and program product
Bommala et al. Machine learning job failure analysis and prediction model for the cloud environment
CN113014412B (en) Method and system for predicting delay time of downtime fault service
Deldari et al. A survey on preemptible IaaS cloud instances: challenges, issues, opportunities, and advantages
CN116841753A (en) Stream processing and batch processing switching method and switching device
CN117172370A (en) Customer loss prediction method, system, equipment and storage medium
CN111611479A (en) Data processing method and related device for network resource recommendation
CN114520773B (en) Service request response method, device, server and storage medium
CN116827950A (en) Cloud resource processing method, device, equipment and storage medium
US20190182343A1 (en) Method and system for tracking application activity data from remote devices and generating a corrective action data structure for the remote devices
Shen et al. Cost-sensitive Tensor-based Dual-stage Attention LSTM with Feature Selection for Data Center Server Power Forecasting
Nwanganga et al. Statistical Analysis and Modeling of Heterogeneous Workloads on Amazon's Public Cloud Infrastructure
CN115065685B (en) Cloud computing resource scheduling method, device, equipment and medium
US11599690B2 (en) Wafer asset modeling using language processing methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant