CN110543462A - Microservice reliability prediction method, prediction device, electronic device, and storage medium - Google Patents

Microservice reliability prediction method, prediction device, electronic device, and storage medium Download PDF

Info

Publication number
CN110543462A
CN110543462A CN201910833987.8A CN201910833987A CN110543462A CN 110543462 A CN110543462 A CN 110543462A CN 201910833987 A CN201910833987 A CN 201910833987A CN 110543462 A CN110543462 A CN 110543462A
Authority
CN
China
Prior art keywords
fault
obtaining
matrix
faults
log data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910833987.8A
Other languages
Chinese (zh)
Inventor
程末
蒲文斌
王博
梁忠
邓万宇
孙严
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Silk Road Yunqi Intelligent Technology Co Ltd
Original Assignee
Shaanxi Silk Road Yunqi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Silk Road Yunqi Intelligent Technology Co Ltd filed Critical Shaanxi Silk Road Yunqi Intelligent Technology Co Ltd
Priority to CN201910833987.8A priority Critical patent/CN110543462A/en
Publication of CN110543462A publication Critical patent/CN110543462A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Fuzzy Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Debugging And Monitoring (AREA)

Abstract

the invention discloses a micro-service reliability prediction method, which comprises the following steps: acquiring first distributed log data in each microservice; obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data; processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix; obtaining the fault category transfer probability matrix according to the fault category transfer matrix; and obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transition probability matrix of each micro service. The invention classifies the fault of each micro service by taking the distributed log data of the micro service system as input, obtains the class transition probability matrix of each micro service, thereby obtaining the reliability of each micro service, and predicts and analyzes the reliability of the combined micro service by using the reliability of each micro service.

Description

Microservice reliability prediction method, prediction device, electronic device, and storage medium
Technical Field
The invention belongs to the technical field of micro services, and particularly relates to a micro service reliability prediction method, a prediction device, electronic equipment and a storage medium.
Background
With the development of the internet, the information interaction amount among various service systems is increasing, and in order to ensure high availability and high performance, a micro-service cluster designed based on a micro-service architecture appears. In such a micro-service cluster, one request from a client may call multiple micro-services to finally implement a service.
the current technical scheme for reliability prediction analysis of microservices mainly comprises the following steps: 1. and performing group test on basic services with the same functions, calculating the reliability of the web service in real time, and selecting the basic services participating in execution by using the idea of majority voting. The method needs a web service development environment with strong computing power to complete the functions of testing and selecting the service, and has high integration and operation cost. 2. Based on the structural information and service reliability analysis of the running log, the model analyzes both the static structural information of the service and the dynamic information of the service execution. 3. And clustering similar service or similar service users by using a clustering method, and predicting the reliability of the new service or the new user according to the reliability of the similar service or the user. 4. And predicting the reliability of the service by using a data mining method, wherein the method predicts the reliability of the atomic service for the data called by the service by using a K-means method.
the traditional reliability analysis method of the micro service aims at the single architecture or multi-instance deployment application of the SOA, but the traditional reliability analysis method cannot accurately analyze the reliability of the decentralized micro service, and the effective analysis of the reliability of the micro service access facing a large-flow and high-concurrency user cannot be solved under the traditional reliability analysis method.
disclosure of Invention
in order to solve the above problems in the prior art, the present invention provides a method, a device, an electronic device and a storage medium for predicting micro-service reliability. The technical problem to be solved by the invention is realized by the following technical scheme:
a micro-service reliability prediction method comprises the following steps:
Acquiring first distributed log data in each microservice;
Obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
Processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix;
Obtaining the fault category transfer probability matrix according to the fault category transfer matrix;
And obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transition probability matrix of each micro service.
in an embodiment of the present invention, obtaining the fault category sequence and the N faults in the prediction phase according to the first distributed log data includes:
performing complement missing data, smoothing noise data and eliminating inconsistent data on the first distributed log data to obtain second distributed log data;
and obtaining a fault type sequence and N faults in a prediction stage according to the second distributed log data.
In an embodiment of the present invention, obtaining the fault category sequence and the N faults in the prediction phase according to the second distributed log data includes:
Obtaining the fault type sequence and M faults corresponding to the fault type sequence according to the fault type of the second distributed log data;
And obtaining N faults in the prediction stage by utilizing the M faults based on a Markov chain.
In an embodiment of the present invention, processing the fault category sequence by using a one-step transition probability of a markov chain to obtain a fault category transition matrix, includes:
Counting the times of transferring each fault to another fault according to the fault type sequence;
and obtaining the fault category transfer matrix according to the times of transferring each fault to another fault.
In an embodiment of the present invention, obtaining the fault category transition probability matrix according to the fault category transition matrix includes:
and processing the fault category transfer matrix by using a transfer probability matrix model to obtain the fault category transfer probability matrix.
in an embodiment of the present invention, obtaining an absolute probability vector of N faults in the prediction phase according to the fault category transition probability matrix of each micro service includes:
Obtaining an initial probability vector of the prediction stage according to the absolute probability vector of the last fault corresponding to the first distributed log data;
And obtaining the absolute probability vectors of the N faults in the prediction stage according to the initial probability vector in the prediction stage and the fault type transition probability matrix.
in an embodiment of the present invention, after obtaining the absolute probability vectors of the N faults in the prediction phase, the method further includes:
And weighting the fault category transfer probability matrix of each micro service to obtain the fault category transfer probability of the combined micro service.
An embodiment of the present invention further provides a prediction apparatus, including:
The acquisition module is used for acquiring first distributed log data in each microservice;
The processing module is used for obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
The transfer matrix determining module is used for processing the fault category sequence by utilizing the one-step transfer probability of the Markov chain to obtain a fault category transfer matrix;
The transition probability matrix determining module is used for obtaining the fault category transition probability matrix according to the fault category transition matrix;
And the prediction module is used for obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transfer probability matrix of each micro service.
An embodiment of the present invention further provides an electronic device, including a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
A processor, configured to implement the method steps of the microservice reliability prediction according to any of the above embodiments when the computer program is executed.
An embodiment of the present invention further provides a storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the micro-service reliability prediction method steps described in any of the above embodiments.
the invention has the beneficial effects that:
the method for predicting the micro-services is a prediction method based on a Markov chain, and the method classifies the fault of each micro-service by taking distributed log data of a micro-service system as input, and acquires a class transition probability matrix of each micro-service according to the obtained classification fault result, so that the reliability of each micro-service is obtained, the reliability of each micro-service can be accurately analyzed, and meanwhile, the method is suitable for the condition of reliability analysis of large-flow and high-concurrency micro-service access.
the present invention will be described in further detail with reference to the accompanying drawings and examples.
drawings
Fig. 1 is a schematic flowchart of a method for predicting reliability of a microservice according to an embodiment of the present invention;
FIG. 2 is a block diagram of a microservice reliability prediction framework provided by an embodiment of the present invention;
FIG. 3 is a flow chart illustrating another method for predicting microservice reliability according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a prediction apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
the present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
example one
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for predicting micro-service reliability according to an embodiment of the present invention. The invention provides a micro-service reliability prediction method, which specifically comprises the following steps:
Step 1, acquiring first distributed log data in each microservice;
Step 2, obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
Step 3, processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix;
Step 4, obtaining a fault category transfer probability matrix according to the fault category transfer matrix;
And 5, obtaining absolute probability vectors of N faults in the prediction stage according to the fault category transition probability matrix of each micro service.
Nowadays, more and more software systems adopt a micro-service architecture, and different component modules on different nodes cooperate with each other to provide services to the outside. In these software systems, a log mechanism plays an important role, and each module records actions and operations in the system through a log, wherein the actions and operations contain abundant information and data. Therefore, the log data accessed by the users with large flow and high concurrency provides a large amount of precious data resources for the reliability analysis of the micro-service system.
In this embodiment, first, by collecting first distributed log data of a large number of users of each micro service system, where the first distributed log data is the most initial distributed log data directly collected from each micro service system, if a certain micro service system fails, the failure occurring in the micro service system will be recorded in the corresponding distributed log data, and therefore, the reliability of the micro service system can be analyzed through the collected first distributed log data.
In this embodiment, the collected first distributed log data may be classified according to a certain fault classification method, that is, before the collected first distributed log data is fault classified, a certain fault classification method needs to be determined to classify a fault, but the fault classification method is not specifically limited in this embodiment, and a person skilled in the art may set the fault classification method according to actual requirements, so that the collected first distributed log data is sequentially classified according to the determined fault classification method, and a fault category sequence is correspondingly obtained, where the fault category sequence is a sequence that reflects which fault type each fault of the first distributed log data belongs to, for example, the fault classification includes 4 categories, that is, 1, 2, 3, and 4, and the fault category sequence of the first distributed log data obtained through testing is 11213111241211231232, the 1 st digit 1 indicates that the type of the first fault belongs to class 1, the 2 nd digit 1 indicates that the type of the second fault belongs to class 1, the 3 rd digit 2 indicates that the type of the third fault belongs to class 2, and so on. And the total number N of faults in the prediction phase, that is, the N faults that may exist in the prediction phase, may be obtained according to the first distributed log data, and the N faults in the prediction phase obtained in this embodiment may only satisfy the requirement that can be applied to the markov chain.
In this embodiment, after obtaining the fault category sequence corresponding to the first distributed log data, the fault category sequence may be processed by using a one-step transition probability of the markov chain to obtain a fault category transition matrix, where the fault category transition matrix reflects the number of times that each fault type is transitioned to another fault type, and for example, the fault category transition matrix is represented by the following formula:
where K is a fault category transition matrix, K11 is the number of transitions found from the fault category sequence from the first fault category to the first fault category, K12 is the number of transitions found from the fault category sequence from the first fault category to the second fault category, K1n is the number of transitions found from the fault category sequence from the first fault category to the nth fault category, K21 is the number of transitions found from the fault category sequence from the second fault category to the first fault category, K22 is the number of transitions found from the fault category sequence from the second fault category to the second fault category, K2n is the number of transitions found from the fault category sequence from the second fault category to the nth fault category, and so on.
in addition, according to the fault category transition probability matrix, absolute probability vectors of N faults in a prediction stage can be obtained, the absolute probability vectors can reflect the probability of which a certain fault in the prediction stage belongs to which type of fault, and the most probable fault type of the fault can be known according to the probability.
The prediction method of the micro-service provided by the invention is a prediction method based on a Markov chain, the distributed log data of the micro-service system is used as input, the fault of each micro-service is classified to obtain a fault type sequence, and a type transition probability matrix of each micro-service is obtained according to the obtained fault type sequence, so that the reliability of each micro-service is obtained.
example two
at present, the reliability analysis method of the traditional software system is applied to single-body architecture or multi-instance deployment of the SOA. The reliability of decentralized micro-services cannot be accurately analyzed by using the traditional reliability analysis method, and the problem that the reliability is effectively analyzed when micro-services of high-flow and high-concurrency users are accessed cannot be solved by using the traditional reliability analysis method. Secondly, the service of the internet of things micro-service system is relatively complex, and the traditional reliability analysis method cannot meet the new problem caused by the expansion of the micro-service. In addition, the reliability analysis methods in the prior art all require a large amount of service invocation data, that is, the reliability of the service is considered from the perspective of the service user, so that the publishing, discovering and binding capabilities of the micro-service all have certain influence on the reliability of the web service.
based on the above problems, referring to fig. 2 and fig. 3, the present embodiment specifically describes a method for predicting micro-service reliability in the first embodiment based on the above embodiments.
the embodiment exemplifies a method for acquiring first distributed log data in each microservice on the basis of the above embodiment, specifically, the Flume is a highly available, highly reliable, distributed massive log collection, aggregation and transmission system provided by Cloudera, and supports a sender customizing various types of data in the log system for collecting data; at the same time, flash provides the ability to simply process data and write to various data recipients. Therefore, the embodiment can select the open source project flash of Cloudera with complete built-in components and small required development amount to be responsible for collecting the first distributed log data from each micro service system node in real time. The first distributed log data may be a large amount of original log data generated by a user operation request acquired from a log generation terminal such as a client, a server, or a cloud service system providing a business service.
in addition, the first distributed log data of the microservice system may also be obtained by other manners, which is not limited in this embodiment.
the present embodiment exemplifies step 2 of the first embodiment on the basis of the foregoing embodiment, and specifically, step 2 in the first embodiment may specifically include steps 2.1 to 2.2:
and 2.1, complementing missing data, smoothing noise data and eliminating inconsistent data on the first distributed log data to obtain second distributed log data.
specifically, because the acquired first distributed log data may have data missing, noise, and inconsistent data, in order to further improve the reliability of the final prediction, the first distributed log data needs to be subjected to data missing complementing, noise smoothing, and inconsistent data eliminating processing, and the second distributed log data can be obtained after the data missing, noise smoothing, and inconsistent data are complemented and eliminated processing, and the accuracy of the final reliability prediction can be improved by using the second distributed log data for testing.
and 2.2, obtaining a fault type sequence and N faults in a prediction stage according to the second distributed log data.
the second distributed log data can be classified according to a certain fault classification method, the second distributed log data are classified according to the determined fault classification method in sequence according to the embodiment, a fault type sequence is correspondingly obtained, the total number N of faults in the prediction stage, namely the N faults possibly existing in the prediction stage, can be obtained according to the second distributed log data, and the N faults in the prediction stage obtained in the embodiment only need to meet the requirement of being applicable to the markov chain.
Specifically, in order to further describe step 2.2 of this embodiment, this embodiment will exemplify a specific implementation process of step 2.2, and step 2.2 may include, for example:
step 2.21, obtaining a fault type sequence and M faults corresponding to the fault type sequence according to the fault type of the second distributed log data;
and 2.22, obtaining N faults in the prediction stage by utilizing the M faults based on the Markov chain.
in the test stage, dynamically obtaining second distributed log data, and classifying the second distributed log data according to a certain fault classification method, in this embodiment, according to the determined fault classification method, classifying the collected second distributed log data according to a sequence, recording the type of each fault, correspondingly obtaining a fault type sequence, recording the fault type sequence as X1 and X2 … XM, where X1 is a fault type corresponding to a first fault, X2 is a fault type corresponding to a second fault, and XM is a fault type corresponding to an mth fault, and simultaneously recording the total number of faults found in the second distributed log data in the test stage, and recording the total number of faults corresponding to the second distributed log data as M faults.
according to the theory of the Markov chain, what type of fault the fault obtained by the second distributed log data belongs to can be regarded as a random process, the state space of the random process is the fault type, and the state set is discrete. The fault reflected by the second distributed log data may be classified according to, for example, economic loss to the user, functional importance, or user-defined fault type, for example, the fault type may be classified according to the loss caused by the fault to the user, such as the fault type is classified into 4 categories, the loss caused by category 1 is less than or equal to $ 2000, the loss caused by category 2 is less than or equal to $ 20000 and greater than $ 20000, the loss caused by category 3 is less than or equal to $ 200000 and greater than $ 20000, and the loss caused by category 4 is greater than $ 200000. In addition, the fault classification method may be divided according to whether the fault is recoverable, warning, fatal error, etc., for example, the fault may be classified by adopting knn or kmeans algorithm, and the method belongs to the type of dynamically generated fault.
the total number of faults N in the prediction stage can be obtained according to the total number of faults M in the second distributed log data, and the N faults in the prediction stage obtained in this embodiment only need to meet the requirement that the N faults can be applied to the markov chain. The test phase is used for accumulating historical data and obtaining an initial probability vector and a transition probability matrix, and the prediction phase uses the data obtained in the test phase to perform prediction according to a model.
when the fault in the prediction stage is predicted, the fault type transition probability matrix is obtained according to the second distributed log data history data. Therefore, if the total number of faults corresponding to the second distributed log data is M and the total number of faults in the prediction phase is N, when N is relatively large, the fault category transition probability matrix cannot be regarded as invariant any more, so that in order to make the total number of faults N in the prediction phase satisfy the markov chain, the total number of faults N in the prediction phase cannot be too large, for example, assuming that when N is less than or equal to M · 10%, the fault category transition probability matrix can be regarded as invariant, and when N is greater than M · 10%, the requirement of the markov chain is no more applicable. Therefore, the total number N of predicted faults in the present embodiment should ensure that the obtained fault class transition probability matrix can be considered as invariant. For example, assuming that the total number of faults M found in the test phase is 30 faults, the fault N in the prediction phase can be calculated to be 3 according to the formula N ≦ M · 10%.
in this embodiment, step 3 of the first embodiment is illustrated on the basis of the foregoing embodiment, and specifically, step 3 in the first embodiment may specifically include steps 3.1 to 3.2:
Step 3.1, counting the times of transferring each fault to another fault according to the fault type sequence;
Step 3.2, obtaining a fault category transfer matrix according to the times of transferring each fault to another fault
in the M faults obtained in the test phase, the M faults may be divided by using a one-step transition probability of a markov chain, that is, when the test phase ends, the number of times each fault transitions to another fault is counted, specifically, the number of times a certain fault in a fault category sequence transitions to the same fault in the next fault is counted, for example, the fault categories include 4 categories, that is, 1, 2, 3, and 4, and the fault category sequence of the first distributed log data obtained through the test is 11213111241211231232, then the number of times of transitioning from the 1 st fault to the 1 st fault is 4, the number of times of transitioning from the 1 st fault to the 2 nd fault is 5, the number of times of transitioning from the 1 st fault to the 3 rd fault is 1, the number of times of transitioning from the 1 st fault to the 4 th fault is 0, and so on. The fault category transition matrix may then be obtained according to the counted number of transitions of each fault to another fault, for example, the fault category transition matrix obtained according to the fault category sequence 11213111241211231232 is:
the present embodiment illustrates step 4 of the first embodiment on the basis of the foregoing embodiment, and specifically, step 4 of the first embodiment may specifically include:
Processing the fault category transfer matrix by using a transfer probability matrix model to obtain a fault category transfer probability matrix, wherein the transfer probability matrix model is as follows:
wherein Pij is an element in the ith row and the jth column in the transition probability matrix, and kij is an element in the ith row and the jth column in the fault category transition matrix.
For example, by taking an example of the fault category transition matrix corresponding to the fault category sequence listed in step 3.1 to step 3.2 as 11213111241211231232, the transition probability matrix model can be used to obtain: p11 is 4/(4+5+1+0) is 0.4, p12 is 5/(4+5+1+0) is 0.5, p13 is 1/(4+5+1+0) is 0.1, p14 is 0/(4+5+1+0) is 0, p21 is 2/(2+0+2+1) is 0.4, p22 is 0/(2+0+2+1) is 0, p23 is 2/(2+0+2+1) 0.4, p24 is 1/(2+0+2+1) is 0.2, and the rest of the elements are analogized, so that the fault-type transition probability matrix can be obtained.
the obtained fault category transition probability matrix can reflect the probability that each type of fault in each micro service is transferred to another fault, so that the reliability of each micro service can be reflected according to the fault category transition probability matrix.
the present embodiment exemplifies step 5 of the first embodiment on the basis of the foregoing embodiment, and specifically, step 5 in the first embodiment may specifically include:
And 5.1, obtaining an initial probability vector of a prediction stage according to the absolute probability vector of the last fault corresponding to the first distributed log data.
In this embodiment, assuming that { Xq, q ∈ T } is a markov chain, pj ═ P { X0 ═ j } is referred to as an initial probability of the markov chain, pj (q) ═ P { Xq ═ j } is referred to as an absolute probability of the markov chain, PT (0) = { P1, P2, … } is an initial probability vector, and PT (q) { P1(q), P2PT (q), … } is an absolute probability vector of the qth fault.
in this embodiment, the initial probability vector of the prediction stage is an absolute probability vector corresponding to the last fault of the test stage, that is, the absolute probability vector of the last fault can be determined by determining the type of the last fault corresponding to the first distributed log data, so that the initial probability vector of the prediction stage can be obtained. For example, the fault is classified into 4 types, and the last fault corresponding to the first distributed log data belongs to a type 2 fault, the absolute probability vector of the last fault is [0100], and if the last fault corresponding to the first distributed log data belongs to a type 3 fault, the absolute probability vector of the last fault is [0010], and for example, the fault is classified into 5 types, and the last fault corresponding to the first distributed log data belongs to a type 5 fault, the absolute probability vector of the last fault is [00001 ].
and 5.2, obtaining the absolute probability vectors of the N faults in the prediction stage according to the initial probability vector and the fault type transition probability matrix in the prediction stage.
Specifically, the result obtained by multiplying the initial probability vector of the prediction stage by the fault class transition probability matrix is the absolute probability vector of N faults of the prediction stage, that is:
p(q)=p(0)·p
wherein, pT (q) is an absolute probability vector of the prediction phase, pT (0) is an initial probability vector of the prediction phase, and p is a fault class transition probability matrix.
The absolute probability vector of the prediction phase obtained in this embodiment can reflect the probability of each fault occurring in the prediction phase and belonging to a certain class of faults, and the most likely fault type of the fault is obtained according to the probability value.
In this embodiment, after obtaining the absolute probability vectors of N faults in the prediction stage, the reliability of the combined micro service may be predicted by using the reliability of the micro service, and specifically, the fault category transition probability of the combined micro service may be obtained by performing weighting processing on the fault category transition probability matrix of each micro service.
furthermore, the combined micro service of the present embodiment refers to a micro service group with a composite function, which is composed of a plurality of basic micro services, and the present embodiment measures the reliability of the combined micro service by the importance of each service in the combined micro service, that is, the fault class transition probability matrix of each micro service of the combined micro service is weighted according to the importance, so as to obtain the fault class transition probability of the combined micro service, and the reliability of the combined micro service can be predicted according to the obtained fault class transition probability of the combined micro service.
Reliability, usability and safety are important problems of system quality assurance, especially for high-flow high-concurrency micro-service application systems, reliability becomes an important problem of such application systems, and therefore improvement of product quality, improvement of system usability and user satisfaction all depend on reliable systems. In the embodiment, the distributed log data is acquired and processed, the distributed log data is subjected to fault classification, then the reliability modeling measurement is performed on the system constructed by the micro-services, and the reliability of the combined micro-services is measured and predicted by using the reliability measurement result of the single micro-service, so that a deployment scheme is provided for constructing a reliable self-adaptive system. Therefore, the technical scheme of the embodiment can improve the product quality of the micro-service system, and improve the availability and the user satisfaction of the micro-service system.
After the reliability of the combined micro-service is obtained, the script of the adaptive micro-service can be written according to the reliability of the combined micro-service. Specifically, a deployment script is generated according to the obtained measure value of the reliability of the combined micro-service, and the micro-service self-adaptive deployment and arrangement are performed by using K8S according to the generated script.
Specifically, the reliability assessment algorithm describes:
Input: distributed log collection file, combined data of combined microservices
Output: combined microservice reliability
the reliability of the micro service is evaluated and predicted based on the Markov chain, but the reliability evaluation and prediction method of the current software cannot evaluate and predict the reliability of the micro service, especially the reliability of the combined micro service, but the evaluation and prediction of the reliability of the combined micro service can be solved by using the prediction method of the embodiment, so that a beneficial scheme support is provided for the deployment of the micro service, and the reliability of the prediction of the micro service system is improved.
EXAMPLE III
Referring to fig. 4, fig. 4 is a schematic structural diagram of a prediction apparatus according to an embodiment of the present invention.
The prediction device comprises:
The acquisition module is used for acquiring first distributed log data in each microservice;
the processing module is used for obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
the transfer matrix determining module is used for processing the fault category sequence by utilizing the one-step transfer probability of the Markov chain to obtain a fault category transfer matrix;
The transition probability matrix determining module is used for obtaining the fault category transition probability matrix according to the fault category transition matrix;
and the prediction module is used for obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transfer probability matrix of each micro service.
In an embodiment of the present invention, the processing module is specifically configured to perform complementary missing data, smooth noise data, and eliminate inconsistent data processing on the first distributed log data to obtain second distributed log data; and obtaining a fault type sequence and N faults in a prediction stage according to the second distributed log data.
In an embodiment of the present invention, the processing module is further specifically configured to obtain the fault category sequence and M faults corresponding to the fault category sequence according to a fault type of the second distributed log data; and obtaining N faults in the prediction stage by utilizing the M faults based on a Markov chain.
in an embodiment of the present invention, the transfer matrix determining module is specifically configured to count the number of times that each fault is transferred to another fault according to the fault category sequence; and obtaining the fault category transfer matrix according to the times of transferring each fault to another fault.
In an embodiment of the present invention, the transition probability matrix determination module is specifically configured to process the fault category transition matrix by using a transition probability matrix model to obtain the fault category transition probability matrix.
In an embodiment of the present invention, the prediction module is specifically configured to obtain an initial probability vector of the prediction stage according to an absolute probability vector of a last fault corresponding to the first distributed log data; and obtaining the absolute probability vectors of the N faults in the prediction stage according to the initial probability vector in the prediction stage and the fault type transition probability matrix.
The prediction apparatus provided in the embodiment of the present invention may implement the method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Example four
referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device 1100 comprises: the system comprises a processor 1101, a communication interface 1102, a memory 1103 and a communication bus 1104, wherein the processor 1101, the communication interface 1102 and the memory 1103 are communicated with each other through the communication bus 1104;
a memory 1103 for storing a computer program;
the processor 1101 is configured to implement the above-mentioned method steps when executing the computer program.
The processor 1101, when executing the computer program, implements the steps of: acquiring first distributed log data in each microservice; obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data; processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix; obtaining the fault category transfer probability matrix according to the fault category transfer matrix; and obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transition probability matrix of each micro service.
the electronic device provided by the embodiment of the present invention can execute the above method embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
EXAMPLE five
Yet another embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
Acquiring first distributed log data in each microservice;
Obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
Processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix;
Obtaining the fault category transfer probability matrix according to the fault category transfer matrix;
And obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transition probability matrix of each micro service.
The computer-readable storage medium provided by the embodiment of the present invention may implement the above method embodiments, and the implementation principle and technical effect are similar, which are not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "module" or "system. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. A computer program stored/distributed on a suitable medium supplied together with or as part of other hardware, may also take other distributed forms, such as via the Internet or other wired or wireless telecommunication systems.
in the description of the present invention, it is to be understood that the terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
in the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples described in this specification can be combined and combined by those skilled in the art.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. A method for predicting reliability of micro-services is characterized by comprising the following steps:
Acquiring first distributed log data in each microservice;
obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
Processing the fault category sequence by utilizing the one-step transition probability of the Markov chain to obtain a fault category transition matrix;
obtaining the fault category transfer probability matrix according to the fault category transfer matrix;
And obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transition probability matrix of each micro service.
2. The method of claim 1, wherein obtaining a fault class sequence and N faults of a prediction stage from the first distributed log data comprises:
Performing complement missing data, smoothing noise data and eliminating inconsistent data on the first distributed log data to obtain second distributed log data;
and obtaining a fault type sequence and N faults in a prediction stage according to the second distributed log data.
3. The micro-service reliability prediction method according to claim 2, wherein obtaining a fault category sequence and N faults of a prediction stage from the second distributed log data comprises:
Obtaining the fault type sequence and M faults corresponding to the fault type sequence according to the fault type of the second distributed log data;
and obtaining N faults in the prediction stage by utilizing the M faults based on a Markov chain.
4. the method of claim 1, wherein processing the fault category sequence with a one-step transition probability of a Markov chain to obtain a fault category transition matrix comprises:
Counting the times of transferring each fault to another fault according to the fault type sequence;
And obtaining the fault category transfer matrix according to the times of transferring each fault to another fault.
5. The method of claim 1, wherein obtaining the fault class transition probability matrix according to the fault class transition matrix comprises:
and processing the fault category transfer matrix by using a transfer probability matrix model to obtain the fault category transfer probability matrix.
6. The method of claim 1, wherein obtaining the absolute probability vectors of N faults in the prediction phase according to the class of fault transition probability matrix of each of the microservices comprises:
Obtaining an initial probability vector of the prediction stage according to the absolute probability vector of the last fault corresponding to the first distributed log data;
and obtaining the absolute probability vectors of the N faults in the prediction stage according to the initial probability vector in the prediction stage and the fault type transition probability matrix.
7. the method of claim 1, wherein after obtaining the absolute probability vectors of the N faults of the prediction phase, further comprising:
And weighting the fault category transfer probability matrix of each micro service to obtain the fault category transfer probability of the combined micro service.
8. a prediction apparatus, comprising:
the acquisition module is used for acquiring first distributed log data in each microservice;
the processing module is used for obtaining a fault type sequence and N faults in a prediction stage according to the first distributed log data;
the transfer matrix determining module is used for processing the fault category sequence by utilizing the one-step transfer probability of the Markov chain to obtain a fault category transfer matrix;
the transition probability matrix determining module is used for obtaining the fault category transition probability matrix according to the fault category transition matrix;
and the prediction module is used for obtaining the absolute probability vectors of the N faults in the prediction stage according to the fault category transfer probability matrix of each micro service.
9. an electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
A memory for storing a computer program;
A processor for implementing the method steps of any of claims 1-7 when executing the computer program.
10. A storage medium, characterized in that a computer program is stored in the storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN201910833987.8A 2019-09-04 2019-09-04 Microservice reliability prediction method, prediction device, electronic device, and storage medium Pending CN110543462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910833987.8A CN110543462A (en) 2019-09-04 2019-09-04 Microservice reliability prediction method, prediction device, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910833987.8A CN110543462A (en) 2019-09-04 2019-09-04 Microservice reliability prediction method, prediction device, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
CN110543462A true CN110543462A (en) 2019-12-06

Family

ID=68711284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910833987.8A Pending CN110543462A (en) 2019-09-04 2019-09-04 Microservice reliability prediction method, prediction device, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN110543462A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242428A (en) * 2019-12-31 2020-06-05 华为技术有限公司 Microservice processing method, microservice processing device, microservice processing apparatus, and storage medium
CN111611146A (en) * 2020-06-18 2020-09-01 南方电网科学研究院有限责任公司 Micro-service fault prediction method and device
CN112579436A (en) * 2020-12-01 2021-03-30 中国科学院电子学研究所苏州研究院 Micro-service software architecture identification and measurement method
CN113641679A (en) * 2021-08-31 2021-11-12 京东方科技集团股份有限公司 Data transfer method, data transfer system, computer device and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153866A1 (en) * 2002-11-15 2004-08-05 Microsoft Corporation Markov model of availability for clustered systems
US20090031176A1 (en) * 2004-01-30 2009-01-29 Tsuyoshi Ide Anomaly detection
CN108600000A (en) * 2018-04-12 2018-09-28 咪咕文化科技有限公司 Fault prediction method, server and computer storage medium
CN108833137A (en) * 2018-05-18 2018-11-16 南京南瑞信息通信科技有限公司 A kind of flexibility micro services Monitoring framework framework
CN109635854A (en) * 2018-11-26 2019-04-16 国网冀北电力有限公司 Based on markovian charging pile failure prediction method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153866A1 (en) * 2002-11-15 2004-08-05 Microsoft Corporation Markov model of availability for clustered systems
US20090031176A1 (en) * 2004-01-30 2009-01-29 Tsuyoshi Ide Anomaly detection
CN108600000A (en) * 2018-04-12 2018-09-28 咪咕文化科技有限公司 Fault prediction method, server and computer storage medium
CN108833137A (en) * 2018-05-18 2018-11-16 南京南瑞信息通信科技有限公司 A kind of flexibility micro services Monitoring framework framework
CN109635854A (en) * 2018-11-26 2019-04-16 国网冀北电力有限公司 Based on markovian charging pile failure prediction method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242428A (en) * 2019-12-31 2020-06-05 华为技术有限公司 Microservice processing method, microservice processing device, microservice processing apparatus, and storage medium
CN111611146A (en) * 2020-06-18 2020-09-01 南方电网科学研究院有限责任公司 Micro-service fault prediction method and device
CN111611146B (en) * 2020-06-18 2023-05-16 南方电网科学研究院有限责任公司 Micro-service fault prediction method and device
CN112579436A (en) * 2020-12-01 2021-03-30 中国科学院电子学研究所苏州研究院 Micro-service software architecture identification and measurement method
CN112579436B (en) * 2020-12-01 2022-11-29 中国科学院电子学研究所苏州研究院 Micro-service software architecture identification and measurement method
CN113641679A (en) * 2021-08-31 2021-11-12 京东方科技集团股份有限公司 Data transfer method, data transfer system, computer device and medium

Similar Documents

Publication Publication Date Title
CN110543462A (en) Microservice reliability prediction method, prediction device, electronic device, and storage medium
US7680753B2 (en) System and method for fault identification in an electronic system based on context-based alarm analysis
WO2019179248A1 (en) Anomaly detection method and device
Lan et al. A study of dynamic meta-learning for failure prediction in large-scale systems
Wang et al. Workload-aware anomaly detection for web applications
Gu et al. Dynamic meta-learning for failure prediction in large-scale systems: A case study
Gujrati et al. A meta-learning failure predictor for blue gene/l systems
CN114844768B (en) Information analysis method and device and electronic equipment
WO2018156076A1 (en) Method and system for fault localization in a cloud environment
CN112685207A (en) Method, apparatus and computer program product for error assessment
CN112051771B (en) Multi-cloud data acquisition method and device, computer equipment and storage medium
CN111027591B (en) Node fault prediction method for large-scale cluster system
CN113190417A (en) Microservice state detection method, model training method, device and storage medium
CN113076239A (en) Hybrid neural network fault prediction method and system for high-performance computer
CN113467421A (en) Method for acquiring micro-service health status index and micro-service abnormity diagnosis method
CN112532455A (en) Abnormal root cause positioning method and device
CN111522736A (en) Software defect prediction method and device, electronic equipment and computer storage medium
Wang et al. Online reliability time series prediction for service-oriented system of systems
CN108415819A (en) Hard disk fault tracking method and device
CN116560794A (en) Exception handling method and device for virtual machine, medium and computer equipment
Avritzer et al. Automated generation of test cases using a performability model
Ilijašić et al. Characterization of a computational grid as a complex system
Badri et al. Predicting the size of test suites from use cases: An empirical exploration
WO2011151419A1 (en) Test operation
Wang et al. SaaS software performance issue identification using HMRF‐MAP framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191206