CN109213616A

CN109213616A - A kind of micro services software systems method for detecting abnormality based on calling map analysis

Info

Publication number: CN109213616A
Application number: CN201811114173.0A
Authority: CN
Inventors: 周红卫; 李亚琼; 刘延新; 李守超; 周博
Original assignee: Jiangsu Run He Software Inc Co
Current assignee: Jiangsu Run He Software Inc Co
Priority date: 2018-09-25
Filing date: 2018-09-25
Publication date: 2019-01-15

Abstract

Invention is related to a kind of based on the micro services software systems method for detecting abnormality for calling map analysis.Each micro services feature is monitored, sliding window is established for each feature, is based on Single moving average method predicted characteristics value；The difference of comparison prediction value and monitor value detects off-note based on statistical method；The micro services of feature abnormalities are detected as exception, and the call graph of micro services is established according to the knowledge that system constructs；The intensity of anomaly of each micro services is assessed based on PageRank method and is ranked up.

Description

A kind of micro services software systems method for detecting abnormality based on calling map analysis

Technical field

The present invention relates to a kind of based on the micro services software systems method for detecting abnormality for calling map analysis, belongs to software technology Field.

Background technique

Internet company's business is constantly expanded, and user demand frequently changes, and conventional software architectural has been unable to satisfy internet Using the demand for design, exploitation, maintenance etc..Micro services are a kind of system architectures of high cohesion lower coupling, will be applied One group of micro services that has a single function, can independently dispose is split into, each other by lightweight communication mechanism coordinated, is realized quick Victory exploitation and deployment.But the component of micro services is numerous, dependence is complicated, increases abnormal odds and diagnosis Difficulty.Have work and be normally based on monitoring metrics or log analysis, system manager combines domain knowledge investigation abnormal Reason, but when in face of complicated micro services, then it can not quickly position abnormal cause.Particularly, when one of serviced component It is abnormal, abnormal influences to spread as the mutual calling of inter-module is continuous, service quality decline or promise breaking are eventually led to, Cause heavy losses.Therefore, it is the performance and reliability for ensureing micro services, effectively detects abnormal the problem of having become urgent need to resolve One of.

Work at present can be divided into following a few classes, the method for detecting abnormality based on monitoring metrics, such as Hystrix (Latency and Fault Tolerance for Distributed Systems. https://github.com/Netflix/ Hystrix), this method is realized simple, it is only necessary to easy configuration can monitoring objective system operating index, such as CPU, interior It deposits, and alarm rule is set.When system occurs abnormal, then alarm is triggered, it is different that administrator in conjunction with domain knowledge carrys out analyzing and positioning Normal reason.Method for monitoring and analyzing based on log, such as ELK (https: //www.elastic.co/), such methods are collected respectively Log information in a component, administrator analyze abnormal initiation reason by information retrieval method or log mining. What the monitoring method based on measurement or log was generally concerned with is the executive condition of single serviced component, and this mode is realized relatively Simply, it but can not reflect that the integrality of micro services can not track Business Stream in complicated micro services interactive relation, manage Reason person will take a substantial amount of time to search and position abnormal cause, sometimes or even cannot all diagnose and go wrong.Based on micro services The analysis method for reliability of dependence, as App-Bisect (Rajagopalan S, Jamjoom H. App-Bisect: autonomous healing for microservice-based apps, USENIX Conference on Hot Topics in Cloud Computing, 2015.)、Gremlin (Heorhiadi V, Rajagopalan S, Jamjoom H, Reiter M K, Sekar V. Gremlin: Systematic Resilience Testing of Microservices. IEEE 36th International Conference on Distributed Computing Systems. 2016. 57-66.).Such method is concerned with the operating status of micro services component-level, passes through Version Control Or server resets guarantee the reliability of system.The rank that such method positions extremely is that micro services component equally can not The state of system is portrayed from whole angle, when system occurs abnormal, it is still desirable to which administrator further checks.Separately One kind is distributed method for tracing, as Dapper (Sigelman B H, Barroso L A, Burrows M, Stephenson P, Plakal M, Beaver D, Jaspan S, Shanbhag C. Dapper: a large-scale Distributed systems tracing infrastructure, Google Tecnical Report, 2010.), should The execution information that method is called by forms monitoring methods such as log, pitching pile or notes, tracking request end to end is in system In treatment process.Exception can cause processing track to be deviateed, by analysis processing track thus to achieve the purpose that diagnosis. But this method focuses on the treatment process for tracking single request end to end, and and is not provided with the analysis method of effect.In addition, Micro services software systems are facing generally towards huge user, cause monitoring data amount huge, so that administrator is submerged in monitoring data In.

Summary of the invention

The purpose of the present invention: according to the Historical Monitoring data configuration statistical model of micro services feature, pass through predicted characteristics value To detect off-note and micro services automatically, sorted micro services intensity of anomaly to position abnormal root based on anomalous propagation network This reason.

The principle of the present invention: monitoring each micro services feature, establishes sliding window for each feature, based on primary mobile flat Equal method predicted characteristics value；The difference of comparison prediction value and monitor value detects off-note based on statistical method；Feature abnormalities Micro services are detected as exception, and the call graph of micro services is established according to the knowledge that system constructs；It is commented based on PageRank method Estimate the intensity of anomaly of each micro services and is ranked up.

The technology of the present invention solution: a kind of micro services software systems method for detecting abnormality based on calling map analysis, Feature is to realize that steps are as follows:

The first step, feature monitoring:

The sliding window that length is N is established for each feature, collects timing monitoring data (y_t,y_t-1,…,y_t-N+1) be sequentially put into Sliding window, wherein y_iFor i moment monitoring data；

Second step, feature prediction:

For each feature, the value at (t+T) moment is predicted,

；

Third step, off-note detection:

,, wherein y_t+T' it is spy Levy the predicted value at (t+T) moment, y_t+T' it is characterized monitor value at (t+T) moment, it is different if being detected as off-note Micro services where Chang Tezheng are identified as abnormal micro services.

4th step, the building of micro services calling figure:

According to the call relation between the building knowledge or monitoring micro services for understanding micro services software systems, in the form of digraph Call relation between the abnormal micro services detected is described, wherein each micro services are expressed as a node, and micro services A is called Micro services B is expressed as the directed edge from node A to node B.

5th step, abnormal root is because of positioning:

When node A has link to be directed toward node B, be considered as node B and obtain score, the score value number depend on node The importance of the significance level of A, i.e. node A is bigger, and the score that node B is obtained is higher.The calculating of the score value is an iteration Process, finish node will be ranked up it according to resulting score and search result delivered user, this point quantified Number is exactly PR value.

(1) initial phase: the initial of all node is is givenPR _i(0)(i=1,2,3,…,N) value, wherein N is node Quantity meets；

(2) the more new stage:, wherein p_jIndicate the node for being directed toward node i, L(p_j) it is directed to the number of nodes of node i, q is random walk coefficient q ∈ (0,1)；

(3) abnormal positioning: each node of recursive calculationPRValue, is ranked up the PR value of each node, the PR value of micro services is got over Height, intensity of anomaly are bigger.

The invention has the following advantages over the prior art:

1. being detected automatically by predicted characteristics value abnormal special according to the Historical Monitoring data configuration statistical model of micro services feature Sign, without using the feature of domain knowledge manual analysis micro services；

2. constructing the call graph of micro services, model the interactive relation between micro services, can portray it is abnormal micro services it Between propagation；

3. using the abnormality score of each micro services of PageRank algorithm evaluation, so that anomalous propagation factor is considered, it is accurate fixed The basic reason of position micro services software systems exception.

Detailed description of the invention

Fig. 1 is the use environment of present invention method.

Specific embodiment

Below in conjunction with specific embodiments and the drawings, the present invention is described in detail, as shown in Figure 1, embodiment of the present invention side Method process:

Piggy Metrics (https: //github.com/sqshq/PiggyMetrics) is had chosen as experimental subjects. Piggy Metrics is one and provides the micro services application of personal finance function, with reference to the design principle of micro services, Realize that code trustship at present is in Github using micro services Development Framework Spring Boot and Spring Cloud.Piggy In Metrics software architecture, API Gateway is directly interacted with Client, is received Client request, is realized load balancing And routing forwarding, call multiple back-end services to complete request processing, final polymerization result returns to Client；Account Service provides account management function, including account registration, inquiry etc.；Statistics Service realizes Personal Finance Statistical function, including income, expenditure and remaining sum etc.；The contact details and notice of Notification Service storage user are set It sets, periodically transmits the message to booking reader；Auth Service realizes the access authority of service based on OAuth2 Tokens Control；For application configuration using Spring Cloud Config as centralized management, serviced component passes through Config when starting Service obtains configuration parameter；Each serviced component can run multiple examples, which uses Netflix Eureka and make For service discovery component, the operating status of dynamic sensing Service Instance；Led between serviced component by Restful API Letter is cooperated using Ribbon as client load balanced device with service discovery framework Eureka, neatly control load plan Slightly.In addition, the data that the application uses decentralization store, each service uses individual MongoDB storing data.

Experimental situation includes micro services application Piggy Metrics, workload generator, abnormal injector and abnormity diagnosis system System.Wherein, Piggy Metrics carries out container volume by Docker Compose using Docker as basic running environment Row；Workload generator is asked using testing tool Apache Jmeter (http://jmeter.apache.org/) analog subscriber It asks, generates load；Abnormal injector will be injected into system, by preset script to test abnormity diagnostic system extremely Diagnosis effect；Abnormity diagnostic system as designs the micro services abnormity diagnostic system with realization herein.Using Apache Jmeter Analog subscriber request loads to generate.A test plan is created first, then creates one group of thread, setting for each scene HTTP request parameter and result exhibition method, every group of Thread Count 100, cycle-index 100, scheduler duration are 90s, finally Starting thread generates load to simulate.

The validity of proposed method is verified in such a way that injection is abnormal.It, successively will be abnormal single in experimentation It solely is injected into system, abnormity diagnosis is then carried out using mentioned method herein.Each load duration curve is 90s, in 30s Injection is abnormal, restores after continuing 30s, while the execution information of abnormity diagnostic system collection system, and carry out abnormity diagnosis.Specifically Experimental procedure is as follows:

1) reset system: each exception is individually monitored and is diagnosed, influences each other in order to prevent, tests it every time Afterwards, the cleaning of environment is carried out first, deletes junk data, then reinstalls the unloading of micro services application cluster.This process can Easily to pass through container programming facility Docker Compose (https: //github.com/docker/compose) It realizes.

2) start abnormity diagnostic system: starting abnormity diagnostic system, and debugged, check whether normal operation.

3) starting load generator: starting Apache Jmeter is arranged load configuration, initiates load.

4) injection is abnormal: for the exception that can restore automatically, such as pause container etc., injecting, holds after load continuous 30s Cancel after continuous 30s abnormal；It for expendable exception, is then just injected when system starts, is continued until that load is completed.

Micro services abnormity diagnosis specifically includes the following steps:

1) distributed monitoring software Zabbix (https: //www.zabbix.com/) is used, the resource of collection vessel uses special Sign, including CPU, memory, network utilization etc., setting sliding window size are 10, the monitoring data of collection in every 30 seconds, each Feature establishes a sliding window；

2) using the formula of the second step in invention description, CPU, memory, network are predicted according to the monitoring data in sliding window The characteristic value in the next stage of the features such as utilization rate；

3) using the formula of the third step in invention description, the variance of the features such as CPU, memory, network utilization is calculated, for every The difference of a feature calculation monitor value and predicted value, if it is greater than 3 times of variances, then the micro services are detected as exception；

4) call relation between the abnormal micro services detected is described in the form of digraph, each micro services are expressed as a section Point, micro services A call micro services B to be expressed as the directed edge from node A to node B；

5) using five step in invention description the step of, each micro services PR value is calculated, and sort from large to small to PR value, The forward micro services that sort be abnormal root because.

Claims

1. a kind of based on the micro services software systems method for detecting abnormality for calling map analysis, method characteristic is to realize step such as Under:

The first step, feature monitoring: monitoring each micro services feature, establishes the sliding window that length is N for each feature, when collection Sequence monitoring data (y_t,y_t-1,…,y_t-N+1) it is sequentially put into sliding window, wherein y_iFor i moment monitoring data；

Feature prediction: second step for each feature, predicts the value at (t+T) moment, Wherein,；

Third step, off-note detection:,, Wherein, y_t+T' it is characterized predicted value at (t+T) moment, y_t+T' it is characterized monitor value at (t+T) moment, if, then it is detected as off-note, the micro services where off-note are identified as abnormal micro services；

4th step, micro services calling figure building: according between the building knowledge or monitoring micro services for understanding micro services software systems Call relation, the call relation between the abnormal micro services that detect is described in the form of digraph, wherein each micro services table It is shown as a node, micro services A calls micro services B to be expressed as the directed edge from node A to node B；

5th step, abnormal root is because of positioning: giving the initial of all node isPR _i(i=1,2,3,…,N) value, wherein N is number of nodes Amount meets；The each node of recursive calculationp _i'sPRValue,, Wherein, p_jIndicate the node for being directed toward node i, L (p_j) it is directed to the number of nodes of node i, q is random walk coefficient q ∈ (0,1)； The PR value of each node is ranked up, the PR value of micro services is higher, and intensity of anomaly is bigger.