CN101394314B - Fault positioning method for Web application system - Google Patents

Fault positioning method for Web application system Download PDF

Info

Publication number
CN101394314B
CN101394314B CN2008101199727A CN200810119972A CN101394314B CN 101394314 B CN101394314 B CN 101394314B CN 2008101199727 A CN2008101199727 A CN 2008101199727A CN 200810119972 A CN200810119972 A CN 200810119972A CN 101394314 B CN101394314 B CN 101394314B
Authority
CN
China
Prior art keywords
fault
fault point
module
detection
web application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008101199727A
Other languages
Chinese (zh)
Other versions
CN101394314A (en
Inventor
邱雪松
成璐
龙会湖
亓峰
孟洛明
王颖
刘会永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN2008101199727A priority Critical patent/CN101394314B/en
Publication of CN101394314A publication Critical patent/CN101394314A/en
Application granted granted Critical
Publication of CN101394314B publication Critical patent/CN101394314B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

A fault location method for a Web application system comprises the following steps: (1) building an expansion dependable matrix model: abstracting all fault points and available detection in the system, determining correlation between the fault points and the available detection, and describing the built expansion dependable matrix model; (2) carrying out detection: sending a detection request to a target Web application system, analyzing returned results, determining success or fail of the detection, and obtaining detection results assembly as symptom data for fault rational analysis; and (3) performing fault rational analysis: performing failure detection of related fault points to filter for fault points with fault probability base on the expansion dependable matrix and the symptom data detection results assembly; performing success detection of related fault points to exclude fault probability; and analyzing the probability of each fault in the fault point assembly obtained by filtering, and arranging in the sequence from high to low.

Description

A kind of Fault Locating Method of Web application system
Technical field
The present invention relates to the Fault-Locating Test field, the method that by active probe the application layer fault is positioned in particularly a kind of Web application system.
Background technology
The Web application system is meant the technology based on Web, provide user interface by browser, provide data and service logic by database server and application server, adopt http (HyperText Transportation Protocol between user and the server, HTML (Hypertext Markup Language)), https (Secure Hypertext TransferProtocol, Secure Hypertext Transfer Protocol) or the mutual application system of other standard agreements.At present, along with the development of Web technology, the maturation of Web application system System Framework and the improvement of network condition, the Web application system is widely used, the more and more bulky complex that also becomes, and also paid close attention to day by day widely at the failure diagnosis of Web application system.
The Web application system generally is multi-level distributed system, and the fault that faces may be from different aspects, such as application server fault, network failure and service logic fault etc.Forefathers studies show that the service of Web application system provides failure to come from business configuration mistake [3] in the maintenance process mostly.Such mistake generally can not cause whole system paralysis, and often causes the system business logic to change, thereby causes some Web application semantics to make a mistake, and can't login such as the user of a mandate, can not add commodity etc. toward shopping cart.The fault that this symptom shows as semantic error is called as the application layer fault.The application layer fault is modal fault, also is the fault that the ISP pays close attention to most.The application layer fault can't be detected by the ping of network layer and the means such as httprequest of protocol layer, other detection means of semantic class must be adopted.
Current Web application system failure diagnosis mode is mainly inferred the module position that breaks down by the request and the practice condition thereof of monitoring system inside, it is a kind of passive mode, often need system transformed or increase extra associated component and system state data to transmit link, make system initiatively provide inner state to monitoring system, implementation complexity, maintenance difficulties are bigger.
The failure diagnosis mode of the Web application system initiative among the present invention is initiatively to send a kind of failure diagnosis mode of service request with the detection system internal state to the Web application system.According to the application layer fault definition as can be known, Web applied business request results is the direct symptom data of application layer fault.By the symptom data of survey gathering, in conjunction with service request and incidence relation between the business logic systems assembly, the guilty culprit that can infer the Web application system be provided.Secondly, because the active probe mode does not need to change framework of Web application system own or functional definition, therefore the Web application system had stronger independence, can construct cost less, implement diagnostic system simple, fast adaptation Web application system service logic configuration variation, thereby improve fault diagnosis efficiency and accuracy.
" Proceedings of the International Conference on Dependable Systems andNetworks ", one piece of " Pinpoint:problem determination in large by name was disclosed in 2002, dynamic Internet services " paper [1], this paper has been discussed Pinpoint (fix point method), it is a kind of method for diagnosing faults based on data mining, it is monitored message transmission between all Internet service modules and follows the tracks of a user and ask the module that experiences, need the relevant data of record to comprise that the user asks ID, the module of experience and the result of execution to each request; When fault takes place, the data of record are analyzed.Pinpoint has following shortcoming: the message-oriented middleware that 1) needs change Web application system is to report inner message, and this need revise by guard system, and is many times infeasible in actual use, and has increased the cost that failure diagnosis is disposed and safeguarded; 2) in actual moving process, can be produced a large amount of, complicated message data by guard system, wherein major part is useless, to diagnostic system bring data processing on load; 3) passive diagnostic system is a kind of diagnosis afterwards, can't diagnose before fault is presented to the user and be out of order.
" 26th IEEE International Symposium on Reliable Distributed Systems ", disclosed the paper [2] of a piece " Distributed Diagnosis of Failures in a Three Tier E-CommerceSystem " by name in 2007, this paper has been discussed Monitor (supervision method) method.This method not only monitors the execution result of all service requests, also follows the trail of the implementation of service request.It is that each module safeguards a logical timer, comes the logging modle time sequencing of calling or be called in a service execution process.Can generate the message between the module according to this logical time and transmit causality figure.When failure takes place, carry out probability inference according to current causality figure.Monitor need understand the situation of calling of each component internal method, and safeguards a state diagram for each assembly, has increased the complexity of algorithm greatly.The mechanism of Monitor is too complicated, and the expense that is difficult for realization, disposes and safeguards is all bigger.
In the introducing mode above-mentioned technology contents is incorporated in the application.
Summary of the invention
The object of the present invention is to provide the method that by active probe the application layer fault is positioned in a kind of Web application system, can be under the prerequisite of not transforming the Web application system, simply, effectively the malfunctioning module of the application layer of Web application system is located fast, and service logic configuration variation that can fast adaptation Web application system.
To achieve these goals, the invention provides a kind of Fault Locating Method of Web application system, this method is achieved by the following technical solution:
A kind of Fault Locating Method of Web application system, this method comprises:
Set up expansion and rely on the matrix model step: take out fault points all in the system and available detection, and determine the corresponding relation between described fault point and the described detection, the foundation expansion relies on matrix and is described.
Carry out detection steps: to target Web application system transmission probe requests thereby, and analyze return results, determine that detection runs succeeded or fails, draw the result of detection set, as the symptom data of fault reasoning analysis.
The fault reasoning analytical procedure: rely on matrix model and symptom data snooping results set based on expansion, failure is surveyed relevant fault point and is filtered into the fault point that may break down; Success is surveyed relevant fault point and is got rid of the possibility that breaks down.To the fault point set that described filtration draws, analyze the possibility size of each fault, arrange by order from big to small.
The Fault Locating Method of Web application system of the present invention is a kind ofly to rely on the Fault Locating Method that matrix and active probe are obtained rational analysis after the symptom data based on expansion.Expansion dependence matrix is surveyed related with the service logic of module except taking out the middle service request of tradition dependence matrix [4] in the described method, it is related with the service logic of intermodule call relation also to take out the service request detection, can carry out precision high fault diagnosis location to the application layer fault.Active probe is obtained the symptom data mode and the Web application system has stronger independence in the described method, does not need to follow the tracks of the message of Web application system inside, does not need to revise the Web application system, make diagnostic system be easy to safeguard; And configuration variation that can fast adaptation Web service logic, implement simple.
Description of drawings
Accompanying drawing described herein is used to provide further understanding of the present invention, constitutes the application's a part, does not constitute limitation of the invention.In the accompanying drawings:
Be illustrated in figure 1 as a kind of Web application system of the present invention Fault Locating Method flow chart;
Be illustrated in figure 2 as the present invention and set up expansion dependence matrix model detailed step flow chart;
Be illustrated in figure 3 as dependence of the present invention and implement illustration;
Be illustrated in figure 4 as the present invention and filter possible breakdown point detailed step flow chart;
Be illustrated in figure 5 as the present invention's possible breakdown point detailed step flow chart that sorts.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, specific embodiments of the invention are elaborated below in conjunction with accompanying drawing.At this, illustrative examples of the present invention and explanation thereof are used to explain the present invention, but not as a limitation of the invention.
The invention provides a kind of active fast fault locating method of Web application system module.Fig. 1 is a kind of Web application system of the present invention Fault Locating Method flow chart, and as shown in Figure 1, this method mainly is divided into three steps: step 101, and set up expansion and rely on matrix model; Step 102 is carried out and is surveyed; Step 103, the fault reasoning analysis.Illustrated respectively below.
(1) step 101 is set up expansion and is relied on matrix model
Most Fault Locating Method is by setting up the foundation of fault propagation model as fault location.In system failure propagation model, there are two key concepts of symptom and fault: (1) symptom, when the variety of issue of function or service logic appears in internal system, the information of the various reflection system current state that shows; (2) fault, the basic reason that described symptom occurs.Though fault generally is unobservable, can carry out reasoning to determine the root place of fault according to external, observable symptom.
The Web application system often is made up of one group of separate functional module.The service logic of the realization system that cooperates mutually between the module, described module is general external to be provided limited interface interchange and often realizes that by the message-oriented middleware of standard intermodule is mutual.When the Web system module breaks down, with a kind of external professional symptom that can directly be observed of this module related Web service request result for described fault.
Putting before this, the present invention carries out modeling to concrete Web application system, by functional module and the service request dependence between the two, set up fault point possible in the system and the corresponding relation model between the service request, the dependence matrix model that is expanded is so that carry out the location of module failure.The present invention sets up shown in the expansion dependence matrix model detailed step flow chart as Fig. 2, mainly may further comprise the steps:
Step 201: obtain the functional module set of Web application system, separate between each module.
Separate being meant between described each module: the inside modules encapsulation realizes the components of system as directed logic of business functions, has independently state, externally limited interface interchange only is provided or realizes that by the message-oriented middleware of standard intermodule is mutual.
Step 202: enumerate all available probe requests thereby set, each is separate between surveying, and the probe requests thereby set need cover all modules and the call relation between the module as far as possible.
Described detection is meant: in the Web application system, client is initiated a Web service request, and server end is handled this Web service request incident and once complete end to end mutual to the client return results.If once survey the service logic design of the compliance with system of returning alternately as a result, then assert and survey successfully, otherwise assert and survey failure.The detection embodiment that " adds commodity to shopping cart " such as an e-commerce website is: the user selects commodity on the Web page, and click " adding shopping cart to ", server responds this request and return results: add successfully or add and fail or other are unusual.To return results, if commodity at frame in stockit is available and result for add success or commodity are temporarily out of stock and the result for adding failure, then assert and survey successfully, otherwise assert that detection fails.
Separate being meant between described each surveyed: success or the failure of surveying A arbitrarily do not rely on any B detection, and only relevant with the functional module that provides described A to survey service logic.Be arbitrary detection regardless of its execution result, can not change the follow-up behavior of system.Web application system system satisfies this condition usually.
Described probe requests thereby set is meant: at a concrete Web application system, the actual user can send all possible Web request to the Web application system.The kind of probe requests thereby is many more, and is just comprehensive more to the covering of call relation between the functional module of Web application system and the module, just high more to system module failure location accuracy accordingly.
Step 203: take out single module fault point and module invokes fault point respectively from the call relation of module collection and intermodule.
In order to carry out fault location, need from goal systems, take out one group of fault point (least unit of fault location).Should be separate between the fault point, each fault point has two separate states: break down or do not have fault.
Fault point of the present invention is divided into two classes: single module fault point and module invokes fault point; The single module fault of described single module fault point correspondence is meant certain module generation internal logic fault, makes that other modules can be failed to calling all of it arbitrarily; The module invokes fault of described module invokes fault point correspondence is meant certain fault of externally calling or be called of generation of certain module, makes the particular module failure of it being called or be called.
In the present invention, the separate functional module of Web application system is mapped as the single module fault point; The intermodule call relation is mapped as the module invokes fault point.When having different call relations between two modules, promptly a module is called another module in a different manner, then every kind of call relation all is abstracted into the module invokes fault point.Such as the embodiment of modules A and B (there are two kinds of call relations to B in A), can draw Fab.1, the module invokes fault point that Fab.2 is such during abstract fault point.
Step 204: determine the dependence between probe requests thereby set and the fault point set.
Each surveys the service logic according to its realization, and always depending on regularly provides the correlation function of this service logic module, and irrelevant with other modules.At this moment, there is dependence in the described detection fault point corresponding with described functional module.On the one hand, the state that is relied on the fault point can influence result's (success or failure) of detection; Another reverse side, the result of detection has reflected the running status (break down or do not have fault) of the fault point of its dependence, fault location of the present invention is exactly to come which module of analyzing and positioning to have fault according to the dependence of result of detection and detection and fault point.
Step 205:, set up expansion and rely on matrix according to probe requests thereby set and fault point dependence;
By probe requests thereby set and fault point dependence, set up the functional module position that to break down in the system and the corresponding relation model between the service request.
Traditional dependence matrix is to be made of probe requests thereby set and the set of single module fault point.
As shown in table 1 for relying on an embodiment of matrix.
F1 F2 F3 F4 F5
P1 1 0 1 0 0
P2 1 0 1 0 0
P3 1 0 1 1 0
P4 1 1 1 0 1
According to shown in the table 1, described dependence matrix comprises five fault points of F1~F5 and four detections of P1~P4.Matrix crosspoint value be represented in 1 o'clock corresponding survey and the fault point between exist and rely on, survey with the fault point and do not have dependence otherwise value is 0 expression.P1 calls F3 by F1 and realizes that promptly P1 depends on F1 and F3.If F1 or F3 break down, then P1 surveys failure probably, if instead P1 surveys failure, then illustrates to have at least a fault point to break down among F1 and the F3.
According to relying on matrix model,, can infer that module failure point that described detection relies on certainly exists at least one module and breaks down if survey failure.Otherwise, survey successfully, can not infer that but the module failure point that described detection relies on does not break down.Only embodied in certain specific calling in the context such as the module invokes fault, other business that do not influence module provide.In this case, can't from successfully survey, obtain fault location information.
It is the expansion that tradition relies on matrix that expansion relies on matrix, is made of probe requests thereby set and fault point set, and the fault point set not only comprises and also comprises the module invokes fault point in the single module fault point.If survey and depend on certain functional module, then described detection depends on the single module fault point of described functional module correspondence; If detection can cause that calling between some functional module is performed, then described detection depends on the single module fault point and the described module invokes fault point of calling correspondence of described functional module correspondence.
Rely on matrix model according to expansion,, can infer that fault that described detection relies on certainly exists at least one module and breaks down if survey failure.Otherwise if survey successfully, because the context relation of surveying of calling determines that all not breaking down in the fault point that then can push away described detection dependence, compares traditional dependence matrix, the accuracy of fault location is improved significantly.
Implement shown in the illustration as dependence of Fig. 3 the present invention, described embodiment comprises A, B, three modules of C and two detections of P1~P3.P1 calls B by A and realizes; P2 calls C interface 1 by A and realizes; P3 calls C interface 2 by A and realizes.Then P1 relies on single module fault point Fa, Fb and calls fault point Fab; P2 relies on single module fault point Fa, Fc and calls fault point Fac1; P3 relies on single module fault point Fa, Fc and calls fault point Fac2.
As shown in table 2 is the dependence matrix that generates according to Fig. 3 dependence embodiment.
Fa Fb Fc
P1 1 1 0
P2 1 0 1
P3 1 0 1
As shown in table 3 is to rely on matrix according to the expansion that Fig. 2 dependence embodiment generates.
Fa Fb Fc Fab Fac2 Fac2
P1 1 1 0 1 0 0
P2 1 0 1 0 1 0
P3 1 0 1 0 0 1
Shown in table 2 and table 3, when result of detection is P1 success, P2 failure, P3 success, then relying under the matrix model, the P2 failure: infer fault point Fa, Fc break down at least one; P1 success: illustrate that A calls the B success; P3 success: illustrate that A calls the C success.If P1, P3 success can infer that Fa, Fb, Fc all do not break down, then repel each other with the conclusion of P2 failure deduction.Do not obtain embodying in relying on matrix because call context in the fault point, the successful detection that relies in the matrix not can be used as the foundation that fault is inferred.And relying under the matrix model P2 failure in expansion: infer fault point Fa, Fc, Fac1 break down at least one; P1 success: determine owing to call context, can infer that fault point Fa, Fb, Fab1 that P1 relies on do not break down; P3 success: determine owing to call context, can infer that fault point Fa, Fc, Fac2 that P3 relies on do not break down.In conjunction with the inferred results of result of detection set, can infer that the unique fault point that may break down is Fac1, promptly modules A is called the interface 1 of module C and is broken down, and the result who verifies P1~P3 conversely is consistent.Therefore, expansion relies on matrix model and compares the dependence matrix model, and fault localization accuracy improves a lot.
(2) step 102 is carried out and is surveyed
After the good expansion of foundation relies on matrix model, by the probe requests thereby set in the execution dependence matrix, can obtain one group of result of detection set, reflection system current state is as the foundation of fault reasoning analysis.
(3) step 103, the fault reasoning analysis
After carrying out the probe requests thereby set that relies in the matrix, the present invention carries out return results by analyzing the probe requests thereby set, according to the dependence between detection and the fault point in the expansion dependence matrix, filter out a series of fault point set that may break down, and all possible breakdown points that gather described fault point are carried out possibility calculate, by the possibility size descending output is carried out in the fault point again, the user can select wherein previous or a plurality of results as final fault reasoning analysis result.
Reasoning analysis method of the present invention mainly comprises following two steps: (1) filters possible breakdown point step: rely on matrix model according to result of detection set Ps and expansion, filter out the fault point that institute might break down and the failure detection number of each fault point correspondence.Wherein, failure is surveyed and is inferred the fault point that may break down, successfully surveys the possibility of fixing a breakdown and a little breaking down.(2) ordering possible breakdown point step: survey number according to the failure of each fault point correspondence and carry out fault point descending from high to low output, by described failure survey number embody the generation of each fault point the possibility size.
Be illustrated in figure 4 as the present invention and filter possible breakdown point detailed step flow chart:
Step 401, one two dimensional filter output of initialization is Fs as a result, and the first dimension fault point is made as sky, the second dimension counter Cf zero clearing, one of initialization is successfully surveyed and relied on fault point S set s is empty, enters step 402.
Step 402 judges whether result of detection set Ps is empty, if for sky then enter step 408, otherwise enters step 403;
Step 403 is taken out a result of detection P from result of detection set Ps, enter step 404;
Step 404 is judged result of detection P whether for successfully surveying, if for being then enter step 405, otherwise enters step 406;
Step 405 relies on matrix model according to expansion, and the fault point that P is relied on joins Ss, enters step 407;
Step 406 relies on matrix model according to expansion, and the fault point that P is relied on adds Fs, and the corresponding counter in each fault point adds 1, enters step 407;
Step 407 is deleted P, and is returned step 402 from result of detection set Ps;
Step 408 is judged successfully and to be surveyed whether rely on fault point S set s be empty, if for being then enter step 412, otherwise enters step 409;
Step 409 is taken out a fault point F from successfully surveying to rely on the S set s of fault point, enter step 412;
Step 410, the F among the corresponding described Ss, with two dimensional filters output as a result among the Fs corresponding value be that the fault point of F empties, corresponding counter O reset enters step 411;
Step 411 is deleted F, and is returned step 408 from successfully surveying to rely on the S set s of fault point;
Step 412 finishes the fault filtering flow process and returns fault filtering Fs as a result.
Be illustrated in figure 5 as the present invention's possible breakdown point detailed step flow chart that sorts:
Step 501, a results of two-dimensional ordination formation of initialization Fq is made as sky with the first dimension fault point, and number Cq zero clearing is surveyed in the second dimension failure, enters step 502;
Step 502, whether failure judgement filter result Fs is empty, if for sky then enter step 505, otherwise enters step 503;
Step 503 is chosen the maximum fault point F of a counter Cf value in Fs, this fault point F is added the Fq tail of the queue, and adopts Cf to the Cq assignment, enters step 504;
Step 504 is deleted in Fs this fault point F, corresponding counter Cf zero clearing, and turn back to step 502.
Step 505 finishes the fault filtering flow process and returns fault sequencing queue Fq.
Below be the embodiment of the fault location of an ecommerce Web application system.This application system is a simple electronic retailing system, described embodiment only relates to client and logins a kind of shopping cart shopping service logic of doing shopping to and submitting to shopping cart to log off to order again, and therefore the system component among the described embodiment may be less than concrete electronic retailing system.According to Fault Locating Method of the present invention, at first this system is carried out modeling, take out expansion and rely on matrix, carry out the detection back and carry out accident analysis according to result of detection.
(1) set up expansion and rely on matrix model, comprise following five steps:
Step 1: obtain the separate functional module set of Web application system.
Separate associated component of present embodiment or module module are as follows:
A---client's module: stores processor Customer Information;
B---warehouse module: the module of stores processor merchandise news;
C---controller module: receive external user interface message, and transmit message to each functional module;
D---shopping cart module: the shopping cart information of this session of stores processor client;
E---order module: the sequence information of this session of stores processor;
More than five functions of modules separate.
Step 2: enumerate all available separate probe requests thereby set.
According to the service logic of ecommerce shopping, determine to carry out and separate detection set: the user can be carried out the detection that all operations can be carried out as us, and the set of present embodiment row probe requests thereby is as follows:
P0---client's accessing system: controller module (C) receives user login information, mails to client's module (A) checking, by then creating client's session instance;
P1---add commodity first to shopping cart: controller module (C) adds the commodity request first according to the user, calls shopping cart module (D), for client creates the shopping cart session instance; Obtain merchandise news and add shopping cart module (D) to from warehouse module (B);
P2---add commodity to shopping cart: controller module (C) adds the commodity request according to the user, obtains merchandise news and adds shopping cart module (D) to from warehouse module (B);
P3---remove commodity from shopping cart: controller module (C) removes the commodity request according to the user, removes commodity from shopping cart module (D), and will remove commodity amount information and return warehouse module (B);
P4---submit shopping cart to: controller module (C) is submitted the shopping cart request to according to the user, Customer Information in commodity in order module (E) submission shopping cart module (D) in the shopping cart session instance, the client's module (D) in client's session instance, create user's order, after commodity were submitted to, the shopping cart session instance emptied;
P5---client logs off: controller module (C) calls client's module (A) deletion client's session instance and calls shopping cart module (D) deletion shopping cart session instance according to user's request.
Step 3 takes out the fault point from the call relation of module collection and intermodule
Five modules of A~E are abstract respectively to be five single module fault point Fa~Fe.
Controller module C creates client's session instance to the control login authentication of client's modules A and publishes abstract module invokes fault point: the Fca of being of calling logic of deletion client session instance;
Controller module C is module invokes fault point: Fcb to the preemption of warehouse module B/cancellation preemption commodity calling logic is abstract;
Controller module C to the interpolation of shopping cart module D/remove commodity, submit that the calling logic of shopping cart aspect shopping cart operation is abstract to be module invokes fault point: Fcd1 to;
Controller module C is abstract to the calling logic of generation/deletion shopping cart session instance of shopping cart module D to be module invokes fault point: Fcd2
Controller module C is abstract to the calling logic of the submission shopping cart generation order of order module E to be module invokes fault point: Fce;
Step 4 is determined the dependence between probe requests thereby set and the fault point set
P0 relies on the single module fault point: Fa, Fc, and Depending module calls the fault point: Fca;
P1 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd2;
P2 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd1;
P3 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd1;
P4 relies on the single module fault point: Fa, Fc, Fd, Fe, and Depending module calls the fault point: Fca, Fcd1, Fce;
P4 relies on single module fault point Fa, Fc, Fd, and Depending module calls the fault point: Fca, Fcd2.
Step 5 is set up expansion and is relied on matrix
Dependence according between described probe requests thereby set and the fault point set can draw following expansion and rely on model.
Fa Fb Fc Fd Fe Fca Fcb Fcd1 Fcd2 Fce
The P0 accessing system 1 0 1 0 0 1 0 0 0 0
P1 adds commodity first to shopping cart 0 1 1 1 0 0 0 0 1 0
P2 adds commodity to shopping cart 0 1 1 1 0 0 1 1 0 0
P3 removes commodity from shopping cart 0 1 1 1 0 0 1 1 0 0
P4 submits shopping cart to 1 0 1 1 1 1 0 1 0 1
P5 logs off 1 0 1 1 0 1 0 0 1 0
(2) carry out detection and fault reasoning analysis
The fault reasoning analysis embodiment that the single module fault point Fa of a shopping cart module D breaks down is as follows:
Carrying out the result of detection set of obtaining after the described probe requests thereby set is: P0---success, P1---failure, P2---failure, P3---failure, P4---failure, P5---failure.
Step 1: filter the possible breakdown point.At first surveying the Fs that obtains according to failure is { (Fa, 2), (Fb, 2), (Fc, 5), (Fd, 5), (Fe, 1), (Fca, 2), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }, be that { Fca} deletes fault point corresponding among the Fs, so the final result Fs of fault filtering is { (Fb, 2) for Fa, Fc then according to successfully surveying the fault point S set s that relies on, (Fd, 5), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }.
Step 2: ordering possible breakdown point.At Fs is { (Fb, 2), (Fd, 5), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }, thus last diagnostic as a result Fq be (Fd, 5) in proper order, (Fcd1,3), (Fb, 2), (Fcb, 2), (Fcd2,2), (Fe, 1), (Fce, 1).Therefore, the possibility maximum that Fd breaks down, promptly the possibility maximum of single module fault takes place in module D, breaks down consistent with the single module fault point Fa of prerequisite shopping cart module D.
The fault reasoning analysis embodiment that the module invokes fault point Fcd1 of a shopping cart module D breaks down is as follows:
Carrying out the result of detection set of obtaining after the described probe requests thereby set is: P0---success, P1---success, P2---failure, P3---failure, P4---failure, P5---success.
Step 1: filter the possible breakdown point.At first survey the Fs that obtains and be { (Fa, 1), (Fb, 2), (Fc, 3) according to failure, (Fd, 3), (Fe, 1), (Fca, 2), (Fcb, 2), (Fcd1,3), (Fce, 1) } be { Fa then according to successfully surveying the fault point S set s that relies on, Fc, Fd, Fca, Fcd2}, corresponding fault point among the deletion Fs, so the final result Fs of fault filtering is { (Fb, 2), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fce, 1) }.
Step 2: ordering possible breakdown point.At Fs is { (Fb, 2), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fce, 1) }, thus last diagnostic as a result Fq be (Fcd1,3) in proper order, (Fb, 2), (Fcb, 2), (Fe, 1), (Fce, 1).The possibility maximum that Fcd1 breaks down breaks down consistent with the module invokes fault point Fcd1 of prerequisite shopping cart module D.
Comprehensive above-mentioned steps, in a concrete e-commerce website, at a kind of shopping cart shopping service logic carried out once intactly set up expansion rely on matrix model, carry out survey, the fault location process of accident analysis reasoning.In concrete complete Web application system application layer fault location, the basic principle of reasoning is identical with above-mentioned steps, and the method that proposes in can be according to the present invention can be carried out fault location quickly and accurately.
Beneficial effect of the present invention is, at first, the fault location model extension that the present invention proposes relies on matrix and can solve tradition and rely on matrix and survey with the fault point and can not form deterministic relation (functional unit may with various faults or relevant with a plurality of faults), the situation that the fault diagnosis result precision is not high, expand the deduction foundation that the symptom data provide, thereby obviously improved the accuracy of diagnosis.Secondly, Fault Locating Method of the present invention is the active probe mode, do not need described system transformed or increase functional unit or system state data to transmit the message that link is followed the tracks of Web application system inside, do not need extra maintenance, it is less to have the cost of realization, implement simply the advantage that is easy to safeguard.In addition, because the independence between active probe mode and the tested Web application system, configuration variation that can fast adaptation Web service logic, the while guarantees the accuracy of symptom data to a certain extent, the accuracy that helps fault location improves fault diagnosis efficiency and accuracy.
Above-described embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is the specific embodiment of the present invention; and be not intended to limit the scope of the invention; within the spirit and principles in the present invention all, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.
List of references:
1Mike?Y.Chen,Emre?Kiciman,Eugene?Fratkin,Armando?Fox,Eric?Brewer:Pinpoint:problemdetermination?in?large,dynamic?Internet?services.Proceedings?of?the?International?Conferenceon?Dependable?Systems?and?Networks(DSN’02)IEEE,2002
2Gunjan?Khanna,Ignacio?Laguna,Fahad?A.Arshad,Saurabh?Bagchi:Distributed?Diagnosis?ofFailures?in?a?Three?Tier?E-Commerce?System.26th?IEEE?International?Symposium?on?ReliableDistributed?Systems
3D?Oppenheimer,A?Ganapathi,DA?Patterson:why?do?internet?services?fail?and?what?can?be?doneabout?it.Proceedings?of?USITS’03:4th?USENIX?Symposium?on?Internet?technologies?andSystems?Seattle,WA,USA?March26-28,2003
4Irina?Rish,Mark?Brodie,Sheng?Ma,Natalia?Odintsova,Alina?Beygelzimer,Genady?Grabarnik,Karina?Hernandez:Adaptive?diagnosis?in?distributed?systems.IEEE?Transactions?on?neuralnetworks.vol.16,NO.5,September2005

Claims (8)

1. the Fault Locating Method of a Web application system is characterized in that this method comprises:
Set up expansion and rely on the matrix model step: obtain the functional module set of Web application system, separate between each module; Enumerate all available probe requests thereby set, each is separate between surveying, and the probe requests thereby set need cover all modules and the call relation between the module as far as possible; Take out single module fault point and module invokes fault point respectively from the call relation of module collection and intermodule, separate between the fault point; Determine that there is dependence in the described detection fault point corresponding with described functional module; According to probe requests thereby set and fault point dependence, set up expansion and rely on matrix;
Carry out detection steps: to target Web application system transmission probe requests thereby, and analyze return results, determine that detection runs succeeded or fails, draw the result of detection set, as the symptom data of fault reasoning analysis;
The fault reasoning analytical procedure: rely on matrix model and symptom data snooping results set based on expansion, failure is surveyed relevant fault point and is filtered into the fault point that may break down; Success is surveyed relevant fault point and is got rid of the possibility that breaks down; To the fault point set that described filtration draws, analyze the possibility size of each fault, arrange by order from big to small.
2. the Fault Locating Method of a kind of Web application system according to claim 1, it is characterized in that, separate being meant between described each module: the inside modules encapsulation realizes the components of system as directed logic of business functions, have independently state, externally limited interface interchange only is provided or realizes that by the message-oriented middleware of standard intermodule is mutual.
3. the Fault Locating Method of a kind of Web application system according to claim 1, it is characterized in that, described detection is meant: in the Web application system, client is initiated a Web service request, and server end is handled this Web service request incident and once complete end to end mutual to the client return results; If once survey the service logic design of the compliance with system of returning alternately as a result, then assert and survey successfully, otherwise assert and survey failure;
Separate being meant between described each surveyed: success or the failure of surveying A arbitrarily do not rely on any B detection, and only the functional module with the service logic that described detection A is provided is relevant; Be arbitrary detection regardless of its execution result, can not change the follow-up behavior of system;
Described probe requests thereby set is meant: at a concrete Web application system, the actual user can send all possible Web request to the Web application system.
4. the Fault Locating Method of a kind of Web application system according to claim 1 is characterized in that, the single module fault of described single module fault point correspondence is meant certain module generation internal logic fault, makes that other modules can be failed to calling all of it arbitrarily; The separate functional module of Web application system is mapped as the single module fault point;
The module invokes fault of described module invokes fault point correspondence is meant certain fault of externally calling or be called of generation of certain module, makes the particular module failure of it being called or be called; The intermodule call relation is mapped as the module invokes fault point; When having different call relations between two modules, promptly a module is called another module in a different manner, then every kind of call relation all is abstracted into the module invokes fault point.
5. the Fault Locating Method of a kind of Web application system according to claim 1, it is characterized in that, described expansion relies on matrix and is made of probe requests thereby set and fault point set, the fault point set comprises all single module fault points and module invokes fault point, and the matrix value is for existing dependence and not having dependence.
6. the Fault Locating Method of a kind of Web application system according to claim 1 is characterized in that, described fault reasoning analytical procedure comprises:
Initialization fault filtering result step: a two dimensional fault filter result of initialization Fs, the first dimension fault point is made as sky, the second dimension counter Cf zero clearing, one of initialization is successfully surveyed and relied on fault point S set s is empty;
Judge result of detection set step, judge whether result of detection set Ps is empty, if for sky then return failure and survey and filter intermediate object program Fs ', otherwise from result of detection set Ps result of detection P of taking-up;
Judge the result of detection step, judge that whether result of detection P is for successfully surveying, if for being then according to expansion dependence matrix model, the fault point that P is relied on joins successfully detection dependence fault point S set s, otherwise rely on matrix model according to expansion, the fault point that P is relied on adds two dimensional fault filter result Fs, and the corresponding counter in each fault point adds 1;
Deletion result of detection step is deleted P from result of detection set Ps, and continues to judge the result of detection set.
7. the Fault Locating Method of a kind of Web application system according to claim 6 is characterized in that, described fault reasoning analytical procedure also comprises:
Judge successfully to survey to rely on fault point set step, judge and survey successfully whether rely on fault point S set s be empty, and if for being the detection filtration intermediate object program Fs ' that will fail return as two dimensional fault filter result Fs; Otherwise take out a fault point F from successfully survey to rely on fault point S set s, will fail and survey corresponding value among the filtration intermediate object program Fs ' is that the fault point of F empties corresponding counter O reset;
Delete successful detection steps, F is deleted from successfully surveying to rely on the S set s of fault point, and continue to judge and successfully survey the set of dependence fault point.
8. the Fault Locating Method of a kind of Web application system according to claim 7 is characterized in that, described fault reasoning analytical procedure also comprises:
Initialization fault ranking results step, a results of two-dimensional ordination formation of initialization Fq is made as sky with the first dimension fault point, and number Cq zero clearing is surveyed in the second dimension failure;
Failure judgement filter result step, judge that whether two dimensional fault filter result Fs is empty,, otherwise in two dimensional fault filter result Fs, choose the fault point F of a counter Cf value maximum if for sky then return fault sequencing queue Fq, this fault point F is added the Fq tail of the queue, and adopt Cf the Cq assignment;
Deletion fault point step is deleted in two dimensional fault filter result Fs this fault point F, corresponding counter Cf zero clearing, and continue the failure judgement filter result.
CN2008101199727A 2008-10-20 2008-10-20 Fault positioning method for Web application system Expired - Fee Related CN101394314B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101199727A CN101394314B (en) 2008-10-20 2008-10-20 Fault positioning method for Web application system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101199727A CN101394314B (en) 2008-10-20 2008-10-20 Fault positioning method for Web application system

Publications (2)

Publication Number Publication Date
CN101394314A CN101394314A (en) 2009-03-25
CN101394314B true CN101394314B (en) 2011-03-23

Family

ID=40494403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101199727A Expired - Fee Related CN101394314B (en) 2008-10-20 2008-10-20 Fault positioning method for Web application system

Country Status (1)

Country Link
CN (1) CN101394314B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102684902B (en) * 2011-03-18 2015-10-14 北京邮电大学 Based on the network failure locating method of probe prediction
CN103605602A (en) * 2013-11-29 2014-02-26 中国航空工业集团公司第六三一研究所 Method for filtering out malfunctions of distributed computer system
CN105141505A (en) * 2015-08-25 2015-12-09 北京京东尚科信息技术有限公司 Message passing tracking method and device in instant messaging system
CN109936586B (en) * 2017-12-15 2021-04-20 腾讯科技(深圳)有限公司 Communication processing method and device
CN109343987A (en) * 2018-08-20 2019-02-15 科大国创软件股份有限公司 IT system fault diagnosis and restorative procedure, device, equipment, storage medium
CN110048901B (en) * 2019-06-04 2022-03-22 广东电网有限责任公司 Fault positioning method, device and equipment for power communication network
US11372707B2 (en) 2020-02-06 2022-06-28 International Business Machines Corporation Cognitive problem isolation in quick provision fault analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1474570A (en) * 2002-08-10 2004-02-11 华为技术有限公司 Method for recording access of address changeover users in data transmission process
CN1529455A (en) * 2003-09-29 2004-09-15 港湾网络有限公司 Network failure real-time relativity analysing method and system
EP1592206A1 (en) * 2004-04-28 2005-11-02 Sap Ag Computer system and method for providing a failure resistant data processing service
CN101170447A (en) * 2007-11-22 2008-04-30 北京邮电大学 Service failure diagnosis system based on active probe and its method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1474570A (en) * 2002-08-10 2004-02-11 华为技术有限公司 Method for recording access of address changeover users in data transmission process
CN1529455A (en) * 2003-09-29 2004-09-15 港湾网络有限公司 Network failure real-time relativity analysing method and system
EP1592206A1 (en) * 2004-04-28 2005-11-02 Sap Ag Computer system and method for providing a failure resistant data processing service
CN101170447A (en) * 2007-11-22 2008-04-30 北京邮电大学 Service failure diagnosis system based on active probe and its method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
钟仕群,朱程荣,熊齐邦.一种基于贝叶斯网络的集成的故障定位模型.《计算机技术与发展》.2006,第16卷(第12期),13-15、18. *

Also Published As

Publication number Publication date
CN101394314A (en) 2009-03-25

Similar Documents

Publication Publication Date Title
CN101394314B (en) Fault positioning method for Web application system
US20210119892A1 (en) Online computer system with methodologies for distributed trace aggregation and for targeted distributed tracing
US7792948B2 (en) Method and system for collecting, aggregating and viewing performance data on a site-wide basis
US7506047B2 (en) Synthetic transaction monitor with replay capability
US7461369B2 (en) Java application response time analyzer
US11269718B1 (en) Root cause detection and corrective action diagnosis system
US7953850B2 (en) Monitoring related content requests
EP2871574B1 (en) Analytics for application programming interfaces
US8578017B2 (en) Automatic correlation of service level agreement and operating level agreement
Jiang et al. Modeling and tracking of transaction flow dynamics for fault detection in complex systems
KR100763318B1 (en) Method and system for transaction pipeline decomposition
US9740991B2 (en) Calculating in-flight metrics for non-interruptible business transactions
US20060026467A1 (en) Method and apparatus for automatically discovering of application errors as a predictive metric for the functional health of enterprise applications
JP2008003709A (en) Management device, task management method, and program
US11526422B2 (en) System and method for troubleshooting abnormal behavior of an application
US7469287B1 (en) Apparatus and method for monitoring objects in a network and automatically validating events relating to the objects
CN113452607A (en) Distributed link acquisition method and device, computing equipment and storage medium
CN116192621A (en) Method for tracking service call chain based on Opentracking link
JP3897897B2 (en) TROUBLESHOOTING DEVICE AND TROUBLESHOOTING METHOD IN A NETWORK COMPUTING ENVIRONMENT AND COMPUTER-READABLE RECORDING MEDIUM CONTAINING TROUBLESHOOTING PROGRAM
US20080134210A1 (en) Distributed Computer
CN112527619A (en) Analysis link calling method and system based on directed acyclic graph structure
CN115987858A (en) Pressure testing method of block chain network and related equipment
Ramakrishna et al. A platform for end-to-end mobile application infrastructure analytics using system log correlation
Ehrenstein Distributed sensor management for an Industrial DevOps monitoring platform
CN117272321A (en) Method and system for detecting abnormal state under Ethernet DApp chain

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110323

Termination date: 20121020