Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, specific embodiments of the invention are elaborated below in conjunction with accompanying drawing.At this, illustrative examples of the present invention and explanation thereof are used to explain the wood invention, but not as a limitation of the invention.
The invention provides a kind of active fast fault locating method of Web application system module.Fig. 1 is a kind of Web application system of the present invention Fault Locating Method flow chart, and as shown in Figure 1, this method mainly is divided into three steps: step 101, and set up expansion and rely on matrix model; Step 102 is carried out and is surveyed; Step 103, the fault reasoning analysis.Illustrated respectively below.
(1) step 101 is set up expansion and is relied on matrix model
Most Fault Locating Method is by setting up the foundation of fault propagation model as fault location.In system failure propagation model, there are two key concepts of symptom and fault: (1) symptom, when the variety of issue of function or service logic appears in internal system, the information of the various reflection system current state that shows; (2) fault, the basic reason that described symptom occurs.Though fault generally is unobservable, can carry out reasoning to determine the root place of fault according to external, observable symptom.
The Web application system often is made up of one group of separate functional module.The service logic of the realization system that cooperates mutually between the module, described module is general external to be provided limited interface interchange and often realizes that by the message-oriented middleware of standard intermodule is mutual.When the Web system module breaks down, with a kind of external professional symptom that can directly be observed of this module related Web service request result for described fault.
Putting before this, the present invention carries out modeling to concrete Web application system, by functional module and the service request dependence between the two, set up fault point possible in the system and the corresponding relation model between the service request, the dependence matrix model that is expanded is so that carry out the location of module failure.The present invention sets up shown in the expansion dependence matrix model detailed step flow chart as Fig. 2, mainly may further comprise the steps:
Step 201: obtain the functional module set of Web application system, separate between each module.
Separate being meant between described each module: the inside modules encapsulation realizes the components of system as directed logic of business functions, has independently state, externally limited interface interchange only is provided or realizes that by the message-oriented middleware of standard intermodule is mutual.
Step 202: enumerate all available probe requests thereby set, each is separate between surveying, and the probe requests thereby set need cover all modules and the call relation between the module as far as possible.
Described detection is meant: in the Web application system, client is initiated a Web service request, and server end is handled this Web service request incident and once complete end to end mutual to the client return results.If once survey the service logic design of the compliance with system of returning alternately as a result, then assert and survey successfully, otherwise assert and survey failure.The detection embodiment that " adds commodity to shopping cart " such as an e-commerce website is: the user selects commodity on the Web page, and click " adding shopping cart to ", server responds this request and return results: add successfully or add and fail or other are unusual.To return results, if commodity at frame in stockit is available and result for add success or commodity are temporarily out of stock and the result for adding failure, then assert and survey successfully, otherwise assert that detection fails.
Separate being meant between described each surveyed: success or the failure of surveying A arbitrarily do not rely on any B detection, and only relevant with the functional module that provides described A to survey service logic.Be arbitrary detection regardless of its execution result, can not change the follow-up behavior of system.Web application system system satisfies this condition usually.
Described probe requests thereby set is meant: at a concrete Web application system, the actual user can send all possible Web request to the Web application system.The kind of probe requests thereby is many more, and is just comprehensive more to the covering of call relation between the functional module of Web application system and the module, just high more to system module failure location accuracy accordingly.
Step 203: take out single module fault point and module invokes fault point respectively from the call relation of module collection and intermodule.
In order to carry out fault location, need from goal systems, take out one group of fault point (least unit of fault location).Should be separate between the fault point, each fault point has two separate states: break down or do not have fault.
Fault point of the present invention is divided into two classes: single module fault point and module invokes fault point; The single module fault of described single module fault point correspondence is meant certain module generation internal logic fault, makes that other modules can be failed to calling all of it arbitrarily; The module invokes fault of described module invokes fault point correspondence is meant certain fault of externally calling or be called of generation of certain module, makes the particular module failure of it being called or be called.
In the present invention, the separate functional module of Web application system is mapped as the single module fault point; The intermodule call relation is mapped as the module invokes fault point.When having different call relations between two modules, promptly a module is called another module in a different manner, then every kind of call relation all is abstracted into the module invokes fault point.Such as the embodiment of modules A and B (there are two kinds of call relations to B in A), can draw Fab.1, the module invokes fault point that Fab.2 is such during abstract fault point.
Step 204: determine the dependence between probe requests thereby set and the fault point set.
Each surveys the service logic according to its realization, and always depending on regularly provides the correlation function of this service logic module, and irrelevant with other modules.At this moment, there is dependence in the described detection fault point corresponding with described functional module.On the one hand, the state that is relied on the fault point can influence result's (success or failure) of detection; Another reverse side, the result of detection has reflected the running status (break down or do not have fault) of the fault point of its dependence, fault location of the present invention is exactly to come which module of analyzing and positioning to have fault according to the dependence of result of detection and detection and fault point.
Step 205:, set up expansion and rely on matrix according to probe requests thereby set and fault point dependence;
By probe requests thereby set and fault point dependence, set up the functional module position that to break down in the system and the corresponding relation model between the service request.
Traditional dependence matrix is to be made of probe requests thereby set and the set of single module fault point.
As shown in table 1 for relying on an embodiment of matrix.
|
F1 |
F2 |
F3 |
F4 |
F5 |
P1 |
1 |
0 |
1 |
0 |
0 |
P2 |
1 |
0 |
1 |
0 |
0 |
P3 |
1 |
0 |
1 |
1 |
0 |
According to shown in the table 1, described dependence matrix comprises five fault points of F1~F5 and four detections of P1~P4.Matrix crosspoint value be represented in 1 o'clock corresponding survey and the fault point between exist and rely on, survey with the fault point and do not have dependence otherwise value is 0 expression.P1 calls F3 by F1 and realizes that promptly P1 depends on F1 and F3.If F1 or F3 break down, then P1 surveys failure probably, if instead P1 surveys failure, then illustrates to have at least a fault point to break down among F1 and the F3.
According to relying on matrix model,, can infer that module failure point that described detection relies on certainly exists at least one module and breaks down if survey failure.Otherwise, survey successfully, can not infer that but the module failure point that described detection relies on does not break down.Only embodied in certain specific calling in the context such as the module invokes fault, other business that do not influence module provide.In this case, can't from successfully survey, obtain fault location information.
It is the expansion that tradition relies on matrix that expansion relies on matrix, is made of probe requests thereby set and fault point set, and the fault point set not only comprises and also comprises the module invokes fault point in the single module fault point.If survey and depend on certain functional module, then described detection depends on the single module fault point of described functional module correspondence; If detection can cause that calling between some functional module is performed, then described detection depends on the single module fault point and the described module invokes fault point of calling correspondence of described functional module correspondence.
Rely on matrix model according to expansion,, can infer that fault that described detection relies on certainly exists at least one module and breaks down if survey failure.Otherwise if survey successfully, because the context relation of surveying of calling determines that all not breaking down in the fault point that then can push away described detection dependence, compares traditional dependence matrix, the accuracy of fault location is improved significantly.
Implement shown in the illustration as dependence of Fig. 3 the present invention, described embodiment comprises A, B, three modules of C and two detections of P1~P3.P1 calls B by A and realizes; P2 calls C interface 1 by A and realizes; P3 calls C interface 2 by A and realizes.Then P1 relies on single module fault point Fa, Fb and calls fault point Fab; P2 relies on single module fault point Fa, Fc and calls fault point Fac1; P3 relies on single module fault point Fa, Fc and calls fault point Fac2.
As shown in table 2 is the dependence matrix that generates according to Fig. 3 dependence embodiment.
|
Fa |
Fb |
Fc |
P1 |
1 |
1 |
0 |
P2 |
1 |
0 |
1 |
P3 |
1 |
0 |
1 |
As shown in table 3 is to rely on matrix according to the expansion that Fig. 2 dependence embodiment generates.
|
Fa |
Fb |
Fc |
Fab |
Fac2 |
Fac2 |
P1 |
1 |
1 |
0 |
1 |
0 |
0 |
P2 |
1 |
0 |
1 |
0 |
1 |
0 |
P3 |
1 |
0 |
1 |
0 |
0 |
1 |
Shown in table 2 and table 3, when result of detection is P1 success, P2 failure, P3 success, then relying under the matrix model, the P2 failure: infer fault point Fa, Fc break down at least one; P1 success: illustrate that A calls the B success; P3 success: illustrate that A calls the C success.If P1, P3 success can infer that Fa, Fb, Fc all do not break down, then repel each other with the conclusion of P2 failure deduction.Do not obtain embodying in relying on matrix because call context in the fault point, the successful detection that relies in the matrix not can be used as the foundation that fault is inferred.And relying under the matrix model P2 failure in expansion: infer fault point Fa, Fc, Fac1 break down at least one; P1 success: determine owing to call context, can infer that fault point Fa, Fb, Fab1 that P1 relies on do not break down; P3 success: determine owing to call context, can infer that fault point Fa, Fc, Fac2 that P3 relies on do not break down.In conjunction with the inferred results of result of detection set, can infer that the unique fault point that may break down is Fac1, promptly modules A is called the interface 1 of module C and is broken down, and the result who verifies P1~P3 conversely is consistent.Therefore, expansion relies on matrix model and compares the dependence matrix model, and fault localization accuracy improves a lot.
(2) step 102 is carried out and is surveyed
After the good expansion of foundation relies on matrix model, by the probe requests thereby set in the execution dependence matrix, can obtain one group of result of detection set, reflection system current state is as the foundation of fault reasoning analysis.
(3) step 103, the fault reasoning analysis
After carrying out the probe requests thereby set that relies in the matrix, the present invention carries out return results by analyzing the probe requests thereby set, according to the dependence between detection and the fault point in the expansion dependence matrix, filter out a series of fault point set that may break down, and all possible breakdown points that gather described fault point are carried out possibility calculate, by the possibility size descending output is carried out in the fault point again, the user can select wherein previous or a plurality of results as final fault reasoning analysis result.
Reasoning analysis method of the present invention mainly comprises following two steps: (1) filters possible breakdown point step: rely on matrix model according to result of detection set Ps and expansion, filter out the fault point that institute might break down and the failure detection number of each fault point correspondence.Wherein, failure is surveyed and is inferred the fault point that may break down, successfully surveys the possibility of fixing a breakdown and a little breaking down.(2) ordering possible breakdown point step: survey number according to the failure of each fault point correspondence and carry out fault point descending from high to low output, by described failure survey number embody the generation of each fault point the possibility size.
Be illustrated in figure 4 as the present invention and filter possible breakdown point detailed step flow chart:
Step 401, one two dimensional filter output of initialization is Fs as a result, and the first dimension fault point is made as sky, the second dimension counter Cf zero clearing, one of initialization is successfully surveyed and relied on fault point S set s is empty, enters step 402.
Step 402 judges whether result of detection set Ps is empty, if for sky then enter step 408, otherwise enters step 403;
Step 403 is taken out a result of detection P from result of detection set Ps, enter step 404;
Step 404 is judged result of detection P whether for successfully surveying, if for being then enter step 405, otherwise enters step 406;
Step 405 relies on matrix model according to expansion, and the fault point that P is relied on joins Ss, enters step 407;
Step 406 relies on matrix model according to expansion, and the fault point that P is relied on adds Fs, and the corresponding counter in each fault point adds 1, enters step 407;
Step 407 is deleted P, and is returned step 402 from result of detection set Ps;
Step 408 is judged successfully and to be surveyed whether rely on fault point S set s be empty, if for being then enter step 412, otherwise enters step 409;
Step 409 is taken out a fault point F from successfully surveying to rely on the S set s of fault point, enter step 412;
Step 410, the F among the corresponding described Ss, with two dimensional filters output as a result among the Fs corresponding value be that the fault point of F empties, corresponding counter O reset enters step 411;
Step 411 is deleted F, and is returned step 408 from successfully surveying to rely on the S set s of fault point;
Step 412 finishes the fault filtering flow process and returns fault filtering Fs as a result.
Be illustrated in figure 5 as the present invention's possible breakdown point detailed step flow chart that sorts:
Step 501, a results of two-dimensional ordination formation of initialization Fq is made as sky with the first dimension fault point, and number Cq zero clearing is surveyed in the second dimension failure, enters step 502;
Step 502, whether failure judgement filter result Fs is empty, if for sky then enter step 505, otherwise enters step 503;
Step 503 is chosen the maximum fault point F of a counter Cf value in Fs, this fault point F is added the Fq tail of the queue, and adopts Cf to the Cq assignment, enters step 504;
Step 504 is deleted in Fs this fault point F, corresponding counter Cf zero clearing, and turn back to step 502.
Step 505 finishes the fault filtering flow process and returns fault sequencing queue Fq.
Below be the embodiment of the fault location of an ecommerce Web application system.This application system is a simple electronic retailing system, described embodiment only relates to client and logins a kind of shopping cart shopping service logic of doing shopping to and submitting to shopping cart to log off to order again, and therefore the system component among the described embodiment may be less than concrete electronic retailing system.According to Fault Locating Method of the present invention, at first this system is carried out modeling, take out expansion and rely on matrix, carry out the detection back and carry out accident analysis according to result of detection.
(1) set up expansion and rely on matrix model, comprise following five steps:
Step 1: obtain the separate functional module set of Web application system.
Separate associated component of present embodiment or module module are as follows:
A---client's module: stores processor Customer Information;
B---warehouse module: the module of stores processor merchandise news;
C---controller module: receive external user interface message, and transmit message to each functional module;
D---shopping cart module: the shopping cart information of this session of stores processor client;
E---order module: the sequence information of this session of stores processor;
More than five functions of modules separate.
Step 2: enumerate all available separate probe requests thereby set.
According to the service logic of ecommerce shopping, determine to carry out and separate detection set: the user can be carried out the detection that all operations can be carried out as us, and the set of present embodiment row probe requests thereby is as follows:
P0---client's accessing system: controller module (C) receives user login information, mails to client's module (A) checking, by then creating client's session instance;
P1---add commodity first to shopping cart: controller module (C) adds the commodity request first according to the user, calls shopping cart module (D), for client creates the shopping cart session instance; Obtain merchandise news and add shopping cart module (D) to from warehouse module (B);
P2---add commodity to shopping cart: controller module (C) adds the commodity request according to the user, obtains merchandise news and adds shopping cart module (D) to from warehouse module (B);
P3---remove commodity from shopping cart: controller module (C) removes the commodity request according to the user, removes commodity from shopping cart module (D), and will remove commodity amount information and return warehouse module (B);
P4---submit shopping cart to: controller module (C) is submitted the shopping cart request to according to the user, Customer Information in commodity in order module (E) submission shopping cart module (D) in the shopping cart session instance, the client's module (D) in client's session instance, create user's order, after commodity were submitted to, the shopping cart session instance emptied;
P5---client logs off: controller module (C) calls client's module (A) deletion client's session instance and calls shopping cart module (D) deletion shopping cart session instance according to user's request.
Step 3 takes out the fault point from the call relation of module collection and intermodule
Five modules of A~E are abstract respectively to be five single module fault point Fa~Fe.
Controller module C creates client's session instance to the control login authentication of client's modules A and publishes abstract module invokes fault point: the Fca of being of calling logic of deletion client session instance;
Controller module C is module invokes fault point: Fcb to the preemption of warehouse module B/cancellation preemption commodity calling logic is abstract;
Controller module C to the interpolation of shopping cart module D/remove commodity, submit that the calling logic of shopping cart aspect shopping cart operation is abstract to be module invokes fault point: Fcd1 to;
Controller module C is abstract to the calling logic of generation/deletion shopping cart session instance of shopping cart module D to be module invokes fault point: Fcd2
Controller module C is abstract to the calling logic of the submission shopping cart generation order of order module E to be module invokes fault point: Fce;
Step 4 is determined the dependence between probe requests thereby set and the fault point set
P0 relies on the single module fault point: Fa, Fc, and Depending module calls the fault point: Fca;
P1 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd2;
P2 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd1;
P3 relies on the single module fault point: Fb, Fc, Fd, and Depending module calls the fault point: Fcb, Fcd1;
P4 relies on the single module fault point: Fa, Fc, Fd, Fe, and Depending module calls the fault point: Fca, Fcd1, Fce;
P4 relies on single module fault point Fa, Fc, Fd, and Depending module calls the fault point: Fca, Fcd2.
Step 5 is set up expansion and is relied on matrix
Dependence according between described probe requests thereby set and the fault point set can draw following expansion and rely on model.
|
Fa |
Fb |
Fc |
Fd |
Fe |
Fca |
Fcb |
Fcd1 |
Fcd2 |
Fce |
The P0 accessing system |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
P1 adds commodity first to shopping cart |
0 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
P2 adds commodity to shopping cart |
0 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
P3 removes commodity from shopping cart |
0 |
1 |
1 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
P4 submits shopping cart to |
1 |
0 |
1 |
1 |
1 |
1 |
0 |
1 |
0 |
1 |
P5 logs off |
1 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
(2) carry out detection and fault reasoning analysis
The fault reasoning analysis embodiment that the single module fault point Fa of a shopping cart module D breaks down is as follows:
Carrying out the result of detection set of obtaining after the described probe requests thereby set is: P0---success, P1---failure, P2---failure, P3---failure, P4---failure, P5---failure.
Step 1: filter the possible breakdown point.At first surveying the Fs that obtains according to failure is { (Fa, 2), (Fb, 2), (Fc, 5), (Fd, 5), (Fe, 1), (Fca, 2), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }, be that { Fca} deletes fault point corresponding among the Fs, so the final result Fs of fault filtering is { (Fb, 2) for Fa, Fc then according to successfully surveying the fault point S set s that relies on, (Fd, 5), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }.
Step 2: ordering possible breakdown point.At Fs is { (Fb, 2), (Fd, 5), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fcd2,2), (Fce, 1) }, thus last diagnostic as a result Fq be (Fd, 5) in proper order, (Fcd1,3), (Fb, 2), (Fcb, 2), (Fcd2,2), (Fe, 1), (Fce, 1).Therefore, the possibility maximum that Fd breaks down, promptly the possibility maximum of single module fault takes place in module D, breaks down consistent with the single module fault point Fa of prerequisite shopping cart module D.
The fault reasoning analysis embodiment that the module invokes fault point Fcd1 of a shopping cart module D breaks down is as follows:
Carrying out the result of detection set of obtaining after the described probe requests thereby set is: P0---success, P1---success, P2---failure, P3---failure, P4---failure, P5---success.
Step 1: filter the possible breakdown point.At first survey the Fs that obtains and be { (Fa, 1), (Fb, 2), (Fc, 3) according to failure, (Fd, 3), (Fe, 1), (Fca, 2), (Fcb, 2), (Fcd1,3), (Fce, 1) } be { Fa then according to successfully surveying the fault point S set s that relies on, Fc, Fd, Fca, Fcd2}, corresponding fault point among the deletion Fs, so the final result Fs of fault filtering is { (Fb, 2), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fce, 1) }.
Step 2: ordering possible breakdown point.At Fs is { (Fb, 2), (Fe, 1), (Fcb, 2), (Fcd1,3), (Fce, 1) }, thus last diagnostic as a result Fq be (Fcd1,3) in proper order, (Fb, 2), (Fcb, 2), (Fe, 1), (Fce, 1).The possibility maximum that Fcd1 breaks down breaks down consistent with the module invokes fault point Fcd1 of prerequisite shopping cart module D.
Comprehensive above-mentioned steps, in a concrete e-commerce website, at a kind of shopping cart shopping service logic carried out once intactly set up expansion rely on matrix model, carry out survey, the fault location process of accident analysis reasoning.In concrete complete Web application system application layer fault location, the basic principle of reasoning is identical with above-mentioned steps, and the method that proposes in can be according to the present invention can be carried out fault location quickly and accurately.
Beneficial effect of the present invention is, at first, the fault location model extension that the present invention proposes relies on matrix and can solve tradition and rely on matrix and survey with the fault point and can not form deterministic relation (functional unit may with various faults or relevant with a plurality of faults), the situation that the fault diagnosis result precision is not high, expand the deduction foundation that the symptom data provide, thereby obviously improved the accuracy of diagnosis.Secondly, Fault Locating Method of the present invention is the active probe mode, do not need described system transformed or increase functional unit or system state data to transmit the message that link is followed the tracks of Web application system inside, do not need extra maintenance, it is less to have the cost of realization, implement simply the advantage that is easy to safeguard.In addition, because the independence between active probe mode and the tested Web application system, configuration variation that can fast adaptation Web service logic, the while guarantees the accuracy of symptom data to a certain extent, the accuracy that helps fault location improves fault diagnosis efficiency and accuracy.
Above-described embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is the specific embodiment of the present invention; and be not intended to limit the scope of the invention; within the spirit and principles in the present invention all, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.
1?Mike?Y.Chen,Emre?Kiciman,Eugene?Fratkin,Armando?Fox,Eric?Brewer:Pinpoint:problemdetermination?in?large,dynamic?Internet?services.Proceedings?of?the?International?Conferenceon?Dependable?Systems?and?Networks(DSN’02)IEEE,2002
2?Gunjan?Khanna,Ignacio?Laguna,Fahad?A.Arshad,Saurabh?Bagchi:DistributedDiagnosis?ofFailures?in?a?Three?Tier?E-Commerce?System.26th?IEEE?International?Symposium?on?ReliableDistributed?Systems
3?D?Oppenheimer,A?Ganapathi,DA?Patterson:why?do?internet?services?fail?and?what?can?be?doneabout?it.Proceedings?of?USITS’03:4th?USENIX?Symposium?on?Internet?technologies?andSystems?Seattle,WA,USA?March26-28,2003
4?Irina?Rish,Mark?Brodie,Sheng?Ma,Natalia?Odintsova,Alina?Beygelzimer,Genady?Grabarnik,Karina?Hernandez:Adaptive?diagnosis?in?distributed?systems.IEEE?Transactions?on?neuralnetworks,vol.16,NO.5,September?2005