CN104835015B - Workflow mining method based on predecessor task - Google Patents
Workflow mining method based on predecessor task Download PDFInfo
- Publication number
- CN104835015B CN104835015B CN201510272608.4A CN201510272608A CN104835015B CN 104835015 B CN104835015 B CN 104835015B CN 201510272608 A CN201510272608 A CN 201510272608A CN 104835015 B CN104835015 B CN 104835015B
- Authority
- CN
- China
- Prior art keywords
- task
- tasks
- workflow
- log
- predecessor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 159
- 238000005065 mining Methods 0.000 title claims abstract description 51
- 230000008569 process Effects 0.000 claims description 59
- 230000001364 causal effect Effects 0.000 claims description 34
- 239000002243 precursor Substances 0.000 claims description 23
- 238000001914 filtration Methods 0.000 claims description 12
- 238000007781 pre-processing Methods 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 5
- 238000010304 firing Methods 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 2
- 230000037431 insertion Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 238000005303 weighing Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 19
- 238000004458 analytical method Methods 0.000 abstract description 6
- 230000007704 transition Effects 0.000 description 23
- 238000005516 engineering process Methods 0.000 description 11
- 238000009412 basement excavation Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010411 postconditioning Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses the Workflow mining methods based on predecessor task to be analyzed by task in analysis event log including predecessor task in the event log to workflow;It is input with event log, the Work flow model with Petri network description is output result;This method uses the event log based on predecessor task, and it includes the information of predecessor task that is, in event log that it is the input of current task that predecessor task, which refers to that current task executes the set for the task that the preceding needs relied on are completed,;The formal definitions of the event log of predecessor task are as follows: T is task-set, and T* is the task sequence comprising n task, and E=[θ] T is event set on the basis of task-set T;Predecessor task sequence is expressed as σ ∈ E*, and the event log of predecessor task is expressed as WE*.The present invention theoretically proposes novel method for digging, and all realizes actual tool on Activiti platform and ProM platform.
Description
Technical Field
The invention belongs to the technical field of workflows, in particular to a workflow mining technology in the technical field of workflows, and discloses a technology for mining a workflow process model from a workflow log.
Background
A workflow process is defined as a whole or partial business process that transfers files, information or activities from one participant to another according to a series of procedures or rules. A workflow system is an automated system for centrally managing workflows. Most information systems now use a defined workflow model to describe task relationships and maintain the entire business process. However, as more business processes and more complex single business processes are performed, the workflow model inevitably has the problems of low efficiency and even errors. There is a need for monitoring and improving business processes, and these needs all require obtaining the true behavior of the workflow model.
Workflow mining techniques aim to solve the above problems. The workflow mining technology effectively analyzes a large amount of data accumulated in the workflow execution process to obtain the operation conditions of real scene personnel and the workflow process, and provides support for monitoring and analyzing a workflow model in a later stage, as shown in fig. 2, the practical significance of the workflow mining technology is shown in the dotted line frame part in fig. 2. The workflow mining technology reversely deduces a corresponding workflow process model by analyzing the event log. The invention only considers the condition that the event log information is complete and no noise exists, and does not consider the possible incompleteness of the event log information and the condition of information error.
Workflow mining is a technique of reversely deducing a process model from an event log (execution sequence), and then expressing the relationship between tasks in a certain way (currently, a Petri net is generally used for describing the whole workflow model). Workflow mining is therefore a technical problem, the key of which is how to extrapolate process models from event logs backwards. An event log is a collection of event traces, each trace consisting of a plurality of events. The workflow mining technology analyzes event logs and calculates the relationships among tasks, mainly including causal relationships, selection relationships and concurrency relationships, and then reversely deduces a process model according to the relationships, wherein the Petri network is only one means for realizing the technology.
At present, the workflow net is a popular modeling method in the field of workflow process modeling, and the workflow net is a special Petri net which can clearly describe the sequence, selection, cycle, concurrency and synchronization structure in the process model, and has the advantages of formal semantics, intuitive graphical representation, easy understanding, solid mathematical theory base, mature analysis technology and the like in terms of describing the process model, so the Petri net is a mature and popular process modeling tool, and the Petri net is structurally a triple PN (P, T, F) which is a set of arcs between a library and a transition, wherein P is a set of places (place), T is a set of transitions (transition), and P ∩ T phi, F phi (PxT) ∪ (TxP) is a set of arcs between the library and the transition, and x { y ∪ T (y, x phi) represents a set of arcs (P, x F) or a set of arcs between the library and the transition (P, x phi) represents a set of arcs of a front transition or a back (F) of the front or back (∪ F).
Compared with the common Petri network, the workflow network has two special conditions, namely two special library sites in the workflow network are respectively called an initial library site i and an end library site o, the initial library site has no input, and the end library site has no output, the second condition is that an auxiliary transition T is added between the library site o and the library site i, and the formed extended model PN ═ (P, T ∪ { T }, F ∪ { (o, T), (T, i) }) is in strong communication.
According to the Petri net theory, the executable condition of a task (represented by a transition) in a workflow net is that each preposition (token) of the transition corresponding to the task is called as an ignitable condition, and is sometimes called as an enabled condition (enabled). The firing rules for one task (transition representation) are: one token is removed from each of all input banks in which a transition to a fire occurs, and one token is added to each of all output banks in which a transition to a fire occurs. Corresponding to the workflow system, the execution steps of one task are as follows: and judging the pre-condition, executing the task and setting the post-condition. Preconditions are preconditions that a task can execute, i.e. the executable conditions of the task, and a task can only be executed if all executable conditions are available. Post-conditioning refers to some kind of benevolent processing done by a task after completion of the task and before the task is finished, which may inform the completion of the whole process and may also set pre-conditions for its successor.
The workflow engine executes the workflow by parsing the "workflow definition". The workflow process defines activities and relationships between activities in an XML manner. As shown in FIG. 3, workflow Process is used to define the flow information for a workflow and Activities is used to define the active set of the flow. Each Activity element in Activities is used to define a single Activity, each Activity being uniquely identified using an Id. Transitions are used To define the Transition process between activities, where each Transition within represents the Transition process between two activities, the From attribute represents the start activity of the Transition, and the To represents the end activity of the Transition. Thus, through the From and To information in Transition, the workflow engine, while analyzing the process definition and deciding the execution of a task, can record all predecessor tasks that the current task has dependently ended.
Workflow mining is a technology for reversely deducing a process model from an event log (execution sequence), if a workflow network of the reversely deduced process model is described, the essence of the workflow mining is a technology for constructing the workflow network in the direction of the event log (execution sequence), in the triple structure PN (P, T, F) of the workflow network, a transition set is directly composed of task sets in the workflow log (execution sequence), so that mining work becomes mining a library set and a connecting arc between the library set and the transition set, and the reverse technology needs to analyze the task relationship (relationship), the existing α method, α method+Process, α++Both the method and the β method are designed based on this idea.
At present, the workflow mining method based on the event log mainly comprises α method, α method+Process, α++Methods, β methods and χ methods, wherein α methods, α+Methods and α++Events in the method Log are simply task names, and events in the β method Log contain the beginning and ending of a taskBeam information the α method can only handle process models constrained by SWF mesh structure and cannot handle short loop structures, implicit causal dependency structures and implicit library structures α+The method expands the digging capability of the α method, which can dig the short-cycle structure α++The method further expands the mining capability of the α method, can mine most implicit causal dependency structures, introduces a new event type in the β method, can mine a process model and a short loop structure which meet the structural constraint of the SWF network, but cannot process the implicit causal dependency structures and the implicit library structures.
Although most known workflow mining methods consider event types, such as timestamps, operators, and the like, existing workflow mining methods mine causal relationships and concurrency relationships between tasks by analyzing task neighbors in event logs, and further mine selection relationships between tasks++The mining capability of the method is strongest, and the method can not process the structure of an implicit library although the method can mine an SWF structure, a short-loop structure and most of implicit causal dependency structures, and α++When the method is used for mining the implicit causal dependency structure, complex logic task relation analysis is needed, and therefore the complexity of the method is greatly improved. The χ method has significant advantages in the scope of the mineable structure and the performance of the mining method, but is inferior to the method proposed in this patent in the scope of the mineable log and the logging performance.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: a workflow process mining method is provided that mines predecessor tasks that contain information of predecessor tasks (already completed tasks/resources upon which a current task executes). The method can expand the excavation range of the workflow excavation method, simplify the causal dependency relationship and potential concurrency relationship in the excavation workflow model, and can excavate incomplete logs with incomplete log information (called precursor complete logs in the patent).
The technical scheme of the invention is as follows: a workflow mining method based on precursor tasks is characterized in that the precursor tasks in an event log of a workflow are analyzed by analyzing the tasks in the event log; taking an event log as input and a workflow model described by a Petri network as an output result; the method uses an event log based on a precursor task, wherein the precursor task refers to a set of tasks needing to be completed and depended on before the current task is executed, and is input into the current task, namely the event log contains information of the precursor task; the formalization of the event log of the predecessor task is defined as: t is a task set, T is a task sequence containing n tasks, E ═ θ ] T is an event set on the basis of the task set T; the predecessor task sequence is denoted as σ ∈ E @, and the event log of the predecessor task is denoted as WE @.
The overall flow of the excavation method is shown in fig. 1, and includes the following steps (shown in fig. 4):
(1) initializing a return value N of the process based on a workflow model described by the Petri network, wherein the N is collected by a library P according to the structural definition of the Petri networkWTask set TWAnd arc set FWForming;
(2) analyzing the event log W and calculating a task set TWStart task TIAnd end task TO(ii) a Setting an initial value N ═ P (for the workflow process model to be mined)W,TW,FW) In which P isW=TW=FWPhi is defined as; analyzing the event log W and calculating a task set TWStart task TIAnd end task TO;
(3) Extraction ofPerforming single-step circulation to obtain a Hash table HT; set of preprocessing tasks TW;
(4) According to the relation between tasks, a task relation set X is calculatedW;
(5) Removing task relationship set XWThe redundant elements in the system calculate a final task relation set YW;
(6) According to YWCalculating out of stock set PW;
(7) According to YWAnd PWCalculating an arc set FW;
(8) Single step loop insertion of HT into FW;
(9) Returning a working flow process model N described by the Petri network;
(10) a flow diagram is presented using a tool according to the process model N.
The workflow mining method based on the predecessor task is a complete set of workflow mining methods, and the mining method comprises the 10 steps listed above (in fig. 1). The key point of the patent claim protection is that the unique step (3) and its substep, step (4) and its substep, step (8) in the whole process specifically includes the following steps:
(3) extracting single step circulation, storing in HT, preprocessing task set TW;
(4) According to the relation between tasks, a task relation set X is calculatedW;
(8) Inserting a single-step loop in the hash table HT into FWPerforming the following steps;
the step (3) is specifically refined and comprises the following steps:
(3-1) constructing a set of preprocessing tasks TWA task pair St formed by two tasks in the task pair;
(3-2) defining a hash table HT for storing the single-step loop;
(3-3) traversing all task pairs in St, finding out the task pairs respectively started and ended by a certain task a, inserting into HT in a mode of HT { (task pair) } { (a) } and simultaneously, at TWExcluding the task a.
The step (4) is specifically refined and comprises the following steps:
(4-1) slave task set TWConstruct all task relationship sets XA;
(4-2) use of causal dependence on XAFiltering to obtain XB;
(4-3) Using non-causal dependence, potential concurrency and relaxed potential selection, on XBFiltering to obtain XC;
(4-4) Using strict selection relationship, for XCFiltering to obtain the final relation set XW;
A workflow mining method based on precursor tasks defines a series of relationships between related tasks different from other methods. These relationships are used in the steps of claim 1. These relationships include causal dependencies, non-causal dependencies, potential concurrencies, non-causal dependencies, loose potential selection relationships, and strict potential selection relationships. Specific definitions of these relationships are described below:
(A) causal dependency (labeled a →wb) The method comprises the following steps At [ theta ]]If b belongs to sigma and sigma belongs to W and a belongs to theta, the task b depends on the task a;
(B) non-causal dependency (labeled as(a→wb) ): the inverse of the causal dependency, i.e. task a and task b do not satisfy a →w b;
(C) Potential concurrency relationships (labeled a/vs)wb) The method comprises the following steps Task a and task b are arbitrarily satisfiedOne of the following two conditions:
(C-1) under the condition that [ theta ] t belongs to sigma, a belongs to theta and b belongs to theta; or,
(C-2) at a certain log sequence σ ═ t1t2t3…tnIn, there are two logs [ θ ]1]a ∈ σ and [ θ ]2]b belongs to sigma and satisfies theta1∩θ2Phi and b immediately follows a.
(D) Non-potential concurrency relationships (labeled as(a//wb) ): task a and task b do not satisfy a/Hw b;
(E) Relaxed potential selection relationship (labeled as a #)Lb) The method comprises the following steps The task a and the task b do not exist in a certain log sequence at the same time;
(F) strict potential selection relationship (marked as a #)Sb) The method comprises the following steps Tasks a and b satisfy the following two conditions simultaneously:
(F-1) tasks a and b satisfy the relaxed selection condition, i.e., a #Lb, and
(F-2) tasks a and b exist in two log sequences σ, respectivelyiAnd σjAnd the intersection of the predecessors of the two tasks is empty, i.e. θa∩θbPhi, that is to say that the two tasks do not have a common predecessor task.
The workflow mining method based on the predecessor tasks can mine incomplete logs, so that the capability is remarkably improved compared with other methods in the field. The log range which can be mined by the method is named as a precursor complete log, and a formal definition of the precursor complete log is given. The formalization is defined as follows:
(1) n ═ P, T, F is a reasonable SWF structure, i.e.W is the junctionForming N workflow logs, i.e.And each log sequence σ ∈ W is a structure from the starting state [ i ∈ W ]]Start to end state [ o]A terminated firing sequence; if and only if W simultaneously satisfies the following conditions (2), (3), (4); weighing W as a precursor complete log;
(2) for any task t e N, a log sequence sigma e W exists, so that an event [ theta ] t e sigma exists in the log sequence; that is, each task occurs at least once in the log;
(3) for any two tasks a and b, if in the actual workflow model N the tasks a and b are alternatives behind a certain library p, then in the workflow log W a and b must satisfy the alternatives, i.e. a #S b。
(4) For any two tasks a and b, if in the actual workflow model N, the tasks a and b are immediately followed by a certain task tiSubsequent concurrency relationships, then in the workflow log W, there must be a log sequence σ ═ t1t2t3…tnE is W; so that there is an event [ theta ] in the log sequence1]tkE σ and [ θ [ [ epsilon ]2]tk+1E sigma satisfies theta1∩θ2Not equal to phi and tk=a,tk+1B; that is, tasks a and b must occur adjacently once in a log sequence.
In the method, the step (3) (comprising the sub-steps) has the function of proposing the task of the single-step loop by utilizing the log rule generated by the single-step loop, so that the log to be excavated does not contain the single-step loop any more. After the whole workflow is mined, the tasks of the single-step loop are inserted into the Petri network, so that the tasks of the single-step loop are mined correctly.
In step (4) of the method, the task relationship in the log needs to be preprocessed, and the task relationship among all tasks in the log is calculated. The method for preprocessing the relationship between tasks in the log (as shown in FIG. 5) comprises the following steps: causal dependencies, non-causal dependencies, potential concurrency, non-causal dependencies, relaxed potential selection relationships, and strict potential selection relationships.
In accordance with the present disclosure, we have developed an "Activiti platform based predecessor task recording tool" and a "ProM platform based predecessor task mining tool". The former may be in an enterprise workflow management system using activti as a workflow engine, where the execution log of tasks in the system is recorded in the form of a predecessor task. The latter can dig out a process model from the input log information and visually display the process model in a flow chart in the form of a Petri net graphic element. (Note: Activiti is an open source workflow engine, official website ishttp://www.activiti.org/(ii) a ProM is an open source software project in the field of workflow mining, and the official website ishttp://www.processmining.org/)
The invention has the beneficial effects that: the method not only improves the mining capability of the workflow mining method (the structure of an implicit library can be mined, incomplete, namely precursor complete logs can be mined), but also simplifies the process of mining causal dependency relationships and potential concurrency relationships. All current process mining methods do not focus on this special structure because the implicit library does not affect the behavior of the workflow model. However, the implicit library shows redundant relationships between tasks, which may have performance and safety concerns to some extent. The method focuses on the structure of the implicit library, and can dig out part of the structure of the implicit library, so that better support can be provided for analysis, verification and monitoring of a workflow model; and meanwhile, the system also provides unique support relative to other methods for incomplete logs generated by a complex workflow system which cannot completely cover concurrent branches in a production environment.
Drawings
FIG. 1 is a flow diagram of a workflow mining method based on predecessor tasks.
FIG. 2 role of workflow mining technique in workflow management system.
Fig. 3 workflow process defines an example graph.
FIG. 4 is a main flow of a workflow mining method based on predecessor tasks.
FIG. 5 is a method for preprocessing relationships between tasks.
FIG. 6 is an example of a process of academic dissertation management.
FIG. 7 is a workflow model that includes an implicit library that can be mined according to an embodiment of the invention.
FIG. 8 is a flowchart of a development application tool according to example log information.
Fig. 9 is an architectural diagram of an activti platform based predecessor task recording tool.
Fig. 10 is a system design diagram of a precursor task mining tool based on a ProM platform.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. The description is intended to be exemplary only, and is not intended to limit the scope of the invention. Moreover, in the following description, descriptions of existing structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The invention mainly uses new event types and obtains all the relationships among tasks in the log through the relationship preprocessing among the tasks, and adds a correction step to a task relationship set on the basis of the α method, the whole process of the mining method is shown in figure 1, and the specific implementation is as follows:
1. the main flow of the method is shown in the upper half of fig. 4.
(1) Step 1, initializing a return value N (a workflow model described by a Petri network) of the process, wherein the N is defined by a library set P according to the structure of the Petri networkWTask set TWAnd arc set FWForming;
(2) step 2, analyzing the event log to calculate a task set TW(all the differently named tasks contained in the log), the starting set of tasks T for each execution trace σIAnd ending the task set TO;
(3) Step 3, extracting single step circulation and storing the single step circulation in a Hash table HT; set of preprocessing tasks TW;
(4) Step 4, according to the relation between tasks, calculating a task relation set XW;
(5) Step 5, by deleting X'WThe final task relation set Y is calculated by using the redundant elementsW。
(6) Step 6, calculating a library set P of the workflow modelWThe element is YWA set of elements in (1), a starting library location and an ending library location;
(7) step 7, according to PWAnd YWObtaining a set of transition arcs F of a workflow modelW;
(8) Step 8, inserting the single-step loop stored in HT into FW;
(9) Step 9, returning to the workflow model N;
(10) in the last step, a flow chart is presented using a tool according to the process model N.
Wherein, the step 3 of the method extracts a single step cycle as follows:
(3-1) step 1, constructing a task set TWA task pair St formed by all the tasks in the system in pairs;
and (3-2) step 2, defining a hash table HT for storing the single-step loop.
(3-3) step 3, traversing all task pairs in St, finding out the task pairs respectively started and ended by a certain task a, inserting into HT in a mode of HT { (task pair) } { (a) } and simultaneously, at T, inserting into HT in a mode ofWExcluding the task a.
Wherein, in the 4 th step of the method, a task relation set X is calculatedWAnd (6) carrying out the process. The process applies the inter-task relationship preprocessing method described in fig. 5, and the specific steps are as follows:
(4-1) step 1, from task set TWConstruct all task relationship sets XA。
(4-2) step 2, using causal dependence on XAFiltering to obtain XB。
(4-3) step 3, using the non-causal dependency relationship, the potential concurrency relationship and the loose potential selection relationship for XBFiltering to obtain XC。
(4-4) step 4, using strict selection relation, for XCFiltering to obtain the final relation set XW。
The practice of the invention is illustrated by the following specific examples.
An example of the present invention would mine the workflow model of FIG. 6 from the event log, consisting of 11 libraries, 12 transitions. The workflow model describes a management process of graduate academic paper, and mainly relates to the steps of internal and external review of the paper, answer recognition of the paper, review of the paper and the like. To facilitate the analysis, we need to map the example Chinese name to a mathematical symbol, as shown in Table 1. Table 2 shows an event log of an example of the academic dissertation management process, which is used as input data of the example of the present invention.
TABLE 1 comparison table of task numbers and Chinese names
Table 2 event Log of academic thesis management Process example
For this example, we will implement the method using the following steps:
1. initializing a return value N (a Petri Net described workflow model, defined according to the Petri Net structure, N is collected by a library PWTask set TWAnd arc set FWConstitution) such that PW=TW=FW=φ。
2. Obtaining event task set T from event logW={t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12Get the initial task set TI={t1And end task set TO={t12}。
3. Extracting single step circulation, storing the single step circulation in a Hash table HT and preprocessing a task set TWThe method comprises the following specific steps:
(1) constructing a task pair St formed by all tasks in the task set in pairs { (t)1,t2),(t1,t3),(t1,t4),…,(t11,t12)}。
(2) An empty hash table HT { }isdefined.
(3) All task pairs in St are traversed to find task pairs that start and end respectively by a certain task. Here t can be found4And t7Therefore, after this step is finished, HT { { t { [3,t5}=>{t4},{t3,t6}=>{t7}},TW={t1,t2,t3,t5,t6,t8,t9,t10,t11,t12}
4. According to the relation between tasks, a task relation set X is calculatedWThe method comprises the following specific steps:
(1) from task set TWConstructing all task relation sets:
XA={({t1},{t2}),({t1},{t3}),…,({t1,t2},{t3}),…,({t11},{t12})}。
(2) step 2, using causal dependence relation to XAFiltering to obtain:
XB={({t1},{t2}),({t3},{t5}),…,({t1,t2},{t3,t5}),…,({t11},{t12})}。
(3) step 3, using the non-causal dependency relationship, the potential concurrency relationship and the loose potential selection relationship to XBFiltering to obtain:
XC={({t1},{t2}),({t11},{t2}),({t3,t8},{t5})…,({t1,t2},{t3}),…,({t11},{t12})}.
(4) step 4, using strict selection relation to XCFiltering to obtain the final relation set XW={({t1},{t2}),({t11},{t2}),({t1,t11},{t2}),({t2},{t3}),({t3},{t5}),({t3},({t5},{t8}),({t6},{t8}),({t8},{t9}),({t9},{t11}),({t9},{t10}),({t9},{t10,t11}),({t10},{t12})}
5. According to step 6 of the main flow, correction task relationship set X 'is deleted'WTo obtain a final set of task relationships YWThe final set of task relationships is: { ({ t)1,t11},{t2}),({t2},{t3}),({t3},{t5}),({t3},{t6}),({t5},{t8}),({t6},{t8}),({t8},{t9}),({t9},{t10,t11}),({t10},{t12})}。
6. According to step 7 of the method and the final set of tasks YWThe method can obtain library set PWThe library is collected as follows: { iw,ow,p({t1,t11},{t2}),p({t2},{t3}),p({t3},{t5}),p({t3},{t6}),p({t5},{t8}),p({t6},{t8}),p({t8},{t9}),p({t9},{t10,t11}),p({t10},{t12})}. Wherein iwAnd owRespectively a starting library site and an ending library site.
7. Apply library Collection P according to step 8 of Main ProcessWAnd task set YWThe method obtains an arc line set FWThe arc set is: { (i)w,t1),(t1,p({t1,t11},{t2})),(p({t1,t11},{t2}),t2),…,(t12,ow)}。
8. Inserting a single step loop into FWIn the above, the complete arc set is obtained as follows: { (i)w,t1),(t1,p({t1,t11},{t2})),(p({t1,t11},{t2}),t2),(t4,p({t3},{t5})),(p({t3},{t5}),t4),(t7,p({t3},{t6})),(p({t3},{t6}),t7)…,(t12,ow)}}。
9. So far, the method completely obtains the workflow described by the Petri netModel N ═ PW,TW,FW)。
10. From the process model N, a flow chart is presented using tools, as shown in FIG. 8.
The workflow model N is obtained through the above steps, and the workflow model represented by the graph representation tool of the Petri net can be obtained through the graph representation tool of the Petri net. Although the model comprises the SWF structure, the short-loop structure and even the implicit causal dependency structure, the method can accurately mine the SWF structure, the short-loop structure and the implicit causal dependency structure. Of course, the method can also mine a workflow model as shown in FIG. 7, which contains the implicit library structure P1。
We have implemented the "predecessor task-based workflow mining method" as an Activiti platform-based predecessor task recording tool and a ProM platform-based predecessor task mining tool. The Activiti platform is a workflow engine widely applied to an enterprise workflow management system, and the method can be applied to actual enterprise workflow management by realizing a log recording tool of the method on the platform, so that input with production value is provided for the method. The ProM platform is a framework for workflow process mining and research that is widely used by both industry and academia, and has many tools for workflow process analysis and research. By implementing the excavation tool of the method of this patent on the platform, the method of this patent can be embodied and the output of the method of this patent can be visually presented.
We apply the proactive task recording tool based on the activti platform to the academic thesis management system (the core workflow model of which is shown in fig. 6), and can obtain the proactive task of the workflow (as shown in table 2), and the log will be stored on the disk in the form of text document. Then, a precursor task mining tool based on a ProM platform is used for the precursor task, a result corresponding to the log can be mined, a screenshot is run as shown in fig. 8, and it can be seen that the mined result is completely the same as that in fig. 6.
As a part of the specific implementation of the workflow mining method based on the predecessor task, a predecessor task recording tool based on an Activiti platform and a predecessor task mining tool based on a ProM are an enhancement and a supplement to the patent. Therefore, the architectural design of the two tools is supplemented by fig. 9 and 10, respectively, to enhance the description of the rights claimed by the present patent.
It should be noted that the above-mentioned embodiments of the present invention are only used for illustrating or explaining the principle of the present invention, and do not constitute a limitation to the present invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.
Claims (4)
1. A workflow mining method based on precursor tasks is characterized in that the precursor tasks in an event log of a workflow are analyzed by analyzing the tasks in the event log; taking an event log as input and a workflow model described by a Petri network as an output result; the method uses an event log based on a precursor task, wherein the precursor task refers to a set of tasks needing to be completed and depended on before the current task is executed, and is input into the current task, namely the event log contains information of the precursor task; the formalization of the event log of the predecessor task is defined as: t is a task set, T is a task sequence comprising n tasks; e ═ θ ] T is the set of events on the basis of the task set T; the precursor task sequence is expressed as sigma epsilon E, and the event log of the precursor task is expressed as WE;
the whole process comprises the following steps:
(1) initializing a return value N of the process based on a workflow model described by the Petri network, wherein the N is collected by a library P according to the structural definition of the Petri networkWTask set TWAnd arc set FWForming;
(2) analyzing the event log W and calculating a task set TWStart task TIAnd end task TO(ii) a Setting an initial value N ═ P (for the workflow process model to be mined)W,TW,FW) In which P isW=TW=FW=φ;
(3) Extracting single-step circulation to obtain a Hash table HT; set of preprocessing tasks TW;
(4) According to the relation between tasks, a task relation set X is calculatedW;
(5) Removing task relationship set XWThe redundant elements in the system calculate a final task relation set YW;
(6) According to YWCalculating out of stock set PW;
(7) According to YWAnd PWCalculating an arc set FW;
(8) Single step loop insertion of HT into FW;
(9) Returning a working flow process model N described by the Petri network;
(10) a flow diagram is presented using a tool according to the process model N.
2. The predecessor task-based workflow mining method of claim 1,
the method is characterized in that the step (3) specifically comprises the following steps:
(3-1) constructing a set of preprocessing tasks TWA task pair St formed by two tasks in the task pair;
(3-2) defining a hash table HT for storing the single-step loop;
(3-3) traversal StThe task pairs are found out and inserted into HT in a mode of HT { (task pair) } { (a) } and at the same time, T is used for processingWExcluding the task a;
the step (4) is specifically refined and comprises the following steps:
(4-1) slave task set TWConstruct all task relationship sets XA;
(4-2) use of causal dependence on XAFiltering to obtain XB;
(4-3) Using non-causal dependence, potential concurrency and relaxed potential selection, on XBFiltering to obtain XC;
(4-4) Using strict selection relationship, for XCFiltering to obtain the final relation set XW。
3. The predecessor task-based workflow mining method of claim 1, wherein a series of relationships between related tasks are defined; the relationships include causal dependencies, non-causal dependencies, potential concurrencies, non-causal dependencies, loose potential selection relationships, and strict potential selection relationships; specific definitions of these relationships are described below:
(A) causal dependency, labeled a →wb: at [ theta ]]If b belongs to sigma and sigma belongs to W and a belongs to theta, the task b depends on the task a;
(B) non-causal dependencies, flagsThe inverse of the causal dependency, i.e. task a and task b do not satisfy a →w b;
(C) Potential concurrency relationships, labeled a/Hwb: task a and task b arbitrarily satisfy one of the following two conditions:
(C-1) under the condition that [ theta ] t belongs to sigma, a belongs to theta and b belongs to theta; or,
(C-2) at a certain log sequence σ ═ t1t2t3…tnIn, there are twoBar Log [ theta ]1]a ∈ σ and [ θ ]2]b belongs to sigma and satisfies theta1∩θ2Phi and b is immediately after a;
(D) non-potential concurrency relationships, labelsTask a and task b do not satisfy a/Hw b;
(E) Loose potential selection relation, marked as a #Lb: the task a and the task b do not exist in a certain log sequence at the same time;
(F) strict potential selection relationship, labeled as a #Sb: tasks a and b satisfy the following two conditions simultaneously:
(F-1) tasks a and b satisfy the relaxed selection condition, i.e., a #Lb, and
(F-2) tasks a and b exist in two log sequences σ, respectivelyiAnd σjAnd the intersection of the predecessors of the two tasks is empty, i.e. θa∩θbPhi, that is to say that the two tasks do not have a common predecessor task.
4. The method of workflow mining based on predecessor tasks of claim 1, wherein a scope of mined logs is named predecessor full logs and a formal definition of predecessor full logs is given; the formalization is defined as follows:
(1) n ═ P, T, F is a reasonable SWF structure, i.e.W is the workflow Log for the structure N, i.e.And each log sequence σ ∈ W is a structure from the starting state [ i ∈ W ]]Start to end state [ o]A terminated firing sequence; if and only if W simultaneously satisfies the following conditions (2), (3), (4); weighing W as a precursor complete log;
(2) for any task t e N, a log sequence sigma e W exists, so that an event [ theta ] t e sigma exists in the log sequence; that is, each task occurs at least once in the log;
(3) for any two tasks a and b, if in the actual workflow model N the tasks a and b are alternatives behind a certain library p, then in the workflow log W a and b must satisfy the strict potential alternative, i.e. a #S b;
(4) For any two tasks a and b, if in the actual workflow model N, the tasks a and b are immediately followed by a certain task tiSubsequent concurrency relationships, then in the workflow log W, there must be a log sequence σ ═ t1t2...tnE is W; so that there is an event [ theta ] in the log sequence1]tkE σ and [ θ [ [ epsilon ]2]tk+1E sigma satisfies theta1∩θ2Not equal to phi and tk=a,tk+1B; that is, tasks a and b must occur adjacently once in a log sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510272608.4A CN104835015B (en) | 2015-05-25 | 2015-05-25 | Workflow mining method based on predecessor task |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510272608.4A CN104835015B (en) | 2015-05-25 | 2015-05-25 | Workflow mining method based on predecessor task |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104835015A CN104835015A (en) | 2015-08-12 |
CN104835015B true CN104835015B (en) | 2019-01-22 |
Family
ID=53812889
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510272608.4A Active CN104835015B (en) | 2015-05-25 | 2015-05-25 | Workflow mining method based on predecessor task |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104835015B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260188A (en) * | 2015-10-24 | 2016-01-20 | 北京航空航天大学 | Time characteristic model and modeling method thereof |
CN106779594A (en) * | 2016-12-01 | 2017-05-31 | 江苏鸿信系统集成有限公司 | A kind of Workflow management method based on Activiti |
CN108647253B (en) * | 2018-04-23 | 2022-09-06 | 南京理工大学 | Mining algorithm containing time constraint workflow |
CN108710645B (en) * | 2018-04-23 | 2021-09-10 | 南京理工大学 | Process mining method based on mixed event log |
CN108717625B (en) * | 2018-05-28 | 2022-05-20 | 北京交通大学 | Generation method of railway electric service workflow |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332125A (en) * | 2011-11-08 | 2012-01-25 | 南京大学 | Workflow mining method based on subsequent tasks |
-
2015
- 2015-05-25 CN CN201510272608.4A patent/CN104835015B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332125A (en) * | 2011-11-08 | 2012-01-25 | 南京大学 | Workflow mining method based on subsequent tasks |
Non-Patent Citations (3)
Title |
---|
一种基于后继任务的过程挖掘算法;王栋毅 等;《计算机应用与软件》;20121015;第29卷(第10期);17-21 |
基于事件日志的工作流挖掘算法研究;梁艳;《中国优秀硕士学位论文全文数据库信息科技辑》;20111215(第S1期);I138-479 |
基于工作流网的过程挖掘算法研究;闻立杰;《中国博士学位论文全文数据库信息科技辑》;20080815(第08期);I138-1 |
Also Published As
Publication number | Publication date |
---|---|
CN104835015A (en) | 2015-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104835015B (en) | Workflow mining method based on predecessor task | |
Ziadi et al. | Feature identification from the source code of product variants | |
CN102332125B (en) | Workflow mining method based on subsequent tasks | |
Schimm | Mining exact models of concurrent workflows | |
Giaglis | A taxonomy of business process modeling and information systems modeling techniques | |
Wen et al. | Detecting implicit dependencies between tasks from event logs | |
Baier et al. | Bridging abstraction layers in process mining by automated matching of events and activities | |
US8000946B2 (en) | Discrete event simulation with constraint based scheduling analysis | |
Ahmad et al. | A framework for architecture-driven migration of legacy systems to cloud-enabled software | |
CN104573063A (en) | Data analysis method based on big data | |
Barnes et al. | Automated planning for software architecture evolution | |
US8086997B2 (en) | Detecting aspectual behavior in unified modeling language artifacts | |
Grati et al. | Extracting sequence diagrams from execution traces using interactive visualization | |
Liu et al. | Formal modeling and discovery of hierarchical business processes: A petri net-based approach | |
Liu | Formal modeling and discovery of multi-instance business processes: A cloud resource management case study | |
Eyitemi et al. | System decomposition to optimize functionality distribution in microservices with rule based approach | |
CN109063040B (en) | Client program data acquisition method and system | |
CN112130849B (en) | Code automatic generation method and device | |
Baier et al. | Bridging abstraction layers in process mining: Event to activity mapping | |
CN112069136A (en) | Outsourcing model mining method for emergency handling process of emergency event | |
Fernández-Ropero et al. | Graph-Based Business Process Model Refactoring. | |
Szlenk et al. | Modelling architectural decisions under changing requirements | |
Aouag et al. | Towards architectural view-driven modernization | |
Kerdoudi et al. | A novel approach for software architecture product line engineering | |
KR101488188B1 (en) | Sequence diagram generating method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |