CN115525693B - Incremental event log-oriented process model mining method and system - Google Patents
Incremental event log-oriented process model mining method and system Download PDFInfo
- Publication number
- CN115525693B CN115525693B CN202211144557.3A CN202211144557A CN115525693B CN 115525693 B CN115525693 B CN 115525693B CN 202211144557 A CN202211144557 A CN 202211144557A CN 115525693 B CN115525693 B CN 115525693B
- Authority
- CN
- China
- Prior art keywords
- event log
- directed graph
- graph model
- log
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 104
- 230000008569 process Effects 0.000 title claims abstract description 77
- 238000005065 mining Methods 0.000 title claims abstract description 51
- 230000000694 effects Effects 0.000 claims description 48
- 238000012163 sequencing technique Methods 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 9
- 238000011112 process operation Methods 0.000 claims description 6
- 238000010924 continuous production Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 8
- 238000009412 basement excavation Methods 0.000 description 7
- 230000003068 static effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1734—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a process model mining method and a system for incremental event logs, comprising the following steps: 1) Basic data, namely a business process event log which is stored regularly, is obtained and divided into an original event log and a plurality of increment event logs; 2) Generating a directed graph model of a flow by an original event log; 3) Updating the directed graph model obtained in the step 2) by using a single increment event log; 4) Iteratively updating the directed graph model obtained in the step 3); 5) And D) excavating a flow model by utilizing the directed graph model obtained in the step 4). According to the invention, the process model updating and the intermediate model directed graph are combined, the operation of merging the original log and the increment log in the traditional mining method is replaced by adopting the mode of updating the intermediate model directed graph by the increment event log, the storage space occupation ratio is reduced on the premise of ensuring the quality of the mining model, the model discovery efficiency is improved, and the problem of inefficiency of the traditional mining method in the process model mining is effectively solved.
Description
Technical Field
The invention relates to the technical field of business process mining, in particular to a process model mining method and system for incremental event logs.
Background
The process model mining method discovers a process model from the historical event log, thereby providing a fact basis for understanding, improving and reconstructing the business process of the enterprise. Due to the influence of external environment and the personalized demands of users, the business process needs to be changed continuously, and the process model needs to be improved and updated continuously to adapt to the dynamically changing process. However, the conventional process mining method used at present can only process static logs and static models, so in order to realize model update and ensure model accuracy, the conventional process mining method adopts a mode of re-mining the process models after log combination to cope with the challenge of continuous change of service processes, and specifically comprises the following steps: 1) Using an original log mining flow model; 2) Merging the increment logs with the original logs, and mining a flow model by using the merged logs; 3) Repeating the step 2) until no incremental log exists. The traditional process mining method can realize model updating to a certain extent by the mode, but the method has obvious defects, on one hand, the combination of event logs can cause the increase of memory and hard disk storage consumption and improve the occupation ratio of storage space; on the other hand, the operations of merging logs and re-mining the flow model can lead to the extension of execution time and the reduction of model discovery efficiency. Therefore, the conventional process model mining method must be improved, so that the problems existing in the current conventional process model mining method can be solved in a targeted manner.
Disclosure of Invention
The first objective of the present invention is to overcome the drawbacks and disadvantages of the prior art, and to provide a process model mining method for incremental event logs, which focuses on providing an incremental process model mining framework IPDF (Incremental Process Discovery Framework), so as to solve the problems of high storage space occupation ratio, low model discovery efficiency and the like in the conventional process mining method, consume less memory and hard resources on the premise of ensuring the same quality of the mining model, and greatly shorten the execution time of the complete mining process, so that the storage space occupation ratio is lower and the model discovery efficiency is higher.
A second object of the present invention is to provide a process model mining system for incremental event logs.
The first object of the invention is achieved by the following technical scheme: the process model mining method for the increment event log comprises the following steps:
1) Basic data, namely a business process event log which is stored regularly, is obtained and divided into an original event log and a plurality of increment event logs;
2) Generating a directed graph model of a flow by an original event log;
3) Updating the directed graph model obtained in the step 2) by using a single increment event log;
4) Iteratively updating the directed graph model obtained in the step 3);
5) And D) excavating a flow model by utilizing the directed graph model obtained in the step 4).
Further, in step 1), a business process event log stored regularly is obtained and divided into an original event log and a plurality of increment event logs, and specifically includes the following steps:
1.1 Acquiring a business process event log stored regularly; the business process event log stored regularly refers to an event log generated by business process operation in a period of time, and the business process event log stored regularly is often used for updating a flow model in the actual operation process of the business due to the influence of external environment and the personalized requirements of users; the event log has a continuous process for a period of time, and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Performing track division on the business process event logs which are obtained in the step 1.1) and are stored regularly to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to sequencing results;
1.3 Dividing the single event logs which are obtained in the step 1.2) and are ordered according to month into an original event log and an increment event log, wherein the event log which is ordered in the first month is used as the original event log, and each remaining event log is an increment event log.
Further, in step 2), the original event log generates a directed graph model of the flow, specifically including the following steps:
2.1 Selecting the original event log in step 1) as input content;
2.2 Analyzing the direct following activity relationship in the log by formula (1);
wherein DFR_S (L) represents a set of direct following active relationships in the event log L, L represents the event log, σ represents the trace in L, σ i The ith activity representing trajectory sigma, sigma i+1 The (i+1) th activity representing the trajectory sigma;
the direct following activity relationship refers to a logic relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relation obtained in the step 2.2), constructing a directed graph model of the flow by using a directed graph algorithm through a formula (2), and if the directed graph model is a directed weighted graph, representing the direct following frequency among events by using weights;
wherein, DFR_DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency of the ith and (i+1) th activities directly followed;
the directed graph model refers to representing events in a track by nodes and representing a direct following activity relationship among the events by directed edges.
Further, in step 3), the directed graph model obtained in step 2) is updated using a single incremental event log, specifically including the steps of:
3.1 Acquiring a single increment event log from the increment event log obtained in the step 1) according to the month sequence, analyzing track information of the single increment event log, and then calculating a direct following activity relation and corresponding frequency of the increment event log;
3.2 Selecting newly added event log track information from the direct following activity relation obtained in the step 3.1), and then obtaining the direct following activity relation in the newly added event log track to update the directed graph model obtained in the step 2), so as to update the directed graph model generated by the original event log by the incremental event log; the updating is to add the direct following active relation in the newly added event log track to the direct following active relation of the existing directed graph model.
Further, in step 4), the directed graph model obtained in step 3) is iteratively updated, specifically including the following steps:
4.1 Sequentially selecting single increment event logs from the increment event logs obtained in the step 1);
4.2 Repeating the step 3) and the step 4) until all increment event logs are selected;
4.3 Storing the directed graph model after the iteration is finished in the step 4.2) into a storage space, and taking the directed graph model as input content of the next step.
Further, in step 5), the process model is mined by using the directed graph model obtained in step 4), and specifically includes the following steps:
5.1 Taking the directed graph model obtained in the step 4) after being updated by all the increment event logs as input content;
5.2 Analyzing the direct following activity relation between the events in the directed graph model, and mining the Petri network from the directed graph model by using a flow discovery algorithm based on the directed graph; the flow discovery algorithm based on the directed graph model is divided into two types according to the presence/absence weight of the directed graph, wherein the first type is Alpha Miner and indirect Miner based on the directed graph, and the second type is Heuristic Miner and indirect Miner-info based on the directed graph.
The second object of the invention is achieved by the following technical scheme: the flow model mining system for the increment event log comprises an event log acquisition and log preprocessing module, a directed graph model module for generating a flow of an original event log, a single increment event log updating directed graph model module, an iteration updating directed graph model module and a flow model mining module for the directed graph model;
the event log acquisition and log preprocessing module is used for acquiring a business process event log which is stored regularly and dividing the business process event log into an original event log and a plurality of increment logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relation and the corresponding frequency thereof, and generating a directed graph model by using a directed graph algorithm on the basis;
the single increment event log updating directed graph model module is used for selecting a single increment event log and updating a directed graph model of a flow generated by an original event log by using the single increment event log;
the iterative updating directed graph model module is used for sequentially selecting logs from the residual incremental event logs and iteratively performing the operation of updating the directed graph model by the incremental event logs;
the directed graph model mining flow model module is used for selecting the directed graph model updated by all increment event logs and mining the flow model from the directed graph model by using a directed graph-based flow discovery algorithm.
Further, the event log obtaining and log preprocessing module specifically performs the following operations:
acquiring a business process event log which is stored regularly; carrying out track division on the business process event logs stored regularly to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to sequencing results; selecting an event log arranged in the first month as an original event log, and taking each remaining event log as an incremental event log; the business process event log is generated through business process operation in a period of time, and the business process event log is often updated by using the periodically stored event log in the actual operation process of the business due to the influence of external environment and the personalized requirements of users; the event log has a continuous process for a period of time, and is a set of finite event sequences, and each finite event sequence is called a track;
the directed graph model module of the original event log generation flow specifically executes the following operations:
selecting an original event log as input content; analyzing the direct following activity relation and the corresponding frequency in the original event log; and constructing a directed graph model by using a directed graph algorithm from the obtained direct following activity relation and frequency.
Further, the single increment event log updating directed graph model module specifically performs the following operations:
acquiring a single increment event log from the increment event log according to the month sequence, analyzing track information of the single increment event log, and then calculating a direct following activity relationship and corresponding frequency of the increment event log; selecting a new direct following active relation from the obtained direct following active relations to update the direct following active relation of the directed graph model constructed by the original event log, so as to update the directed graph model generated by the original event log by the incremental event log;
the iterative updating directed graph model module specifically performs the following operations:
sequentially selecting single increment event logs from the rest increment event logs; repeating the operation of updating the directed graph model by a single increment event log until all increment event logs are selected; and saving the directed graph model after the iteration is finished in a storage space so as to serve as input content of the next module.
Further, the dig flow model module of the directed graph model specifically performs the following operations:
selecting a directed graph model updated by all increment event logs, and analyzing the relation among nodes and the corresponding frequency; selecting any flow discovery algorithm based on the directed graph model, and mining a flow model from the following relation of each node of the directed graph model and the corresponding frequency; the nodes in the directed graph model are all events in the event log track.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention improves the operation of merging the original event log and the increment event log in the traditional process model mining method into updating the intermediate model directed graph by the increment event log, and can reduce the utilization rate of storage space and improve the mining efficiency by the method.
2. The invention aims at providing an incremental flow model mining framework which can be effectively integrated and applied to an updated flow model in the flow mining field, and provides a more efficient method for periodically updating the flow model by periodically storing business flow event logs.
3. The invention greatly reduces the consumption of memory space and hard disk storage space resources occupied in the excavation flow and reduces the storage space occupation ratio on the premise of ensuring the same quality of the excavation model.
4. The invention shortens the execution time of the complete excavation flow and improves the model discovery efficiency on the premise of ensuring the same quality of the excavation model.
5. The invention has wide use space in the incremental flow model mining, simple operation, strong adaptability and wide prospect in incremental updating models.
Drawings
FIG. 1 is a schematic diagram of the logic flow of the method of the present invention.
Fig. 2 is a schematic diagram of an original event log and an incremental event log in CSV format divided from a periodically stored business process event log in this embodiment.
Fig. 3 is a schematic diagram of an event log after converting a CSV format event log into an XES format in this embodiment.
Fig. 4 is a schematic diagram of a directed graph model constructed from an original event log in the present embodiment.
FIG. 5 is a schematic diagram of a directed graph model updated with a single incremental event log in this embodiment.
Fig. 6 is a schematic diagram of a directed graph model updated by all incremental event logs in the present embodiment.
Fig. 7 is a schematic diagram of a flow model mined from the directed graph model using the heuristics Miner algorithm in this embodiment.
Fig. 8 is a diagram of a system architecture of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
As shown in fig. 1, this embodiment discloses a flow model mining method for incremental event logs, which includes the following steps:
1) Basic data, namely a business process event log which is stored regularly, is obtained and divided into an original event log and a plurality of increment event logs, and specifically comprises the following steps:
1.1 Acquiring a business process event log stored regularly; the business process event log stored regularly refers to an event log generated by business process operation in a period of time, and the business process event log stored regularly is often used for updating a flow model in the actual operation process of the business due to the influence of external environment and the personalized requirements of users; the event log has a continuous process for a period of time, and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Performing track division on the business process event logs which are obtained in the step 1.1) and are stored regularly to obtain a single track, then sequencing all tracks according to months (or other time units), and storing the log tracks of the same month into the same event log according to sequencing results;
1.3 Dividing the single event logs obtained in the step 1.2) in order of month (or other time units) into an original event log and an increment event log, wherein the event log arranged in the first month is used as the original event log, and each event log is used as one increment event log.
By adopting the steps, the original event log and a plurality of increment event logs which are divided by the business process event log which is stored regularly are obtained, the event logs are all stored in a storage space in a year-month name, the event logs comprise case (case number), event (event), startTime (starting time) and complexetime (ending time), the initial states of the original event log and the increment event logs are in CSV format, and the original event log and the increment event logs are required to be converted into event logs in XES format as shown in figure 3 so as to carry out subsequent operation.
2) The method specifically comprises the following steps of:
2.1 Selecting the original event log in step 1) as input content;
2.2 Analyzing the direct following activity relationship in the log by formula (1);
wherein DFR_S (L) represents a set of direct following active relationships in the event log L, L represents the event log, σ represents the trace in L, σ i The ith activity representing trajectory sigma, sigma i+1 The (i+1) th activity representing the trajectory sigma;
the direct following activity relationship refers to a logic relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relation obtained in the step 2.2), constructing a directed graph model by using a directed graph algorithm through a formula (2), and if the directed graph is a directed weighted graph, indicating the direct following frequency among events by using weights;
wherein, DFR_DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency of the ith and (i+1) th activities directly followed;
the directed graph refers to representing events in a track by nodes and representing a direct following active relationship between the events by directed edges.
With the above steps, a directed graph model of the flow generated from the original event log is obtained, as shown in fig. 4.
3) The single increment event log updates the directed graph model, and specifically comprises the following steps:
3.1 Acquiring a single increment event log from the increment event log obtained in the step 1) according to the month (or other time units) sequence, analyzing track information of the single increment event log, and then calculating the direct following activity relation and the corresponding frequency of the increment event log;
3.2 Selecting newly added event log track information from the direct following activity relation obtained in the step 3.1), and then obtaining the direct following activity relation in the newly added event log track to update the directed graph model obtained in the step 2), so as to update the directed graph model generated by the original event log by the incremental event log; the updating is to add the direct following active relation in the newly added event log track to the direct following active relation of the existing directed graph model.
With the above steps, a directed graph model updated with a single incremental event log is obtained, as shown in FIG. 5.
4) Iteratively updating the directed graph model obtained in the step 3), which specifically comprises the following steps:
4.1 Sequentially selecting single increment event logs from the increment event logs obtained in the step 1);
4.2 Repeating the step 3) and the step 4) until all increment event logs are selected;
4.3 Storing the directed graph model after the iteration is finished in the step 4.2) into a storage space, and taking the directed graph model as input content of the next step.
By adopting the steps, a directed graph model updated by all increment event logs is obtained, as shown in fig. 6.
5) Excavating a flow model by the directed graph model, which specifically comprises the following steps:
5.1 Taking the directed graph model obtained in the step 4) after being updated by all the increment event logs as input content;
5.2 Analyzing the direct following activity relation between the events in the directed graph model, and mining the Petri network from the directed graph model by using a flow discovery algorithm based on the directed graph model; the flow discovery algorithm based on the directed graph model can be divided into two types according to the existence weight of the directed graph model, wherein the first type is Alpha Miner and indirect Miner based on the directed graph without weight, and the second type is Heuristic Miner and indirect Miner-Infinite based on the directed graph with weight.
Using the above procedure, a Petri net was obtained that was mined from the directed graph model using the heuristics Miner algorithm, as shown in fig. 7.
Example 2
The embodiment discloses a flow model mining system facing to incremental event logs, as shown in fig. 8, the system comprises the following functional modules: the system comprises an event log acquisition and log preprocessing module, a directed graph model module of an original event log generation flow, a single increment event log updating directed graph model module, an iterative updating directed graph model module and a flow model module mined by the directed graph model;
the event log acquisition and log preprocessing module is used for acquiring a business process event log which is stored regularly and dividing the business process event log into an original event log and a plurality of increment logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relation and the corresponding frequency thereof, and generating a directed graph model by using a directed graph algorithm on the basis;
the single increment event log updating directed graph model module is used for selecting a single increment event log and updating a directed graph model generated by an original event log by using the single increment event log;
the iterative updating directed graph model module is used for sequentially selecting logs from the residual incremental event logs and iteratively performing the operation of updating the directed graph model by the incremental event logs;
the directed graph model mining flow model module is used for selecting the directed graph model updated by all increment event logs and mining the flow model from the directed graph model by using a flow discovery algorithm based on the directed graph model.
Further, the event log obtaining and log preprocessing module specifically performs the following operations:
acquiring a business process event log which is stored regularly; dividing the business process event log track stored regularly into single tracks, and sequencing the tracks according to months (or other time units), wherein the track of each month forms an event log, the event log of the first month is used as an original event log, and the logs of the other months are used as an increment event log; analyzing the direct following activity relation and the corresponding frequency between the events from the original event log to construct the relation and the frequency between the nodes in the directed graph model;
the business process event log is generated by business process operation in a period of time, and the process model is required to be continuously changed due to the influence of external environment and the personalized requirement of a user, so that the process model is often updated regularly by adopting the event log stored regularly in the actual operation process; the event log is a collection of finite sequences of events, each of which is referred to as a trace.
Further, the original event log generation directed graph model module specifically performs the following operations:
selecting an original event log as input content; analyzing the direct following activity relation and the corresponding frequency in the original event log; and constructing a directed graph model by using a directed graph algorithm from the obtained direct following activity relation and frequency.
Further, the single increment event log updating directed graph model module specifically performs the following operations:
acquiring a single increment event log from the increment event log according to month (or other time units) sequence, analyzing track information of the single increment event log, and then calculating a direct following activity relation and corresponding frequency of the increment event log; and selecting a new direct following activity relation from the obtained direct following activity relation to update the direct following activity relation of the directed graph model constructed by the original event log, thereby realizing the update of the incremental event log on the directed graph model generated by the original event log.
Further, the iterative updating directed graph model module specifically performs the following operations:
sequentially selecting single increment event logs from the rest increment event logs; repeating the operation of updating the directed graph model by a single increment event log until all increment event logs are selected; and saving the directed graph model after the iteration is finished in a storage space so as to serve as input content of the next module.
Further, the dig flow model module of the directed graph model specifically performs the following operations:
selecting a directed graph model updated by all increment event logs, and analyzing the relation among nodes and the corresponding frequency; selecting any flow discovery algorithm based on the directed graph model, and mining a flow model from the following relation of each node of the directed graph model and the corresponding frequency; the nodes in the directed graph model are all events in the event log track.
In summary, after the scheme is adopted, the invention provides a new excavation framework for process model excavation, and the directed graph model is used as an intermediate model in the model updating process, so that the problems of excessive storage space and execution time consumption in the traditional excavation method can be effectively solved, the model discovery efficiency is effectively improved, the practical popularization value is realized, and the method is worth popularizing.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (2)
1. The process model mining method for the increment event log is characterized by comprising the following steps of:
1) Basic data, namely a business process event log which is stored regularly, is obtained and divided into an original event log and a plurality of increment event logs, and specifically comprises the following steps:
1.1 Acquiring a business process event log stored regularly; the business process event log stored regularly refers to an event log generated by business process operation in a period of time, and the business process event log stored regularly is often used for updating a flow model in the actual operation process of the business due to the influence of external environment and the personalized requirements of users; the event log has a continuous process for a period of time, and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Performing track division on the business process event logs which are obtained in the step 1.1) and are stored regularly to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to sequencing results;
1.3 Dividing the single event log which is obtained in the step 1.2) and is sequenced according to the month into an original event log and an increment event log, wherein the event log which is sequenced in the first month is used as the original event log, and each increment event log is remained;
2) The method specifically comprises the following steps of:
2.1 Selecting the original event log in step 1) as input content;
2.2 Analyzing the direct following activity relationship in the log by formula (1);
wherein DFR_S (L) represents a set of direct following active relationships in the event log L, L represents the event log, σ represents the trace in L, σ i The ith activity representing trajectory sigma, sigma i+1 The (i+1) th activity representing the trajectory sigma;
the direct following activity relationship refers to a logic relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relation obtained in the step 2.2), constructing a directed graph model of the flow by using a directed graph algorithm through a formula (2), and if the directed graph model is a directed weighted graph, representing the direct following frequency among events by using weights;
wherein, DFR_DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency of the ith and (i+1) th activities directly followed;
the directed graph model refers to the fact that nodes are used for representing events in a track, and directed edges are used for representing direct following activity relations among the events;
3) Updating the directed graph model obtained in the step 2) by using a single increment event log, which specifically comprises the following steps:
3.1 Acquiring a single increment event log from the increment event log obtained in the step 1) according to the month sequence, analyzing track information of the single increment event log, and then calculating a direct following activity relation and corresponding frequency of the increment event log;
3.2 Selecting newly added event log track information from the direct following activity relation obtained in the step 3.1), and then obtaining the direct following activity relation in the newly added event log track to update the directed graph model obtained in the step 2), so as to update the directed graph model generated by the original event log by the incremental event log; the updating is to add the direct following active relation in the newly added event log track to the direct following active relation of the existing directed graph model;
4) Iteratively updating the directed graph model obtained in the step 3), which specifically comprises the following steps:
4.1 Sequentially selecting single increment event logs from the increment event logs obtained in the step 1);
4.2 Repeating the step 3) and the step 4) until all increment event logs are selected;
4.3 Storing the directed graph model after the iteration is finished in the step 4.2) into a storage space, and taking the directed graph model as input content of the next step;
5) Excavating a flow model by using the directed graph model obtained in the step 4), which specifically comprises the following steps:
5.1 Taking the directed graph model obtained in the step 4) after being updated by all the increment event logs as input content;
5.2 Analyzing the direct following activity relation between the events in the directed graph model, and mining the Petri network from the directed graph model by using a flow discovery algorithm based on the directed graph; the flow discovery algorithm based on the directed graph model is divided into two types according to the presence/absence weight of the directed graph, wherein the first type is Alpha Miner and indirect Miner based on the directed graph, and the second type is Heuristic Miner and indirect Miner-info based on the directed graph.
2. The flow model mining system for the increment event log is characterized by comprising an event log acquisition and log preprocessing module, a directed graph model module for generating a flow of an original event log, a single increment event log updating directed graph model module, an iteration updating directed graph model module and a flow model mining module for the directed graph model;
the event log acquisition and log preprocessing module is used for acquiring a business process event log which is stored regularly and dividing the business process event log into an original event log and a plurality of increment logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relation and the corresponding frequency thereof, and generating a directed graph model by using a directed graph algorithm on the basis;
the single increment event log updating directed graph model module is used for selecting a single increment event log and updating a directed graph model of a flow generated by an original event log by using the single increment event log;
the iterative updating directed graph model module is used for sequentially selecting logs from the residual incremental event logs and iteratively performing the operation of updating the directed graph model by the incremental event logs;
the directed graph model mining flow model module is used for selecting the directed graph model updated by all increment event logs and mining the flow model from the directed graph model by using a directed graph-based flow discovery algorithm;
the event log obtaining and log preprocessing module specifically executes the following operations:
acquiring a business process event log which is stored regularly; carrying out track division on the business process event logs stored regularly to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to sequencing results; selecting an event log arranged in the first month as an original event log, and taking each remaining event log as an incremental event log; the business process event log is generated through business process operation in a period of time, and the business process event log is often updated by using the periodically stored event log in the actual operation process of the business due to the influence of external environment and the personalized requirements of users; the event log has a continuous process for a period of time, and is a set of finite event sequences, and each finite event sequence is called a track;
the directed graph model module of the original event log generation flow specifically executes the following operations:
selecting an original event log as input content; analyzing the direct following activity relation and the corresponding frequency in the original event log; starting from the obtained direct following activity relation and frequency, constructing a directed graph model by using a directed graph algorithm;
the single increment event log updating directed graph model module specifically performs the following operations:
acquiring a single increment event log from the increment event log according to the month sequence, analyzing track information of the single increment event log, and then calculating a direct following activity relationship and corresponding frequency of the increment event log; selecting a new direct following active relation from the obtained direct following active relations to update the direct following active relation of the directed graph model constructed by the original event log, so as to update the directed graph model generated by the original event log by the incremental event log;
the iterative updating directed graph model module specifically performs the following operations:
sequentially selecting single increment event logs from the rest increment event logs; repeating the operation of updating the directed graph model by a single increment event log until all increment event logs are selected; storing the directed graph model after iteration is finished into a storage space so as to serve as input content of a next module;
the directed graph model mining flow model module specifically executes the following operations:
selecting a directed graph model updated by all increment event logs, and analyzing the relation among nodes and the corresponding frequency; selecting any flow discovery algorithm based on the directed graph model, and mining a flow model from the following relation of each node of the directed graph model and the corresponding frequency; the nodes in the directed graph model are all events in the event log track.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211144557.3A CN115525693B (en) | 2022-09-20 | 2022-09-20 | Incremental event log-oriented process model mining method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211144557.3A CN115525693B (en) | 2022-09-20 | 2022-09-20 | Incremental event log-oriented process model mining method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115525693A CN115525693A (en) | 2022-12-27 |
CN115525693B true CN115525693B (en) | 2024-02-06 |
Family
ID=84697912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211144557.3A Active CN115525693B (en) | 2022-09-20 | 2022-09-20 | Incremental event log-oriented process model mining method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115525693B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116258350B (en) * | 2023-05-15 | 2023-08-11 | 烟台岸基网络科技有限公司 | Sea container transportation monitoring method |
CN117495071B (en) * | 2023-12-29 | 2024-05-14 | 安徽思高智能科技有限公司 | Flow discovery method and system based on predictive log enhancement |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103778051A (en) * | 2014-01-09 | 2014-05-07 | 安徽理工大学 | Business process increment mining method based on L* algorithm |
CN105069044A (en) * | 2015-07-22 | 2015-11-18 | 安徽理工大学 | Simulated indirect dependency based novel process model mining method |
CN109086385A (en) * | 2018-07-26 | 2018-12-25 | 安徽理工大学 | A kind of operation flow low frequency Behavior mining method based on Petri network |
CN109460391A (en) * | 2018-09-04 | 2019-03-12 | 安徽理工大学 | A kind of process model excavation new method cut based on process |
CN111984706A (en) * | 2020-08-20 | 2020-11-24 | 山东理工大学 | Emergency linkage disposal flow model mining method for emergency |
CN112069136A (en) * | 2020-08-28 | 2020-12-11 | 山东理工大学 | Outsourcing model mining method for emergency handling process of emergency event |
CN114756602A (en) * | 2022-05-19 | 2022-07-15 | 上海熵评科技有限公司 | Real-time streaming process mining method and system and computer readable storage medium |
CN114971710A (en) * | 2022-05-25 | 2022-08-30 | 北京凡得科技有限公司 | Event log-based multi-dimensional process variant difference analysis method and system |
-
2022
- 2022-09-20 CN CN202211144557.3A patent/CN115525693B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103778051A (en) * | 2014-01-09 | 2014-05-07 | 安徽理工大学 | Business process increment mining method based on L* algorithm |
CN105069044A (en) * | 2015-07-22 | 2015-11-18 | 安徽理工大学 | Simulated indirect dependency based novel process model mining method |
CN109086385A (en) * | 2018-07-26 | 2018-12-25 | 安徽理工大学 | A kind of operation flow low frequency Behavior mining method based on Petri network |
CN109460391A (en) * | 2018-09-04 | 2019-03-12 | 安徽理工大学 | A kind of process model excavation new method cut based on process |
CN111984706A (en) * | 2020-08-20 | 2020-11-24 | 山东理工大学 | Emergency linkage disposal flow model mining method for emergency |
CN112069136A (en) * | 2020-08-28 | 2020-12-11 | 山东理工大学 | Outsourcing model mining method for emergency handling process of emergency event |
CN114756602A (en) * | 2022-05-19 | 2022-07-15 | 上海熵评科技有限公司 | Real-time streaming process mining method and system and computer readable storage medium |
CN114971710A (en) * | 2022-05-25 | 2022-08-30 | 北京凡得科技有限公司 | Event log-based multi-dimensional process variant difference analysis method and system |
Non-Patent Citations (3)
Title |
---|
"一种增量挖掘优化流程模型方法";薛洋婷,王丽丽;《赤峰学院学报(自然科学版)》;第35卷(第10期);57-60 * |
"基于事件日志增强的时序活动表示学习方法";倪维健,孙宇健,曾庆田等;《计算机集成制造系统》;第25卷(第4期);第837-846页 * |
"基于增量日志的过程挖掘方法研究";周衍志;《中国优秀硕士学位论文全文数据库》(第02期);I138-350 * |
Also Published As
Publication number | Publication date |
---|---|
CN115525693A (en) | 2022-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115525693B (en) | Incremental event log-oriented process model mining method and system | |
US9129002B2 (en) | Dividing device, dividing method, and recording medium | |
CN102799679B (en) | Hadoop-based massive spatial data indexing updating system and method | |
CN102103497A (en) | Finite state machine actuating device and method, and method for establishing and using finite state machine | |
CN103488537A (en) | Method and device for executing data ETL (Extraction, Transformation and Loading) | |
CN103150163A (en) | Map/Reduce mode-based parallel relating method | |
CN110838072A (en) | Social network influence maximization method and system based on community discovery | |
CN103702401A (en) | User behavior analysis-based energy-saving method for mobile network | |
CN113254241B (en) | Data cross-process transmission method applied to CFD high-performance computation | |
CN106503872A (en) | A kind of business process system construction method based on basic business active set | |
CN113342495A (en) | Cross-organization multi-instance subprocess model mining method and system | |
CN102662967B (en) | Method for designing based on J2EE technology CHINAUNICOM fixed network call-data analysis scheme | |
CN105138650A (en) | Hadoop data cleaning method and system based on outlier mining | |
CN112651618A (en) | Construction method of audit dimension model for online audit of metering data | |
CN112527300A (en) | Fine-grained compiling self-optimization method for multiple targets | |
Lyuh et al. | High-level synthesis for low power based on network flow method | |
CN115408546A (en) | Time sequence data management method, device, equipment and storage medium | |
CN115203290A (en) | Fault diagnosis method based on multi-dimensional prefix span algorithm | |
CN109768878B (en) | Network work order calculation method and device based on big data | |
CN112463739A (en) | Data processing method and system based on ocean mode ROMS | |
CN101303657A (en) | Method of optimization of multiprocessor real-time task execution power consumption | |
CN113902220A (en) | Vehicle track prediction method based on adaptive density clustering algorithm | |
CN103984723A (en) | Method used for updating data mining for frequent item by incremental data | |
CN102455907B (en) | Mobile telephone with multiple cards application software method for designing | |
CN115858719B (en) | Big data analysis-based SIM card activity prediction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |