CN115525693A - Incremental event log-oriented process model mining method and system - Google Patents

Incremental event log-oriented process model mining method and system Download PDF

Info

Publication number
CN115525693A
CN115525693A CN202211144557.3A CN202211144557A CN115525693A CN 115525693 A CN115525693 A CN 115525693A CN 202211144557 A CN202211144557 A CN 202211144557A CN 115525693 A CN115525693 A CN 115525693A
Authority
CN
China
Prior art keywords
event log
directed graph
incremental
graph model
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211144557.3A
Other languages
Chinese (zh)
Other versions
CN115525693B (en
Inventor
刘聪
刘文娟
李会玲
李彩虹
张立晔
郭娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Technology
Original Assignee
Shandong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Technology filed Critical Shandong University of Technology
Priority to CN202211144557.3A priority Critical patent/CN115525693B/en
Publication of CN115525693A publication Critical patent/CN115525693A/en
Application granted granted Critical
Publication of CN115525693B publication Critical patent/CN115525693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a flow model mining method and a system facing incremental event logs, which comprises the following steps: 1) Acquiring basic data, namely a service flow event log which is stored regularly, and dividing the basic data into an original event log and a plurality of incremental event logs; 2) Generating a directed graph model of a flow by using an original event log; 3) Updating the directed graph model obtained in the step 2) by the single incremental event log; 4) Iteratively updating the directed graph model obtained in the step 3); 5) And excavating a process model by using the directed graph model obtained in the step 4). The method combines the process model updating with the intermediate model directed graph, replaces the operation of combining the original log and the incremental log in the traditional mining method by adopting the mode of updating the intermediate model directed graph by the incremental event log, reduces the occupation ratio of the storage space, improves the model discovery efficiency and effectively solves the problem of low efficiency of the traditional mining method in the process model mining on the premise of ensuring the quality of the mining model.

Description

Incremental event log-oriented process model mining method and system
Technical Field
The invention relates to the technical field of business process mining, in particular to a process model mining method and system facing incremental event logs.
Background
The process model mining method is to find a process model from a historical event log so as to provide a factual basis for understanding, improving and reconstructing the business process of an enterprise. Due to the influence of the external environment and the personalized requirements of users, the business process needs to be changed continuously, and the process model also needs to be improved and updated continuously to adapt to the dynamically changing process. However, the conventional process mining method used at present can only process static logs and static models, so in order to update the models and ensure the accuracy of the models, the conventional process mining method adopts a mode of mining the process models again after merging the logs to deal with the challenge that the business process is changed constantly, and the specific steps are as follows: 1) Mining a flow model by using an original log; 2) Merging the incremental log and the original log, and mining a process model by using the merged log; 3) And repeating the step 2) until no incremental log exists. The traditional process mining method can realize model updating to a certain extent through the mode, but the method has obvious defects, on one hand, the combination of event logs can increase the storage consumption of a memory and a hard disk, and the storage space occupation ratio is improved; on the other hand, the operation of merging logs and re-mining the process model can cause the execution time to be prolonged, and the model discovery efficiency is reduced. Therefore, the traditional process model mining method must be improved, so that the problems existing in the traditional process mining method can be specifically solved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a flow model mining method facing Incremental event logs, and particularly provides an Incremental flow model mining frame IPDF (Incremental Process Discovery Framework), which breaks through the problems of high occupied storage space ratio, low model Discovery efficiency and the like of the traditional flow mining method, consumes less memory and hard storage resources on the premise of ensuring the same quality of a mining model, greatly shortens the execution time of a complete mining flow, and ensures that the occupied storage space ratio is lower and the model Discovery efficiency is higher.
The invention also aims to provide a flow model mining system facing the incremental event logs.
The first purpose of the invention is realized by the following technical scheme: the incremental event log-oriented process model mining method comprises the following steps:
1) Acquiring basic data, namely a service flow event log which is stored regularly, and dividing the basic data into an original event log and a plurality of incremental event logs;
2) Generating a directed graph model of a flow by using an original event log;
3) Updating the directed graph model obtained in the step 2) by using a single incremental event log;
4) Iteratively updating the directed graph model obtained in the step 3);
5) And excavating a flow model by using the directed graph model obtained in the step 4).
Further, in step 1), the method for obtaining the regularly stored service flow event logs and dividing the service flow event logs into an original event log and a plurality of incremental event logs specifically comprises the following steps:
1.1 Obtaining a regularly stored business process event log; the regularly stored business process event log refers to an event log generated by business process operation within a period of time, and due to the influence of an external environment and the personalized requirements of users, the regularly stored event log is often used for updating a process model in the actual business operation process; the event log has a period of persistence and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Track division is carried out on the service flow event logs which are obtained in the step 1.1) and stored regularly to obtain a single track, then all tracks are sorted according to months, and the log tracks of the same month are stored into the same event log according to the sorting result;
1.3 ) carrying out original event log and incremental event log division on the single event log which is obtained in the step 1.2) and is sorted according to the months, wherein the event log which is arranged in the first month is used as the original event log, and each rest event log is an incremental event log.
Further, in step 2), the directed graph model of the original event log generation process specifically includes the following steps:
2.1 Selecting the original event log in the step 1) as input content;
2.2 Analyze the directly following activity relationships in the log by equation (1);
Figure BDA0003855070060000031
in the formula, DFR _ S (L) represents the set of directly following activity relationships in the event Log L, L represents the event Log, σ represents the track in L, σ represents i Representative of the ith activity, σ, of the trajectory σ i+1 Represents the i +1 st activity of the trajectory σ;
the direct following activity relationship refers to a logical relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relationship obtained in the step 2.2), constructing a directed graph model of the flow by using a directed graph algorithm through a formula (2), wherein if the directed graph model is a directed weighted graph, the direct following frequency between events is represented by using a weight value;
Figure BDA0003855070060000032
in the formula, DFR _ DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency at which the ith and (i + 1) th activities follow directly;
the directed graph model refers to representing events in a track by nodes and representing direct following activity relations between the events by directed edges.
Further, in step 3), updating the directed graph model obtained in step 2) by using a single incremental event log, specifically including the following steps:
3.1 Obtaining a single increment event log from the increment event logs obtained in the step 1) according to the order of months, analyzing track information of the single increment event log, and then calculating the direct following activity relationship and the corresponding frequency of the increment event log;
3.2 Selecting new event log track information from the direct following activity relationship obtained in the step 3.1), and then obtaining the direct following activity relationship in the new event log track to update the directed graph model obtained in the step 2) so as to update the directed graph model generated by the original event log by the incremental event log; and the updating is to add the direct following activity relationship in the newly added event log track into the direct following activity relationship of the existing directed graph model.
Further, in step 4), iteratively updating the directed graph model obtained in step 3), specifically including the following steps:
4.1 ) sequentially selecting single incremental event logs from the rest incremental event logs obtained in the step 1);
4.2 Step 3) and step 4) are repeated until all the incremental event logs are selected;
4.3 Storing the directed graph model after the iteration is finished in the step 4.2) into a storage space, and taking the directed graph model as the input content of the next step.
Further, in step 5), mining the process model by using the directed graph model obtained in step 4), specifically including the following steps:
5.1 Taking the directed graph model updated by all the incremental event logs obtained in the step 4) as input content;
5.2 Analyzing direct following activity relation between events in the directed graph model, and mining a Petri network from the events by using a flow discovery algorithm based on the directed graph; the flow discovery algorithm based on the directed graph model is divided into two types according to the weighted value of the directed graph, wherein the first type is Alpha Miner and Inductive Miner based on the directed graph, and the second type is Hearistic Miner and Inductive Miner-Infrequency based on the directed graph.
The second purpose of the invention is realized by the following technical scheme: the incremental event log-oriented process model mining system comprises an event log obtaining and log preprocessing module, a directed graph model module of an original event log generating process, a single incremental event log updating directed graph model module, an iterative updating directed graph model module and a directed graph model mining module;
the event log acquiring and log preprocessing module is used for acquiring a regularly stored service flow event log and dividing the service flow event log into an original event log and a plurality of incremental logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relationship and the corresponding frequency in the original event log, and generating a directed graph model by using a directed graph algorithm on the basis;
the single incremental event log updating directed graph model module is used for selecting a single incremental event log and updating a directed graph model of a process generated by an original event log by using the single incremental event log updating directed graph model;
the iterative updating directed graph model module is used for sequentially selecting logs from the rest incremental event logs and iteratively updating the directed graph model of the incremental event logs;
and the directed graph model mining flow model module is used for selecting the directed graph model after updating all incremental event logs and mining the flow model from the directed graph model by using a flow discovery algorithm based on the directed graph.
Further, the event log obtaining and log preprocessing module specifically executes the following operations:
acquiring a service flow event log which is stored regularly; performing track division on regularly stored service flow event logs to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to a sequencing result; selecting the event log arranged in the first month as an original event log, and taking each residual event log as an incremental event log; the regularly stored event log of the service process refers to an event log generated by operation of the service process within a period of time, and due to the influence of an external environment and the personalized requirements of users, the regularly stored event log is often used for updating the process model in the actual operation process of the service; the event log has a period of persistence and is a set of finite event sequences, and each finite event sequence is called a track;
the directed graph model module of the original event log generation process specifically executes the following operations:
selecting an original event log as input content; analyzing the direct following activity relationship and the corresponding frequency in the original event log; and constructing a directed graph model by using a directed graph algorithm based on the obtained direct following activity relation and frequency.
Further, the single incremental event log updating directed graph model module specifically executes the following operations:
acquiring a single increment event log from the increment event logs according to the order of months, analyzing track information of the single increment event log, and then calculating the direct following activity relationship and the corresponding frequency of the increment event log; selecting a newly added direct following activity relation from the obtained direct following activity relations to update the direct following activity relation of the directed graph model constructed by the original event log, so as to update the directed graph model generated by the original event log by the incremental event log;
the module for iteratively updating the directed graph model specifically executes the following operations:
sequentially selecting a single increment event log from the rest increment event logs; repeatedly carrying out the operation of updating the directed graph model by the single incremental event log until all the incremental event logs are selected; and storing the directed graph model after the iteration is finished into a storage space so as to be used as the input content of the next module.
Further, the directed graph model mining process model module specifically executes the following operations:
selecting a directed graph model updated by all incremental event logs, and analyzing the relationship and the corresponding frequency among all nodes; selecting any flow discovery algorithm based on the directed graph model, and excavating a flow model from the following relation and the corresponding frequency of each node of the directed graph model; and the nodes in the directed graph model are all events in the event log track.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the method improves the operation of combining the original event log and the incremental event log in the traditional process model mining method into updating the intermediate model directed graph by using the incremental event log, and can reduce the utilization rate of a storage space and improve the mining efficiency.
2. The invention provides an incremental flow model mining framework, which can be efficiently fused and applied to an updating flow model in the flow mining field, and provides a more efficient method for periodically updating the flow model by periodically storing business flow event logs.
3. On the premise of ensuring the same quality of the mining model, the invention greatly reduces the consumption of the memory space and the hard disk storage space resources occupied in the mining process and reduces the occupation ratio of the storage space.
4. The invention shortens the execution time of the complete excavation process and improves the model discovery efficiency on the premise of ensuring the same quality of the excavation models.
5. The method has the advantages of wide use space in the incremental process model excavation, simple operation, strong adaptability and wide prospect on the incremental updating model.
Drawings
FIG. 1 is a logic flow diagram of the method of the present invention.
Fig. 2 is a schematic diagram of the original event log and the incremental event log in the CSV format partitioned by the regularly stored service flow event log in this embodiment.
Fig. 3 is a schematic diagram of the event log after the event log in the CSV format is converted into the xe format in this embodiment.
Fig. 4 is a schematic diagram of a directed graph model constructed by original event logs in this embodiment.
Fig. 5 is a schematic diagram of a directed graph model updated by a single incremental event log in this embodiment.
Fig. 6 is a schematic diagram of a directed graph model updated by all incremental event logs in this embodiment.
Fig. 7 is a schematic diagram of a flow model mined from a directed graph model by using the Heuristic Miner algorithm in this embodiment.
Fig. 8 is a system architecture diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the embodiments of the present invention are not limited thereto.
Example 1
As shown in fig. 1, the embodiment discloses a method for mining a flow model facing an incremental event log, which includes the following steps:
1) The method comprises the following steps of obtaining basic data, namely service flow event logs stored regularly, and dividing the basic data into original event logs and a plurality of incremental event logs, wherein the method specifically comprises the following steps:
1.1 Obtaining a regularly stored business process event log; the regularly stored business process event log refers to an event log generated by business process operation within a period of time, and due to the influence of an external environment and the personalized requirements of users, the regularly stored event log is often used for updating a process model in the actual business operation process; the event log has a period of persistence and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Carrying out track division on the regularly stored service flow event logs obtained in the step 1.1) to obtain a single track, then sequencing all tracks according to months (or other time units), and storing the log tracks of the same month into the same event log according to the sequencing result;
1.3 The single event logs which are obtained in the step 1.2) and are sorted by the month (or other time units) are divided into original event logs and incremental event logs, wherein the event log which is arranged in the first month serves as the original event log, and each remaining event log serves as one incremental event log.
By adopting the above steps, an original event log and a plurality of incremental event logs divided by a service flow event log stored periodically are obtained, the event logs are all stored in a storage space by year and month names, the event logs include a case number, an event, a startTime, and a completeTime, and initial states of the original event log and the incremental event logs are in a CSV format, and as shown in fig. 2, the original event log and the incremental event logs need to be converted into event logs in an XES format as shown in fig. 3 so as to perform subsequent operations.
2) The directed graph model of the original event log generation process specifically comprises the following steps:
2.1 Selecting the original event log in the step 1) as input content;
2.2 Analyze the direct follow-up activity relationship in the log by equation (1);
Figure BDA0003855070060000091
in the formula, DFR _ S (L) represents a directly following activity relationship set in an event log L, L represents the event log, σ represents a track in L, σ represents i Representing the ith activity, σ, of the trajectory σ i+1 Represents the i +1 st activity of the trajectory σ;
the direct following activity relationship refers to a logical relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relationship obtained in the step 2.2), constructing a directed graph model by using a directed graph algorithm through a formula (2), wherein if the directed graph is a directed weighted graph, the direct following frequency between events is represented by using a weight value;
Figure BDA0003855070060000092
in the formula, DFR _ DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency with which the ith and (i + 1) th activities follow directly;
the directed graph refers to that the events in the track are represented by nodes, and the direct following activity relation between the events is represented by directed edges.
With the above steps, a directed graph model of the flow generated from the original event log is obtained, as shown in fig. 4.
3) The method for updating the directed graph model by the single incremental event log specifically comprises the following steps:
3.1 Obtaining single increment event logs according to the order of months (or other time units) from the increment event logs obtained in the step 1), analyzing track information of the single increment event logs, and then calculating the direct following activity relationship and the corresponding frequency of the increment event logs;
3.2 Extracting trace information of the newly added event log from the direct following activity relationship obtained in the step 3.1), and then obtaining the direct following activity relationship in the trace of the newly added event log to update the directed graph model obtained in the step 2) so as to update the directed graph model generated by the original event log by the incremental event log; and the updating is to add the direct following activity relationship in the newly added event log track into the direct following activity relationship of the existing directed graph model.
By adopting the steps, a directed graph model updated by a single incremental event log is obtained, as shown in fig. 5.
4) Iteratively updating the directed graph model obtained in the step 3), specifically comprising the following steps:
4.1 Selecting single increment event logs in turn from the rest increment event logs obtained in the step 1);
4.2 Step 3) and step 4) are repeated until all the incremental event logs are selected;
4.3 The directed graph model after the iteration is finished in the step 4.2) is stored in a storage space and is used as the input content of the next step.
By adopting the steps, the directed graph model updated by all the incremental event logs is obtained, as shown in fig. 6.
5) The method for excavating the process model from the directed graph model specifically comprises the following steps:
5.1 Taking the directed graph model updated by all the incremental event logs obtained in the step 4) as input content;
5.2 Analyzing direct following activity relation between events in the directed graph model, and mining a Petri network from the events by using a flow discovery algorithm based on the directed graph model; the flow discovery algorithm based on the directed graph model can be divided into two types according to the existence or nonexistence value of the directed graph model, wherein the first type is Alpha Miner and Inductive Miner based on the directed graph without weight, and the second type is Heuristic Miner and Inductive Miner-Infrequency based on the directed graph with weight.
By adopting the steps, a Petri net mined from the directed graph model by using a Heuristic Miner algorithm is obtained, and is shown in FIG. 7.
Example 2
The embodiment discloses a flow model mining system facing incremental event logs, as shown in fig. 8, the system includes the following functional modules: the system comprises an event log acquisition and log preprocessing module, a directed graph model module of an original event log generation process, a single incremental event log updating directed graph model module, an iterative updating directed graph model module and a directed graph model mining module;
the event log acquiring and log preprocessing module is used for acquiring a regularly stored service flow event log and dividing the service flow event log into an original event log and a plurality of incremental logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relationship and the corresponding frequency in the original event log, and generating a directed graph model by using a directed graph algorithm on the basis;
the single incremental event log updating directed graph model module is used for selecting a single incremental event log and updating a directed graph model generated by an original event log by using the single incremental event log;
the iterative updating directed graph model module is used for sequentially selecting logs from the rest incremental event logs and iteratively updating the directed graph model of the incremental event logs;
and the directed graph model mining flow model module is used for selecting the directed graph model after updating all incremental event logs and mining the flow model from the directed graph model by using a flow discovery algorithm based on the directed graph model.
Further, the event log obtaining and log preprocessing module specifically executes the following operations:
acquiring a service flow event log which is stored regularly; dividing a regularly stored business process event log track into single tracks, sorting the tracks according to months (or other time units), and forming an event log by the tracks of each month, wherein the event log of the first month is used as an original event log, and the logs of the rest months are used as incremental event logs; analyzing the direct following activity relation and the corresponding frequency between events from the original event log so as to construct the relation and the frequency between nodes in the directed graph model;
the regularly stored business process event log refers to an event log generated by business process operation within a period of time, and the process model needs to be changed continuously due to the influence of an external environment and the personalized requirements of a user, so that the regularly stored event log is often adopted to update the process model periodically in the actual operation process; the event log is a collection of finite sequences of events, each of which is referred to as a trace.
Further, the module for generating a directed graph model from the original event log specifically executes the following operations:
selecting an original event log as input content; analyzing the direct following activity relationship and the corresponding frequency in the original event log; and constructing a directed graph model by using a directed graph algorithm based on the obtained direct following activity relation and frequency.
Further, the single incremental event log updating directed graph model module specifically executes the following operations:
acquiring a single increment event log from the increment event logs according to the sequence of months (or other time units), analyzing track information of the single increment event log, and then calculating the direct following activity relationship and the corresponding frequency of the increment event log; and selecting a newly added direct following activity relation from the obtained direct following activity relations to update the direct following activity relation of the directed graph model constructed by the original event log, so as to realize the update of the incremental event log on the directed graph model generated by the original event log.
Further, the module for iteratively updating the directed graph model specifically executes the following operations:
sequentially selecting a single increment event log from the rest increment event logs; repeating the operation of updating the directed graph model by the single incremental event log until all the incremental event logs are selected; and saving the directed graph model after the iteration is ended into a storage space so as to be used as the input content of the next module.
Further, the digger model module of the directed graph model specifically executes the following operations:
selecting a directed graph model updated by all incremental event logs, and analyzing the relationship and the corresponding frequency among all nodes; selecting any flow discovery algorithm based on the directed graph model, and excavating a flow model from the following relation and the corresponding frequency of each node of the directed graph model; and the nodes in the directed graph model are all events in the event log track.
In conclusion, after the scheme is adopted, the method provides a new excavation framework for the process model excavation, and the directed graph model is used as the intermediate model in the model updating process, so that the problems of excessive storage space consumption and execution time consumption in the traditional excavation method can be effectively solved, the model discovery efficiency is effectively improved, the actual popularization value is realized, and the method is worthy of popularization.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. The incremental event log-oriented process model mining method is characterized by comprising the following steps of:
1) Acquiring basic data, namely a service flow event log which is stored regularly, and dividing the basic data into an original event log and a plurality of incremental event logs;
2) Generating a directed graph model of the flow by using the original event log;
3) Updating the directed graph model obtained in the step 2) by using a single incremental event log;
4) Iteratively updating the directed graph model obtained in the step 3);
5) And excavating a flow model by using the directed graph model obtained in the step 4).
2. The incremental event log-oriented process model mining method of claim 1, wherein: in step 1), a service flow event log stored regularly is acquired and divided into an original event log and a plurality of incremental event logs, and the method specifically comprises the following steps:
1.1 Obtaining a regularly stored business process event log; the regularly stored business process event log refers to an event log generated by business process operation within a period of time, and due to the influence of an external environment and the personalized requirements of users, the regularly stored event log is often used for updating a process model in the actual business operation process; the event log has a period of persistence and is a set of finite event sequences, and each finite event sequence is called a track;
1.2 Track division is carried out on the service flow event logs which are obtained in the step 1.1) and stored regularly to obtain a single track, then all tracks are sorted according to months, and the log tracks of the same month are stored into the same event log according to the sorting result;
1.3 The single event log which is obtained in the step 1.2) and is sorted by the month shares is divided into an original event log and an incremental event log, wherein the event log which is arranged in the first month is used as the original event log, and each remaining event log is used as an incremental event log.
3. The incremental event log-oriented process model mining method of claim 2, wherein: in step 2), the directed graph model of the original event log generation process specifically includes the following steps:
2.1 Selecting the original event log in the step 1) as input content;
2.2 Analyze the directly following activity relationships in the log by equation (1);
Figure FDA0003855070050000021
in the formula, DFR _ S (L) represents the set of directly following activity relationships in the event Log L, L represents the event Log, σ represents the track in L, σ represents i Representing the ith activity, σ, of the trajectory σ i+1 The i +1 th activity representing the trajectory σ;
the direct following activity relationship refers to a logical relationship existing between two adjacent events in the log track;
2.3 Starting from the direct following activity relationship obtained in the step 2.2), constructing a directed graph model of the flow by using a directed graph algorithm through a formula (2), wherein if the directed graph model is a directed weighted graph, the direct following frequency between events is represented by using a weight value;
Figure FDA0003855070050000022
in the formula, DFR _ DFG (L) represents a directed graph model mined from the event log L, and n represents the frequency with which the ith and (i + 1) th activities follow directly;
the directed graph model refers to representing events in a track by nodes and representing direct following activity relations between the events by directed edges.
4. The incremental event log-oriented process model mining method of claim 3, wherein: in step 3), updating the directed graph model obtained in step 2) by using a single incremental event log, specifically including the following steps:
3.1 Obtaining a single increment event log from the increment event logs obtained in the step 1) according to the order of months, analyzing track information of the single increment event log, and then calculating the direct following activity relationship and the corresponding frequency of the increment event log;
3.2 Selecting new event log track information from the direct following activity relationship obtained in the step 3.1), and then obtaining the direct following activity relationship in the new event log track to update the directed graph model obtained in the step 2) so as to update the directed graph model generated by the original event log by the incremental event log; and the updating is to add the direct following activity relationship in the newly-added event log track into the direct following activity relationship of the existing directed graph model.
5. The incremental event log oriented process model mining method of claim 4, wherein: in step 4), iteratively updating the directed graph model obtained in step 3), specifically including the following steps:
4.1 ) sequentially selecting single incremental event logs from the rest incremental event logs obtained in the step 1);
4.2 Step 3) and step 4) are repeated until all the incremental event logs are selected;
4.3 Storing the directed graph model after the iteration is finished in the step 4.2) into a storage space, and taking the directed graph model as the input content of the next step.
6. The incremental event log oriented process model mining method of claim 5, wherein: in step 5), mining the process model by using the directed graph model obtained in step 4), specifically comprising the following steps:
5.1 Taking the directed graph model updated by all the incremental event logs obtained in the step 4) as input content;
5.2 Analyzing direct following activity relation between events in the directed graph model, and mining a Petri network from the events by using a flow discovery algorithm based on the directed graph; the flow discovery algorithm based on the directed graph model is divided into two types according to the weighted value of the directed graph, wherein the first type is Alpha Miner and Inductive Miner based on the directed graph, and the second type is Hearistic Miner and Inductive Miner-Infrequency based on the directed graph.
7. The incremental event log-oriented process model mining system is characterized by comprising an event log obtaining and log preprocessing module, a directed graph model module of an original event log generating process, a single incremental event log updating directed graph model module, an iterative updating directed graph model module and a directed graph model mining module;
the event log acquiring and log preprocessing module is used for acquiring a regularly stored service flow event log and dividing the service flow event log into an original event log and a plurality of incremental logs;
the directed graph model module of the original event log generation flow is used for selecting an original event log, analyzing the direct following activity relationship and the corresponding frequency in the original event log, and generating a directed graph model by using a directed graph algorithm on the basis;
the single incremental event log updating directed graph model module is used for selecting a single incremental event log and updating a directed graph model of a flow generated by an original event log by using the single incremental event log updating directed graph model;
the iterative updating directed graph model module is used for sequentially selecting logs from the rest incremental event logs and iteratively updating the directed graph model of the incremental event logs;
and the directed graph model mining flow model module is used for selecting the directed graph model after updating all incremental event logs and mining the flow model from the directed graph model by using a directed graph-based flow discovery algorithm.
8. The incremental event log oriented process model mining system of claim 7, wherein: the event log obtaining and log preprocessing module specifically executes the following operations:
acquiring a service flow event log which is stored regularly; performing track division on regularly stored service flow event logs to obtain a single track, then sequencing all tracks according to months, and storing the log tracks of the same month into the same event log according to a sequencing result; selecting the event log arranged in the first month as an original event log, and taking each residual event log as an incremental event log; the regularly stored business process event log refers to an event log generated by business process operation within a period of time, and due to the influence of an external environment and the personalized requirements of users, the regularly stored event log is often used for updating a process model in the actual business operation process; the event log has a period of persistence and is a set of finite event sequences, and each finite event sequence is called a track;
the directed graph model module of the original event log generation process specifically executes the following operations:
selecting an original event log as input content; analyzing the direct following activity relationship and the corresponding frequency in the original event log; and constructing a directed graph model by using a directed graph algorithm based on the obtained direct following activity relation and frequency.
9. The incremental event log oriented process model mining system of claim 7, wherein: the single incremental event log updating directed graph model module specifically executes the following operations:
acquiring a single increment event log from the increment event logs according to the order of months, analyzing track information of the single increment event log, and then calculating the direct following activity relationship and the corresponding frequency of the increment event log; selecting a newly added direct following activity relation from the obtained direct following activity relations to update the direct following activity relation of the directed graph model constructed by the original event log, so as to update the directed graph model generated by the original event log by the incremental event log;
the iterative updating directed graph model module specifically executes the following operations:
sequentially selecting a single increment event log from the rest increment event logs; repeatedly carrying out the operation of updating the directed graph model by the single incremental event log until all the incremental event logs are selected; and storing the directed graph model after the iteration is finished into a storage space so as to be used as the input content of the next module.
10. The incremental event log oriented process model mining system of claim 7, wherein: the digger flow model module of the directed graph model specifically executes the following operations:
selecting a directed graph model updated by all incremental event logs, and analyzing the relationship and the corresponding frequency among all nodes; selecting any flow discovery algorithm based on the directed graph model, and excavating a flow model from the following relationship and corresponding frequency of each node of the directed graph model; and the nodes in the directed graph model are all events in the event log track.
CN202211144557.3A 2022-09-20 2022-09-20 Incremental event log-oriented process model mining method and system Active CN115525693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211144557.3A CN115525693B (en) 2022-09-20 2022-09-20 Incremental event log-oriented process model mining method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211144557.3A CN115525693B (en) 2022-09-20 2022-09-20 Incremental event log-oriented process model mining method and system

Publications (2)

Publication Number Publication Date
CN115525693A true CN115525693A (en) 2022-12-27
CN115525693B CN115525693B (en) 2024-02-06

Family

ID=84697912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211144557.3A Active CN115525693B (en) 2022-09-20 2022-09-20 Incremental event log-oriented process model mining method and system

Country Status (1)

Country Link
CN (1) CN115525693B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258350A (en) * 2023-05-15 2023-06-13 烟台岸基网络科技有限公司 Sea container transportation monitoring method
CN117495071A (en) * 2023-12-29 2024-02-02 安徽思高智能科技有限公司 Flow discovery method and system based on predictive log enhancement

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778051A (en) * 2014-01-09 2014-05-07 安徽理工大学 Business process increment mining method based on L* algorithm
CN105069044A (en) * 2015-07-22 2015-11-18 安徽理工大学 Simulated indirect dependency based novel process model mining method
CN109086385A (en) * 2018-07-26 2018-12-25 安徽理工大学 A kind of operation flow low frequency Behavior mining method based on Petri network
CN109460391A (en) * 2018-09-04 2019-03-12 安徽理工大学 A kind of process model excavation new method cut based on process
CN111984706A (en) * 2020-08-20 2020-11-24 山东理工大学 Emergency linkage disposal flow model mining method for emergency
CN112069136A (en) * 2020-08-28 2020-12-11 山东理工大学 Outsourcing model mining method for emergency handling process of emergency event
CN114756602A (en) * 2022-05-19 2022-07-15 上海熵评科技有限公司 Real-time streaming process mining method and system and computer readable storage medium
CN114971710A (en) * 2022-05-25 2022-08-30 北京凡得科技有限公司 Event log-based multi-dimensional process variant difference analysis method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778051A (en) * 2014-01-09 2014-05-07 安徽理工大学 Business process increment mining method based on L* algorithm
CN105069044A (en) * 2015-07-22 2015-11-18 安徽理工大学 Simulated indirect dependency based novel process model mining method
CN109086385A (en) * 2018-07-26 2018-12-25 安徽理工大学 A kind of operation flow low frequency Behavior mining method based on Petri network
CN109460391A (en) * 2018-09-04 2019-03-12 安徽理工大学 A kind of process model excavation new method cut based on process
CN111984706A (en) * 2020-08-20 2020-11-24 山东理工大学 Emergency linkage disposal flow model mining method for emergency
CN112069136A (en) * 2020-08-28 2020-12-11 山东理工大学 Outsourcing model mining method for emergency handling process of emergency event
CN114756602A (en) * 2022-05-19 2022-07-15 上海熵评科技有限公司 Real-time streaming process mining method and system and computer readable storage medium
CN114971710A (en) * 2022-05-25 2022-08-30 北京凡得科技有限公司 Event log-based multi-dimensional process variant difference analysis method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
倪维健,孙宇健,曾庆田等: ""基于事件日志增强的时序活动表示学习方法"", 《计算机集成制造系统》, vol. 25, no. 4, pages 837 - 846 *
周衍志: ""基于增量日志的过程挖掘方法研究"", 《中国优秀硕士学位论文全文数据库》, no. 02, pages 138 - 350 *
薛洋婷,王丽丽: ""一种增量挖掘优化流程模型方法"", 《赤峰学院学报(自然科学版)》, vol. 35, no. 10, pages 57 - 60 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258350A (en) * 2023-05-15 2023-06-13 烟台岸基网络科技有限公司 Sea container transportation monitoring method
CN116258350B (en) * 2023-05-15 2023-08-11 烟台岸基网络科技有限公司 Sea container transportation monitoring method
CN117495071A (en) * 2023-12-29 2024-02-02 安徽思高智能科技有限公司 Flow discovery method and system based on predictive log enhancement
CN117495071B (en) * 2023-12-29 2024-05-14 安徽思高智能科技有限公司 Flow discovery method and system based on predictive log enhancement

Also Published As

Publication number Publication date
CN115525693B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN115525693A (en) Incremental event log-oriented process model mining method and system
CN104715073B (en) Based on the association rule mining system for improving Apriori algorithm
CN102945240B (en) Method and device for realizing association rule mining algorithm supporting distributed computation
US9129002B2 (en) Dividing device, dividing method, and recording medium
CN102214086A (en) General-purpose parallel acceleration algorithm based on multi-core processor
CN113342495B (en) Cross-tissue multi-instance sub-process model mining method and system
CN103150163A (en) Map/Reduce mode-based parallel relating method
CN107391621A (en) A kind of parallel association rule increment updating method based on Spark
CN103198099A (en) Cloud-based data mining application method facing telecommunication service
CN112651618A (en) Construction method of audit dimension model for online audit of metering data
CN115408546A (en) Time sequence data management method, device, equipment and storage medium
CN115309749A (en) Big data experiment system for scientific and technological service
CN109768878B (en) Network work order calculation method and device based on big data
CN103984723A (en) Method used for updating data mining for frequent item by incremental data
CN115858719B (en) Big data analysis-based SIM card activity prediction method and system
CN112699252B (en) Processing method of attribute data applied to knowledge graph and electronic equipment
CN107707487B (en) Real-time retrieval system and real-time retrieval method for network service flow
CN112000389B (en) Configuration recommendation method, system, device and computer storage medium
Meng et al. Accelerating monte-carlo tree search on cpu-fpga heterogeneous platform
CN106250549B (en) A kind of Frequent Pattern Mining method memory-based
CN115203290A (en) Fault diagnosis method based on multi-dimensional prefix span algorithm
CN111813833B (en) Real-time two-degree communication relation data mining method
CN112685456A (en) User access data processing method and device and computer system
CN105868293A (en) Method for mining data stream frequent closed item set based on topology model
CN106598659A (en) Data file construction method, method and device for updating application program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant