CN117349697A

CN117349697A - Business process abnormality detection method, computer device, and readable storage medium

Info

Publication number: CN117349697A
Application number: CN202311251392.4A
Authority: CN
Inventors: 田银花; 杨立飞; 张睿哲; 韩咚; 牛晓琳; 庞孝文
Original assignee: Shandong University of Science and Technology
Current assignee: Shandong University of Science and Technology
Priority date: 2023-09-26
Filing date: 2023-09-26
Publication date: 2024-01-05

Abstract

The invention belongs to the technical field of business process anomaly detection, and discloses a business process anomaly detection method, computer equipment and a readable storage medium. The method carries out abnormal detection of the business process based on the BERT model added with the relative position information and the CNN convolutional neural network, and comprises the following specific implementation steps: firstly, constructing a data set with an abnormal mark according to a track in a historical event log in a data preprocessing process; then, the representation of the feature vector is completed using the BERT added with the relative position information in the feature vector representation process to better utilize the context information; and finally, constructing a classification model by applying a convolutional neural network in the abnormality detection process, thereby more accurately completing the abnormality detection of the business process. The experimental result on the real data set shows that the accuracy of the business process anomaly prediction is greatly improved, and the effectiveness of the method is proved.

Description

Business process abnormality detection method, computer device, and readable storage medium

Technical Field

The present invention relates to a method for detecting abnormal business processes, a computer device and a readable storage medium.

Background

The continuous development of big data technology greatly promotes the development of business process management. Enterprises need to conduct business process management more intelligently, efficiently and orderly, and production efficiency is improved, so that changing market demands are met. The development of large data not only provides a large, diverse amount of data, but also provides the ability to store and analyze such large data so that businesses can record, analyze, and better understand and manage their own business processes.

By utilizing big data analysis, enterprises can find out the association relationship hidden behind the data, find out potential clients, monitor and predict the change of the business process in real time, quickly find out abnormal conditions, take corresponding measures and ensure the stability of the business process to the greatest extent. The occurrence of abnormal events in a business process may negatively affect the efficiency, quality and profit of an enterprise, and thus implementing abnormal detection of the business process of an enterprise becomes critical.

By establishing an effective business process abnormality detection method, abnormal conditions in the operation process of enterprises can be monitored in real time, the enterprises can be helped to find problems early and solve potential problems, the enterprises can be helped to reduce losses, customer satisfaction can be improved, and competitiveness can be enhanced.

The anomaly detection comprises two methods of offline anomaly detection and online anomaly detection.

Offline anomaly detection refers to anomaly detection and identification of historical data that has been collected, and this approach typically uses techniques such as process mining to identify anomalies. The process mining has great effect in business flow management, is significant for anomaly detection research of business processes, and can monitor and identify abnormal activities and behaviors by establishing a model of the business processes, so that potential problems can be found and treated in time. Most of the current anomaly detection technologies are offline anomaly detection, i.e. anomaly detection is performed on the tracks which have already been executed, so that the real-time performance is lacking, and the error behavior cannot be immediately observed.

The online anomaly detection refers to anomaly detection and identification in a real-time data stream, and is mainly characterized by being capable of monitoring the data stream in real time and detecting anomaly immediately and rapidly, so that the anomaly can be immediately found and processed, and potential risks and losses of enterprises are reduced to the greatest extent. There are few models for online anomaly detection.

Business process anomaly detection methods can be broadly divided into three categories: an anomaly detection method based on a process model, an anomaly detection method based on machine learning, and an anomaly detection method based on deep learning.

The anomaly detection method based on the process model can realize anomaly detection of the control flow, but the method is difficult to realize because of the need of excavating a high-quality process model, and has great limitation because a great deal of time is wasted when alignment comparison is carried out.

With the development of machine learning, business process anomaly detection methods based on the techniques are also widely studied, however with the continuous development of deep learning, the deep learning method is applied to business process anomaly detection and achieves better effects, and compared with the machine learning method, the method improves anomaly detection capability and detection quality.

Although the deep learning method has the problem of overlong training time in the training stage, model training is completed in the offline stage, the online detection efficiency is not affected, and the requirement on resources such as hardware and the like is too high when a large number of logs exist.

Disclosure of Invention

The invention aims to provide a business process anomaly detection method, which combines a BERT model added with relative position information and a CNN prediction model to fully utilize the strong semantic representation capability of BERT and the sensitivity of the CNN prediction model to local characteristics, thereby obtaining higher accuracy in the business process anomaly detection. In addition, the invention can realize the online track abnormality detection of the business process, thereby being capable of identifying and rapidly detecting the abnormality in the business process in real time.

In order to achieve the above purpose, the invention adopts the following technical scheme:

a business process abnormality detection method comprises the following steps:

step 1, generating an abnormal track sequence according to tracks in a historical event log, and respectively labeling sequence abnormal conditions so as to construct a data set; dividing the data set into a training set for training a business process abnormality detection model in the following step 2 and a testing set for model testing;

step 2, constructing a business process anomaly detection model;

the business process anomaly detection model comprises a BERT model added with relative position information and a CNN prediction model;

the processing procedure of the business process abnormality detection model is as follows:

firstly, converting an input track sequence into a track feature vector containing context semantic information by using a BERT model added with relative position information so as to better utilize the context information;

then inputting the obtained track feature vector into a CNN prediction model to detect abnormal business flow;

step 3, training the business process anomaly detection model by using the training set data constructed in the step 1, and testing the trained business process anomaly detection model by using the testing set data;

and 4, utilizing the trained abnormal detection model of the business process to realize online and offline abnormal detection of the business process.

In addition, on the basis of the business process abnormality detection method, the invention also provides computer equipment which comprises a memory and one or more processors. The memory stores executable codes, and the processor is used for realizing the steps of the business process abnormality detection method when executing the executable codes.

In addition, on the basis of the business process abnormality detection method, the invention also provides a computer readable storage medium on which a program is stored. The program, when executed by the processor, is configured to implement the steps of the business process anomaly detection method described above.

The invention has the following advantages:

as described above, the present invention describes a method for detecting abnormal business processes, which uses a BERT model added with relative position information to complete the representation of feature vectors on a preprocessed data set, fully considers the context association, captures the association relation between the front and the back, further extracts the feature vectors with more characterization capability through representation learning, and simultaneously uses a CNN convolutional neural network to realize the abnormal detection of the business processes. Meanwhile, the invention can realize the online track abnormality detection of the business process, thereby being capable of identifying and rapidly detecting the abnormality in real time and finding the potential risk in time, enabling the system to rapidly respond, reducing the potential risk and loss, improving the performance of the system and ensuring the safety of the system.

Drawings

FIG. 1 is a flow chart of business process anomaly detection based on representation learning in an embodiment of the invention.

Fig. 2 is a block diagram of a BERT network model employed in an embodiment of the present invention.

FIG. 3 is a diagram illustrating a self-attention mechanism after adding relative position codes in an embodiment of the present invention.

Fig. 4 is a network structure diagram of a convolutional neural network in an embodiment of the present invention.

Fig. 5 is a network structure diagram based on adding the relative position information BERT model and convolutional neural network.

FIG. 6 is a graph showing the comparison of accuracy of the method of the present invention with that of the conventional method over 4 data sets.

Detailed Description

Example 1

The embodiment 1 describes a business process abnormality detection method based on representation learning, which performs business process abnormality detection by using a classification algorithm through high-dimensional representation of learning trajectories.

For a better understanding of the present invention, the following theoretical knowledge introduction is given first.

In business process management, an event log is a log file that records a sequence of events that occur in an organization or system, consisting of execution traces of business processes. For ease of understanding, a formal definition of the relevant concepts in the event log is given below.

Defining an event: events are the main body of business process, each event has event ID, timestamp and other related attributes, and is composed of e= (attr) ₁ ,attr ₂ ,…,attr _m ) Representation, where attr _i Representing the attributes that the event has.

Defining a track: a sequence consisting of a series of events, called trajectory, consists of σ= { e ₁ ,e ₂ ,e ₃ ,…,e _|n| Represented by (e), where e ₁ 、e ₂ 、e ₃ 、…、e _|n| For events in the flow instance, |n| represents the length of the track.

Defining an event log: event logs are a set of all traces, which can be made of l= { σ ₁ ,σ ₂ ,σ ₃ ,…,σ _|L| And } represents.

Defining a track prefix: the track prefix is a track sub-sequence composed of the first l events from the beginning of the track, and can be composed of sigma ^l ＝{e ₁ ,e ₂ ,e ₃ ,…,e _l And the expression is that 1 is less than or equal to l is less than or equal to |n|.

The transducer has excellent performance in the field of natural language processing. Because sentences in natural language processing and tracks in business processes are both sequence data, the use of a transducer to solve different tasks in a business process is a significant advantage.

The BERT model constructed based on Transformer Encoder can capture long-distance dependency, so that the invention provides an anomaly detection method based on the addition of the relative position information BERT model and the convolutional neural network on the basis of a framework.

Representation learning is the extraction of feature vectors by automatic learning of other data patterns, such as images or text, to better accomplish downstream tasks such as classification, anomaly detection, activity prediction, etc.

In business process anomaly detection, the representation learning improves the performance of anomaly detection by learning a characteristic representation of the data, which can improve the performance and generalization capability of a downstream predictive task learning model by learning a potential structure in the data.

And in the process of representing learning, the BERT model added with the relative position information is used for completing the feature vector representation of the preprocessed data set, and in the process of detecting the abnormality, the CNN convolutional neural network is used for realizing the abnormal detection of the business process.

BERT was proposed by Google in 2018, with excellent performance in 11 different natural language processing (Natural Language Processing, NLP) tasks becoming one of the most influential language representation models today.

Unlike the traditional unidirectional language model, the BERT is a training model based on a bidirectional transducer architecture, and context information on the left side and the right side can be considered at the same time, so that words and sentences can be expressed more accurately and comprehensively.

BERT serves as a powerful pre-trained language model that provides high quality, deep vector representations by way of pre-training and fine-tuning. The pretraining process of the BERT model includes two tasks: mask language modeling (Masked LanguageModel, MLM), next sentence prediction (Next Sentence Prediction, NSP).

The MLM task randomly overlays some of the input words and then lets the model predict these masked words from context, which enables it to provide more powerful semantic understanding and context awareness, thus achieving significant performance improvements in various natural language processing tasks.

Given two sentences in the input sequence, the NSP task predicts whether the two sentences are consecutive, which task helps learn the correlation between sentences.

The input of the BERT model mainly consists of three parts, token Embedding (Token Embedding), position Embedding (position Embedding) and segment Embedding (Segment Embedding).

Tag embedding converts individual words into a vector of fixed dimensions, a common method of converting words into vectors.

The positional embedding is used to represent the positional relationship of the markers in the input sequence, and is a fixed vector representation.

Segment embedding is also a set of fixed vector representations, one for each paragraph, for distinguishing the labels of the different paragraphs in the input text, for use in downstream tasks where there are scenes requiring two sentences to be input simultaneously into the model, for example: text pairs, question-answering systems, etc.

A block diagram of the BERT network model is shown in fig. 2. The BERT network model processing process is that an input word embedding vector passes through an encoder unit of a multi-layer converter in the BERT, in each layer, self-attention mechanism calculation is carried out on the input, and finally the obtained vector takes the output of a first position as the input of a classification model.

For anomaly detection, the feature vector representation of the deep trace is mainly learned, and only one trace sequence is input at a time.

The present invention focuses on training the BERT model using MLM tasks to help model learn context information between activities. In the MLM task, the model is input without embedding a building section, and the input vector of the model is formed by two parts.

(1) Marker embedding

Each activity of the input track sequence is embedded into a representation, which is converted into a vector form to represent semantic information of the activity.

(2) Position embedding

For representing the order of the input sequence, a vector representation is assigned to each marked position.

The final Input vector (Input vector) can be obtained by only adding the results of each part and is used as the Input of the BERT model, and the Input vector is shown as a formula (1).

Input Embedding＝Token Embedding+Positional Embedding(1)

The Convolutional Neural Network (CNN) is a deep feedforward neural network, mainly composed of an input layer, a convolutional layer, a pooling layer, a full-connection layer and other components, and is characterized by learning input data through multi-layer stacking. CNNs are suitable for processing tasks with grid structure data. The method can extract high-level characteristic representation from input data through convolution, pooling, full connection and other operations, and provides an effective solution for tasks in the fields of image processing, natural language processing and the like.

In the field of image processing, images are two-dimensional data, convolution kernels in the images are generally square, and feature extraction is performed by sliding windows from left to right and from top to bottom by adopting two-dimensional convolution. The method is applied to a text sequence, the width of the convolution kernel is the same as that of the feature vector matrix, and the convolution kernel only performs feature extraction in a sliding window from top to bottom.

Based on the above theoretical basis, the present invention will be described in further detail with reference to the accompanying drawings and detailed description.

As shown in fig. 1, the method for detecting abnormal business process in this embodiment 1 includes the following steps:

and step 1, data preprocessing.

Generating an abnormal track sequence according to the tracks in the historical event log, and respectively labeling sequence abnormal conditions so as to construct a data set; and dividing the data set into a training set for training the business process abnormality detection model in the following step 2 and a testing set for model testing.

The business process abnormality detection method needs to realize detection of both offline abnormality and online abnormality of the business process, so the constructed data set comprises an offline abnormality detection data set and an online abnormality detection data set.

The two anomaly detection datasets are configured differently, as described in detail below.

The acquisition process of the offline anomaly detection dataset is as follows:

and randomly deleting and adding the track of the historical event log to generate an abnormal track, and marking the abnormal conditions of the track on the manually added abnormal track and the track in the original event log respectively, so as to construct an offline abnormal detection data set.

The acquisition process of the online anomaly detection dataset is as follows:

and constructing an online abnormality detection data set by utilizing the track prefixes, extracting track prefix sequences with different lengths from the tracks obtained by the event logs, adding the abnormal tracks on the basis, marking the abnormal conditions, and marking the abnormal conditions of the tracks respectively by the manually added abnormal tracks and the track prefixes, thereby constructing the online abnormality detection data set.

In more detail, the acquisition process of the online anomaly detection dataset specifically includes:

step 1.1, connecting events in an event log in series according to an instance ID to form a sequence, so as to obtain a track;

step 1.2, dividing the obtained track into track prefixes according to different lengths to obtain prefix sequences with different lengths;

step 1.3, randomly adding abnormal activity or deleting activity on the generated track prefix sequence, comparing the generated sequence with the track extracted by the event log and the track prefix sequence, and if the generated sequence does not exist, adding the generated sequence to the data set;

and 1.4. Marking abnormal conditions of each track and track prefix in the data set, wherein the abnormal track is marked as 1, otherwise, marking as 0.

Table 1 is a sample log of the BPIC2017 event log, which includes 4 activities in the event log fragment, namely O_Create Offer, O_Created, O_ Sent (online only), O_Cancel.

Taking the event log as an example, a data preprocessing method for the event log is described.

Table 1 sample log of BPIC2017 event log

Firstly, the events in the event log are connected in series according to the instance ID to form a sequence, and a track is obtained.

For example, an instance concatenation activity numbered as Offer_247135719 may result in a trace < O_Create Offer, O_Created, O_ Sent (online only), O_Cancel >.

Secondly, track prefix extraction is an important step for realizing online anomaly detection in a business process. Dividing the obtained track into track prefixes according to different lengths to obtain prefix sequences with different lengths.

Then, manually adding an exception in the data set, wherein the specific implementation method comprises the following steps:

and randomly adding abnormal activity or deleting activity on the generated track prefix sequence, comparing the generated sequence with the track extracted by the event log and the track prefix sequence, and if the generated sequence is not added to the data set.

And finally, marking abnormal conditions of each track and track prefix in the data set, wherein the abnormal track is marked as 1, and otherwise, marking as 0.

An example of a preprocessing trace of the BPIC2017 event log is shown in table 2.

Table 2 pre-processing trace example of BPIC2017 event log

After the two abnormal detection data sets are obtained, the offline abnormal detection data set and the online abnormal detection data set are further divided into a training set and a testing set respectively and independently. Specifically:

the training set and the testing set divided by the offline anomaly detection data set and the training set and the testing set divided by the online anomaly detection data set are respectively and independently used for training and testing the business process anomaly detection model.

And 2, constructing a business process anomaly detection model, wherein the model architecture is shown in fig. 5. The business process anomaly detection model comprises a BERT model added with relative position information and a CNN prediction model.

first, an input trajectory sequence is converted into a trajectory feature vector containing context semantic information using a BERT model added with relative position information to better utilize the context information.

And then inputting the obtained track feature vector into a CNN prediction model to detect the abnormal operation flow.

To better utilize the context information, the transform self-attention mechanism in the BERT model is improved, and absolute position coding is combined with relative position coding in calculating attention.

The conventional transducer's self-attention mechanism performs a representation of the position between different activities based on absolute position coding, which results in that the attention weight of each position is only dependent on its correlation with other activity positions, and the relative distance between the different positions cannot be distinguished when calculating the attention weight. The attention weight is obtained by dot product calculation of the query vector and the key vector, scaling, dividing the dot product calculation by the dimension of the key vector, and softmax operation, and finally, the attention expression is obtained by multiplying the weight and the value vector. The calculation formula is shown as formula (2).

Wherein Q, K and V represent query, key and value vectors, respectively, d _k Representing the dimensions of the key vector.

In this embodiment, the BERT model added with the relative position information is used to convert the input track sequence into the track feature vector containing the context semantic information, and in order to capture the relative position information, the attention calculation formula is modified when the attention weight is calculated, and the absolute position code is combined with the relative position code to calculate the attention weight.

In order to better encode the position information between the activities, adding a relative position code, adding a key vector K and a relative position matrix R after adding the relative position code, performing dot product calculation with a query vector Q, performing scaling and softmax operation on the calculation result to obtain attention weight, and finally multiplying the weight by a value vector V to obtain attention representation. The calculation process is shown in formula (3).

Wherein Attention (Q, K, V, R) is the resulting Attention representation.

Comparing equation (2) with equation (3), it can be seen that a relative position matrix is also added when calculating the attention weight, so that the attention weight can be adjusted according to the relative position, where R represents the relative position matrix.

In the embodiment, the BERT model adds relative position information when calculating the self-attention mechanism weight, so that the relation between different positions in the sequence can be more accurately captured, and the representation capability of the feature vector is enhanced.

The structure of the self-attention mechanism after adding the relative position code is shown in fig. 3. As shown in fig. 3, the inputs thereof include a query vector Q, a key vector K, a value vector V, and a relative position matrix R, first, the dot product of Q and K, and the dot product of Q and R are calculated; secondly, adding the two-point multiplication results; then use sqrt (d _k ) Scaling it; performing softmax operation calculation to obtain attention weight; finally, multiplication with the value vector V results in an attention representation.

In this embodiment, the feature vector representation is trained using the BERT model added with the relative position information, and the BERT model added with the relative position information has various advantages over other feature vector representation methods.

1. The BERT model in this embodiment can better capture context information than the TF-IDF (term frequency-inverse document frequency) method, which does not scale well for active differentiation in the trace.

2. Compared with Word2Vec, in this embodiment, the BERT model may output not only a Word-level vector representation but also a vector representation of a track sequence directly, while Word2Vec may only implement a Word-level representation, and the vector representation of the whole track needs to be implemented by means of average weighting or stitching.

The convolutional layer is one of the cores of the CNN convolutional neural network, and functions to extract features of input data. Taking the feature vector output by the BERT model as the input of a convolutional neural network, and extracting the local features of an input layer sequence through a convolutional layer; the pooling layer performs aggregation operation on the sequence data to convert the sequence information into vector representation with fixed length, and reduces the size of output characteristics of the convolution layer; finally, the feature vectors are input to a full connection layer, and softmax is used for anomaly detection classification.

The network structure of the CNN convolutional neural network is shown in fig. 4, and the specific processing procedure is as follows.

Local features are extracted at the convolution layers by using convolution kernels with the sizes of 2, 3 and 4 respectively, wherein the width is the same as the dimension of the track vector, and the number of each convolution kernel is 128. Each convolution corresponds to one feature extraction, and convolution kernels with different sizes are defined to realize the extraction of different local features. The method is realized by using a maximum pooling method at the pooling layer, the size of the feature vector is reduced, the complexity of the model is reduced, and the most remarkable information of the feature is obtained. And before the input of the full connection layer, the dropout operation is carried out, the output of the neuron is randomly set to zero with a certain probability (dropout rate), invalid characteristic data are discarded, and the data overfitting phenomenon is reduced. The softmax activation function was used at the full connectivity layer to obtain the final trace abnormality detection results.

The definition of the softmax activation function is shown in equation (4).

Wherein p is _i And softmax (z) _i The representation is for the output class probability calculation,and z is an output vector, the dimension of the z is K, and the maximum value of the abnormal class of the sequence is obtained through a softmax function, namely the predicted class.

The hyper-parametric configuration of the CNN prediction model is shown in table 3.

TABLE 3CNN model hyper-parameter configuration

The convolution neural network model is used for abnormality detection, and compared with other methods, when the convolution neural network processes the track sequence, the convolution operation can extract local features containing abnormal conditions, and different local features are obtained through convolution kernels with different sizes.

In addition, compared with other deep learning methods, the CNN model uses a parameter sharing mechanism, so that the parameter quantity of the model is greatly reduced, and the parallel computation of convolution operation enables CNN to have higher computation efficiency when processing long sequences.

Therefore, the method for integrating the BERT model added with the relative position information and the CNN can fully utilize the strong semantic representation capability of BERT and the sensitivity of CNN to local characteristics, thereby obtaining better effect in abnormal detection of the business flow.

And 3, training the business process abnormality detection model by using the training set data constructed in the step 1, and testing the trained business process abnormality detection model by using the testing set data.

And training the built business process anomaly detection model by using the training set divided in the offline anomaly detection data set and the training set divided in the online anomaly detection data set respectively to obtain two groups of optimized network parameters.

Two groups of optimized network parameters are defined as off-line abnormality detection network parameters and on-line abnormality detection network parameters respectively.

And testing the business process anomaly detection model with the offline anomaly detection network parameters by utilizing the test set divided in the offline anomaly detection data set so as to test the training effect of the business process anomaly detection model.

And simultaneously, testing the business process abnormality detection model with the online abnormality detection network parameters by utilizing a test set divided in the online abnormality detection data set so as to test the training effect of the business process abnormality detection model.

The invention uses the business process abnormality detection model with the offline abnormality detection network parameter or the online abnormality detection network parameter to respectively realize the offline abnormality detection and the online abnormality detection of the business process.

Specifically, the offline anomaly detection process of the business process is as follows:

firstly, aiming at a track in an event log, generating a feature vector representation of the track, then inputting the feature vector representation into a business process anomaly detection model with offline anomaly detection network parameters, and predicting the anomaly condition of the track to obtain an anomaly detection result.

Specifically, the online anomaly detection process of the business process is as follows:

firstly, generating a characteristic vector representation of a track aiming at the executing online track, then inputting the characteristic vector representation into a business process anomaly detection model with online anomaly detection network parameters, and predicting the anomaly condition of the current track to obtain an anomaly detection result of the executing online track.

The method is beneficial to the realization of the abnormality detection of the offline log or the online track.

In addition, to verify the performance of the present invention for online anomaly detection, experiments were performed on 4 published data sets. The following describes the data set and experiment setting required by the experiment, and the melting experiment and the comparison experiment are analyzed by taking the accuracy as an evaluation index.

The 4 public data sets are each as follows: the Help Desk dataset describes information about the ticketing business process of a company in Italy, BPIC_2017 describes the loan application process of the Netherlands financial institution, BPIC_2020 describes events related to two-year travel claims, and Sepsis Cases describes hospital Sepsis case events. Table 4 is a relevant data analysis of event logs.

Table 4 analysis of relevant data of event logs

The device processor in the experiment was Intel (R) Core (TM) i5-12500H 3.10GHz,GPU was GeForce RTX3060, with Python version 3.8, pytorch version 1.12.1, CUDA version 11.6.0. The parameter settings of the model in the experiment are shown in table 5. When BERT is used for feature vector training, the epoch number is set to 30, the learning rate is 0.001, the batch size is 16, and the dimension of the output vector of the model is the maximum track length. When the convolutional neural network is classified, the epoch number is set to 20, the batch size is 32, the learning rate is 0.001, the dropout is 0.1, and the convolutional kernel number is 128.

TABLE 5 model parameter configuration

The experiment uses the accuracy rate as an evaluation index to compare the abnormal detection performance, and the abnormal detection performance represents the proportion of correctly classified samples of the classification model. In the anomaly detection classification, if the data set includes m anomaly trajectories and n normal trajectories, the model correctly predicts x for the m anomaly trajectories and y for the n normal trajectories, the accuracy is shown in equation (5).

Accuracy＝(x+y)/(m+n)(5)

According to the invention, the BERT with relative position information is compared with the ablation experimental design of the traditional BERT model in the aspect of feature vector representation, and the optimization effect of the part on the model is evaluated by modifying the calculation method of the self-attention mechanism in the model and adding the relative position matrix during calculation. Based on the traditional BERT model, the relative position information is added, the same super parameter and other parameters of the model are ensured, the downstream prediction task uses the convolutional neural network with the same parameter, the same training set and the same test set are used, and the accuracy of the two algorithms is compared as shown in the table 6.

Table 6 comparison of accuracy of two algorithms

From the average accuracy of the two algorithms, the accuracy of the BERT model added with the relative position information is better than that of the traditional BERT model. The BERT model method for adding the relative position information has higher accuracy than the traditional BERT model.

Because the existing detection method is different in the method for adding the abnormal track, the invention selects the general deep learning model to finish the abnormal detection, uses the above 4 event log data sets, compares the invention with the general classification model Word2Vec+ CNN, transformer, BERT classification model, and the accuracy result on each data set is shown in figure 6.

As can be seen from the data in fig. 6, the method of the present invention is higher in average accuracy than other methods, and has more obvious advantages in the set of Sepsis Cases and BPIC 2020 data.

Observing the Sepsis Cases and BPIC 2020 data sets reveals that the track sequence in the data set is mostly a long sequence, whereas the model itself is good at handling relatively long sequences, so good results are achieved on this data set.

By carrying out experiments on 4 real event logs, the experimental results show that compared with the traditional Word2Vec, convolutional neural network, a transducer algorithm and a BERT classification algorithm, the accuracy of the method is improved to the maximum of 90.21%, and therefore, the method can more accurately realize abnormal detection of the business process.

Example 2

Embodiment 2 describes a computer device for implementing the method for detecting an abnormality in a business process described in embodiment 1.

In particular, the computer device includes a memory and one or more processors. Executable code is stored in the memory, which when executed by the processor is used to implement the steps of the business process anomaly detection method described above.

In this embodiment, the computer device is any device or apparatus having data processing capability, which is not described herein.

Example 3

Embodiment 3 describes a computer-readable storage medium for implementing the business process anomaly detection method described in embodiment 1 above.

Specifically, the computer-readable storage medium in this embodiment 3 has stored thereon a program for implementing the steps of the above-described business process abnormality detection method when executed by a processor.

The computer readable storage medium may be an internal storage unit of any device or apparatus having data processing capability, such as a hard disk or a memory, or may be an external storage device of any device having data processing capability, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, which are provided on the device.

The foregoing description is, of course, merely illustrative of preferred embodiments of the present invention, and it should be understood that the present invention is not limited to the above-described embodiments, but is intended to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.

Claims

1. The business process abnormality detection method is characterized by comprising the following steps:

step 2, constructing a business process anomaly detection model;

then, inputting the obtained track feature vector into a CNN prediction model to detect abnormal business flow;

2. The method for detecting a business process anomaly of claim 1, wherein,

in the step 2, an input track sequence is converted into a track feature vector containing context semantic information by using a BERT model added with relative position information, and in order to capture the relative position information, when the BERT model calculates the attention weight, a calculation formula of attention is modified, absolute position codes are combined with the relative position codes, and the attention weight is calculated;

in order to better encode the position information between the activities, adding a relative position code, adding a key vector K and a relative position matrix R after adding the relative position code, performing dot product calculation with a query vector Q, performing scaling and softmax operation on a calculation result to obtain attention weight, and multiplying the weight by a value vector V to obtain attention representation;

the calculation process is as follows:

wherein, attention (Q, K, V, R) is the obtained Attention expression;

r represents a relative position matrix, Q, K and V represent query, key and value vectors, respectively, d _k Representing the dimensions of the key vector.

3. The method for detecting a business process anomaly of claim 1, wherein,

in the step 2, the CNN prediction model has the following processing flow:

taking the track feature vector extracted by the BERT model as the input of the CNN prediction model input layer, then sequentially carrying out multi-layer convolution and pooling, and finally realizing task classification by means of a full-connection layer; different local features are acquired by using convolution kernels with different sizes at a convolution layer, feature dimensions are reduced by using maximum pooling at a pooling layer, and anomaly detection classification is realized by using softmax at a full connection layer.

4. The method for detecting a business process anomaly of claim 1, wherein,

the data set constructed in the step 1 comprises an offline abnormality detection data set and an online abnormality detection data set;

the acquisition process of the offline anomaly detection data set is as follows:

randomly deleting and adding activities to the tracks of the historical event logs to generate abnormal tracks, and marking the abnormal conditions of the tracks on the manually added abnormal tracks and the tracks in the original event logs respectively, so as to construct an offline abnormal detection data set;

the acquisition process of the online anomaly detection dataset is as follows:

and constructing an online abnormality detection data set by utilizing the track prefixes, extracting track prefix sequences with different lengths from the tracks obtained by the event logs, adding the abnormal tracks on the basis, and marking the abnormal conditions of the tracks on the manually added abnormal tracks and the track prefixes respectively, so as to construct the online abnormality detection data set.

5. The method for detecting a business process anomaly of claim 4, wherein,

in the step 1, the acquiring process of the online anomaly detection data set specifically includes:

6. The method for detecting a business process anomaly of claim 4, wherein,

in the step 3, training the built business process anomaly detection model by using the training set divided in the offline anomaly detection data set and the training set divided in the online anomaly detection data set respectively to obtain two groups of optimized network parameters;

defining two groups of optimized network parameters as off-line abnormality detection network parameters and on-line abnormality detection network parameters respectively;

testing the business process anomaly detection model with the offline anomaly detection network parameters by utilizing a test set divided in the offline anomaly detection data set so as to test the training effect of the model;

and simultaneously, testing the business process anomaly detection model with the online anomaly detection network parameters by utilizing the test set divided in the online anomaly detection data set so as to test the training effect of the model.

7. The method for detecting a business process anomaly of claim 6, wherein,

in the step 4, the offline anomaly detection and the online anomaly detection of the service process are respectively and correspondingly realized by using the service process anomaly detection model with the offline anomaly detection network parameters or the online anomaly detection network parameters.

8. The business process anomaly detection method of claim 7, wherein,

the offline anomaly detection process of the business process is as follows:

firstly, aiming at a track in an event log, generating a feature vector representation of the track, then inputting the feature vector representation into a business process anomaly detection model with offline anomaly detection network parameters, and predicting the anomaly condition of the track to obtain an anomaly detection result;

the online abnormality detection process of the business process is as follows:

9. A computer device comprising a memory and one or more processors, the memory having executable code stored therein, wherein the processor, when executing the executable code,

the steps of implementing a business process anomaly detection method according to any one of claims 1 to 8.

10. A computer-readable storage medium having a program stored thereon, which when executed by a processor, implements the steps of the business process anomaly detection method according to any one of claims 1 to 8.