CN117314140A - RPA process mining method and device based on event relation extraction - Google Patents
RPA process mining method and device based on event relation extraction Download PDFInfo
- Publication number
- CN117314140A CN117314140A CN202310997294.9A CN202310997294A CN117314140A CN 117314140 A CN117314140 A CN 117314140A CN 202310997294 A CN202310997294 A CN 202310997294A CN 117314140 A CN117314140 A CN 117314140A
- Authority
- CN
- China
- Prior art keywords
- new sentence
- relation
- sentence
- event
- new
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 82
- 230000008569 process Effects 0.000 title claims abstract description 54
- 238000005065 mining Methods 0.000 title claims abstract description 45
- 238000000605 extraction Methods 0.000 title claims abstract description 32
- 239000013598 vector Substances 0.000 claims abstract description 61
- 230000006399 behavior Effects 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims abstract description 13
- 239000011159 matrix material Substances 0.000 claims abstract description 9
- 238000011176 pooling Methods 0.000 claims abstract description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 230000001364 causal effect Effects 0.000 description 5
- 241000282414 Homo sapiens Species 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000004801 process automation Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 238000013474 audit trail Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0633—Workflow analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Marketing (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an RPA process mining method based on event relation extraction, which is characterized in that sentences of a user behavior log are expressed as a real value vector sequence and input into a pre-trained BERT model for learning, so that context semantic information of each sentence is obtained, the context semantic information is sequentially input into a Bi-LSTM model, a pooling layer, a self-attention layer and a full-connection layer softmax function, a hidden state sequence of sentence length, a relation vector expression sequence expression, sentence relation vector expression integrating context information and probabilities that sentences belong to different event relation labels are obtained, an event relation label with the highest probability is selected as a classification result of the event relation type of the sentence, a corresponding footprint matrix is generated according to the classification result of the event relation type of each new sentence, a petri net is constructed, and finally the event log is converted into a corresponding process mining model. The technical scheme of the invention can improve the accuracy of event relation extraction, quickly generate a process mining model and reduce the labor cost.
Description
Technical Field
The invention relates to the technical field of process mining, in particular to an RPA process mining method and device based on event relation extraction.
Background
RPA (Robotic Process Automation, robot flow automation) is a technique for automatically processing services according to specified rules and flows by software to assist or replace manual work. In short, the RPA automatically performs operations of some processes by using a robot working mode, and its input is generally structured data, that is, data of a standard table structure. The RPA software completes computer operation according to the set flow, replaces or assists people to complete regular and definite repetitive labor, and is a digital labor force. Although conventional information systems have been used in many enterprises, there is still a lot of repetitive work on these systems, which requires a certain amount of manpower to process, which reduces the efficiency of the systems, and the manual processing of the transactions easily causes errors, further reducing the effectiveness of these information systems. With the advent of robot flow automation, it is possible to improve automation of enterprise information systems by using RPA technology, reduce error rate, improve working efficiency, and improve compliance of work. According to statistics, in data maintenance work such as SAP (System, applications and Products in Data Processing, data processing System, application and product) and the like, the time cost can be reduced by 70%, so that the manpower resources of a company are saved. When various processes in an OA (Office Automation ) system are processed by adopting RPA and other technologies, the working efficiency can be improved by 30%. Meanwhile, the RPA can run around the clock, so that the running of the flow meets the standard requirement and the running is error-free. The digital capability of enterprises can be improved by utilizing the RPA technology, and the working efficiency of the enterprises is further improved. The RPA industry is currently seeking more intelligent and innovative RPA, utilizing cognitive computing and embedded intelligent automated processing decision-making processes. The increased level of intelligence of such systems means increased technical logic capabilities, thereby enabling a high level of process automation and creating value for stakeholders.
The RPA can be almost applied to repeated regularization workflow of all industries, improves data accuracy, optimizes financial organization architecture, improves enterprise competitiveness, and brings great value for enterprises. Firstly, the RPA technology is used for improving efficiency and releasing human resources. The RPA robot can effectively reduce the burden of a large amount of repeated work on staff by replacing manual operation, so that human resources are released, and the human resources are put into the more valuable field. Secondly, the RPA is easy to modify and can quickly adapt to environmental changes. The RPA not only can accurately complete a large amount of work in a shorter time, but also can better adapt to environmental changes. For example: technological upgrades, business requirement adjustments, training to learn and accept new things is a relatively complex and long-term process for human beings, whereas RPA can be accomplished by modifying the process or introducing new processes in programming, thereby accommodating changes more easily and quickly. Finally, risk control can also be performed by using the RPA technology, and compliance management is enhanced. RPA has the property of traceable records, with particular advantages in compliance. The robot may ensure that it does not deviate from the predetermined trajectory to perform other tasks so that a complete audit trail may be provided.
Business intelligence software has been widely used in modern enterprise operation optimization and management to process historical and current data of the enterprise, and to provide analysis functions such as data collection, mining and visualization to help the enterprise quickly find a viable insight that can facilitate strategic decision making. However, these tools generally assume that enterprise flows are known and that they perform only data-dependent analysis operations (e.g., classification, clustering, association analysis, etc.), lacking support for display flows. The new research field-process mining is formed towards the requirements.
A flow is a series of operations taken to achieve a particular goal. Process mining aims to build a bridge between traditional model driven methods (such as business process modeling and model correctness verification) and new data driven methods (such as data mining and machine learning). The process mining technology is based on events recorded in sequence, each event refers to an activity and is associated with a specific business scene, and after additional information in an event log is summarized, visual information in the form of a 'flow chart' of an actual process is formed, and process performance and process compliance are displayed to a business leader, so that decision making is assisted. The process of flow mining is generally divided into three steps: process discovery, consistency check, process improvement. Wherein, the process discovery aims to create a model based on the event log on the premise of not utilizing any prior information. The model here is in most cases a flow model, but other models are also possible, such as a character interaction model. Consistency checks aim to check compliance between a flow model and an event log, which mainly focuses on the comparison of event logs and flow models, including compliance checking problems between old logs and new models, old logs and old models, new logs and new models, new logs and old models. The consistency check compares the real flow execution condition with a preset standard path rule, and finds out the difference between the real flow execution and an ideal preset standard. The process improvement is to expand or improve the existing process by means of knowledge and information obtained from the event log of the actual process record.
The process mining has the technical characteristics of strong compatibility, no specific industry attribute and the like, so that the application range is very wide. Taking the most widely used financial and insurance fields of current process mining as an example, the business processes they relate to, e.g., loan processing, claim management, insurance application, etc., are highly structured and all events during process execution are systematically and securely recorded. Thus, process mining can be well applied to these scenarios to shorten process time, improve efficiency and compliance. In addition, process mining can also be used to improve processes in different organizations. For example, in call centers in the telecommunications industry, an increase in the number of repeated calls by a customer typically indicates a problem with the quality of seat service, as the customer's problem is not resolved in the first call. If the enterprise is not enough to train the seat service in order to save the cost in the beginning stage, or the seat is instructed to shorten the conversation time as much as possible, the customer is led to visit again and again, so that the flow time for solving the problem is prolonged. This places a great strain on the cost of operation of the enterprise in the long run. By performing flow mining on the call service log, the accuracy of the first response can be found, the customer experience and the efficiency of the call center can be improved, and the operation cost of the call center can be reduced. In general, process mining can greatly improve work efficiency, and has been widely used in the fields of finance, insurance, telecommunications, and the like.
Flow mining is a technology for extracting useful information from a workflow log, and at present, an alpha algorithm is used for processing event logs in a traditional RPA flow mining task, the event log is processed into the event log by using an event extraction technology or manual segmentation, and then the alpha algorithm is applied to the event log to obtain the relationship among the events. The method of manually splitting the event log is very time-consuming and labor-consuming when facing a large number of complicated repetitive operations.
Disclosure of Invention
In view of this, the invention provides an RPA process mining method based on event relation extraction, comprising the following steps:
s1, acquiring a user behavior log, wherein the user behavior log consists of sentences;
s2, combining sentences in the user behavior log in pairs to form M new sentences W= (W) 1 ,W 2 ,…,W M ) Representing each new sentence as a sequence of real valued vectors;
s3, inputting each real value vector sequence into a pre-trained BERT model for learning to obtain context semantic information of each new sentence;
s4, inputting the context semantic information of each new sentence into the Bi-LSTM model to obtain a hidden state sequence of each new sentence length;
s5, inputting the hidden state sequence of each new sentence length into a pooling layer to obtain the relation vector representation of each new sentence, wherein the relation vector representations of all sentences form a relation vector representation sequence R= (R) 1 ,r 2 ,…,r M );
S6, inputting the relation vector representation of all new sentences into the self-attention layer to obtain new sentence relation vector representation of the integrated context information;
s7, inputting new sentence relation vector representation of comprehensive context information into a full-connection layer, obtaining the probability that each new sentence belongs to different event relation labels by using a softmax function, and selecting the event relation label with the highest probability as a classification result of the event relation type of the new sentence;
s8, generating a corresponding footprint matrix according to the classification result of each new sentence event relation type, constructing a petri network, and finally converting the event log into a corresponding flow mining model.
The invention also comprises an RPA process mining device based on event relation extraction, which comprises:
a processor;
a memory having stored thereon a computer program executable on the processor;
the RPA process mining method based on event relation extraction is realized when the computer program is executed by the processor.
The technical scheme provided by the invention has the beneficial effects that:
the method comprises the steps of obtaining a user behavior log, expressing sentences of the user behavior log as a real value vector sequence, and inputting the real value vector sequence into a pre-trained BERT model for learning to obtain context semantic information of each sentence; inputting the context semantic information of each sentence into a Bi-LSTM model to obtain a hidden state sequence of each sentence length; inputting the hidden state sequence into a pooling layer to obtain a relation vector representation sequence representation; inputting the relation vector representation sequence representation into the self-attention layer to obtain sentence relation vector representation of the integrated context information; inputting sentence relation vector representation of comprehensive context information into a full-connection layer, obtaining probability that each sentence belongs to different event relation labels by using a softmax function, selecting the event relation label with the highest probability as a classification result of the sentence event relation type, generating a corresponding footprint matrix according to the classification result of each sentence event relation type, constructing a petri network, and finally converting an event log into a corresponding flow mining model. The technical scheme of the invention uses an event relation extraction model to automatically mine event relations in a user behavior log, and provides a method for constructing the event relation extraction model according to the relation among events to reflect the structural relation among text events on the basis of event relation mining, which can:
(1) The semantic information of the text and the relation importance degree between the events can be expressed more clearly. And the complex step of an alpha algorithm can be omitted, and a flow mining model can be generated more rapidly, so that the working efficiency of a flow mining task is improved.
(2) The accuracy of event relation extraction is improved: the prior method generally uses only current sentence information to identify the order relation of event pairs in sentences, and does not consider context information, and the sentence information is limited and is not enough to accurately identify the event relation in the sentences. The invention learns the relation between sentences and other sentences in the context through the context enhancement module, so that the accuracy of event relation extraction can be improved.
(3) The cost is reduced: the method improves the traditional process mining method, replaces manual segmentation of the user behavior log by event relation extraction, reduces labor and time cost to a certain extent, and avoids errors caused by manual segmentation of the log.
Drawings
FIG. 1 is a flow chart of an RPA flow mining method based on event relation extraction of the present invention;
FIG. 2 is a diagram of a petri net constructed based on event relationships in accordance with an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Term interpretation in the embodiments of the present invention:
event relationship extraction
In general, the occurrence of an event is not isolated, but its occurrence often correlates with development with other events that are extrinsic. These associations are explicit and implicit. A form of language in which event relationships describe objective facts by words is often active in various media such as news stories, comments, or blogs. Extraction is performed for the event with association, which is called event relation extraction. The event relation extraction field mainly has four relations among events, which are respectively: co-fingering, timing, causal, and sub-event relationships. Wherein, the co-index relationship indicates that the two events describe the same target event; the time sequence relation represents the time sequence of the two events; causality refers to the relationship of actions between events, i.e., one event is the result of another event; the sub-event relationship indicates that one event is a sub-event of another event.
Bi-LSTM model
Bi-LSTM is composed of forward LSTM and backward LSTM, and modeling sentences with LSTM has a problem: the back-to-front information cannot be encoded. And Bi-LSTM can better capture the Bi-directional semantic dependencies.
The key to LSTM is the cell state, which is used to save the state information of the current cell and pass it on to the next cell. The LSTM mainly designs three control gate structures with different functions, including an input gate, an output gate and a forget gate. The function of the three gates is to control the retention and transmission of signals in the model, and a specific calculation formula is as follows:
c t =i t ⊙d t +f t ⊙c t-1
h t =o t ⊙tanh(c t )
where t is the time step, i t Is an input door, f t Is a forgetful door o t Is an output door d t Is in a temporary state, c t Is the memory cell state, c is obtained by random initialization t Initial value, h t Is in a hidden state, x t Is the input of the current time step, sigma is the activation function, W L 、b L Is a network parameter.
The invention provides an RPA process mining method based on multi-mode event extraction, and referring to FIG. 1, FIG. 1 is a flow chart of the RPA process mining method based on event relation extraction, comprising the following steps:
s1, acquiring a user behavior log, wherein the user behavior log consists of sentences.
And recording the operation performed in the business process of the user by using the user behavior recording tool, thereby generating an original user behavior log. Then, the original user behavior log is processed, and each operation behavior information is converted into a sentence form, so that the processed user behavior log is composed of a plurality of sentences.
S2, combining sentences in the user behavior log in pairs to form M new sentences W= (W) 1 ,W 2 ,...,W M ) For example, there are N sentences in the user behavior log, and after combining every two sentences, N (N-1)/2 new sentences are generated, and each new sentence is represented as a real-valued vector sequence.
First, a sentence is input, each word in the input sentence is mapped, and the word vector of each word is obtained by searching a pre-trained word vector table. The sentence is now represented as a sequence of real valued vectors W m =(w m1 ,w m2 ,...,w mn ),W m Represents the m new sentence, w mi Representing the mth new sentenceThe i-th word vector, n, represents the length of the m-th new sentence.
S3, inputting each real value vector sequence into the pre-trained BERT model for learning, and obtaining the context semantic information of each new sentence.
The BERT model adopts a coding block structure of a transducer as a feature extractor to obtain more deep and rich context semantic information V of each new sentence m =(v m1 ,v m2 ,...,v mn ):
v mi =Transformer(w mi ) (1)
Wherein V is m Context semantic information representing the mth new sentence, v mi Context semantic information representing the ith word vector of the mth new sentence, w mi The i-th word vector representing the m-th new sentence.
S4, inputting the context semantic information of each new sentence into the Bi-LSTM model to obtain a hidden state sequence of each new sentence length.
H m A hidden state sequence representing the length of the mth new sentence, H m =(h m1 ,h m2 ,...,h mn ),h mi Represents the hidden state of the ith word of the mth new sentence:
wherein V is m =(v m1 ,v m2 ,...,v mn ),V m Context semantic information representing the mth new sentence, v mi Context semantic information representing the ith word vector of the mth new sentence, |represents a concatenation operation,representing the result of the encoding of forward LSTM, < >>Represents the result of the encoding of backward LSTM, will +.>And->After splicing, h is obtained mi 。
S5, inputting the hidden state sequence of each new sentence length into a pooling layer to obtain the relation vector representation of each new sentence, wherein the relation vector representations of all sentences form a relation vector representation sequence R= (R) 1 ,r 2 ,...,r M )。
Inputting the hidden state sequence of each new sentence length into a pooling layer, and obtaining the relation vector expression of each new sentence to be expressed as a formula:
r m =AvgPool(H m ) (3)
wherein r is m A relation vector representation representing an mth sentence, H m The AvgPool is an average pooling of sentence hidden states representing a sequence of hidden states for the mth new sentence length.
S6, inputting the relation vector representation of all new sentences into the self-attention layer to obtain new sentence relation vector representation of the integrated context information.
The calculation formula is as follows:
score=exp(tanh(RW 1 +b 1 )W 2 +b 2 ) (4)
H′ m =norm(score)*R (5)
wherein score represents the attention score, tanh represents the tanh activation function, W 1 And W is 2 Is a different weight matrix, b 1 And b 2 Is a different bias term, R represents a relationship vector representation sequence, norm is an L2 regularization operation, H' m New sentence relationship vector representations representing comprehensive context information.
S7, inputting new sentence relation vector representation of the comprehensive context information into the full-connection layer, obtaining the probability that each new sentence belongs to different event relation labels by using a softmax function, and selecting the event relation label with the highest probability as a classification result of the event relation type of the new sentence.
Inputting the new sentence relation vector representation of the comprehensive context information into the fully connected layer, and obtaining the expression of the probability that each new sentence belongs to different event relation labels by using a softmax function is as follows:
Z m =soffmax([H m ,H′ m ]W z +b z )
wherein H is m A hidden state sequence representing the length of the mth new sentence, H' m New sentence relation vector representation representing comprehensive context information, [ H ] m ,H′ m ]W z Representing the vector H m And H' m Spliced and then connected with W z Multiplication by W z Is a weight matrix, b z Is a bias term.
S8, generating a corresponding footprint matrix according to a classification result of each new sentence event relation type, and constructing a petri network, referring to FIG. 2, FIG. 2 is a diagram of constructing the petri network based on event relations, wherein x, y and z represent different events, and (x-y) represents that x and y are causal relations; (x→y, x→z, y||z) represents that x and y, x and z are causal, y and z are parallel, and (x→y, x→z, y#z) represents that x and y, x and z are causal, and y and z are independent; (x→z, y→z, y#x) means that x and z, y and z are causal, and y and x are independent. And (5) carrying out flow mining according to the event direct following relation to construct a peri network. And finally, converting the event log into a corresponding flow mining model.
The invention also provides an RPA process mining device based on event relation extraction, which comprises:
a processor;
a memory having stored thereon a computer program executable on the processor;
the RPA process mining method based on event relation extraction is realized when the computer program is executed by the processor.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (8)
1. The RPA process mining method based on event relation extraction is characterized by comprising the following steps:
s1, acquiring a user behavior log, wherein the user behavior log consists of sentences;
s2, combining sentences in the user behavior log in pairs to form M new sentences W= (W) 1 ,W 2 ,…,W M ) Representing each new sentence as a sequence of real valued vectors;
s3, inputting each real value vector sequence into a pre-trained BERT model for learning to obtain context semantic information of each new sentence;
s4, inputting the context semantic information of each new sentence into the Bi-LSTM model to obtain a hidden state sequence of each new sentence length;
s5, inputting the hidden state sequence of each new sentence length into a pooling layer to obtain the relation vector representation of each new sentence, wherein the relation vector representations of all sentences form a relation vector representation sequence R= (R) 1 ,r 2 ,…,r M );
S6, inputting the relation vector representation of all new sentences into the self-attention layer to obtain new sentence relation vector representation of the integrated context information;
s7, inputting new sentence relation vector representation of comprehensive context information into a full-connection layer, obtaining the probability that each new sentence belongs to different event relation labels by using a softmax function, and selecting the event relation label with the highest probability as a classification result of the event relation type of the new sentence;
s8, generating a corresponding footprint matrix according to the classification result of each new sentence event relation type, constructing a petri network, and finally converting the event log into a corresponding flow mining model.
2. The RPA process mining method based on event relation extraction according to claim 1, wherein in step S2, each new sentence is represented as a real value vector sequence specifically:
inputting each new sentence into the order relation coding module, mapping each word in the input new sentence, obtaining the word vector of each word by searching the pre-trained word vector table, and representing each new sentence as a real value vector sequence W m =(w m1 ,w m2 ,…,w mn ),W m Represents the m new sentence, w mi The i-th word vector representing the m-th new sentence, and n represents the length of the m-th new sentence.
3. The method for mining RPA flow based on event relation extraction as claimed in claim 1, wherein in step S3, BERT model uses the coding block structure of transducer as feature extractor to obtain the context semantic information V of each new sentence m =(v m1 ,v m2 ,...,v mn ):
v mi =Transformer(w mi ) (1)
Wherein V is m Context semantic information representing the mth new sentence, v mi Context semantic information representing the ith word vector of the mth new sentence, w mi The i-th word vector representing the m-th new sentence.
4. The RPA process mining method based on event relation extraction according to claim 1, wherein step S4 specifically comprises:
H m a hidden state sequence representing the length of the mth new sentence, H m =(h m1 ,h m2 ,...,h mn ),h mi Represents the hidden state of the ith word of the mth new sentence:
wherein V is m =(v m1 ,v m2 ,...,v mn ) Vm represents context semantic information of the mth new sentence, v mi Context semantic information representing the ith word vector of the mth new sentence, |represents a concatenation operation,representing the result of the encoding of forward LSTM, < >>Represents the result of the encoding of backward LSTM, will +.>And->And splicing to obtain the hmi.
5. The RPA process mining method based on event relation extraction according to claim 1, wherein in step S5, the hidden state sequence of each new sentence length is input to a pooling layer, and the relation vector expression of each new sentence is obtained and expressed as:
r m =AvgPool(H m ) (3)
wherein r is m A relation vector representation representing an mth sentence, H m The AvgPool is an average pooling of sentence hidden states representing a sequence of hidden states for the mth new sentence length.
6. The RPA process mining method based on event relation extraction according to claim 1, wherein the calculation formula in step S6 is as follows:
score=exp(tanh(RW 1 +b 1 )W 2 +b 2 ) (4)
H′ m =norm(score)*R (5)
wherein score represents the attention score, tanh represents the tanh activation function, W 1 And W is 2 Is a different weight matrix, b 1 And b 2 Is a different bias term, R represents a relationship vector representation sequence, norm is an L2 regularization operation, H' m New sentence relationship vector representations representing comprehensive context information.
7. The RPA process mining method based on event relation extraction according to claim 1, wherein in step S7, the expression of probability that each new sentence belongs to different event relation labels is obtained by using a softmax function by inputting the new sentence relation vector representation of the comprehensive context information and the sentence hiding state after splicing into the full connection layer, wherein the expression is:
Z m =softmax([H m ,H′ m ]W z +b z )
wherein H is m A hidden state sequence representing the length of the mth new sentence, H' m New sentence relation vector representation representing comprehensive context information, [ H ] m ,H′ m ]Indicating that H is m And H' m Splicing, W z Is a weight matrix, b z Is a bias term.
8. An RPA process mining device based on event relation extraction, the device comprising:
a processor;
a memory having stored thereon a computer program executable on the processor;
wherein the computer program when executed by the processor implements an RPA process mining method based on event relationship extraction as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310997294.9A CN117314140A (en) | 2023-08-07 | 2023-08-07 | RPA process mining method and device based on event relation extraction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310997294.9A CN117314140A (en) | 2023-08-07 | 2023-08-07 | RPA process mining method and device based on event relation extraction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117314140A true CN117314140A (en) | 2023-12-29 |
Family
ID=89287354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310997294.9A Pending CN117314140A (en) | 2023-08-07 | 2023-08-07 | RPA process mining method and device based on event relation extraction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117314140A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117493220A (en) * | 2024-01-03 | 2024-02-02 | 安徽思高智能科技有限公司 | RPA flow operation abnormity detection method, device and storage device |
-
2023
- 2023-08-07 CN CN202310997294.9A patent/CN117314140A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117493220A (en) * | 2024-01-03 | 2024-02-02 | 安徽思高智能科技有限公司 | RPA flow operation abnormity detection method, device and storage device |
CN117493220B (en) * | 2024-01-03 | 2024-03-26 | 安徽思高智能科技有限公司 | RPA flow operation abnormity detection method, device and storage device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112633419B (en) | Small sample learning method and device, electronic equipment and storage medium | |
CN1910601B (en) | Constraint condition solving method, constraint condition solving device, and constraint condition solving system | |
US12079568B2 (en) | Domain-specific language interpreter and interactive visual interface for rapid screening | |
CN114968788B (en) | Automatic evaluation method, device, medium and equipment for programming capability of artificial intelligent algorithm | |
CN104169871A (en) | Method and apparatus for developing software | |
US11443168B2 (en) | Log analysis system employing long short-term memory recurrent neural net works | |
CN114139490B (en) | Method, device and equipment for automatic data preprocessing | |
US8315874B2 (en) | Voice user interface authoring tool | |
CN117236677A (en) | RPA process mining method and device based on event extraction | |
US20240062016A1 (en) | Systems and Methods for Textual Classification Using Natural Language Understanding Machine Learning Models for Automating Business Processes | |
CN117236676A (en) | RPA process mining method and device based on multi-mode event extraction | |
CN117314140A (en) | RPA process mining method and device based on event relation extraction | |
CN117453717B (en) | Data query statement generation method, device, equipment and storage medium | |
CN111221881B (en) | User characteristic data synthesis method and device and electronic equipment | |
US20240311548A1 (en) | Method and system for automatically formulating an optimization problem using machine learning | |
CN117312989A (en) | Context-aware column semantic recognition method and system based on GCN and RoBERTa | |
CN114492460B (en) | Event causal relationship extraction method based on derivative prompt learning | |
CN114254102B (en) | Natural language-based collaborative emergency response SOAR script recommendation method | |
CN116226741A (en) | Modeling method and device based on data type | |
CN114385694A (en) | Data processing method and device, computer equipment and storage medium | |
Jeyaraman et al. | Practical Machine Learning with R: Define, build, and evaluate machine learning models for real-world applications | |
WO2021217866A1 (en) | Method and apparatus for ai interview recognition, computer device and storage medium | |
CN116107619A (en) | Web API recommendation method based on factoring machine | |
CN111708896B (en) | Entity relationship extraction method applied to biomedical literature | |
CN113537710A (en) | Artificial intelligence-based activity time sequence online prediction method under data driving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |