WO2024036974A1 - 一种重复操作的提取方法及电子设备、存储介质 - Google Patents

一种重复操作的提取方法及电子设备、存储介质 Download PDF

Info

Publication number
WO2024036974A1
WO2024036974A1 PCT/CN2023/084305 CN2023084305W WO2024036974A1 WO 2024036974 A1 WO2024036974 A1 WO 2024036974A1 CN 2023084305 W CN2023084305 W CN 2023084305W WO 2024036974 A1 WO2024036974 A1 WO 2024036974A1
Authority
WO
WIPO (PCT)
Prior art keywords
operations
abstract
subsequence
occurrence
frequency
Prior art date
Application number
PCT/CN2023/084305
Other languages
English (en)
French (fr)
Inventor
黄博
张泉
周元剑
周健
Original Assignee
北京弘玑信息技术有限公司
上海弘玑信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京弘玑信息技术有限公司, 上海弘玑信息技术有限公司 filed Critical 北京弘玑信息技术有限公司
Publication of WO2024036974A1 publication Critical patent/WO2024036974A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions

Definitions

  • the present application relates to the field of data mining technology, and in particular to a repetitive operation extraction method, electronic equipment, and computer-readable storage media.
  • RPA Robot Process Automation, Robotic Process Automation
  • Embodiments of the present application provide extraction methods, electronic devices, and storage media for repeated operations to reduce the workload and cost of repeated manual excavation operations and improve efficiency.
  • the embodiment of the present application provides an extraction method with repeated operations, including:
  • special operations and common operations are screened out based on the specific operations of each step, including:
  • the concrete operation is determined to be a special operation
  • converting the special operations and common operations into abstract operations includes:
  • the common operations are stored with application names and fixed window names to obtain abstract operations corresponding to the common operations.
  • the method before extracting repeated abstract operation combinations from the abstract operation sequence, the method further includes:
  • the abstract operation sequence is filtered to remove abstract operations whose frequency of occurrence satisfies the first preset condition in the abstract operation sequence.
  • filtering the abstract operation sequence to remove abstract operations whose frequency of occurrence meets the first preset condition in the abstract operation sequence includes:
  • extracting repeated abstract operation combinations from the abstract operation sequence includes:
  • the information lookup table select a target subsequence that satisfies the second preset condition from the subsequence list each time, and extend the target subsequence forward and forward to obtain an extended subsequence;
  • selecting a target subsequence that satisfies the second preset condition from the subsequence list each time according to the information lookup table includes:
  • the target subsequence is extended forward and forward to obtain an extended subsequence, including:
  • selecting an extended subsequence that satisfies the third preset condition to be added to the subsequence list includes:
  • the extended subsequence with the highest frequency of occurrence is selected from the filtered extended subsequences and added to the subsequence list.
  • the embodiment of the present application provides a repetitive operation extraction device, which includes:
  • the record acquisition module is used to obtain work operation records.
  • the work operation records include the concrete operations of each step and operating time;
  • the operation screening module is used to filter out special operations and ordinary operations based on the concrete operations of each step;
  • An operation abstraction module used to convert the special operations and ordinary operations into abstract operations, and establish a mapping relationship between the concrete operations and the abstract operations;
  • the operation sequencing module is used to arrange all abstract operations according to the operation time of the corresponding concrete operations to obtain an abstract operation sequence
  • a repetition extraction module is used to extract repeated abstract operation combinations from the abstract operation sequence, and obtain the concrete operation combination and operation time corresponding to the abstract operation combination.
  • An embodiment of the present application also provides an electronic device, where the electronic device includes:
  • Memory used to store instructions executable by the processor
  • the processor is configured to perform the extraction method of the above repeated operations.
  • Embodiments of the present application also provide a computer-readable storage medium, the storage medium stores a computer program, and the computer program can be executed by a processor to complete the extraction method of the above repeated operations.
  • Embodiments of the present application also provide a computer software product.
  • the computer software product is stored in a computer-readable storage medium.
  • the computer software product includes instructions. When the instructions are executed by a processor, the above repeated operations can be realized. Extraction Method.
  • the solution provided by the above embodiment of the present application is to obtain work operation records and filter out special operations and common operations according to the concrete operations of each step in the work operation records; convert special operations and common operations into abstract operations, and establish the relationship between concrete operations and The mapping relationship between abstract operations; arrange the abstract operations according to the operation time of the corresponding concrete operations to obtain an abstract operation sequence; then extract repeated abstract operation combinations from the abstract operation sequence to obtain the concrete operation combinations and operations corresponding to the abstract operation combinations Time, since there is no need for manual participation in repeated mining operations, the workload and cost of manual mining are reduced and efficiency is improved.
  • Figure 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 2 is a schematic flow chart of an extraction method with repeated operations according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of an abstract filtering operation sequence provided by an embodiment of the present application.
  • FIG. 4 is a detailed flow chart of step S250 in the embodiment corresponding to Figure 2;
  • FIG. 5 is a block diagram of an extraction device for repeated operations according to an embodiment of the present application.
  • Figure 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 100 can be used to perform the repeated operation extraction method provided by the embodiment of the present application.
  • the electronic device 100 includes: one or more A processor 102 and one or more memories 104 that store instructions executable by the processor.
  • the processor 102 is configured to execute the repeated operation extraction method provided in the following embodiments of the present application.
  • the processor 102 may be a gateway, a smart terminal, or a device including a central processing unit (CPU), an image processing unit (GPU), or other forms of processing units with data processing capabilities and/or instruction execution capabilities. , data of other components in the electronic device 100 can be processed, and other components in the electronic device 100 can also be controlled to perform desired functions.
  • CPU central processing unit
  • GPU image processing unit
  • data of other components in the electronic device 100 can be processed, and other components in the electronic device 100 can also be controlled to perform desired functions.
  • the memory 104 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache).
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may execute the program instructions to implement the extraction method of repeated operations described below.
  • Various application programs and various data such as various data used and/or generated by the application programs, may also be stored in the computer-readable storage medium.
  • the electronic device 100 shown in FIG. 1 may further include an input device 106, an output device 108, and a data acquisition device 110. These components are interconnected through a bus system 112 and/or other forms of connection mechanisms (not shown). It should be noted that the components and structures of the electronic device 100 shown in FIG. 1 are only exemplary and not restrictive. The electronic device 100 may also have other components and structures as needed.
  • the input device 106 may be a device used by the user to input instructions, and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.
  • the output device 108 may output various information (eg, images or sounds) to the outside (eg, a user), and may include one or more of a display, a speaker, and the like.
  • the data acquisition device 110 can acquire images of objects and store the acquired images in the memory 104 for use by other components.
  • the data collection device 110 may be a camera.
  • each component in the example electronic device 100 used to implement the repeated operation extraction method of the embodiment of the present application can be integrated or decentralized, such as the processor 102, the memory 104, the input device 106 and the The output device 108 is integrated into one body, while the data collection device 110 is installed separately.
  • the example electronic device 100 used to implement the repeated operation extraction method of the embodiment of the present application can be implemented as a smart terminal such as a smartphone, a tablet computer, a desktop computer, a server, a vehicle-mounted device, or the like.
  • FIG. 2 is a schematic flow chart of an extraction method with repeated operations according to an embodiment of the present application. As shown in Figure 2, the method includes the following steps S210 to S250.
  • Step S210 Obtain the work operation record, which includes the specific operation and operation time of each step.
  • a patent worker's "checking emails, downloading patent .docx, opening, modifying, closing, saving, and sending" is a repetitive operation task.
  • a patent worker's daily work includes many operations, among which are the above-mentioned repetitions. Sexual manipulation tasks.
  • Work operation records refer to the data stream composed of the user's operations in a day or a period of time. For example, they can be obtained by recording the user's operations in a day with an RPA recorder.
  • the RPA recorder can parse each step of the user's operations, parse out mouse click operations and special keyboard key operations (such as the ctrl key and enter key), and also parse out the currently operating applications, windows, elements and instructions, and record them.
  • the operation time of each operation can be obtained by recording the user's operations in a day with an RPA recorder.
  • the RPA recorder can parse each step of the user's operations, parse out mouse click operations and special keyboard key operations (such as the ctrl key and enter key), and also parse out the currently operating applications, windows, elements and instructions, and record them.
  • the operation time of each operation can be obtained by recording the user's operations in a day with an RPA recorder.
  • the RPA recorder can parse
  • Window A certain window of the application, such as "Task Mining Patent.docx”.
  • Element An element in a certain window of the application, such as the title of an article, or in word Every option that appears after right-clicking the mouse.
  • Element content The specific content of the element. For example, if the element is the title of an article, then the content of the element can be "a method of mining repeated transactions.” For example, the element is each option that appears after right-clicking the mouse in Word. , then the element content can be "cut", "copy”, etc.
  • the work operation record can include the concrete operation and operation time of each step.
  • the concrete operation is the specific operation, which can include the application name, window name, element content, instructions and other information.
  • Operation time refers to the specific time when each operation occurs.
  • Step S220 Screen out special operations and common operations based on the concrete operations of each step.
  • special operations refer to copy, paste and save operations. If the element content or instructions included in the concrete operation are any of copy, paste, and save, the concrete operation is determined to be a special operation. If the command is "ctrl+c”, it is considered to be a copy. If the command is "ctrl+v”, it is considered to be a paste. If the element content is "copy”, it is considered a copy operation. If the element content is "paste”, “match target format”, “retain original format”, etc., it is considered a paste operation. If the element content is “save”, “save”, etc., it is considered a save operation.
  • the concrete operation does not include an application name or the included application name or window name is a specified name, the concrete operation is determined to be an ignorable operation.
  • the application name is not included, it can be considered that the current operation does not affect an application. Of course, it may also be a parsing error of the RPA recorder, so this concrete operation is considered an ignorable operation.
  • the application name included in the concrete operation is a specified name (such as Resource Manager), it means that the user may be switching applications, and this concrete operation is also considered an ignorable operation.
  • the window name included in the concrete operation is a specified name containing strings such as "new tab page" and "new tab”, the concrete operation is also considered an ignorable operation.
  • Concrete operations other than special operations and ignorable operations are ordinary operations. You can first determine whether the concrete operation is a special operation. If not, then determine whether the concrete operation is an ignorable operation. If not, it means it is a normal operation. If necessary, you can also first determine whether it is an ignorable operation. If not, then determine whether it is a special operation. If not, it means it is a normal operation.
  • Step S230 Convert the special operations and common operations into abstract operations, and establish a mapping relationship between the concrete operations and the abstract operations.
  • operation abstraction may be: storing special operations as operation names to obtain abstract operations corresponding to the special operations.
  • the common operations are stored with application names and fixed window names to obtain abstract operations corresponding to the common operations.
  • the operation name refers to the operation name of a special operation, such as copy, paste or save. Therefore, special operations under different applications and windows can use the operation name of the special operation as an abstract operation.
  • the application name + fixed window name of the common operation can be used as the abstract operation corresponding to the common operation.
  • the fixed window name is relative to the variable window name, and the fixed window name can be extracted from the window names included in ordinary operations. For example, in the window name "Bing Search-Patent", "Bing Search" is a fixed window name, and "Patent” is a variable window name.
  • a keyword extraction algorithm trained in advance can be used to extract the fixed window name from the window name, or a rule matching algorithm can be used. For example, the first word in the window name is the fixed window name.
  • a mapping table can be used to store the mapping relationship between concrete operations and abstract operations, thereby ensuring The concrete operations that must belong to special operations and ordinary operations correspond to which abstract operation after abstraction.
  • Step S240 Arrange all abstract operations according to the operation time of the corresponding concrete operations to obtain an abstract operation sequence.
  • the abstract operation sequence is obtained by arranging all abstract operations in the order of operation time.
  • the operation time of the abstract operation is the operation time of the corresponding concrete operation in the mapping table.
  • Step S250 Extract repeated abstract operation combinations from the abstract operation sequence, and obtain the concrete operation combinations and operation times corresponding to the abstract operation combinations.
  • An abstract operation combination refers to a sequence composed of several abstract operations extracted from an abstract operation sequence. In order to distinguish it, it is called an abstract operation combination.
  • a repeated abstract operation combination means that the abstract operation combination appears more than once.
  • the abstract operation sequence is abcfgfgabcfgabc
  • the repeated abstract operation combinations are abc and fg.
  • the combination of concrete operations refers to the sequence of concrete operations corresponding to the abstract operations.
  • the operation time of the abstract operation combination is the operation time of the corresponding concrete operation. From this, it is possible to determine when the repeated operation occurs, when it ends and the number of occurrences.
  • the method provided by the embodiment of the present application further includes: filtering the abstract operation sequence, and removing abstract operations whose frequency of occurrence satisfies the first preset condition in the abstract operation sequence.
  • the first preset condition may be that the frequency of occurrence is less than 3 times.
  • Abstract operations that occur less frequently can be considered as interference noise, so removing them in advance can reduce the number of iterations for subsequent repeated operation mining.
  • the above process of filtering the abstract operation sequence specifically includes: the following steps S310 to S330.
  • Step S310 Delete the abstract operations whose frequency of occurrence is less than the first preset value in the abstract operation sequence to obtain an updated abstract operation sequence.
  • the abstract operation sequence is abcabdceabcdabcabcd. Assume that the first default value is 3. Since e only appears once, delete it and abcd appears more than 3 times.
  • the updated abstract operation sequence is abcabdcabcdabcabcd. .
  • Step S320 From the updated abstract operation sequence, find the target abstract operation in which the occurrence frequency of the previous connection and the subsequent connection is both less than the second preset value, and the sum of the occurrence frequencies of the previous connection and the subsequent connection is the smallest. .
  • the previous connection refers to the sequence composed of a certain abstract operation and its previous abstract operation.
  • the latter connection refers to the sequence of an abstract operation and the abstract operation after it.
  • its previous connection is ab and its next connection is bc.
  • the target abstract operation refers to the abstract operation that satisfies the following two conditions in the updated abstract operation sequence:
  • Condition 1 The frequency of occurrence of the previous connection and the subsequent connection is less than the second preset value (for example, 3);
  • Condition 2 The sum of the occurrence frequencies of the previous connection and the next connection is the smallest.
  • the occurrence frequency of ab is 5
  • the occurrence frequency of bc is 4
  • the occurrence frequency of ca is 2
  • the occurrence frequency of bd is 1
  • the occurrence frequency of dc is 1
  • the occurrence frequency of cd is 2
  • the occurrence frequency of da is 1 .
  • Step S330 Randomly delete a target abstract operation in the updated abstract operation sequence, and repeat the above steps multiple times until there is no deletable abstract operation.
  • the deletable abstract operations include abstract operations whose occurrence frequency is less than the first preset value, and also include the occurrence frequency of the previous connection and the following connection that are both less than the second preset value and the sum of the appearance frequencies of the previous connection and the following connection. Minimal target abstract operation.
  • the abstract operation sequence becomes abcabdcabcdabcabc.
  • the above-mentioned step S250 specifically includes the following steps S410-step S430'.
  • Step S410 Merge the same abstract operations in the abstract operation sequence into a subsequence, obtain a subsequence list, and record the frequency of occurrence of each subsequence and the time position in the abstract operation sequence through an information lookup table.
  • Merging the same abstract operations into a subsequence means merging the same abstract operations at different time positions into one abstract operation, which can be called a subsequence.
  • the abstract operation sequence is abcfgfgabcfgabc
  • a at the 1st time position, a at the 8th time position, and a at the 13th time position are merged into a subsequence [a].
  • other subsequences [b], [c], [f], [g]. All subsequences constitute the subsequence list [[a],[b],[c],[f],[g]].
  • the information lookup table is used to record the relevant information of each subsequence for subsequent steps to find.
  • Relevant information includes the frequency and time position of the subsequence in the abstract operation sequence.
  • Temporal position is used to characterize the order of subsequences in an abstract sequence of operations. For example, the occurrence frequency of subsequence [a] is 3 times, and the time positions are 1, 8, and 13.
  • Step S420 According to the information lookup table, select a target subsequence that satisfies the second preset condition from the subsequence list each time, and extend the target subsequence forward and forward to obtain an extended subsequence. sequence.
  • the target subsequence refers to a subsequence in the subsequence list that satisfies the second preset condition.
  • Extended subsequence refers to the result obtained by extending the target subsequence forward by one abstract operation or backward by one abstract operation.
  • the above step S420 specifically includes: selecting the subsequence with the highest frequency of occurrence from the subsequence list each time according to the frequency of occurrence of each subsequence recorded in the information lookup table; if the subsequence with the highest frequency of occurrence is If there is more than one subsequence, the subsequence with the longest length is selected as the target subsequence.
  • the second preset conditions include: Condition 1: The subsequence with the highest frequency of occurrence in the information lookup table; Condition 2: When multiple subsequences have the same frequency of occurrence and are the highest, then the subsequence with the longest length is selected. sequence.
  • Condition 1 The subsequence with the highest frequency of occurrence in the information lookup table
  • Condition 2 When multiple subsequences have the same frequency of occurrence and are the highest, then the subsequence with the longest length is selected. sequence.
  • the special case is: when several subsequences have the highest frequency of occurrence and are the same length, randomly select one of these subsequences.
  • subsequence [a], subsequence [b], subsequence [c], subsequence [f], and subsequence [g] in the information lookup table is 3 times, and the length is If both are 1, then randomly select subsequence [b] as the target subsequence this time.
  • the abstract operation sequence is abcfgfgabcfgabc
  • the target subsequence is [b]
  • the subsequence with time position 2 is [b]
  • an abstract operation is extended forward to [a, b]
  • an abstract operation is extended backward to [b, c ]
  • the subsequence [b] with the time position 9 expands an abstract operation forward to [a, b]
  • the subsequence [b] with the time position 14 Extending an abstract operation forward is [a, b], and extending an abstract operation backward is [b, c].
  • the extended subsequence includes [a, b], [b, c]; the occurrence frequency of [a, b] is 3, the occurrence frequency of [b, c,] is 3, and the occurrence frequency of the extended subsequence is also recorded in the information lookup table.
  • Step S430 Select an extended subsequence that meets the third preset condition to be added to the subsequence list, and delete the subsequences that constitute the extended subsequence in the subsequence list until the subsequence list is empty.
  • Step S430' If all extended subsequences do not meet the third preset condition, use the selected target subsequence as a repeated abstract operation combination, and delete the target subsequence in the subsequence list until the descriptor sequence The table is empty.
  • the third preset condition includes: Condition 1: the occurrence frequency of the extended subsequence in the information lookup table is greater than or equal to the preset frequency (for example, 3), the occurrence frequency is greater than the preset frequency (for example, 1%), and the extendable rate is greater than the preset frequency. Set the extendable rate (for example, 20%).
  • Condition 2 For the extended subsequence that satisfies condition 1, select the extended subsequence with the highest frequency of occurrence in the information lookup table.
  • the above-mentioned step S430 includes: filtering out extended subsequences whose occurrence frequency is greater than or equal to a preset frequency, whose occurrence frequency is greater than the preset frequency, and whose extendable rate is greater than the preset extendable rate.
  • the extended subsequence with the highest frequency of occurrence is selected from the filtered extended subsequences and added to the subsequence list.
  • the special case is that if several extended subsequences that meet condition 1 have the same frequency of occurrence in the information lookup table and are all the highest, then an extended subsequence will be randomly selected. That is, only one extended sequence is added to the subsequence list at a time.
  • the occurrence frequency refers to the ratio of the occurrence frequency of the extended subsequence in the information lookup table to the sum of the appearance frequencies of all subsequences in the information lookup table.
  • the extendability rate refers to the ratio of the frequency of occurrence of the extended subsequence in the information lookup table to the frequency of occurrence of the pre-extension subsequence in the information lookup table. It represents the proportion of the original sequence that can be extended into a new sequence.
  • the updated subsequence list is obtained as [[c],[f],[g],[a,b]].
  • the target subsequences used to form the extended subsequences are combined as repeated abstract operations, and the target subsequences are deleted from the subsequence list. This results in an updated subsequence list.
  • step S420 is executed again to select a target subsequence that satisfies the second preset condition from the updated subsequence list, and is extended to obtain an extended subsequence, and then step S430 is executed until the subsequence list is reached. Is empty.
  • the subsequence list at this time is: [[c],[f],[g],[a,b]];
  • the frequency of occurrence of all subsequences is 3, but the subsequence [a, b,] is the longest, then [a, b] will be taken out as the target subsequence for extension.
  • the subsequences [a, b,] and [c,] that are the same as the extended subsequence are deleted, and the subsequence list is updated to: [[f], [g], [a, b, c]].
  • the second round of iteration ends.
  • the subsequence list at this time is: [[f],[g],[a,b,c]]
  • the frequency of occurrence of all subsequences is 3, but the subsequence [a, b, c] is the longest, then [a, b, c] will be taken out as the target subsequence for extension.
  • the extension fails.
  • another branch will be entered, which is the above-mentioned step S430', and the target subsequence [a, b, c] that cannot be extended (extension failure) will be added to the result sequence.
  • the frequency and time position of the target subsequence [a, b, c] are put into the information lookup table corresponding to the result sequence, and the target subsequence [a, b, c] that cannot be extended is deleted from the subsequence list. c], then the new subsequence list is: [[f],[g]].
  • the fourth round of iteration Until the subsequence list is empty, a result sequence will be obtained.
  • the result sequence is the set of all target subsequences that failed in the extension attempt. These target subsequences are repeated abstract operations. combination.
  • the initial subsequence list is: [[a],[b],[c],[f],[g]]
  • the final repeated sequence ie, repeated abstract operation combination
  • the frequency of occurrence of [a, b, c] and [f, g] can be found, as well as the time position of each occurrence.
  • the repeated abstract operation combination [a, b, c] appears three times in total, and the start time is different each time.
  • the concrete operation corresponding to each abstract operation in the abstract operation combination and the operation time of the concrete operation can be determined, thereby obtaining the repeated concrete operation combination and the start of the operation. time and end time.
  • users can analyze these repetitive operations, improve workflow or consider RPA-based operations to improve efficiency.
  • the solution provided by the above embodiments of the present application does not require manual mining and can automatically and accurately analyze repeated operations in user work operation records.
  • FIG. 5 is a block diagram of a repetitive operation extraction device according to an embodiment of the present application. As shown in Figure 5, the device includes:
  • the record acquisition module 510 is used to obtain work operation records, which include the specific operation and operation time of each step;
  • the operation screening module 520 is used to filter out special operations and common operations based on the concrete operations of each step;
  • the operation abstraction module 530 is used to convert the special operations and ordinary operations into abstract operations, and establish a mapping relationship between the concrete operations and the abstract operations;
  • the operation sequencing module 540 is used to arrange all abstract operations according to the operation time of the corresponding concrete operations to obtain an abstract operation sequence
  • the repetition extraction module 550 is used to extract repeated abstract operation combinations from the abstract operation sequence, and obtain the concrete operation combination and operation time corresponding to the abstract operation combination.
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more executable functions for implementing the specified logical function instruction.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or acts. ,or This can be accomplished using a combination of dedicated hardware and computer instructions.
  • each functional module in each embodiment of the present application can be integrated together to form an independent part, each module can exist alone, or two or more modules can be integrated to form an independent part.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods of various embodiments of the application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Input From Keyboards Or The Like (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请提供一种重复操作的提取方法及电子设备、存储介质,该方法通过获取工作操作记录,根据工作操作记录中每一步的具象操作,筛选出特殊操作和普通操作;将特殊操作和普通操作转化为抽象操作,并建立具象操作与抽象操作之间的映射关系;将抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;进而从抽象操作序列中提取出重复的抽象操作组合,获得抽象操作组合对应的具象操作组合和操作时间,由于无需人工参与重复操作的挖掘,降低了人工挖掘的工作量和成本,提高了效率。

Description

一种重复操作的提取方法及电子设备、存储介质
相关申请的交叉引用
本申请要求在2022年08月15日提交中国专利局、申请号为202210971903.9、申请名称为“一种重复操作的提取方法及电子设备、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据挖掘技术领域,特别涉及一种重复操作的提取方法及电子设备、计算机可读存储介质。
背景技术
RPA(Robotic Process Automation,机器人流程自动化)技术通过模拟人工操作键盘鼠标,自动处理计算机中规则清晰、批量重复的工作与任务。它可以将办公人员从每日的重复工作中解放出来,提高生产效率。举例而言,就像工业时代工厂的流水线机器替代工人劳动一样,可以代替办公人员操作电脑和软件,自动完成各类软件系统的工作和业务处理,准确高效地实现业务流程自动化。
人们的日常工作中常常有很多的重复性操作,比如登记,开票。这些工作可以使用RPA技术一键化,来提高效率。但是这些重复操作需要人来进行挖掘,挖掘本身也需要大量的工作,从而影响了下一步RPA技术的应用。
发明内容
本申请实施例提供了重复操作的提取方法及电子设备、存储介质,用以降低人工挖掘重复操作的工作量和成本,提高效率。
本申请实施例提供了一种重复操作的提取方法,包括:
获取工作操作记录,所述工作操作记录包括每一步的具象操作和操作时间;
根据每一步的具象操作,筛选出特殊操作和普通操作;
将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系;
将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;
从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
在一实施例中,所述根据每一步的具象操作,筛选出特殊操作和普通操作,包括:
若所述具象操作包括的元素内容或指令是复制、粘贴和保存中的任意一种,确定所述具象操作为特殊操作;
若所述具象操作不包括应用名或者包括的应用名或窗口名是指定名称,确定所述具象操作为可忽略操作;
除所述特殊操作和所述可忽略操作以外的具象操作为普通操作。
在一实施例中,将所述特殊操作和普通操作转化为抽象操作,包括:
将所述特殊操作以操作名进行存储,得到所述特殊操作对应的抽象操作;
将所述普通操作以应用名和固定窗口名进行存储,得到所述普通操作对应的抽象操作。
在一实施例中,在从所述抽象操作序列中提取出重复的抽象操作组合之前,所述方法还包括:
对所述抽象操作序列进行过滤,去除所述抽象操作序列中出现频次满足第一预设条件的抽象操作。
在一实施例中,对所述抽象操作序列进行过滤,去除所述抽象操作序列中出现频次满足第一预设条件的抽象操作,包括:
删除所述抽象操作序列中出现频次小于第一预设值的抽象操作,得到更新后的抽象操作序列;
从更新后的抽象操作序列中,找出前一连接和后一连接的出现频次均小于第二预设值,且前一连接和后一连接的出现频次之和最小的目标抽象操作;
在所述更新后的抽象操作序列中随机删除一个目标抽象操作,多次重复上述步骤,直到不存在可删除的抽象操作。
在一实施例中,所述从所述抽象操作序列中提取出重复的抽象操作组合,包括:
将所述抽象操作序列中相同的抽象操作合并为一个子序列,得到子序列列表,并通过信息查找表记录每个子序列的出现频次和在所述抽象操作序列中出现的时间位置;
根据所述信息查找表,每次从所述子序列列表中选择一个满足第二预设条件的目标子序列,并将所述目标子序列进行向前和向前扩展,得到延长子序列;
选择一个满足第三预设条件的延长子序列加入所述子序列列表,并在所述子序列列表中删除构成所述延长子序列的子序列,直到所述子序列列表为空;
若所有延长子序列均不满足第三预设条件,将选择的所述目标子序列作为重复的抽象操作组合,并在所述子序列列表中删除所述目标子序列,直到所述子序列列表为空。
在一实施例中,所述根据所述信息查找表,每次从所述子序列列表中选择一个满足第二预设条件的目标子序列,包括:
根据所述信息查找表中记录的每个子序列的出现频次,每次从所述子序列列表中选择出现频次最高的子序列;
若出现频次最高的子序列不止一个,则从中选择一个长度最长的子序列,作为所述目标子序列。
在一实施例中,将所述目标子序列进行向前和向前扩展,得到延长子序列,包括:
根据所述信息查找表中记录的每个子序列在所述抽象操作序列中出现的时间位置,在所述目标子序列对应的时间位置,向前扩展一个抽象操作,得到一个延长子序列;向后扩展一个抽象操作,得到另一个延长子序列。
在一实施例中,所述选择一个满足第三预设条件的延长子序列加入所述子序列列表,包括:
筛选出出现频次大于等于预设频次,出现频率大于预设频率且可延长率大于预设可延长率的延长子序列;
从筛选出的延长子序列中选取出现频次最高的延长子序列加入所述子序列列表。
本申请实施例提供了一种重复操作的提取装置,该装置包括:
记录获取模块,用于获取工作操作记录,所述工作操作记录包括每一步的具象操作和 操作时间;
操作筛选模块,用于根据每一步的具象操作,筛选出特殊操作和普通操作;
操作抽象化模块,用于将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系;
操作排序模块,用于将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;
重复提取模块,用于从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
本申请实施例还提供了一种电子设备,所述电子设备包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为执行上述重复操作的提取方法。
本申请实施例还提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序可由处理器执行以完成上述重复操作的提取方法。
本申请实施例还提供了一种计算机软件产品,所述计算机软件产品存储在计算机可读存储介质中,所述计算机软件产品包括指令,当所述指令被处理器执行时可实现上述重复操作的提取方法。
本申请上述实施例提供的方案,通过获取工作操作记录,根据工作操作记录中每一步的具象操作,筛选出特殊操作和普通操作;将特殊操作和普通操作转化为抽象操作,并建立具象操作与抽象操作之间的映射关系;将抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;进而从抽象操作序列中提取出重复的抽象操作组合,获得抽象操作组合对应的具象操作组合和操作时间,由于无需人工参与重复操作的挖掘,降低了人工挖掘的工作量和成本,提高了效率。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍。
图1为本申请一实施例提供的电子设备的结构示意图;
图2是本申请实施例一种重复操作的提取方法的流程示意图;
图3本申请一实施例提供的过滤抽象操作序列的流程示意图;
图4是图2对应实施例中步骤S250的细节流程图;
图5是本申请一实施例示出的一种重复操作的提取装置的框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。同时,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
图1是本申请实施例提供的电子设备的结构示意图。该电子设备100可以用于执行本申请实施例提供的重复操作的提取方法。如图1所示,该电子设备100包括:一个或多个 处理器102、一个或多个存储处理器可执行指令的存储器104。其中,所述处理器102被配置为执行本申请下述实施例提供的重复操作的提取方法。
所述处理器102可以是网关,也可以为智能终端,或者是包含中央处理单元(CPU)、图像处理单元(GPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元的设备,可以对所述电子设备100中的其它组件的数据进行处理,还可以控制所述电子设备100中的其它组件以执行期望的功能。
所述存储器104可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器102可以运行所述程序指令,以实现下文所述的重复操作的提取方法。在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述应用程序使用和/或产生的各种数据等。
在一实施例中,图1示电子设备100还可以包括输入装置106、输出装置108以及数据采集装置110,这些组件通过总线系统112和/或其它形式的连接机构(未示出)互连。应当注意,图1示的电子设备100的组件和结构只是示例性的,而非限制性的,根据需要,所述电子设备100也可以具有其他组件和结构。
所述输入装置106可以是用户用来输入指令的装置,并且可以包括键盘、鼠标、麦克风和触摸屏等中的一个或多个。所述输出装置108可以向外部(例如,用户)输出各种信息(例如,图像或声音),并且可以包括显示器、扬声器等中的一个或多个。所述数据采集装置110可以采集对象的图像,并且将所采集的图像存储在所述存储器104中以供其它组件使用。示例性地,该数据采集装置110可以为摄像头。
在一实施例中,用于实现本申请实施例的重复操作的提取方法的示例电子设备100中的各器件可以集成设置,也可以分散设置,诸如将处理器102、存储器104、输入装置106和输出装置108集成设置于一体,而将数据采集装置110分离设置。
在一实施例中,用于实现本申请实施例的重复操作的提取方法的示例电子设备100可以被实现为诸如智能手机、平板电脑、台式电脑、服务器、车载设备等智能终端。
图2是本申请实施例一种重复操作的提取方法的流程示意图。如图2所示,该方法包括以下步骤S210-步骤S250。
步骤S210:获取工作操作记录,所述工作操作记录包括每一步的具象操作和操作时间。
举例来说,一个专利从业人员“查收邮件,下载专利.docx,打开,修改,关闭保存,发送”就是一个重复的操作任务,一个专利工作者每天的工作包含很多的操作,其中有上述的重复性操作任务。
工作操作记录是指用户一天或一段时间内的操作构成的数据流,例如可以由RPA录制器录制用户一天的操作得到。RPA录制器可以对用户每一步的操作进行解析,解析出鼠标点击操作和特殊键盘按键操作(例如ctrl键和enter键),还可以解析出当前正在操作的应用、窗口、元素以及指令,并记录每一步操作的操作时间。
应用:比如word。
窗口:应用的某一个窗口,比如“任务挖掘专利.docx”。
元素:应用的某一个窗口中的某一个元素,比如一篇文章的标题,再比如在word中 右击鼠标后出现的每个选项。
元素内容:元素具体包含的内容,比如元素是一篇文章的标题,那么元素的内容可以是“一种挖掘重复事务的方法”,在比如元素是在word中右击鼠标后出现的每个选项,那么元素内容可以是“剪切”、“复制”等等。
指令:包括鼠标和键盘的操作指令,例如鼠标右键点击、键盘点击“ctrl+c”、“ctrl+v”。
工作操作记录可以包括每一步的具象操作和操作时间,具象操作也就是具体操作,可以包括应用名、窗口名、元素内容、指令等信息。操作时间是指每步操作的具体发生时间。
步骤S220:根据每一步的具象操作,筛选出特殊操作和普通操作。
其中,特殊操作是指复制、粘贴和保存操作。若具象操作包括的元素内容或指令是复制、粘贴和保存中的任意一种,确定该具象操作为特殊操作。如果指令是“ctrl+c”,认为是复制,如果指令是“ctrl+v”,认为是粘贴。如果元素内容是“复制”,认为是复制操作。如果元素内容是“粘贴”,“匹配目标格式”,“保留原格式”等认为是粘贴操作。如果元素内容是“保存”,“save”等,认为是保存操作。
若具象操作不包括应用名或者包括的应用名或窗口名是指定名称,确定所述具象操作为可忽略操作。
其中,不包括应用名可以认为是当前操作不作用于一个应用,当然也可能是RPA录制器解析错误,则此具象操作认为是可忽略操作。如果具象操作包括的应用名是指定名称(例如资源管理器),表示用户可能正在进行应用切换,则此具象操作也认为是可忽略操作。如果具象操作包括的窗口名是包含“新标签页”,“new tab”等字符串的指定名称,则该具象操作也认为是可忽略操作。
除特殊操作和可忽略操作以外的具象操作为普通操作。可以先判断具象操作是否是特殊操作,如果不是,再判断此具象操作是否是可忽略操作,如果不是,则表示是普通操作。根据需要,也可以先判断是否是可忽略操作,如果不是,再判断是否是特殊操作,如果也不是,则表示是普通操作。
步骤S230:将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系。
其中,将特殊操作和普通操作转化为抽象操作是为了对具象操作进行简化。由于具象操作包括了较多信息,而有些具象操作可以归为同一种操作,故可以对特殊操作和普通操作进行操作抽象化(也就是简化),简化后的特殊操作和普通操作统称抽象操作。
具体的,操作抽象化可以是:将特殊操作以操作名进行存储,得到所述特殊操作对应的抽象操作。将所述普通操作以应用名和固定窗口名进行存储,得到普通操作对应的抽象操作。
其中,操作名是指特殊操作的操作名,如复制、粘贴或保存。故不同应用、窗口下的特殊操作,都可以以特殊操作的操作名,作为抽象操作。而普通操作的应用名+固定窗口名可以作为普通操作对应的抽象操作。其中,固定窗口名是相对易变窗口名而言的,固定窗口名可以从普通操作包括的窗口名中提取。举例来说,窗口名“必应搜索-专利”中,“必应搜索”是固定窗口名,“专利”是易变窗口名。在一实施例中,可以利用提前训练好的关键词提取算法从窗口名中提取固定窗口名,也可以采用规则匹配算法,例如窗口名中的第一个词就是固定窗口名。
在一实施例中,可以用映射表存储具象操作与抽象操作之间的映射关系,从而可以确 定属于特殊操作和普通操作的具象操作在抽象化后,分别对应哪个抽象操作。
步骤S240:将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列。
抽象操作序列是所有抽象操作按照操作时间的先后顺序进行排列得到的。而抽象操作的操作时间就是映射表中对应具象操作的操作时间。
步骤S250:从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
抽象操作组合是指从抽象操作序列中提取的若干抽象操作构成的序列,为进行区分,称为抽象操作组合。重复的抽象操作组合是指该抽象操作组合出现的次数不止一次。
举例来说,抽象操作序列为abcfgfgabcfgabc,其中重复的抽象操作组合有abc和fg两个。具象操作组合是指抽象操作对应的具象操作构成的序列。抽象操作组合的操作时间就是对应具象操作的操作时间,由此,可以确定重复操作何时发生,何时结束以及出现次数。
在一实施例中,在上述步骤S250之前,本申请实施例提供的方法还包括:对所述抽象操作序列进行过滤,去除所述抽象操作序列中出现频次满足第一预设条件的抽象操作。
为了提高挖掘重复操作的效率,可以先去除抽象操作序列中出现频次满足第一预设条件的抽象操作。举例来说,第一预设条件可以是出现频次小于3次。出现频次较少的抽象操作可以认为是干扰噪声,故提前去除,可以减少后续重复操作挖掘的迭代次数。
在一实施例中,如图3所示,上述过滤抽象操作序列的过程具体包括:以下步骤S310-步骤S330。
步骤S310:删除所述抽象操作序列中出现频次小于第一预设值的抽象操作,得到更新后的抽象操作序列。
举例来说,抽象操作序列比如为abcabdceabcdabcabcd,假设第一预设值是3,由于e仅出现1次,故进行删除,abcd均出现超过3次,留下,得到更新后的抽象操作序列为abcabdcabcdabcabcd。
步骤S320:从更新后的抽象操作序列中,找出前一连接和后一连接的出现频次均小于第二预设值,且前一连接和后一连接的出现频次之和最小的目标抽象操作。
其中,前一连接是指某一个抽象操作和它前一个抽象操作构成的序列。后一连接是指某个抽象操作和它后一个抽象操作构成的序列。举例来说,针对抽象操作b,它的前一连接为ab,后一连接为bc。
目标抽象操作是指更新后的抽象操作序列中,满足以下两个条件的抽象操作:
条件1:前一连接和后一连接的出现频次均小于第二预设值(例如3);
条件2:前一连接和后一连接的出现频次之和最小。
举例来说,假设更新后的抽象操作序列为abcabdcabcdabcabcd,abcabdcabcdabcabcd的1连接(将两个抽象操作相邻出现叫做1连接)有:
ab、bc、ca、ab、bd、dc、ab、bc、cd、da、ab、bc、ca、ab、bc、cd。其中,ab的出现频次为5,bc的出现频次为4,ca的出现频次为2,bd的出现频次为1,dc的出现频次为1,cd的出现频次为2,da的出现频次为1。
第1个时间位置的抽象操作a:前一连接和后一连接的出现频次都大于3,不会被删除;
第2个时间位置的抽象操作b:前一连接和后一连接的出现频次都大于3,不会被删除;
第3个时间位置的抽象操作c:前一连接的出现频次大于3,不会被删除;
第4个时间位置的抽象操作a:后一连接的出现频次大于3,不会被删除;
第5个时间位置的抽象操作b:前一连接的出现频次大于3,不会被删除;
第6个时间位置的抽象操作d:前一连接和后一连接的出现频次都小于3,因此,要继续求取前一连接和后一连接的出现频次和为1+1=2;
第7个时间位置的抽象操作c:前一连接和后一连接的出现频次都小于3,因此,要继续求取前一连接和后一连接的出现频次和为1+2=3;
······
第11个时间位置的抽象操作d:前一连接和后一连接的出现频次都小于3,因此,要继续求取前一连接和后一连接的出现频次和为2+1=3;
······
第18个时间位置的抽象操作d:前一连接和后一连接的出现频次都小于3,因此,前一连接和后一连接的出现频次和为1+1=2(这个d没有后一连接,因此,在实际中可以将这种情况的后一连接的出现频次设置为1。同样的,对于没有前一连接的抽象操作,其前一连接的出现频次也设置为1)。
此时可以发现,第6时间位置的抽象操作d的前一连接和后一连接的出现频次都小于3,且前一连接和后一连接的出现频次和为1+1=2(是最小的)。第18时间位置的抽象操作d的前一连接和后一连接的出现频次都小于3,且前一连接和后一连接的出现频次和为1+1=2(也是最小的)。故第6时间位置的抽象操作d和第18时间位置的抽象操作d可以认为是目标抽象操作。
步骤S330:在所述更新后的抽象操作序列中随机删除一个目标抽象操作,多次重复上述步骤,直到不存在可删除的抽象操作。
如果目标抽象操作不止一个,则随机任意删除一个。如果目标抽象操作只有一个,则删除该目标抽象操作即可。多次重复上述步骤是指重复上述步骤S310-步骤S330。可删除的抽象操作包括出现频次小于第一预设值的抽象操作,还包括前一接和后一连接的出现频次均小于第二预设值且前一连接和后一连接的出现频次之和最小的目标抽象操作。
举例来说,假设第6时间位置的抽象操作d和第18时间位置的抽象操作均是目标抽象操作,则可以随机删除一个,比如删除第18个时间位置的抽象操作d。第一轮删除迭代后,抽象操作序列就变为了abcabdcabcdabcabc。
第二轮迭代删除:
抽象操作d仅出现2次,被删除,其他抽象操作超过了3次,留下,抽象操作序列就变为了abcabcabcabcabc。
abcabcabcabcabc的1连接有:
ab、bc、ca、ab、bc、ca、ab、bc、ca、ab、bc、ca、ab、bc,其中,ab的出现频次为5、bc的出现频次为5、ca的出现频次为4。
第1个时间位置的抽象操作a:前一连接和后一连接的出现频次都大于3,不会被删除。
第2个时间位置的抽象操作b:前一连接和后一连接的出现频次都大于3,不会被删除。
第3个时间位置的抽象操作c:前一连接和后一连接的出现频次都大于3,不会被删除。
·······
第15个时间位置的抽象操作c:前一连接和后一连接的出现频次都大于3,不会被删 除。
此时,没有可以可删除的抽象操作了。至此,迭代删除结束,得到最终更新完成的抽象操作序列为:abcabcabcabcabc。
在一实施例中,如图4所示,上述步骤S250具体包括以下步骤S410-步骤S430’。
步骤S410:将所述抽象操作序列中相同的抽象操作合并为一个子序列,得到子序列列表,并通过信息查找表记录每个子序列的出现频次和在所述抽象操作序列中出现的时间位置。
相同的抽象操作合并为一个子序列是指将不同时间位置的相同抽象操作合并为一个抽象操作,该抽象操作可以称为子序列。假设抽象操作序列为abcfgfgabcfgabc,则第1时间位置的a、第8时间位置的a和第13时间位置的a合并为一个子序列[a],同理,可以得到其他的子序列[b]、[c]、[f]、[g]。所有子序列构成子序列列表[[a],[b],[c],[f],[g]]。
信息查找表用于记录每个子序列的相关信息,供后续步骤查找。相关信息包括子序列在抽象操作序列中的出现频次和时间位置。时间位置用于表征子序列在抽象操作序列中的次序。举例来说,子序列[a]的出现频次是3次,时间位置是1、8、13。
步骤S420:根据所述信息查找表,每次从所述子序列列表中选择一个满足第二预设条件的目标子序列,并将所述目标子序列进行向前和向前扩展,得到延长子序列。
其中,目标子序列中是指子序列列表中满足第二预设条件的一个子序列。延长子序列是指目标子序列向前扩展一个抽象操作或向后扩展一个抽象操作得到的结果。
在一实施例中,上述步骤S420具体包括:根据所述信息查找表中记录的每个子序列的出现频次,每次从所述子序列列表中选择出现频次最高的子序列;若出现频次最高的子序列不止一个,则从中选择一个长度最长的子序列,作为所述目标子序列。
也就是说,第二预设条件包括:条件1:信息查找表中出现频次最高的子序列;条件2:当多个子序列的出现频次相同,都是最高的,则从中选择长度最长的子序列。特殊情况是:当几个子序列的出现频次都是最高的,且长度也一样长时,随机在这几个子序列中选择一个。
举例来说,假设子序列[a],子序列[b],子序列[c],子序列[f],子序列[g]在信息查找表中显示的出现频次都为3次,且长度都是1,则随机取子序列[b],作为本次的目标子序列。
根据所述信息查找表中记录的每个子序列在所述抽象操作序列中出现的时间位置,在所述目标子序列对应的时间位置,向前扩展一个抽象操作,得到一个延长子序列;向后扩展一个抽象操作,得到另一个延长子序列。
假设抽象操作序列为abcfgfgabcfgabc,目标子序列为[b],时间位置为2的子序列[b],向前扩展一个抽象操作为[a,b],向后扩展一个抽象操作为[b,c];时间位置为9的子序列[b],向前扩展一个抽象操作为[a,b],向后扩展一个抽象操作为[b,c];时间位置为14的子序列[b],向前扩展一个抽象操作为[a,b],向后扩展一个抽象操作为[b,c]。
此时,延长子序列有[a,b]、[b,c];[a,b]的出现频次为3,[b,c,]的出现频次为3,延长子序列的出现频次也记录在信息查找表中。
步骤S430:选择一个满足第三预设条件的延长子序列加入所述子序列列表,并在所述子序列列表中删除构成所述延长子序列的子序列,直到所述子序列列表为空。
步骤S430’:若所有延长子序列均不满足第三预设条件,将选择的所述目标子序列作为重复的抽象操作组合,并在所述子序列列表中删除所述目标子序列,直到所述子序列列 表为空。
其中,第三预设条件包括:条件1:延长子序列在信息查找表中的出现频次大于等于预设频次(比如3),出现频率大于预设频率(比如1%),可延长率大于预设可延长率(比如20%)。条件2:对于满足条件1的延长子序列,选取在信息查找表中的出现频次最高的延长子序列。
具体的,上述步骤S430包括:筛选出出现频次大于等于预设频次,出现频率大于预设频率且可延长率大于预设可延长率的延长子序列。从筛选出的延长子序列中选取出现频次最高的延长子序列加入所述子序列列表。特殊情况是,如果几个满足条件1的延长子序列在信息查找表中的出现频次一样,都是最高的话,则随机选取一个延长子序列。也即每次只将一个延长序列加入到子序列列表中。
其中,出现频率指延长子序列在信息查找表中的出现频次与所有子序列在信息查找表中的出现频次之和的比值。可延长率指延长子序列在信息查找表中的出现频次与延长前的子序列在信息查找表中的出现频次的比值,代表有多少比例原序列可以被延长成新序列。
举例来说,延长子序列[a,b],出现频次为3次,大于等于预设频次(3次);出现频率为3/15=0.2%,大于1%;可延长率为100%,大于20%。延长子序列[b,c,],出现频次为3次,大于等于预设频次(3次);出现频率为3/15=0.2%,大于1%;可延长率为100%,大于20%。由于延长子序列[a,b]、[b,c]的出现频次相同,因此,随机将延长子序列[a,b]加入到子序列列表中。此时,子序列列表为:[[a],[b],[c],[f],[g],[a,b]]。
由于构成延长子序列[a,b]的子序列有[a]和[b],因此删除子序列列表中用于构成延长子序列[a,b]的子序列[a]和[b],得到更新后的子序列列表为[[c],[f],[g],[a,b]]。
相反的,假设所有延长子序列均不满足上述第三预设条件,则将用于构成延长子序列的目标子序列作为重复的抽象操作组合,并将目标子序列从子序列列表中删除,由此得到更新后的子序列列表。
之后进行第二轮迭代,再次执行上述步骤S420从更新后的子序列列表中选择一个满足第二预设条件的目标子序列,进行延长,得到延长子序列,进而执行步骤S430,直到子序列列表为空。
举例来说,第二轮迭代:
此时的子序列列表为:[[c],[f],[g],[a,b]];
按照上述第二预设条件的规则,所有子序列的出现频次都是3,但子序列[a,b,]最长,那么会取出[a,b]作为目标子序列进行延长。
所有延长的可能性为[a,b,c],出现频次3次;[g,a,b],出现频次2次。
按照上述第三预设条件的规则,只有延长子序列[a,b,c]满足条件,那么将其加入到子序列列表,为:[[c],[f],[g],[a,b],[a,b,c]]。
之后删除同于构成延长子序列的子序列[a,b,]和[c,],子序列列表更新为:[[f],[g],[a,b,c]]。至此,第二轮迭代结束。
第三轮迭代:
此时的子序列列表为:[[f],[g],[a,b,c]]
按照第二预设条件的规则,所有子序列的出现频次都是3,但子序列[a,b,c]最长,那么会取出[a,b,c]为目标子序列进行延长。
所有延长的可能性为[a,b,c,f],出现频次2次;[g,a,b,c],出现频次2次。
按照上述第三预设条件的规则,没有满足条件的延长子序列,因此延长失败。此时将进入另外一条分支,也就是上述步骤S430’,将无法再延长(延长失败)的目标子序列[a,b,c]加入到结果序列中。同时,将目标子序列[a,b,c]的出现频次和时间位置放入结果序列所对应的信息查找表中,并且在子序列列表中删除无法再延长的目标子序列[a,b,c],则新的子序列列表为:[[f],[g]]。
第四轮迭代:·······直至子序列列表为空,就能得到一个结果序列,结果序列中就是所有延长尝试失败的目标子序列的集合,这些目标子序列就是重复的抽象操作组合。
假设抽象操作序列为abcfgfgabcfgabc,初始的子序列列表为:[[a],[b],[c],[f],[g]],最终得到的重复序列(即重复的抽象操作组合)为[a,b,c]和[f,g]。根据重复序列对应的信息查找表,可以查出[a,b,c]和[f,g]的出现频次,以及每次出现的时间位置。举例来说,重复的抽象操作组合[a,b,c]一共出现3次,每次的开始时间均不同。
之后,根据步骤S230建立的抽象操作和具象操作之间的映射关系,可以确定抽象操作组合中每个抽象操作对应的具象操作以及该具象操作的操作时间,从而得到重复的具象操作组合以及操作开始时间和结束时间。之后,用户可以分析这些重复操作,改进工作流程或者考虑将其RPA化来提高效率。本申请上述实施例提供的方案,无需人工挖掘,自动准确地分析用户工作操作记录中的重复操作。
下述为本申请装置实施例,可以用于执行本申请上述重复操作的提取方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请重复操作的提取方法实施例。
图5为本申请一实施例示出的一种重复操作的提取装置的框图。如图5所示,该装置包括:
记录获取模块510,用于获取工作操作记录,所述工作操作记录包括每一步的具象操作和操作时间;
操作筛选模块520,用于根据每一步的具象操作,筛选出特殊操作和普通操作;
操作抽象化模块530,用于将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系;
操作排序模块540,用于将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;
重复提取模块550,用于从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
上述装置中各个模块的功能和作用的实现过程具体详见上述重复操作的提取方法中对应步骤的实现过程,在此不再赘述。
在本申请所提供的几个实施例中,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或 者可以用专用硬件与计算机指令的组合来实现。
另外,在本申请各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。
功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (13)

  1. 一种重复操作的提取方法,其特征在于,包括:
    获取工作操作记录,所述工作操作记录包括每一步的具象操作和操作时间;
    根据每一步的具象操作,筛选出特殊操作和普通操作;
    将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系;
    将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;
    从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
  2. 根据权利要求1所述的方法,其特征在于,所述根据每一步的具象操作,筛选出特殊操作和普通操作,包括:
    若所述具象操作包括的元素内容或指令是复制、粘贴和保存中的任意一种,确定所述具象操作为特殊操作;
    若所述具象操作不包括应用名或者包括的应用名或窗口名是指定名称,确定所述具象操作为可忽略操作;
    除所述特殊操作和所述可忽略操作以外的具象操作为普通操作。
  3. 根据权利要求1所述的方法,其特征在于,将所述特殊操作和普通操作转化为抽象操作,包括:
    将所述特殊操作以操作名进行存储,得到所述特殊操作对应的抽象操作;
    将所述普通操作以应用名和固定窗口名进行存储,得到所述普通操作对应的抽象操作。
  4. 根据权利要求1所述的方法,其特征在于,在从所述抽象操作序列中提取出重复的抽象操作组合之前,所述方法还包括:
    对所述抽象操作序列进行过滤,去除所述抽象操作序列中出现频次满足第一预设条件的抽象操作。
  5. 根据权利要求4所述的方法,其特征在于,对所述抽象操作序列进行过滤,去除所述抽象操作序列中出现频次满足第一预设条件的抽象操作,包括:
    删除所述抽象操作序列中出现频次小于第一预设值的抽象操作,得到更新后的抽象操作序列;
    从更新后的抽象操作序列中,找出前一连接和后一连接的出现频次均小于第二预设值,且前一连接和后一连接的出现频次之和最小的目标抽象操作;
    在所述更新后的抽象操作序列中随机删除一个目标抽象操作,多次重复上述步骤,直到不存在可删除的抽象操作。
  6. 根据权利要求1所述的方法,其特征在于,所述从所述抽象操作序列中提取出重复的抽象操作组合,包括:
    将所述抽象操作序列中相同的抽象操作合并为一个子序列,得到子序列列表,并通过信息查找表记录每个子序列的出现频次和在所述抽象操作序列中出现的时间位置;
    根据所述信息查找表,每次从所述子序列列表中选择一个满足第二预设条件的目标子序列,并将所述目标子序列进行向前和向前扩展,得到延长子序列;
    选择一个满足第三预设条件的延长子序列加入所述子序列列表,并在所述子序列列表 中删除构成所述延长子序列的子序列,直到所述子序列列表为空;
    若所有延长子序列均不满足第三预设条件,将选择的所述目标子序列作为重复的抽象操作组合,并在所述子序列列表中删除所述目标子序列,直到所述子序列列表为空。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述信息查找表,每次从所述子序列列表中选择一个满足第二预设条件的目标子序列,包括:
    根据所述信息查找表中记录的每个子序列的出现频次,每次从所述子序列列表中选择出现频次最高的子序列;
    若出现频次最高的子序列不止一个,则从中选择一个长度最长的子序列,作为所述目标子序列。
  8. 根据权利要求6所述的方法,其特征在于,将所述目标子序列进行向前和向前扩展,得到延长子序列,包括:
    根据所述信息查找表中记录的每个子序列在所述抽象操作序列中出现的时间位置,在所述目标子序列对应的时间位置,向前扩展一个抽象操作,得到一个延长子序列;向后扩展一个抽象操作,得到另一个延长子序列。
  9. 根据权利要求6所述的方法,其特征在于,所述选择一个满足第三预设条件的延长子序列加入所述子序列列表,包括:
    筛选出出现频次大于等于预设频次,出现频率大于预设频率且可延长率大于预设可延长率的延长子序列;
    从筛选出的延长子序列中选取出现频次最高的延长子序列加入所述子序列列表。
  10. 一种重复操作的提取装置,其特征在于,包括:
    记录获取模块,用于获取工作操作记录,所述工作操作记录包括每一步的具象操作和操作时间;
    操作筛选模块,用于根据每一步的具象操作,筛选出特殊操作和普通操作;
    操作抽象化模块,用于将所述特殊操作和普通操作转化为抽象操作,并建立所述具象操作与所述抽象操作之间的映射关系;
    操作排序模块,用于将所有抽象操作按照对应具象操作的操作时间,排列得到抽象操作序列;
    重复提取模块,用于从所述抽象操作序列中提取出重复的抽象操作组合,获得所述抽象操作组合对应的具象操作组合和操作时间。
  11. 一种电子设备,其特征在于,所述电子设备包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行权利要求1-9任意一项所述的重复操作的提取方法。
  12. 一种计算机可读存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序可由处理器执行以完成权利要求1-9任意一项所述的重复操作的提取方法。
  13. 一种计算机软件产品,其特征在于,所述计算机软件产品存储在计算机可读存储介质中,所述计算机软件产品包括指令,当所述指令被处理器执行时可实现权利要求1-9中任意一项权利要求所述的方法。
PCT/CN2023/084305 2022-08-15 2023-03-28 一种重复操作的提取方法及电子设备、存储介质 WO2024036974A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210971903.9A CN115048282B (zh) 2022-08-15 2022-08-15 重复操作的提取方法及电子设备、存储介质
CN202210971903.9 2022-08-15

Publications (1)

Publication Number Publication Date
WO2024036974A1 true WO2024036974A1 (zh) 2024-02-22

Family

ID=83167962

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/084305 WO2024036974A1 (zh) 2022-08-15 2023-03-28 一种重复操作的提取方法及电子设备、存储介质

Country Status (2)

Country Link
CN (1) CN115048282B (zh)
WO (1) WO2024036974A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115048282B (zh) * 2022-08-15 2022-10-25 北京弘玑信息技术有限公司 重复操作的提取方法及电子设备、存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200206920A1 (en) * 2018-12-31 2020-07-02 Kofax, Inc. Systems and methods for identifying processes for robotic automation and building models therefor
US20210086354A1 (en) * 2019-09-19 2021-03-25 UiPath, Inc. Process understanding for robotic process automation (rpa) using sequence extraction
US20220114512A1 (en) * 2020-10-14 2022-04-14 Samsung Sds Co., Ltd. Method and apparatus for creating workflow based on log
US20220147386A1 (en) * 2020-11-12 2022-05-12 Automation Anywhere, Inc. Automated software robot creation for robotic process automation
WO2022136703A1 (de) * 2020-12-24 2022-06-30 Stuth Andre Verfahren und system zum überführen einer start-objektsituation in eine ziel-objektsituation (intuitive tacit solution finding)
CN115048282A (zh) * 2022-08-15 2022-09-13 北京弘玑信息技术有限公司 重复操作的提取方法及电子设备、存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8140351B2 (en) * 2007-02-08 2012-03-20 Fht, Inc. Centralized sterile drug products distribution and automated management of sterile compounding stations
CN103984769A (zh) * 2014-06-04 2014-08-13 成都美美臣科技有限公司 一种用例数据管理存储方法
CN105975324B (zh) * 2016-07-15 2021-03-23 爱普(福建)科技有限公司 一种记忆人机界面操作习惯的方法
EP3753684B1 (en) * 2019-06-21 2022-08-10 Robert Bosch GmbH Method and system for robot manipulation planning
CN112008766A (zh) * 2020-09-03 2020-12-01 国网江苏省电力有限公司南通供电分公司 一种基于rpa机器人的数据补召自动化方法
CN113240395A (zh) * 2021-05-19 2021-08-10 上海起策教育科技有限公司 一种基于邮件系统的rpa机器人控制系统
CN114445040A (zh) * 2022-01-21 2022-05-06 来也科技(北京)有限公司 结合rpa和ai的业务流程自动化评估方法、装置及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200206920A1 (en) * 2018-12-31 2020-07-02 Kofax, Inc. Systems and methods for identifying processes for robotic automation and building models therefor
US20210086354A1 (en) * 2019-09-19 2021-03-25 UiPath, Inc. Process understanding for robotic process automation (rpa) using sequence extraction
US20220114512A1 (en) * 2020-10-14 2022-04-14 Samsung Sds Co., Ltd. Method and apparatus for creating workflow based on log
US20220147386A1 (en) * 2020-11-12 2022-05-12 Automation Anywhere, Inc. Automated software robot creation for robotic process automation
WO2022136703A1 (de) * 2020-12-24 2022-06-30 Stuth Andre Verfahren und system zum überführen einer start-objektsituation in eine ziel-objektsituation (intuitive tacit solution finding)
CN115048282A (zh) * 2022-08-15 2022-09-13 北京弘玑信息技术有限公司 重复操作的提取方法及电子设备、存储介质

Also Published As

Publication number Publication date
CN115048282B (zh) 2022-10-25
CN115048282A (zh) 2022-09-13

Similar Documents

Publication Publication Date Title
US11074560B2 (en) Tracking processed machine data
US11604782B2 (en) Systems and methods for scheduling concurrent summarization of indexed data
US11818018B1 (en) Configuring event streams based on identified security risks
US10909151B2 (en) Distribution of index settings in a machine data processing system
JP4847709B2 (ja) タイムラインベースのコンピューティング環境を使用してデータを回復するための方法、媒体、およびシステム
US11086897B2 (en) Linking event streams across applications of a data intake and query system
AU2024200596A1 (en) Systems and methods for providing an instant communication channel within integrated development environments
US20150046404A1 (en) Systems and methods for restoring a file
US20150293954A1 (en) Grouping and managing event streams generated from captured network data
US10853157B2 (en) Compact binary event log generation
CN106844102B (zh) 数据恢复方法和装置
WO2024036974A1 (zh) 一种重复操作的提取方法及电子设备、存储介质
CN107209707B (zh) 基于云的分级系统保存
CN113364801A (zh) 网络防火墙策略的管理方法、系统、终端设备及存储介质
US20130204839A1 (en) Validating Files Using a Sliding Window to Access and Correlate Records in an Arbitrarily Large Dataset
CN115826828B (zh) 网盘文件操作方法、装置、终端及存储介质
CN111880964A (zh) 用于基于出处的数据备份的方法和系统
CN110232050A (zh) 一种基于编程语言对文件进行压缩的方法和电子设备
CN112632126A (zh) 一种已删除应用列表提取方法、终端设备及存储介质
CN106469086B (zh) 事件处理方法和装置
JP4410754B2 (ja) ファイルバックアップシステム及び方法
CN114153793A (zh) 一种Windows系统中的文件全面遍历方法和系统
CN117112173A (zh) 一种任务序列的调整方法、装置、设备及存储介质
CN115168105A (zh) 一种恢复Windows删除图片的缩略图的方法以及相关装置
CN114546461A (zh) 一种业务项开发方法、装置、设备和可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23853887

Country of ref document: EP

Kind code of ref document: A1