CN117390336A - Webpage process automation method, device, equipment and storage medium - Google Patents

Webpage process automation method, device, equipment and storage medium Download PDF

Info

Publication number
CN117390336A
CN117390336A CN202311498917.4A CN202311498917A CN117390336A CN 117390336 A CN117390336 A CN 117390336A CN 202311498917 A CN202311498917 A CN 202311498917A CN 117390336 A CN117390336 A CN 117390336A
Authority
CN
China
Prior art keywords
natural language
instruction
webpage
key value
agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311498917.4A
Other languages
Chinese (zh)
Inventor
李明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pacific Insurance Technology Co Ltd
Original Assignee
Pacific Insurance Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pacific Insurance Technology Co Ltd filed Critical Pacific Insurance Technology Co Ltd
Priority to CN202311498917.4A priority Critical patent/CN117390336A/en
Publication of CN117390336A publication Critical patent/CN117390336A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application provides a webpage process automation method, device, equipment and storage medium, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: and processing the natural language instruction acquired by the webpage to obtain a leaf node and instruction key value pair which need the intelligent agent to operate, so that the intelligent agent realizes webpage process automation according to the leaf node and instruction key value pair. Because the key value pairs and the leaf nodes are simple elements in the operation of the webpage process, for the condition of replacing the webpage interface or the application environment, the trained intelligent agent can still analyze the natural language instruction through the operation to obtain the instruction key value pairs and the operable leaf nodes so as to realize the automation of the webpage process, reduce the condition of redeploying RPA under different application scenes and reduce the development cost.

Description

Webpage process automation method, device, equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for automating a web page process.
Background
Along with the increasingly perfect informatization construction of enterprises, a large number of robot webpage process automation (Robotic Process Automation, RPA) products are introduced into the market, and the RPA products can efficiently complete a large number of repeated works by simulating the operation of human beings among electronic systems, can efficiently and high-quality realize a large number of data integration, migration and processing, and greatly improves the office efficiency.
At present, the RPA has the defect of poor generalization capability, and each RPA flow only corresponds to a fixed key mouse operation sequence and a UI interaction position, so each webpage operation has a specific RPA flow and cannot be applied to other scenes, and the generalization capability is poor; in addition, if the style of the button or the input box of the same web page is changed arbitrarily, the RPA process may not locate the UI that needs to be operated, thereby causing misoperation and the like.
Therefore, for replacing the web interface or using the RPA in a different application environment, the RPA needs to be redeployed, and the redeployment of the RPA needs to take a lot of time, which is costly.
Disclosure of Invention
In view of this, the present application provides a web page process automation method and apparatus, which aims to make RPA applicable to multiple application scenarios, reduce redeployment of RPA in different application scenarios, and reduce development cost.
In a first aspect, the present application provides a web page process automation method, including:
acquiring a natural language instruction received by a webpage plug-in, wherein the natural language instruction comprises multiple types of instructions;
processing the natural language instruction to obtain a plurality of instruction key value pairs;
analyzing a document object model of the webpage according to the natural language instruction to obtain an operable leaf node in the webpage;
and inputting the plurality of instruction key value pairs and the operable leaf nodes into a pre-trained agent to automatically execute the natural language instruction.
Optionally, training the agent includes:
and training the intelligent agent by taking the circulating neural network as a core network of the intelligent agent, taking command key value pairs as input, operable leaf nodes as state spaces, clicking text boxes and inputting natural language characters as action spaces.
Optionally, the acquiring the natural language instruction received by the webpage plugin includes:
and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
Optionally, when the natural language instruction is a download class instruction, the download class instruction includes a storage address after the downloading is completed, and the method further includes:
and after the downloading instruction is completed, storing the downloading content included in the downloading instruction according to the storage address.
Optionally, the operable leaf node includes:
one or more of a text box, a button, a drop down box, and a check box in a web page.
In a second aspect, the present application provides a web page process automation device, including:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring natural language instructions received by a webpage plug-in, and the natural language instructions comprise multiple types of instructions;
the first processing unit is used for processing the natural language instruction to obtain a plurality of instruction key value pairs;
the second processing unit is used for analyzing the document object model of the webpage according to the natural language instruction to obtain an operable leaf node in the webpage;
and the input unit is used for inputting the plurality of instruction key value pairs and the operable leaf nodes into the pre-trained agent so as to automatically execute the natural language instruction.
Optionally, the device further includes a training unit, configured to train the agent by using the recurrent neural network as a core network of the agent, using command key value pairs as input, operable leaf nodes as state spaces, clicking text boxes, and inputting natural language characters as action spaces.
Optionally, the acquiring unit is specifically configured to:
and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
Optionally, when the natural language instruction is a download instruction, the download instruction includes a storage address after the download is completed, and when the download instruction is completed, the storage unit is configured to store the download content included in the download instruction according to the storage address.
Optionally, the operable leaf node includes:
one or more of a text box, a button, a drop down box, and a check box in a web page.
In a third aspect, the present application provides an apparatus comprising a memory for storing instructions or code and a processor for executing the instructions or code to cause the apparatus to perform the web page flow automation method of any one of the preceding aspects.
In a fourth aspect, the present application provides a computer readable storage medium having code stored therein, which when executed, causes an apparatus running the code to implement the web page flow automation method of any one of the preceding first aspects.
The application provides a webpage process automation method. When the method is executed, a natural language instruction received by a webpage plug-in is firstly obtained, the natural language instruction comprises multiple types of instructions, then the natural language instruction is processed to obtain multiple instruction key value pairs, a document object model of the webpage is analyzed according to the natural language instruction to obtain an operable leaf node in the webpage, and the multiple instruction key value pairs and the operable leaf node are input into a pre-trained agent so as to automatically execute the natural language instruction. Therefore, the leaf node and instruction key value pair which are required to be operated by the agent are obtained by processing the natural language instruction acquired by the webpage, so that the agent realizes webpage process automation according to the leaf node and instruction key value pair. Because the key value pairs and the leaf nodes are simple elements in the operation of the webpage process, for the condition of replacing the webpage interface or the application environment, the trained intelligent agent can still analyze the natural language instruction through the operation to obtain the instruction key value pairs and the operable leaf nodes so as to realize the automation of the webpage process, reduce the redeployment of RPA under different application scenes and reduce the development cost.
Drawings
In order to more clearly illustrate the present embodiments or the technical solutions in the prior art, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a web page process automation method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a web page process automation device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
As described in the background art of the application, in the web page process automation method in the prior art, the whole operation process of the RPA needs to be trained, and when the web page or the application scene changes, the training needs to be retrained, which takes a lot of time, so that the cost is higher.
In order to solve the above technical problems, an embodiment of the present application provides a web page process automation method, which includes:
firstly acquiring natural language instructions received by a webpage plug-in, wherein the natural language instructions comprise multiple types of instructions, then processing the natural language instructions to obtain multiple instruction key value pairs, analyzing a document object model of the webpage according to the natural language instructions to obtain operable leaf nodes in the webpage, and inputting the multiple instruction key value pairs and the operable leaf nodes into a pre-trained agent so as to automatically execute the natural language instructions. Therefore, the leaf node and instruction key value pair which are required to be operated by the agent are obtained by processing the natural language instruction acquired by the webpage, so that the agent realizes webpage process automation according to the leaf node and instruction key value pair. Because the key value pairs and the leaf nodes are simple elements in the operation of the webpage process, for the condition of replacing the webpage interface or the application environment, the trained intelligent agent can still analyze the natural language instruction through the operation to obtain the instruction key value pairs and the operable leaf nodes so as to realize the automation of the webpage process, reduce the redeployment of RPA under different application scenes and reduce the development cost.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a flowchart of a web page process automation method provided in an embodiment of the present application. Referring to fig. 1, the web page process automation method provided in the embodiment of the present application may include:
s101, acquiring a natural language instruction received by a webpage plug-in, wherein the natural language instruction comprises multiple types of instructions.
The webpage plug-in is also called a browser plug-in and is used for expanding the functions of the browser, enriching the browsing experience and meeting different functional requirements. After the web plug-in the embodiment of the application is started, a text box is displayed on the front-end web page, and natural language instructions can be input through the text box, for example: the natural language instructions input by the user are: "help me download running water bill for last three months".
Natural language refers to the language used daily such as chinese, english, french, etc., is a language developed by human society, and is not an artificial language, and is an important tool for human learning and living. In general, natural language refers to human social conventions that are artificial, as opposed to languages such as programming.
The natural language instructions comprise a plurality of types, and concretely can comprise a downloading class instruction, a filling class instruction and a searching class instruction. The downloading instruction specifically downloads a certain file; the filling instructions are used for filling a certain form; the search class instruction refers to searching for a certain item of content, such as data or a file.
S102, processing the natural language instruction to obtain a plurality of instruction key value pairs.
After a natural language instruction is obtained through a webpage plug-in at the front end, the natural language instruction is transmitted to the rear end for processing, and the processing process specifically comprises the step of identifying the natural language instruction through a natural language model.
The natural language processing model refers to a model capable of performing natural language processing, and natural language processing (Natural Language Processing, abbreviated as NLP) is a means for processing, understanding and applying human language by a computer, and belongs to one branch of artificial intelligence. The natural language processing model can process the information of the shape, the sound, the meaning and the like of the natural language, namely, the operation and the processing of inputting, outputting, identifying, analyzing, understanding, generating and the like of characters, words, sentences and chapters, and realizes the information exchange between human and machine.
The instruction key value pair refers to a key value pair related to an instruction obtained by processing the natural language instruction, the key value pair (key: value) is a simple correspondence, and the key rear face corresponds to a corresponding value, for example: time {: three months }, wherein "time" is a key and "three months" is a corresponding value, combined into one key-value pair. Specifically, processing a natural language instruction may result in a plurality of key-value pairs, for example: the natural language instructions input by the user are: "help me download running bill of last three months", can produce "{ time: three months, document type: running bill, order: download }" such key value pair through the natural language processing model.
S103, analyzing the document object model of the webpage according to the natural language instruction to obtain the operable leaf nodes in the webpage.
The document object model (Document Object Model, DOM for short) is a programming interface for web documents. The DOM represents documents as nodes and objects to enable interactions with pages through a programming language. In general, all rendered content in a page is taken as nodes (nodes) in a DOM document, such as: the element labels are element nodes, the annotated content is an annotation node, the text content is a text node, the documents is a document node, etc. The actionable controls in the DOM are referred to herein as leaf nodes that include one or more of text boxes, buttons, drop-down boxes, and check boxes in the web page.
The operable leaf nodes in different application scenes are different, for example, when the natural language instruction is a download class instruction, the operable leaf nodes comprise text boxes, buttons and drop-down boxes in the webpage; when the natural language instruction is a fill-in class instruction, the actionable leaf node includes a text box, a button, a drop-down box, and a check box in a web page.
S104, inputting the command key value pairs and the operable leaf nodes into an agent trained in advance so as to automatically execute the natural language command.
After obtaining the command key value pair and the operable leaf node, the pre-trained agent automatically executes the corresponding action, thereby completing the webpage operation.
The intelligent agent is an intelligent agent based on reinforcement learning, and after training, the intelligent agent learns various atomic tasks such as clicking a button, clicking a drop-down box, selecting a check box, inputting a text and the like, and any web page workflow such as downloading files, filling forms and the like can be split into combinations of the atomic tasks. The training of the intelligent agent to deal with atomic tasks can be generalized to other complex workflow by reinforcement learning. In addition, since the reinforcement learning agent directly operates on the web page DOM element, the reinforcement learning agent is insensitive to the attribute description such as the position, the shape and the like of the UI, and therefore has extremely high style generalization capability.
Training the agent includes: the agent is trained by taking a recurrent neural network (Rerrent Neural Network, RNN) as a core network of the agent, taking command key value pairs as inputs, operable leaf nodes as state spaces, clicking text boxes and inputting natural language characters as action spaces. In order for the reinforcement learning agent to recognize the UI component categories (i.e., buttons, text boxes, drop-down boxes, etc.) corresponding to various DOM elements and learn to align instruction key value pairs with the operable leaf nodes, data sets of DOM descriptions of different web pages about UI components such as buttons, drop-down boxes, check boxes, etc. need to be collected and labeled during the training phase. For example, a search button, DOM, which may be expressed as "< button type=" search "> search </button >", or "< ahref=" # "class=" button "> search </button >", is labeled as a button, and the richer the data set, the more generalizable the trained agent. The simulation environment is then generated in reverse based on the DOM data sets. The agent is trained in the simulation environment until it can determine the UI component class from the DOM description alone.
Since the reinforcement-learned agent takes action by judging the state (state) of the current environment, in the web page work flow, the state (state) is the DOM information of the web page, and the action (action) is the operations of clicking, inputting, dragging and the like. Therefore, when the web page is abnormal, the intelligent agent determines the output behavior by judging the change of the DOM element of the web page. If the web page abnormality does not affect the key operational DOM leaf nodes, the workflow is not interrupted, thereby achieving the effect of coping with abnormal network conditions.
In addition, the intelligent agent based on the reinforcement learning training can quickly adapt to a new webpage and complete corresponding instructions. Because the reinforcement learning agent takes the DOM elements of the web page as input during training, the reinforcement learning agent is not influenced by factors such as layout, structure, color, style and the like of the web page, but is only related to the operable leaf nodes, so that the reinforcement learning agent can be quickly adapted to web page interfaces with different visual styles, and the condition that the agent cannot recognize after buttons or input frame styles or positions in the web page change is avoided.
In an implementation manner of the embodiment of the present application, the obtaining the natural language instruction received by the web page plug-in includes: and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
When a plurality of tasks are needed to be realized through flow automation, a plurality of task instructions can be acquired at one time, for example, a running bill of a user in one year is downloaded; downloading the running bill of the user b within 3 months, storing the acquired instructions according to the time sequence of the instructions, and sequentially completing the instructions according to the sequence. In order to ensure that instructions can be completed in order of order, the instructions may be stored in order of time.
In an implementation manner of the embodiment of the present application, when the natural language instruction is a download class instruction, the download class instruction includes a storage address after the download is completed, and the method further includes:
and after the downloading instruction is completed, storing the downloading content included in the downloading instruction according to the storage address.
The download class instructions are for example: "download a running bill for three months and store on the desktop". The instruction can be completed by the webpage process automation method, and when the storage address changes, the instruction can be stored to the target address. Because the agent in this application has learned the action of operating the leaf node in the webpage, i.e. small components such as clicking the button, clicking the drop-down box, selecting the check box, inputting the text, etc. when training, therefore, whether changing the application scene or changing the layout of the webpage, the agent can be realized by clicking the components, compared with the RPA which needs to be deployed again in different scenes, the method in this embodiment of the application can reduce the development cost.
The above example of the present application provides a web page process automation method, which includes: firstly acquiring natural language instructions received by a webpage plug-in, wherein the natural language instructions comprise multiple types of instructions, then processing the natural language instructions to obtain multiple instruction key value pairs, analyzing a document object model of the webpage according to the natural language instructions to obtain operable leaf nodes in the webpage, and inputting the multiple instruction key value pairs and the operable leaf nodes into a pre-trained agent so as to automatically execute the natural language instructions. Therefore, the leaf node and instruction key value pair which are required to be operated by the agent are obtained by processing the natural language instruction acquired by the webpage, so that the agent realizes webpage process automation according to the leaf node and instruction key value pair. Because the key value pairs and the leaf nodes are simple elements in the operation of the webpage process, for the condition of replacing the webpage interface or the application environment, the trained intelligent agent can still analyze the natural language instruction through the operation to obtain the instruction key value pairs and the operable leaf nodes so as to realize the automation of the webpage process, reduce the redeployment of RPA under different application scenes and reduce the development cost.
The above is some specific implementation manners of a web page process automation method provided in the embodiments of the present application, and based on this, the present application further provides a corresponding device. The apparatus provided in the embodiments of the present application will be described from the viewpoint of functional modularization.
Fig. 2 is a schematic structural diagram of a web page flow automation device according to an embodiment of the present application. Referring to fig. 2, a web page process automation device 200 provided in an embodiment of the present application includes:
an obtaining unit 210, configured to obtain a natural language instruction received by the web plug-in, where the natural language instruction includes multiple types of instructions;
a first processing unit 220, configured to process the natural language instruction to obtain a plurality of instruction key value pairs;
a second processing unit 230, configured to parse a document object model of the web page according to the natural language instruction, to obtain an operable leaf node in the web page;
an input unit 240 for inputting the plurality of command key value pairs and the operable leaf nodes into a pre-trained agent to automatically execute the natural language command.
In an implementation manner of the embodiment of the present application, the apparatus further includes a training unit, configured to train the agent with the recurrent neural network as a core network of the agent, with the command key value pair as input, the operable leaf node as a state space, the click text box, and the input natural language text as an action space.
In an implementation manner of the embodiment of the present application, the obtaining unit is specifically configured to:
and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
In an implementation manner of the embodiment of the present application, when the natural language instruction is a download instruction, the download instruction includes a storage address after the download is completed, and when the download instruction is completed, the storage unit is configured to store the download content included in the download instruction according to the storage address.
In one implementation of the embodiment of the present application, the operable leaf node includes:
one or more of a text box, a button, a drop down box, and a check box in a web page.
The embodiment of the application also provides corresponding equipment and a computer storage medium, which are used for realizing the scheme provided by the embodiment of the application.
As shown in fig. 3, the computer device 01 is in the form of a general purpose computing device. The components of the computer device 01 may include, but are not limited to: one or more processors or processing units 03, a system memory 08, and a bus 04 that connects the various system components (including the system memory 08 and processing units 03).
Bus 04 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
The computer device 01 typically includes a variety of computer system readable media. Such media can be any available media that can be accessed by the computer device 01 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 08 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 09 and/or cache memory 10. The computer device 01 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 11 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive"). Although not shown in fig. 3, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be coupled to bus 04 through one or more data medium interfaces. The memory 08 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the present application.
A program/utility 12 having a set (at least one) of program modules 13 may be stored in, for example, memory 08, such program modules 13 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 13 generally perform the functions and/or methods in the embodiments described herein.
The computer device 01 may also communicate with one or more external devices 02 (e.g., keyboard, pointing device, display 07, etc.), one or more devices that enable a user to interact with the computer device 01, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 01 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 06. Moreover, the computer device 01 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through the network adapter 05. As shown in fig. 3, the network adapter 05 communicates with other modules of the computer device 01 via bus 04. It should be appreciated that although not shown in fig. 3, other hardware and/or software modules may be used in connection with the computer device 01, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processor unit 03 executes various functional applications and data processing by running programs stored in the system memory 08, for example, to implement the methods provided by the embodiments of the present application.
From the above description of embodiments, it will be apparent to those skilled in the art that all or part of the steps of the above described example methods may be implemented in software plus general hardware platforms. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a read-only memory (ROM)/RAM, a magnetic disk, an optical disk, or the like, including several instructions for causing a computer device (which may be a personal computer, a server, or a network communication device such as a router) to perform the methods described in the embodiments or some parts of the embodiments of the present application.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
It should be further noted that, in the present specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, with reference to the description of the method embodiments in part. The above-described apparatus and system embodiments are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements illustrated as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The foregoing is merely one specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A web page process automation method, the method comprising:
acquiring a natural language instruction received by a webpage plug-in, wherein the natural language instruction comprises multiple types of instructions;
processing the natural language instruction to obtain a plurality of instruction key value pairs;
analyzing a document object model of the webpage according to the natural language instruction to obtain an operable leaf node in the webpage;
and inputting the plurality of instruction key value pairs and the operable leaf nodes into a pre-trained agent to automatically execute the natural language instruction.
2. The method of claim 1, wherein training the agent comprises:
and training the intelligent agent by taking the circulating neural network as a core network of the intelligent agent, taking command key value pairs as input, operable leaf nodes as state spaces, clicking text boxes and inputting natural language characters as action spaces.
3. The method of claim 1, wherein the obtaining the natural language instructions received by the web plug-in comprises:
and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
4. The method of claim 1, wherein when the natural language instruction is a download class instruction, the download class instruction includes a storage address after the download is completed, the method further comprising:
and after the downloading instruction is completed, storing the downloading content included in the downloading instruction according to the storage address.
5. The method according to any one of claims 1 or 2, wherein the operable leaf node comprises:
one or more of a text box, a button, a drop down box, and a check box in a web page.
6. A web page process automation device, the device comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring natural language instructions received by a webpage plug-in, and the natural language instructions comprise multiple types of instructions;
the first processing unit is used for processing the natural language instruction to obtain a plurality of instruction key value pairs;
the second processing unit is used for analyzing the document object model of the webpage according to the natural language instruction to obtain an operable leaf node in the webpage;
and the input unit is used for inputting the plurality of instruction key value pairs and the operable leaf nodes into the pre-trained agent so as to automatically execute the natural language instruction.
7. The apparatus of claim 6, further comprising a training unit for training the agent with a recurrent neural network as a core network of the agent, instruction key value pairs as inputs, operable leaf nodes as state spaces, click text boxes, and input natural language words as action spaces.
8. The apparatus according to claim 6, wherein the acquisition unit is specifically configured to:
and acquiring the natural language instruction received in a preset time period, and storing the natural language instruction received in the preset time period according to the received time sequence.
9. A computing device, the computing device comprising: a memory, a processor;
the memory is used for storing a computer program;
the processor, when executing the computer program, is configured to implement the web page process automation method according to any one of claims 1 to 5.
10. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when executed by a processor, the computer program implements the web page process automation method according to any one of claims 1 to 5.
CN202311498917.4A 2023-11-10 2023-11-10 Webpage process automation method, device, equipment and storage medium Pending CN117390336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311498917.4A CN117390336A (en) 2023-11-10 2023-11-10 Webpage process automation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311498917.4A CN117390336A (en) 2023-11-10 2023-11-10 Webpage process automation method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117390336A true CN117390336A (en) 2024-01-12

Family

ID=89468265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311498917.4A Pending CN117390336A (en) 2023-11-10 2023-11-10 Webpage process automation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117390336A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117634867A (en) * 2024-01-26 2024-03-01 杭州实在智能科技有限公司 RPA flow automatic construction method and system combining large language model and reinforcement learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117634867A (en) * 2024-01-26 2024-03-01 杭州实在智能科技有限公司 RPA flow automatic construction method and system combining large language model and reinforcement learning
CN117634867B (en) * 2024-01-26 2024-05-24 杭州实在智能科技有限公司 RPA flow automatic construction method and system combining large language model and reinforcement learning

Similar Documents

Publication Publication Date Title
US11107036B2 (en) Systems and methods for business processing modelling
Rossant IPython Interactive Computing and Visualization Cookbook: Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook
Pauwels et al. Building an interaction design pattern language: A case study
Burns et al. A dataset for interactive vision-language navigation with unknown command feasibility
CN111949307B (en) Optimization method and system of open source project knowledge graph
Li et al. Autonomous GIS: the next-generation AI-powered GIS
US20110154288A1 (en) Automation Support For Domain Modeling
CN117390336A (en) Webpage process automation method, device, equipment and storage medium
Korinek Generative AI for economic research: Use cases and implications for economists
Khemakhem et al. Enhancing usability for automatically structuring digitised dictionaries
de Graaf et al. How organisation of architecture documentation affects architectural knowledge retrieval
Kuschke et al. Pattern-based auto-completion of UML modeling activities
US20200210855A1 (en) Domain knowledge injection into semi-crowdsourced unstructured data summarization for diagnosis and repair
Schneider et al. Argunet: A software tool for collaborative argumentation analysis and research
Hrendus et al. Developing an Intelligent Online Learning System for Foreign Language Vocabulary Training Based on Gamification.
Meth et al. Exploring design principles of task elicitation systems for unrestricted natural language documents
Tosic Artificial Intelligence-driven web development and agile project management using OpenAI API and GPT technology: A detailed report on technical integration and implementation of GPT models in CMS with API and agile web development for quality user-centered AI chat service experience
Nabuco et al. Inferring ui patterns with inductive logic programming
CN115437621A (en) Process editing method and device based on robot process automation
Schubanz Custom-MADE-Leveraging Agile Rationale Management by Employing Domain-Specific Languages.
Fatwanto Software requirements translation from natural language to object-oriented model
Aris Object-oriented programming semantics representation utilizing agents
CN112149399B (en) Table information extraction method, device, equipment and medium based on RPA and AI
CN112651246B (en) Service demand conflict detection method integrating deep learning and workflow modes
Visochek Practical Data Wrangling: Expert techniques for transforming your raw data into a valuable source for analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination