CN112231033A - Software interface element matching method and device combining RPA and AI - Google Patents

Software interface element matching method and device combining RPA and AI Download PDF

Info

Publication number
CN112231033A
CN112231033A CN202011126599.5A CN202011126599A CN112231033A CN 112231033 A CN112231033 A CN 112231033A CN 202011126599 A CN202011126599 A CN 202011126599A CN 112231033 A CN112231033 A CN 112231033A
Authority
CN
China
Prior art keywords
interface
target element
software interface
information
current software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011126599.5A
Other languages
Chinese (zh)
Inventor
张小勇
罗亮
褚瑞
李玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Benying Network Technology Co Ltd
Beijing Laiye Network Technology Co Ltd
Original Assignee
Beijing Benying Network Technology Co Ltd
Beijing Laiye Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Benying Network Technology Co Ltd, Beijing Laiye Network Technology Co Ltd filed Critical Beijing Benying Network Technology Co Ltd
Publication of CN112231033A publication Critical patent/CN112231033A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Character Input (AREA)

Abstract

The disclosure provides a matching method and device for software interface elements combining RPA and AI. The matching method of the software interface elements provided by the embodiment comprises the steps of extracting the interface elements in the current software interface by adopting an OCR technology; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.

Description

Software interface element matching method and device combining RPA and AI
Technical Field
The present disclosure relates to the field of Automation technologies, and in particular, to an RPA (robot Process Automation) and an AI (Artificial Intelligence), and more particularly, to a method and an apparatus for matching software interface elements that combine an RPA and an AI.
Background
In the field of Robot Process Automation (RPA), in order to implement Automation of a Process, a software robot needs to frequently access control elements (interface elements for short) on a software interface and operate on the interface elements to execute corresponding operation tasks.
Artificial Intelligence (AI) is a new technology science for researching and developing theories, methods, techniques and application systems for simulating, extending and expanding human Intelligence. Research in the field of artificial intelligence includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others.
In the prior art, in order to ensure the accuracy of an automation process, when a software robot runs the process, the position of a target element needs to be accurately matched and the automation operation needs to be performed on the target element. In application scenarios such as remote desktop or virtual machine, interface elements are generally detected by computer vision technology, and feature attributes of the interface elements are extracted as matching bases of the interface elements during process operation.
However, such a matching method is not stable, and it is easy to cause matching errors or matching failures of the target elements, so that the accuracy of the automated process is low.
Disclosure of Invention
The invention provides a matching method and a matching device for software interface elements by combining RPA and AI, which can improve the matching accuracy of the interface elements on a software interface in the robot process automation process, and have the advantages of simple implementation mode and stable and reliable effect.
In a first aspect, the present disclosure provides a matching method for software interface elements combining RPA and AI, including:
extracting interface elements in the current software interface by adopting an Optical Character Recognition (OCR) technology;
matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface;
and executing the access operation on the target element according to the distribution information.
In one possible design, the extracting interface elements in the current software interface by using OCR technology includes:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, matching the feature information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface includes:
searching a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element;
determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define an area that the target element includes.
In one possible design, before matching the feature information of the target element with the interface element in the current software interface, the method further includes:
intercepting an interface image of a template software interface;
extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model;
selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms;
generating feature information of the target element according to the target element and the first anchor point element; the characteristic information of the target element comprises: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
In one possible design, before performing the access operation on the target element according to the distribution information, the method further includes:
detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value;
and if the overlapping threshold value is larger than a preset value, executing the access to the target element.
In one possible design, further comprising:
and if the overlapping threshold value is not larger than a preset value, determining that the target element is invalid, and feeding back matching failure prompt information.
In a second aspect, the present disclosure also provides a matching device for software interface elements combining RPA and AI, including:
the extraction module is used for extracting interface elements in the current software interface by adopting an Optical Character Recognition (OCR) technology;
the matching module is used for matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface;
and the execution module is used for executing the access operation on the target element according to the distribution information.
In one possible design, the extraction module is specifically configured to:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the matching module is specifically configured to:
searching a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element;
determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define an area that the target element includes.
In one possible design, further comprising: the acquisition module is used for intercepting an interface image of the template software interface before matching the characteristic information of the target element with the interface element in the current software interface;
extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model;
selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms;
generating feature information of the target element according to the target element and the first anchor point element; the characteristic information of the target element comprises: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
In one possible design, further comprising: an overlap degree judgment module, configured to:
detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value;
and if the overlapping threshold value is larger than a preset value, executing the access to the target element.
In one possible design, further comprising:
and the feedback module is used for determining that the target element is invalid and feeding back matching failure prompt information when the overlapping threshold value is not larger than a preset value.
In a third aspect, the present disclosure also provides an electronic device, including:
a processor; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform any one of the first aspect methods of matching software interface elements in conjunction with RPA and AI via execution of the executable instructions.
In a fourth aspect, the disclosed embodiments also provide a storage medium, on which a computer program is stored, where the program, when executed by a processor, implements any one of the matching methods for software interface elements combining RPA and AI in the first aspect.
The invention provides a matching method and a device of software interface elements combining RPA and AI, which extracts the interface elements in the current software interface by adopting an OCR technology; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a diagram illustrating an application scenario of a matching method for software interface elements that combines RPA and AI according to an example embodiment of the present disclosure;
FIG. 2 is a flow diagram illustrating a method for matching software interface elements that combines RPA and AI according to an example embodiment of the present disclosure;
FIG. 3 is a flow diagram illustrating a method for matching software interface elements that combines RPA and AI according to another example embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram illustrating a matching apparatus for software interface elements that combines RPA and AI according to an example embodiment of the present disclosure;
FIG. 5 is a schematic structural diagram illustrating a matching apparatus of a software interface element that combines RPA and AI according to another example embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an electronic device shown in the present disclosure according to an example embodiment.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present disclosure and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the field of Robot Process Automation (RPA), in order to implement Automation of a Process, a software robot needs to frequently access control elements (interface elements for short) on a software interface and operate on the interface elements to execute corresponding operation tasks. In the prior art, in order to ensure the accuracy of an automation process, when a software robot runs the process, the position of a target element needs to be accurately matched and the automation operation needs to be performed on the target element. In application scenarios such as remote desktop or virtual machine, interface elements are generally detected by computer vision technology, and feature attributes of the interface elements are extracted as matching bases of the interface elements during process operation. However, such a matching method is not stable, and it is easy to cause matching errors or matching failures of the target elements, so that the accuracy of the automated process is low.
In view of the above technical problems, the present disclosure provides a matching method and device for software interface elements by combining RPA and AI, which can improve the accuracy of matching interface elements on a software interface in a robot process automation process, and has the advantages of simple implementation manner and stable and reliable effect. Fig. 1 is an application scenario diagram illustrating a matching method of software interface elements combining an RPA and an AI according to an example embodiment of the present disclosure, where the interface elements in a software interface mainly include text, icons, and controls, as shown in fig. 1. In general, there is a text element (Label) in the control element to identify it, such as: there is typically a simple text inside the button that identifies the function of the button (e.g., "OK" or "Cancel"), etc.; there will also be a simple text on the left or top side of the input box button to identify the function of the input box (e.g., "username" or "password"), etc.); therefore, when the matching search is performed on the interface element, the Label information used as the identification can be sufficiently utilized for assistance. These Label information are referred to as "anchor points" in this disclosure. The anchor point is more generally defined and described below. "anchor point" is understood to mean a reference point, similar to a landmark, which is morphologically stable (variable in position), easily recognizable, and globally unique. Here, an "anchor point" may be an icon or a piece of text. Therefore, the text elements are detected by an Optical Character Recognition (OCR) technology, and the position and the character content of each section of text in the interface are detected; for the icons and the control elements, the positions and the types of the icons and the control elements in the interface can be detected through a deep learning target detection algorithm (such as SSD \ Faster R-CNN).
Then, the software robot can search a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element; and determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface. The first anchor element is an anchor of the template software interface, the second anchor element is an anchor of the current software interface, and the anchor elements comprise: any one or more of a morphically invariant icon element, text element, key element. If the anchor point element is an icon, matching and searching are carried out in a template matching mode; and if the anchor point element is a text, matching and searching are carried out in a character string matching mode. Thus, a second anchor element that matches the first anchor element may be found in the current software interface. And then, determining the distribution information of the target element on the current software interface by combining the position relation between the target element and the first anchor element in the template software interface and the position of the second anchor element in the current software interface, so that the area range of the target element can be determined to be used as a candidate area. The distribution information of the interface element may be described by coordinate information of at least one shape point, which may be a vertex of the interface element or a center point of the interface element, size information of the target element. The distribution information of the rectangular interface element may be described by four vertices, and the distribution information of the circular interface element may be described by a center point. For example, a circular interface element (circular button), knowing the center position and the radius of the circle, the area of the interface element can be determined. According to the coordinate conversion relation between the coordinate information corresponding to the anchor point area and the coordinate information corresponding to the interface element, the coordinate of the shape point of the interface element can be quickly determined, and further the information such as the position coordinate, the size and the like of the interface element is determined.
Finally, after the distribution information of the target element is acquired, the target element may be accessed, for example, a pick-up and simulation operation of the target element. In a possible implementation, before performing the access operation on the target element according to the distribution information, the method further includes: carrying out overlapping degree detection on target elements in the area corresponding to the distribution information and interface elements in the current software interface to obtain an overlapping threshold value; and if the overlapping threshold value is larger than the preset value, determining that the target element is effective. And if the overlapping threshold value is not larger than the preset value, determining that the target element is invalid, and feeding back matching failure prompt information.
The method can improve the matching accuracy of the interface elements on the software interface in the robot process automation process, and has the advantages of simple implementation mode and stable and reliable effect.
Fig. 2 is a flowchart illustrating a matching method for software interface elements combining RPA and AI according to an example embodiment of the present disclosure, and as shown in fig. 2, the method provided in this embodiment may include:
step 101, extracting interface elements in a current software interface by adopting an Optical Character Recognition (OCR) technology.
In this embodiment, the software robot may intercept an interface image of the current software interface. Then, all interface elements are extracted from the interface image through an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
Specifically, interface elements in the software interface mainly include text, icons, and controls. In general, there is a text element (Label) in the control element to identify it, such as: there is typically a simple text inside the button that identifies the function of the button (e.g., "OK" or "Cancel"), etc.; there will also be a simple text on the left or top side of the input box button to identify the function of the input box (e.g., "username" or "password"), etc.); therefore, when the matching search is performed on the interface element, the Label information used as the identification can be sufficiently utilized for assistance. These Label information are referred to as "anchor points" in this disclosure. The anchor point is more generally defined and described below. "anchor point" is understood to mean a reference point, similar to a landmark, which is morphologically stable (variable in position), easily recognizable, and globally unique. Here, an "anchor point" may be an icon or a piece of text. Therefore, the text elements are detected by an Optical Character Recognition (OCR) technology, and the position and the character content of each section of text in the interface are detected; for the icons and the control elements, the positions and the types of the icons and the control elements in the interface can be detected through a deep learning target detection algorithm (such as SSD \ Faster R-CNN).
And 102, matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface.
In this embodiment, the software robot may search, from the current software interface, a second anchor point element that matches the first anchor point element according to the category information, the position information, and the text information corresponding to the first anchor point element; determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define the area that the target element contains.
Specifically, the first anchor element is an anchor point of the template software interface, the second anchor element is an anchor point of the current software interface, and the anchor elements include: any one or more of a morphically invariant icon element, text element, key element. If the anchor point element is an icon, matching and searching are carried out in a template matching mode; and if the anchor point element is a text, matching and searching are carried out in a character string matching mode. Thus, a second anchor element that matches the first anchor element may be found in the current software interface. And then, determining the distribution information of the target element on the current software interface by combining the position relation between the target element and the first anchor element in the template software interface and the position of the second anchor element in the current software interface, so that the area range of the target element can be determined to be used as a candidate area. The distribution information of the interface element may be described by coordinate information of at least one shape point, which may be a vertex of the interface element or a center point of the interface element, size information of the target element. The distribution information of the rectangular interface element may be described by four vertices, and the distribution information of the circular interface element may be described by a center point. For example, a circular interface element (circular button), knowing the center position and the radius of the circle, the area of the interface element can be determined. According to the coordinate conversion relation between the coordinate information corresponding to the anchor point area and the coordinate information corresponding to the interface element, the coordinate of the shape point of the interface element can be quickly determined, and further the information such as the position coordinate, the size and the like of the interface element is determined.
And 103, executing the access operation on the target element according to the distribution information.
In this embodiment, after the distribution information of the target element is acquired, the target element may be accessed, for example, a picking and simulation operation of the target element.
In a possible implementation, before performing the access operation on the target element according to the distribution information, the method further includes: detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value; and if the overlapping threshold value is larger than the preset value, executing the access to the target element.
Specifically, the obtained candidate region and the interface element analyzed in step 101 are subjected to overlapping degree detection iou (interaction over union). And if the IOU result is larger than the set threshold value, the candidate area is considered to be effective.
In another possible implementation manner, if the overlap threshold is not greater than the preset value, it is determined that the candidate region is invalid, and a matching failure prompt message is fed back.
In a possible implementation manner, when there are multiple anchor elements and the candidate regions determined based on each anchor element are different, performing overlapping degree detection IOU on the obtained candidate regions and the interface elements analyzed in step 101, determining the candidate regions whose IOU results are greater than the set threshold, and performing access operation on the interface elements matched with the candidate regions whose IOU results are greater than the set threshold.
In a possible implementation manner, when there are multiple anchor elements and the candidate regions determined based on each anchor element are different, the obtained candidate regions and the interface elements analyzed in step 101 are subjected to overlapping degree detection IOU, the interface element with the highest comprehensive matching degree with each candidate region is determined, and an access operation on the interface element is performed. The comprehensive matching degree of the interface element and each candidate region may be the sum of the matching degrees of the interface element and each candidate region, or may be determined in other preset manners, which is not limited in this application.
Specifically, the software robot can also perform feedback to the user in a prompt message manner when the matching fails. The matching failure means that the overlapping threshold value of the target element in the area corresponding to the distribution information and the interface element in the current software interface is not larger than a preset value.
In the embodiment, interface elements in the current software interface are extracted; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Fig. 3 is a flowchart illustrating a matching method of software interface elements combining an RPA and an AI according to another exemplary embodiment of the present disclosure, and as shown in fig. 3, the method provided in this embodiment may include:
step 201, acquiring feature information of a first anchor point element and a target element of a template software interface.
In this embodiment, an interface image of a template software interface may be intercepted; extracting all interface elements from an interface image of a template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model; selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms; generating characteristic information of the target element according to the target element and the first anchor point element; the characteristic information of the target element includes: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
Specifically, an interface image of the template software interface may also be intercepted. Detecting the text elements by an OCR technology, and detecting the position and the character content of each section of text in the interface; for the icons and the control elements, the positions and the types of the icons and the control elements in the interface can be detected through a deep learning target detection algorithm (such as SSD \ Faster R-CNN). And taking all the extracted interface elements as candidate elements, and designating target elements to be operated and anchor point elements for assisting the target elements to be searched. Taking a mailbox login interface as an example, the input box control is the target element to be operated, and the text such as the user name or the password can be selected as the anchor point element. Generating characteristic information according to information such as the target element and the anchor point element, storing the characteristic information into an RPA process source code, wherein the characteristic information mainly comprises the category and the position of the target element; the type, position and text content of anchor elements. When the anchor elements are matched, the anchor elements can be matched, and then the positions of the target elements on the current software interface are determined through the matched anchor elements. The specific matching implementation is not described herein again.
Step 202, extracting interface elements in the current software interface by adopting an OCR technology.
And step 203, matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface.
And step 204, executing the access operation on the target element according to the distribution information.
In this embodiment, please refer to the related description in step 101 to step 103 in the method shown in fig. 2 for the specific implementation process and technical principle of step 202 to step 204, which is not described herein again.
In the embodiment, interface elements in the current software interface are extracted; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
In addition, the implementation can also intercept an interface image of the template software interface; extracting all interface elements from an interface image of a template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model; selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms; generating characteristic information of the target element according to the target element and the first anchor point element; the characteristic information of the target element includes: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
Fig. 4 is a schematic structural diagram illustrating a matching apparatus for combining software interface elements of an RPA and an AI according to an example embodiment of the present disclosure. As shown in fig. 4, the matching device for software interface elements combining RPA and AI according to this embodiment may include:
an extracting module 31, configured to extract an interface element in a current software interface by using an Optical Character Recognition (OCR) technology;
the matching module 32 is configured to match the feature information of the target element with an interface element in the current software interface to obtain distribution information of the target element on the current software interface;
and the execution module 33 is configured to execute an access operation on the target element according to the distribution information.
In one possible design, the extraction module 31 is specifically configured to:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
In one possible design, the matching module 32 is specifically configured to:
searching a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element;
determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define the area that the target element contains.
The apparatus provided in this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.
In the embodiment, interface elements in the current software interface are extracted; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
On the basis of the embodiment shown in fig. 4, fig. 5 is a schematic structural diagram of a matching apparatus for combining software interface elements of an RPA and an AI according to another exemplary embodiment of the present disclosure, and as shown in fig. 5, the matching apparatus for combining software interface elements of an RPA and an AI provided in this embodiment further includes:
the obtaining module 34 is configured to intercept an interface image of the template software interface before matching feature information of the target element with an interface element in the current software interface;
extracting all interface elements from an interface image of a template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model;
selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms;
generating characteristic information of the target element according to the target element and the first anchor point element; the characteristic information of the target element includes: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
In one possible design, further comprising: an overlap determination module 35, configured to:
detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value;
and if the overlapping threshold value is larger than the preset value, executing the access to the target element.
In one possible design, further comprising:
and the feedback module 36 is configured to determine that the target element is invalid and feed back a matching failure prompt message when the overlap threshold is not greater than the preset value.
The apparatus provided in this embodiment may be used to implement the technical solutions of the method embodiments shown in fig. 2 and fig. 3, and the implementation principles and technical effects are similar, which are not described herein again.
In the embodiment, interface elements in the current software interface are extracted; matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface; and executing the access operation on the target element according to the distribution information. Therefore, the matching accuracy of the interface elements on the software interface in the robot process automation process can be improved, the implementation mode is simple, and the effect is stable and reliable.
Fig. 6 is a schematic structural diagram of an electronic device shown in the present disclosure according to an example embodiment. As shown in fig. 6, the present embodiment provides an electronic device 40, including:
a processor 401; and the number of the first and second groups,
a memory 402 for storing executable instructions of the processor, which may also be a flash (flash memory);
wherein the processor 401 is configured to perform the respective steps of the above-described method via execution of executable instructions. Reference may be made in particular to the description relating to the preceding method embodiment.
Alternatively, the memory 402 may be separate or integrated with the processor 401.
When the memory 402 is a device independent of the processor 401, the electronic device 40 may further include:
a bus 403 for connecting the processor 401 and the memory 402.
The present embodiment also provides a readable storage medium, in which a computer program is stored, and when at least one processor of the electronic device executes the computer program, the electronic device executes the methods provided by the above various embodiments.
The present embodiment also provides a program product comprising a computer program stored in a readable storage medium. The computer program can be read from a readable storage medium by at least one processor of the electronic device, and the execution of the computer program by the at least one processor causes the electronic device to implement the methods provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (14)

1. A matching method for software interface elements combining RPA and AI is characterized by comprising the following steps:
extracting interface elements in a current software interface by adopting an Optical Character Recognition (OCR) technology;
matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface;
and executing the access operation on the target element according to the distribution information.
2. The method of claim 1, wherein the extracting interface elements in the current software interface by using OCR technology comprises:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
3. The method of claim 1, wherein matching the feature information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface comprises:
searching a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element;
determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define an area that the target element includes.
4. The method of claim 3, further comprising, prior to matching the feature information of the target element with the interface element in the current software interface:
intercepting an interface image of a template software interface;
extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model;
selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms;
generating feature information of the target element according to the target element and the first anchor point element; the characteristic information of the target element comprises: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
5. The method according to any one of claims 1-4, further comprising, prior to performing an access operation to the target element according to the distribution information:
detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value;
and if the overlapping threshold value is larger than a preset value, executing the access to the target element.
6. The method of claim 5, further comprising:
and if the overlapping threshold value is not larger than a preset value, determining that the target element is invalid, and feeding back matching failure prompt information.
7. A matching device for software interface elements that combine RPA and AI, comprising:
the extraction module is used for extracting interface elements in the current software interface by adopting an Optical Character Recognition (OCR) technology;
the matching module is used for matching the characteristic information of the target element with the interface element in the current software interface to obtain the distribution information of the target element on the current software interface;
and the execution module is used for executing the access operation on the target element according to the distribution information.
8. The apparatus according to claim 7, wherein the extraction module is specifically configured to:
intercepting an interface image of a current software interface;
and extracting all interface elements from the interface image by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model.
9. The apparatus of claim 7, wherein the matching module is specifically configured to:
searching a second anchor point element matched with the first anchor point element from the current software interface according to the category information, the position information and the text information corresponding to the first anchor point element;
determining the distribution information of the target element on the current software interface according to the position relationship between the target element and the first anchor point element and the position of the second anchor point element in the current software interface; the distribution information includes: coordinate information of at least one shape point of the target element, size information of the target element; wherein the shape points are used to define an area that the target element includes.
10. The apparatus of claim 9, further comprising: the acquisition module is used for intercepting an interface image of the template software interface before matching the characteristic information of the target element with the interface element in the current software interface;
extracting all interface elements from the interface image of the template software interface as candidate elements by an Optical Character Recognition (OCR) technology or a pre-trained deep learning model;
selecting a target element from the candidate elements and a first anchor element associated with the target element; wherein the first anchor element comprises: any one or more of an icon element, a text element and a key element with invariable forms;
generating feature information of the target element according to the target element and the first anchor point element; the characteristic information of the target element comprises: the position relation between the target element and the first anchor element, and the category information, the position information and the text information corresponding to the first anchor element.
11. The apparatus of any one of claims 7-10, further comprising: an overlap degree judgment module, configured to:
detecting the overlapping degree of the area corresponding to the distribution information and the interface element in the current software interface to obtain an overlapping threshold value;
and if the overlapping threshold value is larger than a preset value, executing the access to the target element.
12. The apparatus of claim 11, further comprising:
and the feedback module is used for determining that the target element is invalid and feeding back matching failure prompt information when the overlapping threshold value is not larger than a preset value.
13. An electronic device, comprising:
a processor; and the number of the first and second groups,
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the matching method of software interface elements incorporating RPA and AI of any of claims 1 to 6 via execution of the executable instructions.
14. A storage medium on which a computer program is stored, which when executed by a processor implements the matching method of software interface elements combining RPA and AI according to any one of claims 1 to 6.
CN202011126599.5A 2019-12-23 2020-10-20 Software interface element matching method and device combining RPA and AI Pending CN112231033A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019113404970 2019-12-23
CN201911340497 2019-12-23

Publications (1)

Publication Number Publication Date
CN112231033A true CN112231033A (en) 2021-01-15

Family

ID=74117906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011126599.5A Pending CN112231033A (en) 2019-12-23 2020-10-20 Software interface element matching method and device combining RPA and AI

Country Status (1)

Country Link
CN (1) CN112231033A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722038A (en) * 2021-09-06 2021-11-30 北京字节跳动网络技术有限公司 Data matching method and device, computer equipment and storage medium
CN114035726A (en) * 2021-10-19 2022-02-11 四川新网银行股份有限公司 Method and system for robot process automation page element identification process
CN115061769A (en) * 2022-08-08 2022-09-16 杭州实在智能科技有限公司 Self-iteration RPA interface element matching method and system for supporting cross-resolution
CN115455227A (en) * 2022-09-20 2022-12-09 上海弘玑信息技术有限公司 Graphical interface element searching method, electronic device and storage medium
CN116168405A (en) * 2023-04-23 2023-05-26 杭州实在智能科技有限公司 Construction method and system of general RPA check box operation assembly
CN116185411A (en) * 2023-03-23 2023-05-30 苏州峰之鼎信息科技有限公司 RPA interface determination method, RPA interface determination device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268083A (en) * 2014-09-30 2015-01-07 上海联影医疗科技有限公司 Software automatic testing method and device
CN104899146A (en) * 2015-06-19 2015-09-09 安一恒通(北京)科技有限公司 Image matching technology based software stability test method and device
CN110413529A (en) * 2019-07-31 2019-11-05 中国工商银行股份有限公司 Applied to the test method of electronic equipment, device, calculate equipment and medium
US10474564B1 (en) * 2019-01-25 2019-11-12 Softesis Inc. Identifying user interface elements using element signatures

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268083A (en) * 2014-09-30 2015-01-07 上海联影医疗科技有限公司 Software automatic testing method and device
CN104899146A (en) * 2015-06-19 2015-09-09 安一恒通(北京)科技有限公司 Image matching technology based software stability test method and device
US10474564B1 (en) * 2019-01-25 2019-11-12 Softesis Inc. Identifying user interface elements using element signatures
CN110413529A (en) * 2019-07-31 2019-11-05 中国工商银行股份有限公司 Applied to the test method of electronic equipment, device, calculate equipment and medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722038A (en) * 2021-09-06 2021-11-30 北京字节跳动网络技术有限公司 Data matching method and device, computer equipment and storage medium
CN114035726A (en) * 2021-10-19 2022-02-11 四川新网银行股份有限公司 Method and system for robot process automation page element identification process
CN114035726B (en) * 2021-10-19 2023-12-22 四川新网银行股份有限公司 Method and system for robot flow automatic page element identification process
CN115061769A (en) * 2022-08-08 2022-09-16 杭州实在智能科技有限公司 Self-iteration RPA interface element matching method and system for supporting cross-resolution
CN115455227A (en) * 2022-09-20 2022-12-09 上海弘玑信息技术有限公司 Graphical interface element searching method, electronic device and storage medium
CN116185411A (en) * 2023-03-23 2023-05-30 苏州峰之鼎信息科技有限公司 RPA interface determination method, RPA interface determination device, computer equipment and storage medium
CN116185411B (en) * 2023-03-23 2024-04-30 苏州峰之鼎信息科技有限公司 RPA interface determination method, RPA interface determination device, computer equipment and storage medium
CN116168405A (en) * 2023-04-23 2023-05-26 杭州实在智能科技有限公司 Construction method and system of general RPA check box operation assembly

Similar Documents

Publication Publication Date Title
CN112231033A (en) Software interface element matching method and device combining RPA and AI
CN112231034A (en) Software interface element identification method and device combining RPA and AI
US9020250B2 (en) Methods and systems for building a universal dress style learner
CN110689535B (en) Workpiece identification method and device, electronic equipment and storage medium
US11279040B2 (en) Robot process automation apparatus and method for detecting changes thereof
US8943468B2 (en) Wireframe recognition and analysis engine
US9971954B2 (en) Apparatus and method for producing image processing filter
CN112749758B (en) Image processing method, neural network training method, device, equipment and medium
US10740940B2 (en) Automatic generation of fundus drawings
CN115061769B (en) Self-iteration RPA interface element matching method and system for supporting cross-resolution
CN108536597A (en) A kind of terminal test method, device, terminal device and storage medium
CN109241998B (en) Model training method, device, equipment and storage medium
CN106845628A (en) The method and apparatus that robot generates new command by internet autonomous learning
CN114387656B (en) Face changing method, device, equipment and storage medium based on artificial intelligence
Pradhan et al. A hand gesture recognition using feature extraction
CN112231032A (en) Software interface element access method and device combining RPA and AI
US20200257372A1 (en) Out-of-vocabulary gesture recognition filter
CN112035120A (en) Logic code acquisition method and device based on medical data and computer equipment
KR101628602B1 (en) Similarity judge method and appratus for judging similarity of program
CN114022684B (en) Human body posture estimation method and device
CN115359203A (en) Three-dimensional high-precision map generation method and system and cloud platform
CN110909739B (en) Picture identification and operation method and device, computer equipment and storage medium
US9536145B2 (en) System and method for selecting features for identifying human activities in a human-computer interacting environment
US20210326754A1 (en) Storage medium, learning method, and information processing apparatus
CN115688083B (en) Method, device and equipment for identifying image-text verification code and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 1902, 19th Floor, China Electronics Building, No. 3 Danling Road, Haidian District, Beijing

Applicant after: BEIJING LAIYE NETWORK TECHNOLOGY Co.,Ltd.

Applicant after: Laiye Technology (Beijing) Co.,Ltd.

Address before: 1902, 19 / F, China Electronics Building, 3 Danling Road, Haidian District, Beijing 100080

Applicant before: BEIJING LAIYE NETWORK TECHNOLOGY Co.,Ltd.

Country or region before: China

Applicant before: BEIJING BENYING NETWORK TECHNOLOGY Co.,Ltd.