CN113177168A - Positioning method based on Web element attribute characteristics - Google Patents

Positioning method based on Web element attribute characteristics Download PDF

Info

Publication number
CN113177168A
CN113177168A CN202110474540.3A CN202110474540A CN113177168A CN 113177168 A CN113177168 A CN 113177168A CN 202110474540 A CN202110474540 A CN 202110474540A CN 113177168 A CN113177168 A CN 113177168A
Authority
CN
China
Prior art keywords
attribute
target
elements
value
amap
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110474540.3A
Other languages
Chinese (zh)
Other versions
CN113177168B (en
Inventor
刘春刚
许凯
赵东旭
田永军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yunda Information Technology Co ltd
Original Assignee
Shanghai Yunda Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yunda Information Technology Co ltd filed Critical Shanghai Yunda Information Technology Co ltd
Priority to CN202110474540.3A priority Critical patent/CN113177168B/en
Publication of CN113177168A publication Critical patent/CN113177168A/en
Application granted granted Critical
Publication of CN113177168B publication Critical patent/CN113177168B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Transfer Between Computers (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention discloses a positioning method based on Web element attribute characteristics, which comprises the following steps: the method comprises the following steps: determining the attribute name to be verified according to different types of target elements; step two: acquiring a characteristic value of a target element; step three: XPath is constructed according to element characteristic values. The Web element attribute feature-based positioning method adopts the feature attribute of a target element as positioning information, the obtained feature attribute of the element is irrelevant to a Dom structure and a page display style, the feature attribute of the element is mostly used for indicating the service purpose of the element (for example, the feature attribute value of a common submission button is Type and Submit), the feature attribute of the element is stable and unchanged after the Dom structure and the page style are changed, different feature attributes are read for different types of elements in a customized manner, and the unique feature attribute value of the target element on the page can be obtained by utilizing a dynamic feature attribute extraction algorithm, so that more robust element positioning is realized.

Description

Positioning method based on Web element attribute characteristics
Technical Field
The invention relates to the technical field of element positioning, in particular to a positioning method based on Web element attribute characteristics.
Background
RPA (robotic Process Automation), namely, robotic Process Automation. It is a work mode to replace manpower, used to relieve manpower in repeating tedious and regular workflow. The realization of robot process automation refers to non-invasive automation, i.e. different recognition techniques are used to locate the target element, not in the form of injected code.
In the product function of RPA, the accurate positioning element is a basic core function, in the field of Web automation, the traditional Web element positioning method mostly uses the traversal path from the target element to the root node based on the page Dom structure as the characteristic value (the traversal path is HTML- > BODY- > DIV- > DIV [2] - > SPAN), as the modern Web page display content is more and more complex and the interactive display is more and more abundant, the current Web mainstream frameworks (React, Vue and Angular) all adopt a Virtual-Dom technology to improve the performance, the technology realizes high-performance Web interactive display by dynamically controlling the addition and deletion change control of the Dom elements, and the addition and deletion of the dynamic Dom elements can cause the change of a traversal path from a root node to a target element, so that the elements cannot be correctly positioned during page automation and the automation operation cannot be completed; in addition, as the Web technology is continuously matured, Web development is gradually normalized and modularized, in order to ensure that the experience of cross-browsers is consistent, different browsers (Chrome/Firefox/Edge) have the same appearance and interaction, and the rendered Dom elements and styles of the same component in different browser types or even different release versions of the same browser type are different, which also causes that page automation cannot correctly position elements due to the fact that a user changes the browser or the browser version is upgraded, and cannot complete automation operation.
Therefore, we propose a positioning method based on the attribute characteristics of Web elements so as to solve the problems proposed in the above.
Disclosure of Invention
The invention aims to provide a positioning method based on Web element attribute characteristics, so as to solve the problem that most positioning methods proposed in the background art can cause that page automation cannot correctly position elements and cannot complete automation operation due to browser replacement or browser version upgrading of a user.
In order to achieve the purpose, the invention provides the following technical scheme: a Web element positioning method selects characteristic attribute values according to optimization of different elements and utilizes the characteristic attribute values to construct a target element XPath query character string, wherein the Web element positioning method comprises the following steps:
the method comprises the following steps: determining the attribute name to be verified according to different types of target elements;
step two: acquiring a characteristic value of a target element;
step three: XPath is constructed according to element characteristic values.
Preferably, in the first step, when determining the attribute name to be verified, the element attribute is first divided into a generic attribute and a proprietary attribute, where:
the general attributes refer to attributes that are ubiquitous and frequently used in HTML elements, and are as follows: an element ID attribute field, the attribute specifying a unique identifier for the element; name, element Name field, which specifies the Name of the element; a Title, namely an element Title field, which is commonly used for explaining the use purpose of the element, and when a mouse is hovered on the element, a text displayed by a prompt box; text is the Text inside the element;
the special attribute refers to reading the attribute specific to the element according to different element types, and the specific attribute is as follows:
input HTMLE element specific attribute: type is used for indicating the content of the Input element used for inputting or whether the Input element is a form submission operation or not;
img HTMLE element specific attribute: src, wherein the attribute specifies a file path of the embedded picture; alt, the attribute specifies the text description of the image, and the alternative text when the image can not be presented or the text is used for the screen reader to read the description to the user for listening;
iframe HTMLE element-specific attribute: src is the embedded page address;
a HTMLE element-specific attribute: href, including URL or URL fragment pointed by hyperlink; target specifies where to display the linked resource;
and finally, obtaining the ArrList according to the target element general attribute and the proprietary attribute.
Preferably, the specific process of acquiring the feature value of the target element in the second step includes:
(1) constructing a target element characteristic attribute AMap and an integer Count for recording the number of elements which can be matched through the current AMap in the whole situation, wherein the initial value is infinite;
(2) acquiring current attribute names K from the Attrlist one by one, acquiring a value V corresponding to the current element attribute name K, and adding the value V into the AMap; globally searching elements which can be matched with all attributes in the AMap, recording the number N of the elements meeting the conditions at present, and if the value of N is not less than Count at this time, indicating that the addition of the current attribute K does not enable the target element to be positioned more accurately, deleting the attribute K and the corresponding V from the AMap at this time; otherwise, updating the value of the Count to be the number N of the elements meeting the condition at present, and executing the step circularly until all attributes in the Attrlist are traversed, or only the target element can be matched uniquely through the attributes in the AMap;
(3) checking the Count field, if the field is 1, indicating that the target element can be uniquely positioned through the AMap, wherein the attribute value in the AMap at the moment is the characteristic value of the target element; if the field is greater than 1, it indicates that there are other elements in the page that satisfy the attribute value in the AMap, and at this time, a feature value needs to be further added to uniquely determine the target element, here, a deep traversal algorithm based on a Dom tree structure is adopted to obtain all element lists ElementArray that satisfy the attribute value of the AMap, and a position Index of the target element in the ElementArray is recorded, and at this time, the Index and all the attribute values in the AMap can be used as the feature value of the target element.
Preferably, in the third step, the XPath refers to an XML path language, which can be used to detect whether a certain node in the document matches with a certain pattern (pattern), and provides rich functions, which can flexibly support multiple attribute matching, and furthermore, the main stream browser supports XPath.
Preferably, the element attribute feature value obtained in the second step is used to construct an XPath character string of the target element, where the XPath is a positioning feature value obtained based on the attribute of the element, and the target element can also be stably positioned after the structure of the Html page is changed.
Preferably, the first step and the second step select the element with obvious characteristics as an anchor element, generate an attribute characteristic set, and generate an XPath unique positioning target element according to the relative position of the anchor element to the target element in the third step.
Compared with the prior art, the invention has the beneficial effects that: the Web element attribute feature-based positioning method adopts the feature attribute of a target element as positioning information, the obtained feature attribute of the element is irrelevant to a Dom structure and a page display style, and the feature attribute of the element is mostly used for indicating the service purpose of the element (for example, the feature attribute value of a common submission button is Type and Submit), and the feature attribute of the element is stable and unchanged after the Dom structure and the page style are changed.
Drawings
FIG. 1 is a schematic diagram illustrating a process of obtaining a feature attribute name of a target element according to the present invention;
FIG. 2 is a schematic diagram of a process of extracting a feature attribute of a target element by using a dynamic feature attribute extraction algorithm according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
Referring to fig. 1-2, the present invention provides a technical solution: a Web element positioning method selects characteristic attribute values according to optimization of different elements and utilizes the characteristic attribute values to construct a target element XPath query character string, wherein the Web element positioning method comprises the following steps:
the method comprises the following steps: determining the attribute name to be verified according to different types of target elements;
step two: acquiring a characteristic value of a target element;
step three: XPath is constructed according to element characteristic values.
Further, in the first step, when determining the attribute name to be verified, the element attribute is first divided into a generic attribute and a proprietary attribute, where:
the general attributes refer to attributes that are ubiquitous and frequently used in HTML elements, and are as follows: an element ID attribute field, the attribute specifying a unique identifier for the element; name, element Name field, which specifies the Name of the element; a Title, namely an element Title field, which is commonly used for explaining the use purpose of the element, and when a mouse is hovered on the element, a text displayed by a prompt box; text is the Text inside the element;
the special attribute refers to reading the attribute specific to the element according to different element types, and the specific attribute is as follows:
input HTMLE element specific attribute: type is used for indicating the content of the Input element used for inputting or whether the Input element is a form submission operation or not;
img HTMLE element specific attribute: src, wherein the attribute specifies a file path of the embedded picture; alt, the attribute specifies the text description of the image, and the alternative text when the image can not be presented or the text is used for the screen reader to read the description to the user for listening;
iframe HTMLE element-specific attribute: src is the embedded page address;
a HTMLE element-specific attribute: href, including URL or URL fragment pointed by hyperlink; target specifies where to display the linked resource;
and finally, obtaining the ArrList according to the target element general attribute and the proprietary attribute.
Further, the specific process of obtaining the feature value of the target element in the second step includes:
(1) constructing a target element characteristic attribute AMap and an integer Count for recording the number of elements which can be matched through the current AMap in the whole situation, wherein the initial value is infinite;
(2) acquiring current attribute names K from the Attrlist one by one, acquiring a value V corresponding to the current element attribute name K, and adding the value V into the AMap; globally searching elements which can be matched with all attributes in the AMap, recording the number N of the elements meeting the conditions at present, and if the value of N is not less than Count at this time, indicating that the addition of the current attribute K does not enable the target element to be positioned more accurately, deleting the attribute K and the corresponding V from the AMap at this time; otherwise, updating the value of the Count to be the number N of the elements meeting the condition at present; circularly executing the step until all attributes in the Attrlist are traversed, or only the target element can be uniquely matched through the attributes in the AMap;
(3) checking the Count field, if the field is 1, indicating that the target element can be uniquely positioned through the AMap, wherein the attribute value in the AMap at the moment is the characteristic value of the target element; if the field is greater than 1, it indicates that there are other elements in the page that satisfy the attribute value in the AMap, and at this time, a feature value needs to be further added to uniquely determine the target element, here, a deep traversal algorithm based on a Dom tree structure is adopted to obtain all element lists ElementArray that satisfy the attribute value of the AMap, and a position Index of the target element in the ElementArray is recorded, and at this time, the Index and all the attribute values in the AMap can be used as the feature value of the target element.
Furthermore, in the third step of the present invention, the XPath refers to an XML path language, which can be used to detect whether a certain node in a document is matched with a certain pattern (pattern), and provides rich functions, which can flexibly support multiple attribute matching, and furthermore, the main stream browser supports XPath.
Furthermore, the method includes the step two of obtaining the element attribute feature value, and constructing an XPath character string of the target element, wherein the XPath is a positioning feature value obtained based on the element attribute, and the target element can be stably positioned after the structure of the Html page is changed.
In a further aspect of the invention, said first and second steps select the distinctive feature element as an anchor element to generate an attribute feature set, and in step three, generate an XPath unique positioning target element based on the relative position of the anchor element to the target element.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It should be noted that in the description of the present specification, reference to the description of the term "one embodiment", "some embodiments", "an example", "a specific example", or "some examples", etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples, which are described in this specification in connection with system embodiments that are relatively simple to describe as they are substantially similar to method embodiments, as may be described in connection with part of this specification. The above-described system embodiments are merely illustrative, and the units described as separate components may or may not be physically separate, and the components suggested as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (6)

1. A positioning method based on Web element attribute features is characterized in that: the Web element positioning method selects characteristic attribute values according to optimization of different elements and utilizes the characteristic attribute values to construct an XPath query character string of a target element, wherein the Web element positioning method comprises the following steps:
the method comprises the following steps: determining the attribute name to be verified according to different types of target elements;
step two: acquiring a characteristic value of a target element;
step three: XPath is constructed according to element characteristic values.
2. The positioning method based on the attribute characteristics of the Web elements as claimed in claim 1, wherein: in the first step, when determining the attribute name to be verified, the element attribute is firstly divided into a general attribute and a special attribute, wherein:
the general attributes refer to attributes that are ubiquitous and frequently used in HTML elements, and are as follows: an element ID attribute field, the attribute specifying a unique identifier for the element; name, element Name field, which specifies the Name of the element; a Title, namely an element Title field, which is commonly used for explaining the use purpose of the element, and when a mouse is hovered on the element, a text displayed by a prompt box; text is the Text inside the element;
the special attribute refers to reading the attribute specific to the element according to different element types, and the specific attribute is as follows:
input HTMLE element specific attribute: type is used for indicating the content of the Input element used for inputting or whether the Input element is a form submission operation or not;
img HTMLE element specific attribute: src, wherein the attribute specifies a file path of the embedded picture; alt, the attribute specifies the text description of the image, and the alternative text when the image can not be presented or the text is used for the screen reader to read the description to the user for listening;
iframe HTMLE element-specific attribute: src is the embedded page address;
a HTMLE element-specific attribute: href, including URL or URL fragment pointed by hyperlink; target specifies where to display the linked resource;
and finally, obtaining the ArrList according to the target element general attribute and the proprietary attribute.
3. The positioning method based on the attribute characteristics of the Web elements as claimed in claim 1, wherein: the specific process for acquiring the characteristic value of the target element in the second step includes:
(1) constructing a target element characteristic attribute AMap and an integer Count for recording the number of elements which can be matched through the current AMap in the whole situation, wherein the initial value is infinite;
(2) acquiring current attribute names K from the Attrlist one by one, acquiring a value V corresponding to the current element attribute name K, and adding the value V into the AMap; globally searching elements which can be matched with all attributes in the AMap, recording the number N of the elements meeting the conditions at present, and if the value of N is not less than Count at this time, indicating that the addition of the current attribute K does not enable the target element to be positioned more accurately, deleting the attribute K and the corresponding V from the AMap at this time; otherwise, updating the value of the Count to be the number N of the elements meeting the condition at present; circularly executing the step until all attributes in the Attrlist are traversed, or only the target element can be uniquely matched through the attributes in the AMap;
(3) checking the Count field, if the field is 1, indicating that the target element can be uniquely positioned through the AMap, wherein the attribute value in the AMap at the moment is the characteristic value of the target element; if the field is greater than 1, it indicates that there are other elements in the page that satisfy the attribute value in the AMap, and at this time, a feature value needs to be further added to uniquely determine the target element, here, a deep traversal algorithm based on a Dom tree structure is adopted to obtain all element lists ElementArray that satisfy the attribute value of the AMap, and a position Index of the target element in the ElementArray is recorded, and at this time, the Index and all the attribute values in the AMap can be used as the feature value of the target element.
4. The positioning method based on the attribute characteristics of the Web elements as claimed in claim 1, wherein: in the third step, the XPath refers to an XML path language, which can be used to detect whether a certain node in a document is matched with a certain pattern (pattern), and provides rich functions, which can flexibly support multiple attribute matching, and also supports XPath in a mainstream browser.
5. The positioning method based on the attribute characteristics of the Web elements as claimed in claim 1, wherein: and constructing an XPath character string of the target element by using the element attribute characteristic value obtained in the step two, wherein the XPath is a positioning characteristic value obtained based on the attribute of the element, and the target element can be stably positioned after the structure of the Html page is changed.
6. The positioning method based on the attribute characteristics of the Web elements as claimed in claim 1, wherein: and in the first step and the second step, the element with obvious characteristics is selected as an anchor element, an attribute characteristic set is generated, and in the third step, an XPath unique positioning target element is generated according to the relative position of the anchor element to the target element.
CN202110474540.3A 2021-04-29 2021-04-29 Positioning method based on Web element attribute characteristics Active CN113177168B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110474540.3A CN113177168B (en) 2021-04-29 2021-04-29 Positioning method based on Web element attribute characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110474540.3A CN113177168B (en) 2021-04-29 2021-04-29 Positioning method based on Web element attribute characteristics

Publications (2)

Publication Number Publication Date
CN113177168A true CN113177168A (en) 2021-07-27
CN113177168B CN113177168B (en) 2023-12-01

Family

ID=76925351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110474540.3A Active CN113177168B (en) 2021-04-29 2021-04-29 Positioning method based on Web element attribute characteristics

Country Status (1)

Country Link
CN (1) CN113177168B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495775A (en) * 2021-09-07 2021-10-12 长沙博为软件技术股份有限公司 Combined positioning system, method, equipment and medium for RPA positioning control element
CN115033822A (en) * 2022-06-14 2022-09-09 壹沓科技(上海)有限公司 Element positioning method, device and equipment and readable storage medium
CN115062206A (en) * 2022-05-30 2022-09-16 上海弘玑信息技术有限公司 Webpage element searching method and electronic equipment
CN115033822B (en) * 2022-06-14 2024-05-17 壹沓科技(上海)有限公司 Element positioning method, device, equipment and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073758A1 (en) * 2005-09-23 2007-03-29 Redcarpet, Inc. Method and system for identifying targeted data on a web page
CN103514292A (en) * 2013-10-09 2014-01-15 南京大学 Webpage data extraction method based on semi-supervised learning of small sample
CN108804472A (en) * 2017-05-04 2018-11-13 腾讯科技(深圳)有限公司 A kind of webpage content extraction method, device and server
CN110297752A (en) * 2018-03-23 2019-10-01 华为软件技术有限公司 Acquisition methods and device, automatization test system, the storage medium of control element
CN111046317A (en) * 2019-12-27 2020-04-21 北京奇艺世纪科技有限公司 Page data acquisition method, device, equipment and computer readable storage medium
CN111079043A (en) * 2019-12-05 2020-04-28 北京数立得科技有限公司 Key content positioning method
CN111368241A (en) * 2020-03-05 2020-07-03 苏州数字力量教育科技有限公司 Webpage element identification method based on XPath
CN112182468A (en) * 2020-10-14 2021-01-05 北京新纽科技有限公司 Positioning and analyzing method compatible with client interface element and webpage element

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073758A1 (en) * 2005-09-23 2007-03-29 Redcarpet, Inc. Method and system for identifying targeted data on a web page
CN103514292A (en) * 2013-10-09 2014-01-15 南京大学 Webpage data extraction method based on semi-supervised learning of small sample
CN108804472A (en) * 2017-05-04 2018-11-13 腾讯科技(深圳)有限公司 A kind of webpage content extraction method, device and server
CN110297752A (en) * 2018-03-23 2019-10-01 华为软件技术有限公司 Acquisition methods and device, automatization test system, the storage medium of control element
CN111079043A (en) * 2019-12-05 2020-04-28 北京数立得科技有限公司 Key content positioning method
CN111046317A (en) * 2019-12-27 2020-04-21 北京奇艺世纪科技有限公司 Page data acquisition method, device, equipment and computer readable storage medium
CN111368241A (en) * 2020-03-05 2020-07-03 苏州数字力量教育科技有限公司 Webpage element identification method based on XPath
CN112182468A (en) * 2020-10-14 2021-01-05 北京新纽科技有限公司 Positioning and analyzing method compatible with client interface element and webpage element

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
TOMAS GRIGALIS等: "generating xpath expressions for structured web data record segmentation", ICIST 2012:INFORMATION AND SOFTWARE TECHNOLOGIES, pages 38 - 47 *
刘钊夏;何明昕;: "利用JTidy和XML实现Web数据信息的批量提取", 计算机工程与设计, vol. 31, no. 06, pages 1243 - 1246 *
王丹;顾明昌;赵文兵;: "跨站脚本漏洞渗透测试技术", 哈尔滨工程大学学报, vol. 38, no. 11, pages 1769 - 1774 *
石龙;强保华;何倩;吴春明;谌超;: "基于DOM的Deep Web查询接口属性抽取方法", 桂林电子科技大学学报, vol. 32, no. 06, pages 468 - 472 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495775A (en) * 2021-09-07 2021-10-12 长沙博为软件技术股份有限公司 Combined positioning system, method, equipment and medium for RPA positioning control element
CN115062206A (en) * 2022-05-30 2022-09-16 上海弘玑信息技术有限公司 Webpage element searching method and electronic equipment
CN115062206B (en) * 2022-05-30 2023-04-07 上海弘玑信息技术有限公司 Webpage element searching method and electronic equipment
CN115033822A (en) * 2022-06-14 2022-09-09 壹沓科技(上海)有限公司 Element positioning method, device and equipment and readable storage medium
CN115033822B (en) * 2022-06-14 2024-05-17 壹沓科技(上海)有限公司 Element positioning method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN113177168B (en) 2023-12-01

Similar Documents

Publication Publication Date Title
US7917846B2 (en) Web clip using anchoring
EP3358470B1 (en) Method of preparing documents in markup languages
US7143344B2 (en) Transformation stylesheet editor
KR101255506B1 (en) Data-driven actions for network forms
CN108762743B (en) Data table operation code generation method and device
US7720885B2 (en) Generating a word-processing document from database content
CN101799753B (en) Method and device for realizing tree structure
CN106960058B (en) Webpage structure change detection method and system
CN101788994A (en) Method for constructing data display model and method and device for displaying data
US20140136958A1 (en) Relating to distributed access infrastructure for a database
US20050223023A1 (en) Generating pages suitable for viewing over the internet
US7720814B2 (en) Repopulating a database with document content
CN113177168B (en) Positioning method based on Web element attribute characteristics
US10776351B2 (en) Automatic core data service view generator
US8799256B2 (en) Incorporated web page content
RU2613026C1 (en) Method of preparing documents in markup languages while implementing user interface for working with information system data
Collins Pro HTML5 with Visual Studio 2015
WO2010147453A1 (en) System and method for designing a gui for an application program
JP2008243075A (en) Structured document management device and method
Naylor Asp. net mvc with entity framework and css
US11210454B2 (en) Method for preparing documents written in markup languages while implementing a user interface for dealing with data of an information system
Škrbić et al. Bibliographic records editor in XML native environment
Wahlberg et al. Umbraco User's Guide
CN103440289A (en) Method for searching for webpage incompatible labels in parallel mode based on MapReduce
Ehrensberger Datamockups: design tool for content-powered mockups

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant