CN115357820B - Digital object packaging and entity access method and system based on record playback - Google Patents

Digital object packaging and entity access method and system based on record playback Download PDF

Info

Publication number
CN115357820B
CN115357820B CN202211264316.2A CN202211264316A CN115357820B CN 115357820 B CN115357820 B CN 115357820B CN 202211264316 A CN202211264316 A CN 202211264316A CN 115357820 B CN115357820 B CN 115357820B
Authority
CN
China
Prior art keywords
digital object
interactive sequence
script
user interactive
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211264316.2A
Other languages
Chinese (zh)
Other versions
CN115357820A (en
Inventor
马郓
黄罡
杨静如
郭曜齐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202211264316.2A priority Critical patent/CN115357820B/en
Publication of CN115357820A publication Critical patent/CN115357820A/en
Application granted granted Critical
Publication of CN115357820B publication Critical patent/CN115357820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a digital object packaging and entity access method and system based on record playback, and relates to the technical field of data processing. Packaging the digital object by using a record playback method, then acquiring a digital object identifier to be accessed, and analyzing the digital object identifier to be accessed; and then, taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface stored in the digital object entity. The method realizes a human-computer interaction type digital object warehouse system, records an addressing path based on the interactive sequence script, completes addressing and access of the digital object entity in the single-page application in a path playback mode, and solves the problem of single-page application data touch. The stored address can accurately position the corresponding page state when the digital object is packaged, and then the simulation of page operation can be efficiently and accurately completed through a path playback method to obtain the stored page state so as to realize access.

Description

Digital object packaging and entity access method and system based on record playback
Technical Field
The invention relates to the technical field of data processing, in particular to a digital object packaging and entity access method and system based on record playback.
Background
The basic element in the Digital Object architecture is the Digital Object (Digital Object), which is a uniform abstraction of any information system data resource. The digital object is divided into four parts: identification, metadata, state data, and entities. Wherein, the mark permanently and uniquely points to a digital object, and is the permanent and unchangeable attribute of the digital object; metadata is descriptive information of the digital object for retrieval and discovery of the digital object; the state data comprises information such as the current position, access entrance, access mode and the like of the digital object and is used for positioning and accessing the digital object; an entity is the actual content contained in a digital object and may be any sequence of bits or a collection of sequences of bits. The separation of the identification and status data also allows the identification of the digital object to no longer be tightly coupled to the location where the digital object is located, allowing the digital object to exist independently of the machine and not be inaccessible as the machine disappears, moves. DO is an abstraction of Internet resources, which represents a meaningful or valuable resource in the Internet. The representation of DO may be a sequence of bits, or a collection of sequences of bits; or may be a program that provides an interface to the outside. Just as each Host in the internet has an IP address as an identifier, each DO in the DOA also has an identifier for uniquely identifying the DO, and the identifier is called DOID. There are two basic protocols in DOA: identification/Resolution Protocol IRP (Identifier/Resolution Protocol) and Digital Object Interface Protocol DOIP (Digital Object Interface Protocol). There are three basic components in DOA, namely an identification/Resolution System (Identifier/Resolution System) responsible for identification and Resolution of the DO, a Repository System (replication System) responsible for storage and access of the DO entity, and a Registry System (registration System) responsible for registration of the DO meta information and search of the DO.
There are many end-users who desire to view digital objects (a uniform abstraction of arbitrary information system data resources). This requirement is often fulfilled by the access of a URL. However, in many scenarios, such as medical record systems in hospitals and OA systems in government offices, these business systems are usually single-page applications, data in the systems can only be presented in a web page and accessed by the most primitive human-computer interaction manner, the data can be reached only by manual page operations, and the traditional data addressing manner, such as URL, IP address, etc., is difficult to apply to such data.
Disclosure of Invention
The invention aims to provide a digital object packaging and entity access method and system based on record playback, which are used for solving the problems that in the prior art, a service system is applied in a single page, data in the system can only be presented in the page and accessed in the most original man-machine interaction mode, the data can be touched through manual page operation, and the traditional data addressing mode such as URL (uniform resource locator), IP (Internet protocol) address and the like is difficult to apply to the data.
In a first aspect, an embodiment of the present application provides a digital object packaging and entity access method based on record playback, including the following steps:
and using a record playback method to collect an interactive sequence of the user and the Web application, forming an interactive sequence script as an access path of the digital object to be packaged, and acquiring a URL (uniform resource locator) of the Web application, page opening time and page title information as metadata of the digital object. Distributing a digital object identifier for a digital object to be packaged through an identifier analysis system, storing metadata into a registry system, and storing an interactive sequence script serving as an access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity;
acquiring a digital object identifier to be accessed, and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; the user interactive sequence script is obtained by recording the operation of a user in a page for accessing a webpage to be packaged into a digital object;
and taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface of the digital object entity.
Based on the first aspect, in some embodiments of the present invention, the following steps are further included:
when detecting that a user accesses a webpage to be packaged as a digital object, generating a digital object generation request;
generating a request according to the digital object and a webpage to be packaged into the digital object, packaging the user interactive sequence script, and distributing a digital object identifier to obtain the digital object; the method comprises the steps of using a user interactive sequence script as an address of a data resource corresponding to a digital object, using a webpage to be packaged into the digital object as a digital object entity, collecting a Web application URL, page opening time and page title information as metadata of the digital object, distributing a digital object identifier for the digital object to be packaged through an identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as an access path into a digital object warehouse system to finish the packaging of the digital object.
Based on the first aspect, in some embodiments of the present invention, the method further comprises the following steps:
extracting and taking DOM information of each operation of a user in a page as a DOM sequence of the user interactive sequence script;
extracting a key operation sequence in the user interactive sequence script;
extracting page elements corresponding to the last key operation from the user interactive sequence script, and searching corresponding page elements in a DOM sequence of the user interactive sequence script to obtain matched DOM information;
extracting a corresponding key operation script according to the matched DOM information, and taking the corresponding key operation as the last key operation; recursively extracting page elements corresponding to the last key operation according to the sequence of the key operations from back to front;
and obtaining a compressed user interactive sequence script according to the corresponding key operation script extracted each time, and taking the compressed user interactive sequence script as the address of the data resource corresponding to the digital object.
Based on the first aspect, in some embodiments of the present invention, after obtaining the compressed user-interactive sequence script according to the corresponding key operation script extracted each time, the method further includes the following steps:
constructing a compressed user interactive sequence script DOM tree according to DOM information of the last operation of a user in a page;
respectively calculating hash values of the compressed user interactive sequence script DOM trees;
establishing association between the hash value of each compressed user interactive sequence script DOM tree and corresponding DOM information and digital object identification, and screening out compressed user interactive sequence scripts which have different operation paths but reach the same DOM according to the hash value of each compressed user interactive sequence script DOM tree;
and taking the compressed user interactive sequence script with the least operation in the compressed user interactive sequence scripts with different operation paths and reaching the same DOM as the normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
Based on the first aspect, in some embodiments of the present invention, the following steps are further included:
constructing a user interactive sequence script DOM tree according to the DOM information of the last operation of the user in the page;
respectively calculating hash values of DOM trees of the user interactive sequence scripts;
establishing association between the hash value of each user interactive sequence script DOM tree and corresponding DOM information and digital object identification, and screening out user interactive sequence scripts with different operation paths but the same DOM according to the hash value of each user interactive sequence script DOM tree;
and taking the user interactive sequence script with the least operation in the user interactive sequence scripts with different operation paths and reaching the same DOM as the normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
Based on the first aspect, in some embodiments of the present invention, the step of calculating the hash value of each user interactive sequence script DOM tree separately comprises the steps of:
calculating the hash value of the child node in the DOM tree of the user interactive sequence script;
calculating the hash value of a non-leaf node according to the hash value of the sub-node in the DOM tree of the user interactive sequence script;
and calculating to obtain the hash value of the root node according to the hash value of the non-leaf node, wherein the hash value is used as the hash value of the DOM tree of the user interactive sequence script.
In a second aspect, an embodiment of the present application provides a digital object packaging and entity access system based on record playback, including:
the digital object packaging module is used for collecting an interactive sequence of a user and Web application by using a record playback method, forming an interactive sequence script as an access path of a digital object to be packaged, and collecting URL (uniform resource locator) of the Web application, page opening time and page title information as metadata of the digital object; distributing a digital object identifier for a digital object to be packaged through an identifier analysis system, storing metadata into a registry system, and storing an interactive sequence script serving as an access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity;
the digital object identifier analysis module is used for acquiring the digital object identifier to be accessed and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; the user interactive sequence script is obtained by recording the operation of a user in a page for accessing a webpage to be packaged into a digital object;
and the path playback module is used for performing path playback by taking the corresponding user interactive sequence script as an address used in path playback to obtain a Web application interface of the digital object entity.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory for storing one or more programs; a processor. The program or programs, when executed by a processor, implement the method of any of the first aspects as described above.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method of any one of the above first aspects.
The embodiment of the invention at least has the following advantages or beneficial effects:
the embodiment of the invention provides a digital object packaging and entity access method and system based on record playback, wherein an interactive sequence of a user and Web application is collected by using a record playback method, an interactive sequence script is formed to be used as an access path of a digital object to be packaged, and URL (uniform resource locator) of the Web application, page opening time and page title information are collected to be used as metadata of the digital object; distributing a digital object identifier for the digital object to be packaged by an identifier analysis system, storing metadata into a registry system, storing an interactive sequence script serving as an access path into a digital object warehouse system, completing the packaging of the digital object, then acquiring the digital object identifier to be accessed, and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; and then, taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface stored in the digital object entity. The digital object entity addressing based on record playback realizes a man-machine interaction type digital object warehouse system, an addressing path is recorded based on an interactive sequence script, addressing and access to the digital object entity in single-page application are completed in a path playback mode, and the problem of single-page application data touch is solved. The method has the advantages that the recording of the page state is realized by recording and replaying the browser operation, an operation sequence script is obtained by recording the user operation path, the problem of state transition of the single-page application can be effectively solved, and the state positioning can be completed on the premise that data in the single-page application is not stored. The stored address can accurately position the corresponding page state when the digital object is packaged, the page rendering information in the state is the digital object entity, and then the simulation of page operation can be efficiently and accurately completed through a path playback method to obtain the stored page state so as to realize access.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of a digital object packaging and entity access method based on record playback according to an embodiment of the present invention;
FIG. 2 is a flow chart of digital object generation provided by an embodiment of the present invention;
fig. 3 is a sequence diagram for a user to start recording according to an embodiment of the present invention;
fig. 4 is a sequence diagram of a user performing an operation according to an embodiment of the present invention;
FIG. 5 is a sequence diagram of a user-generated digital object provided by an embodiment of the present invention;
FIG. 6 is a sequence diagram illustrating a user parsing a digital object according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a digital object addressing method based on record playback according to an embodiment of the present invention;
FIG. 8 is a flowchart of a script compression algorithm provided by an embodiment of the present invention;
FIG. 9 is a diagram illustrating a DOM tree structure provided by an embodiment of the present invention;
fig. 10 is a schematic diagram of a hash tree structure according to an embodiment of the present invention;
fig. 11 is a block diagram of a digital object packaging and entity access system based on record playback according to an embodiment of the present invention;
fig. 12 is a block diagram of an electronic device according to an embodiment of the present invention;
fig. 13 is a flowchart of packaging and accessing of a digital object packaging and entity access method based on recording and playback according to an embodiment of the present invention.
Icon: 100-digital object encapsulation module; 110-digital object identification parsing module; 120-path playback module; 101-a memory; 102-a processor; 103-a communication interface.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the individual features of the embodiments can be combined with one another without conflict.
Referring to fig. 1, fig. 7 and fig. 13, fig. 1 is a flowchart of a digital object packaging and entity access method based on recording and playback according to an embodiment of the present invention, fig. 7 is a schematic diagram of a digital object addressing method based on recording and playback according to an embodiment of the present invention, and fig. 13 is a flowchart of packaging and access of a digital object packaging and entity access method based on recording and playback according to an embodiment of the present invention. The digital object packaging and entity access method based on record playback comprises the following steps:
step S100: collecting an interactive sequence of a user and Web application by using a record playback method, forming an interactive sequence script as an access path of a digital object to be packaged, and collecting URL (Uniform resource locator) of the Web application, page opening time and page title information as metadata of the digital object; and distributing a digital object identifier for the digital object to be packaged by the identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as the access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity.
Step S110: and acquiring the identification of the digital object to be accessed, and analyzing the identification of the digital object to be accessed to obtain the corresponding user interactive sequence script. The user-interactive sequence script is obtained by recording user operations in pages for accessing web pages to be packaged as digital objects.
The method is applied to a scene that a digital object in a single-page application is browsed by an end user, in the scene, the method for touching the data resource is only operated in a page, and in order to access the data resource in the scene, a system page with rendered relevant data can be used as a digital object entity, so that the problem that the data entity leaves the system possibly is avoided. When data access is carried out, the data resource can be obtained by accessing the digital object, the access behavior is not carried out by a program any more but carried out by a user, the user only contacts with the front end of the web application, and the behavior of the user for viewing the data in the page is the access behavior of the digital object.
In order to support the above access behavior, a matching addressing method needs to be provided, that is, the stored address needs to be able to accurately locate the corresponding page state when the digital object is packaged, and the page rendering information in this state is the digital object entity.
To meet the above-described need for a digital object entity addressing method, recording of the page state may optionally be accomplished using recording and playback of browser operations. By recording the user operation path, an operation sequence script, namely a user interactive sequence script, is obtained, so that the problem of state transition of the single-page application can be effectively solved, and the state positioning can be completed on the premise of not storing data in the single-page application. Meanwhile, the restoration of the page state can be realized by a method of replaying the script.
Therefore, before addressing access to the digital object entity by adopting the method, the page to be accessed needs to be packaged into the digital object. Referring to fig. 2, fig. 2 is a flowchart of digital object generation according to an embodiment of the present invention. Digital object generation may be the recording of the process of using a Web application and the selection of content in a Web page to generate a digital object for a data entity. Before analyzing the digital object identifier to be accessed to obtain the corresponding user interactive sequence script, the method also comprises the following steps:
step S111, when detecting that a user accesses a webpage to be packaged as a digital object, generating a digital object generation request; the digital object generation part can comprise three processes, namely user starting recording, user interoperating with the Web application, and user selecting an interface to generate the digital object. Referring to fig. 3-4, fig. 3 is a sequence diagram of user start recording according to an embodiment of the present invention, and fig. 4 is a sequence diagram of user operation according to an embodiment of the present invention. The user starts recording, namely the user opens a plug-in, inputs a URL (uniform resource locator) for accessing a website, then the plug-in activates a back-end event monitor, after the plug-in is ready, the website to be accessed is opened by the plug-in, and the user starts to access. The method for the interoperation between the user and the Web application means that after the user finishes one operation, the plug-in records related operation information and sends the updated script to the rear-end event monitor, after the rear-end event monitor finishes processing the script, the plug-in updates the recorded information, the recording of one operation is finished, and the user can continue to operate normally. When accessing the webpage to be packaged as the digital object, the user opens the plug-in on the selected interface, selects to generate the digital object, and accordingly generates the digital object generation request.
Step S112, according to the digital object generation request and the webpage to be packaged as the digital object, packaging the user interactive sequence script, and distributing the digital object identification to obtain the digital object; the method comprises the steps of using a user interactive sequence script as an address of a data resource corresponding to a digital object, using a webpage to be packaged into the digital object as a digital object entity, collecting a Web application URL, page opening time and page title information as metadata of the digital object, distributing a digital object identifier for the digital object to be packaged through an identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as an access path into a digital object warehouse system to complete the packaging of the digital object. Referring to fig. 5, fig. 5 is a sequence diagram of a user generated digital object according to an embodiment of the present invention. The specific process for encapsulating may be: and after receiving the request for generating the digital object, the back-end event monitor arranges the script content, then sends the script to an identifier analysis system, analyzes the returned value to obtain the digital object identifier, and returns the digital object identifier to the plug-in to complete the generation of the digital object. The interactive sequence script obtained by recording user operation can be stored in the digital object warehouse as the address of the data resource corresponding to the digital object, and when the user wants to access, the data resource corresponding to the digital object is found by a path playback method according to the address stored in the digital object warehouse for accessing.
The identifier in the digital object is a permanent and unique attribute that points to a digital object and is a permanent attribute of the digital object. The identification analysis system analyzes the digital object identification to obtain a warehouse where the digital object entity is located, and then the warehouse finds the specific position of the digital object entity through a corresponding addressing method to complete the access of the digital object entity. Therefore, when the digital object entity is addressed, the digital object identifier to be accessed is acquired first, and then the digital object identifier to be accessed is analyzed to obtain the corresponding user interactive sequence script. Specifically, the user inputs the identifier of the digital object to be accessed in the plug-in, the plug-in transmits the identifier to the back-end event listener, and the back-end calls the identifier resolution system to obtain the script corresponding to the identifier of the digital object to be accessed and returns the script to the plug-in.
Step S120: and taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface of the digital object entity. The path playback is to obtain the recorded interactive sequence script through the digital object identification, and then to complete the simulation of the interoperation sequence of the user and the single-page application through the script, find the corresponding page state, and complete the support of the access. Referring to fig. 6, fig. 6 is a sequence diagram illustrating a user parsing a digital object according to an embodiment of the present invention. Specifically, the implementation can be that a user inputs a digital object identifier in the plug-in, the plug-in transmits the identifier to the back-end event monitor, the back end calls the identifier analysis system to obtain a script corresponding to the digital object identifier, the script is returned to the plug-in, and the plug-in plays back the script to obtain a Web application interface for storing a digital object entity.
In the implementation process, by using a record playback method, an interactive sequence of a user and a Web application is collected, an interactive sequence script is formed to be used as an access path of a digital object to be packaged, and URL (uniform resource locator) of the Web application, page opening time and page title information are collected to be used as metadata of the digital object; and distributing a digital object identifier for the digital object to be packaged by the identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as the access path into a digital object warehouse system to finish the packaging of the digital object. Then acquiring a digital object identifier to be accessed, and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; and then, taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface stored in the digital object entity. The digital object entity addressing based on record playback realizes a man-machine interaction type digital object warehouse system, an addressing path is recorded based on an interactive sequence script, addressing and access of the digital object entity in single-page application are completed in a path playback mode, and the problem of single-page application data touch is solved. The method has the advantages that the recording of the page state is realized by recording and replaying the browser operation, an operation sequence script is obtained by recording the user operation path, the problem of state transition of the single-page application can be effectively solved, and the state positioning can be completed on the premise that data in the single-page application is not stored. The stored address can accurately position the corresponding page state when the digital object is packaged, the page rendering information in the state is the digital object entity, and then the simulation of page operation can be efficiently and accurately completed through a path playback method to obtain the stored page state so as to realize access.
In the process of realizing path playback, the following problems exist: the efficiency needs to be improved in path playback, recording all interoperations of a user and performing playback completely is very inefficient, and a method needs to be found to reduce the number of operations for playback and improve the playback efficiency. In order to solve the above problem, a script compression process is added in the process of generating the digital object, please refer to fig. 8, where fig. 8 is a flowchart of a script compression algorithm provided in an embodiment of the present invention, which specifically includes the following steps:
firstly, extracting and taking DOM information of each operation in a page by a user as a DOM sequence of the user interactive sequence script. Document Object Model (DOM), the entire page element is collectively called DOM, which is essentially an interface (API) that is an API standard for manipulating web page content. The DOM is a platform and language neutral interface that allows programs and scripts to dynamically access, update the content, structure, and style of a document. One page is a document and is represented in the DOM by using document; all tags in the metapage are elements and are represented in the DOM by elements; all content in a page is a node (tag, attribute, text, annotation, etc.), represented in the DOM using a node; the DOM considers the above as one object and is therefore called the document object model.
Then, a key operation sequence in the user interactive sequence script is extracted. The key operation may be a preset operation, such as clicking, double-clicking, keyboard input, message sending, and the like. By combining these key operations, a key operation sequence is obtained.
And then, extracting the page element corresponding to the last key operation from the user interactive sequence script, and searching the corresponding page element in the DOM sequence of the user interactive sequence script to obtain matched DOM information.
The operation in the user interactive sequence script corresponds to the transition of the page state, and when the compression of the script is to be realized, the triggering of the state transition needs to be judged at first. By designing the interactive script, user behaviors can be defined, the user behaviors which may trigger state transition are taken as key operations and brought into the consideration range of script compression, wherein the key operations comprise clicking, double clicking, keyboard input, message sending and the like, and the key operations can be specifically preset according to actual needs. Focusing on these operations and selecting valid ones from them, the compression of the script can be realized.
The information in the script only determines when the state has completed the transition, but does not help the algorithm determine whether the conditions for the state transition are met. Therefore, in order to determine whether the condition of the state transition is satisfied, after each operation performed by the user, the DOM information of the page is pulled at the same time. Whether the operation sequence can be compressed is judged by judging the change of the DOM brought by each operation, namely judging whether the key operation related elements appear in the DOM.
Then, extracting a corresponding key operation script according to the matched DOM information, and taking the corresponding key operation as the last key operation; and recursively extracting the page elements corresponding to the last key operation according to the sequence of the key operations from back to front.
And finally, obtaining a compressed user interactive sequence script according to the corresponding key operation script extracted each time, and taking the compressed user interactive sequence script as the address of the data resource corresponding to the digital object.
During the course of a user's operation, the warehouse system has collected DOM information after each operation. After receiving the user interactive sequence script, extracting the page element corresponding to the last operation, sequentially searching the page element in the DOM of the script sequence, and recursively searching the element corresponding to the operation by taking the page element as the last operation, thereby realizing the compression of the script.
Some page state transitions depend on not a single operation, for example, a search operation includes inputting and clicking a button twice to complete a correct state transition. Therefore, before compression, it is necessary to check whether the current critical operation depends on the previous operation, and bind the operation with the dependency before compression. That is, in the above compression process, after finding the corresponding page element, it may check whether there is an operation dependency (for example, input and then search), and then recursively find the element corresponding to the operation again with this as the last operation, thereby implementing the compression on the script.
For example: an official chinese document is used as a test, which is a tree structured single page application that views different parts of the document by clicking on the tag on the left side of the page. The following is a partial program code:
{
"id": "62e38f50-30f1-4654-b873-78c797963dbd",
"command": "click",
"target": linkText =2.4. Gradually explain the test code ",
"targets": [
[ "linkText =2.4. Stepwise interpretation of test code", "linkText" ],
["css=.toctree-l2:nth-child(4) > .reference", "css:finder"],
[ "xpath =// a [ contacts (text (), '2.4. Stepwise interpretation of test code') ]," "xpath: link" ],
["xpath=//a[contains(@href, '#id6')]", "xpath:href"],
["xpath=//li[4]/a", "xpath:position"],
[ "xpath =// a [ contacts (, '2.4. Stepwise interpretation of test code') ]," xpath: innerText "]
],
"value": ""
}
As can be seen from the above codes, the last operation of the operation sequence is to click on the label of "2.4. Gradually explain the test code", and it is desired to view the contents of section 2.4. Our algorithm looks for script from front to back, and when checking the DOM after the user clicks the "2. Quick entry" tag, we find that the "2.4. Stepwise interpretation test code" this < a > tag is loaded for the first time, corresponding exactly to the script element under record. The algorithm identifies that the preconditions for enabling the operation in code 2.3 are the completion of the click "2. Quick entry" tag, thus completing a search and recursively looking for the preconditions for the click "2. Quick entry" tag. The label code loaded for the first time after clicking the label of '2. Quick entry':
< li class = "toctree-l2" > < a class = "reference internal" href = "# id2" >2.1. Simple use case </a > </li >
< li class = "tactree-l 2" > < a class = "reference internal" href = "# id3" >2.2 exemplary detailed solution </a > </li >
< li class = "tactree-l 2" > < a class = "reference internal" href = "# Selenium" >2.3. Write test case with Selenium >
< li class = "tocotree-l 2" > < a class = "reference internal" href = "# id6" >2.4. Stepwise interpretation of test code </a > </li >
< li class = "tactree-l 2" > < a class = "reference internal" href = "# Selenium-WebDriver" >2.5. Remote Selenium Webdriver > </li >, was used
It can be known from the above codes that the operated element is loaded after a specific preamble operation and is recorded and searched by an algorithm. The tag is a tag type widely existing in HTML (hypertext markup language), and by positioning xpath information of the tag and css information of the element, the corresponding element can be accurately found, and the search algorithm is realized.
In the implementation process, the script compression is carried out on the user interactive sequence script, so that the number of operands contained in the finally packaged script is reduced, the access page can be obtained more quickly during playback, and the playback speed is improved. Meanwhile, the storage space required by the compressed user interactive sequence script is reduced, and the utilization rate of the storage space is improved.
In implementing the path playback, there are also the following problems: different operation sequences may reach the same page state, and if not processed, the different operation sequences may be encapsulated into different digital objects, and in fact, the data resources corresponding to the digital objects are the same. Therefore, it is desirable to package these different sequences of operations into the same digital object. For the above problem, after the script compression is performed, the script normalization processing may be performed, which specifically includes the following steps:
firstly, constructing a compressed DOM tree of the user interactive sequence script according to the DOM information of the last operation of the user in the page. Referring to fig. 9, fig. 9 is a schematic diagram of a DOM tree structure according to an embodiment of the invention. Since the DOM contains information such as documents, nodes, elements, etc., a DOM tree structure can be built based on the information. The DOM tree structure sequentially comprises child nodes, non-leaf nodes and root nodes from bottom to top.
Since the information in the script cannot determine the page state, the script normalization process also needs to pull the DOM information of the web page as the operated information and as the basis of the state. The script normalization algorithm is used for comparison among the scripts, the scripts which pass through different operation paths but reach the same DOM are distinguished according to the number of operands, and the scripts with less operation are used as a common operation sequence of the scripts and the scripts, namely the digital object addresses.
Then, respectively calculating the hash value of each compressed user interactive sequence script DOM tree; in order to determine whether the DOMs are the same, the hash values of the DOM trees of the compressed user-interactive sequence scripts may be compared to determine, and when the hash values of the DOM trees of the two compressed user-interactive sequence scripts are the same, it is determined that the DOMs reached by the two compressed user-interactive sequence scripts are the same.
And then, establishing association between the hash value of each compressed DOM tree of the user interactive sequence script and corresponding DOM information and corresponding digital object identification, and screening out compressed user interactive sequence scripts which have different operation paths but reach the same DOM according to the hash value of each compressed DOM tree of the user interactive sequence script.
Referring to fig. 10, fig. 10 is a schematic diagram of a hash tree structure according to an embodiment of the present invention. The screening is to compare the hash values of the DOM trees of the user interactive sequence scripts, and for simple and efficient comparison, when the hash values of the DOM trees are calculated, the hash values can be calculated one by one from child nodes to parent nodes. The hash value of a child node is the hash value of the name of the node. For a non-leaf node, when calculating the hash value, firstly, all child nodes are sorted according to the lexicographic order of the hash value, then all the hash values are taken as character strings to be connected, then the name of the node is connected, and then the hash value of the new character string is calculated to be used as the hash value of the node. Thus, the hash value of the root node can be finally calculated by calculating upwards layer by layer. And taking the hash value of the root node as the hash value of the whole DOM structure.
After the hash values of the DOM trees of the compressed user interactive sequence scripts are obtained through calculation, the user interactive sequence scripts with the same hash value are divided into a group, and therefore the compressed user interactive sequence scripts which have different operation paths and reach the same DOM are screened out.
And then, taking the compressed user interactive sequence script with the least operation in the compressed user interactive sequence scripts with different operation paths and reaching the same DOM as a normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
In the implementation process, the compressed user interactive sequence scripts are further normalized, so that the user interactive sequence scripts which pass through different operation paths and reach the same DOM all adopt a common operation sequence, the digital object is conveniently packaged in a later period, and the storage space is saved. Meanwhile, because the common operation sequence is the least operation in the user interactive sequence scripts, the corresponding page can be found more quickly during path playback, and the path playback efficiency is further improved.
The script compression and the script normalization performed in the process of generating the digital object may be independent, that is, only the script compression processing or only the script normalization processing is performed in the process of generating the digital object, or the script compression and the script normalization processing may be performed in sequence, which is not limited in this embodiment.
The process of only performing script normalization processing comprises the following steps:
firstly, constructing a DOM tree of a user interactive sequence script according to DOM information of the last operation of a user in a page; then respectively calculating the hash value of each user interactive sequence script DOM tree; then, establishing association between the hash value of each user interactive sequence script DOM tree and corresponding DOM information and digital object identification, and screening out user interactive sequence scripts which have different operation paths but reach the same DOM according to the hash value of each user interactive sequence script DOM tree; and finally, taking the user interactive sequence script with the least operation in the user interactive sequence scripts with different operation paths and reaching the same DOM as the normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
The step of respectively calculating the hash value of each user interactive sequence script DOM tree comprises the following steps: calculating hash values of sub-nodes in a DOM tree of the user interactive sequence script; calculating the hash value of a non-leaf node according to the hash value of the child node in the DOM tree of the user interactive sequence script; and calculating to obtain the hash value of the root node according to the hash values of the non-leaf nodes, and taking the hash value as the hash value of the user interactive sequence script DOM tree.
The process of the script normalization process can be known as follows: the process of performing only script normalization is the same as the process of performing script normalization after script compression, except that the process of performing only script normalization is performed for the user interactive sequence scripts, and the process of performing script normalization after script compression is performed for the compressed user interactive sequence scripts, so the description is omitted here.
Based on the same inventive concept, the present invention further provides a digital object packaging and entity access system based on recording and playback, please refer to fig. 11, where fig. 11 is a block diagram of a digital object packaging and entity access system based on recording and playback according to an embodiment of the present invention. The digital object packaging and entity access system based on recording and playback comprises:
the digital object packaging module 100 is configured to collect an interaction sequence of a user and a Web application by using a record playback method, form an interactive sequence script as an access path of a digital object to be packaged, and acquire a URL of the Web application, page opening time, and page title information as metadata of the digital object; and distributing a digital object identifier for the digital object to be packaged by the identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as the access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity.
The digital object identifier analyzing module 110 is configured to obtain a digital object identifier to be accessed, and analyze the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; the user interactive sequence script is obtained by recording the operation of a user in a page for accessing a webpage to be packaged into a digital object;
and the path playback module 120 is configured to perform path playback by using the corresponding user interactive sequence script as an address used in path playback, so as to obtain a Web application interface of the digital object entity.
In the implementation process, the digital object packaging module 100 collects the interaction sequence of the user and the Web application by using a record playback method, forms an interactive sequence script as an access path of the digital object to be packaged, and collects the URL of the Web application, the page opening time and the page title information as metadata of the digital object; and distributing a digital object identifier for the digital object to be packaged by the identifier analysis system, storing the metadata into a registry system, and storing the interactive sequence script serving as the access path into a digital object warehouse system to finish the packaging of the digital object. The digital object identifier analyzing module 110 obtains the identifier of the digital object to be accessed, and analyzes the identifier of the digital object to be accessed to obtain a corresponding user interactive sequence script; the path playback module 120 takes the corresponding user interactive sequence script as an address used in path playback, and performs path playback to obtain a Web application interface stored in the digital object entity. The digital object entity addressing based on record playback realizes a man-machine interaction type digital object warehouse system, an addressing path is recorded based on an interactive sequence script, addressing and access to the digital object entity in single-page application are completed in a path playback mode, and the problem of single-page application data touch is solved. The recording and playback of the browser operation are used for recording the page state, an operation sequence script is obtained by recording the operation path of the user, the problem of state transition of the single-page application can be effectively solved, and the state positioning can be completed on the premise of not storing data in the single-page application. The stored address can accurately position the corresponding page state when the digital object is packaged, the page rendering information in the state is the digital object entity, and then the simulation of page operation can be efficiently and accurately completed through a path playback method to obtain the stored page state so as to realize access.
Referring to fig. 12, fig. 12 is a schematic structural block diagram of an electronic device according to an embodiment of the present disclosure. The electronic device comprises a memory 101, a processor 102 and a communication interface 103, wherein the memory 101, the processor 102 and the communication interface 103 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 101 may be used for storing software programs and modules, such as program instructions/modules corresponding to an object tracking system provided by the embodiments of the present application, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 101. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in fig. 12 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 12 or have a different configuration than shown in fig. 12. The components shown in fig. 12 may be implemented in hardware, software, or a combination thereof.
In the embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other ways. The above-described system embodiments are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (9)

1. A digital object packaging and physical access method based on record playback, comprising the steps of:
collecting an interactive sequence of a user and Web application by using a record playback method, forming an interactive sequence script as an access path of a digital object to be packaged, and collecting URL (Uniform resource locator) of the Web application, page opening time and page title information as metadata of the digital object; distributing a digital object identifier for the digital object to be packaged through an identifier analysis system, storing metadata into a registry system, and storing an interactive sequence script serving as an access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity;
acquiring a digital object identifier to be accessed, and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; the user interactive sequence script is obtained by recording the operation of a user in a page for accessing a webpage to be packaged into a digital object;
and taking the corresponding user interactive sequence script as an address used in path playback, and performing path playback to obtain a Web application interface of the digital object entity.
2. The recorded playback based digital object packaging and entity access method of claim 1, further comprising the steps of:
when detecting that a user accesses the webpage to be packaged as the digital object, generating a digital object generation request;
according to the digital object generation request and the webpage to be packaged into the digital object, packaging the user interactive sequence script, and distributing a digital object identifier to obtain the digital object; the method comprises the steps that a user interactive sequence script is used as an address of a data resource corresponding to a digital object, a webpage to be packaged into the digital object is used as a digital object entity, web application URL, page opening time and page title information are collected to be used as metadata of the digital object, a digital object identifier is distributed to the digital object to be packaged through an identifier analysis system, the metadata are stored in a registry system, the interactive sequence script used as an access path is stored in a digital object warehouse system, and the digital object is packaged.
3. The method for digital object packaging and physical access based on recorded playback as claimed in claim 2, further comprising the steps of:
extracting and taking DOM information of each operation of the user in the page as a DOM sequence of the user interactive sequence script;
extracting a key operation sequence in the user interactive sequence script;
extracting a page element corresponding to the last key operation from the user interactive sequence script, and searching the corresponding page element in a DOM sequence of the user interactive sequence script to obtain matched DOM information;
extracting a corresponding key operation script according to the matched DOM information, and taking the corresponding key operation as the last key operation; recursively extracting page elements corresponding to the last key operation according to the sequence of the key operations from back to front;
and obtaining a compressed user interactive sequence script according to the corresponding key operation script extracted each time, and taking the compressed user interactive sequence script as the address of the data resource corresponding to the digital object.
4. The method for digital object packaging and entity access based on record playback of claim 3, wherein after obtaining the compressed user-interactive sequence script according to the corresponding key operation script extracted each time, the method further comprises the following steps:
constructing a compressed user interactive sequence script DOM tree according to the DOM information of the last operation of the user in the page;
respectively calculating hash values of the compressed user interactive sequence script DOM trees;
establishing association between the hash value of each compressed DOM tree of the user interactive sequence script and corresponding DOM information and corresponding digital object identification, and screening out compressed user interactive sequence scripts with different operation paths and the same DOM according to the hash value of each compressed DOM tree of the user interactive sequence script;
and taking the compressed user interactive sequence script with the least operation in the compressed user interactive sequence scripts with different operation paths and reaching the same DOM as a normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
5. The method for digital object packaging and physical access based on recorded playback as claimed in claim 1, further comprising the steps of:
constructing a user interactive sequence script DOM tree according to the DOM information of the last operation of the user in the page;
respectively calculating hash values of DOM trees of the user interactive sequence scripts;
establishing association between the hash value of each user interactive sequence script DOM tree and corresponding DOM information and digital object identification, and screening out user interactive sequence scripts of different operation paths but reaching the same DOM according to the hash value of each user interactive sequence script DOM tree;
and taking the user interactive sequence script with the least operation in the user interactive sequence scripts with different operation paths but reaching the same DOM as a normalized user interactive sequence script, and taking the normalized user interactive sequence script as the address of the data resource corresponding to the digital object.
6. The method for digital object encapsulation and entity access based on recorded playback as claimed in claim 5, wherein the step of separately calculating the hash value of each user interactive sequence script DOM tree comprises the steps of:
calculating hash values of sub-nodes in a DOM tree of the user interactive sequence script;
calculating the hash value of a non-leaf node according to the hash value of the child node in the DOM tree;
and calculating the hash value of the root node according to the hash value of the non-leaf node, and taking the hash value as the hash value of the DOM tree of the user interactive sequence script.
7. A digital object packaging and physical access system based on recorded playback, comprising:
the digital object packaging module is used for collecting an interactive sequence of a user and Web application by using a record playback method, forming an interactive sequence script as an access path of a digital object to be packaged, and collecting URL (uniform resource locator) of the Web application, page opening time and page title information as metadata of the digital object; distributing a digital object identifier for the digital object to be packaged through an identifier analysis system, storing metadata into a registry system, and storing an interactive sequence script serving as an access path into a digital object warehouse system to complete the packaging of the digital object, wherein the digital object comprises a digital object entity;
the digital object identifier analysis module is used for acquiring a digital object identifier to be accessed and analyzing the digital object identifier to be accessed to obtain a corresponding user interactive sequence script; the user interactive sequence script is obtained by recording the operation of a user in a page for accessing a webpage to be packaged into a digital object;
and the path playback module is used for taking the corresponding user interactive sequence script as an address used in path playback to perform path playback so as to obtain a Web application interface of the digital object entity.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for digital object packaging and physical access based on recorded playback as claimed in any one of claims 1 to 6.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing performs the steps of the method for packaging and physically accessing a digital object based on recorded playback as claimed in any of claims 1 to 6.
CN202211264316.2A 2022-10-17 2022-10-17 Digital object packaging and entity access method and system based on record playback Active CN115357820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211264316.2A CN115357820B (en) 2022-10-17 2022-10-17 Digital object packaging and entity access method and system based on record playback

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211264316.2A CN115357820B (en) 2022-10-17 2022-10-17 Digital object packaging and entity access method and system based on record playback

Publications (2)

Publication Number Publication Date
CN115357820A CN115357820A (en) 2022-11-18
CN115357820B true CN115357820B (en) 2023-01-13

Family

ID=84008930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211264316.2A Active CN115357820B (en) 2022-10-17 2022-10-17 Digital object packaging and entity access method and system based on record playback

Country Status (1)

Country Link
CN (1) CN115357820B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142016A (en) * 2010-01-29 2011-08-03 微软公司 Cross-browser interactivity recording, playback and editing
CN112148571A (en) * 2020-07-08 2020-12-29 青岛窗外科技有限公司 Method and device for recording and playing back webpage operation process
CN113553529A (en) * 2021-07-26 2021-10-26 平安养老保险股份有限公司 Method and device for recording webpage behaviors, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142016A (en) * 2010-01-29 2011-08-03 微软公司 Cross-browser interactivity recording, playback and editing
CN112148571A (en) * 2020-07-08 2020-12-29 青岛窗外科技有限公司 Method and device for recording and playing back webpage operation process
CN113553529A (en) * 2021-07-26 2021-10-26 平安养老保险股份有限公司 Method and device for recording webpage behaviors, computer equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Carpet: Automating Collaborative Web-based Process across Multiple Devices by Capture-and-Replay;Yun Ma 等;《IEEE Computer Society》;20151231;全文 *
Scratch:一个基于Chrome浏览器的用户操作捕捉与回放工具;陈萧宇 等;《计算机科学》;20141130;全文 *
Smart SEP:基于Web图形操作记录与回放的在线同步教学平台;陈德健 等;《计算机科学》;20141130;全文 *
前端录制回放系统初体验;知乎用户:掘金开发者社区;《知乎,网址:https://zhuanlan.zhihu.com/p/368737689》;20210429;全文 *

Also Published As

Publication number Publication date
CN115357820A (en) 2022-11-18

Similar Documents

Publication Publication Date Title
Di Lucca et al. WARE: A tool for the reverse engineering of web applications
US11556697B2 (en) Intelligent text annotation
US7499965B1 (en) Software agent for locating and analyzing virtual communities on the world wide web
JP5256293B2 (en) System and method for including interactive elements on a search results page
CN102227725B (en) System and method for matching entities
CN111079043B (en) Key content positioning method
US20120191840A1 (en) Managing Application State Information By Means Of A Uniform Resource Identifier (URI)
US20030225829A1 (en) System and method for platform and language-independent development and delivery of page-based content
US20090089278A1 (en) Techniques for keyword extraction from urls using statistical analysis
CN109376291B (en) Website fingerprint information scanning method and device based on web crawler
US9323828B2 (en) Complex query handling
Ly et al. Automated information extraction from web APIs documentation
CN104778232B (en) Searching result optimizing method and device based on long query
US10133826B2 (en) UDDI based classification system
US20110022563A1 (en) Document display system, related document display method, and program
CN110489032B (en) Dictionary query method for electronic book and electronic equipment
CN115357820B (en) Digital object packaging and entity access method and system based on record playback
US20150248500A1 (en) Documentation parser
CN113127776A (en) Breadcrumb path generation method and device and terminal equipment
KR20050074058A (en) System for automatically sending to other web site news automatically classified on internet, and control method thereof
CA2752898A1 (en) Methods and systems of outputting content of interest
CN105677827B (en) A kind of acquisition methods and device of list
Panum et al. Kraaler: A user-perspective web crawler
JPH117452A (en) Method and device for collecting information through network and recording medium recording program for executing the method
CN114003714B (en) Intelligent knowledge pushing method for document context sensing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Ma Yun

Inventor after: Huang Gang

Inventor after: Yang Jingru

Inventor after: Guo Yaoqi

Inventor before: Ma Yun

Inventor before: Huang Gang

Inventor before: Yang Jingru

Inventor before: Guo Yaoqi