WO2023179327A1 - 页面回溯方法及其装置、介质和电子设备 - Google Patents

页面回溯方法及其装置、介质和电子设备 Download PDF

Info

Publication number
WO2023179327A1
WO2023179327A1 PCT/CN2023/079244 CN2023079244W WO2023179327A1 WO 2023179327 A1 WO2023179327 A1 WO 2023179327A1 CN 2023079244 W CN2023079244 W CN 2023079244W WO 2023179327 A1 WO2023179327 A1 WO 2023179327A1
Authority
WO
WIPO (PCT)
Prior art keywords
web page
elements
type
data
electronic device
Prior art date
Application number
PCT/CN2023/079244
Other languages
English (en)
French (fr)
Inventor
许海
王旭
莫元武
Original Assignee
易保网络技术(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 易保网络技术(上海)有限公司 filed Critical 易保网络技术(上海)有限公司
Publication of WO2023179327A1 publication Critical patent/WO2023179327A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present application relates to the field of computer technology, and in particular to a page retrieval method and its device, medium and electronic equipment.
  • Embodiments of the present application provide a page retrieval method and its device, medium and electronic equipment.
  • the electronic device obtains the DOM tree of the web page and uses the DOM tree of the web page to create web page data of the web page, where the web page data of the web page includes the DOM tree of the web page.
  • the electronic device saves the web page data of the web page to meet subsequent page review needs. It is not difficult to understand that the web page data saved by this application is data such as DOM trees such as web pages. Compared with the data amount of saving one frame of image, the data amount of saved web page data is smaller and takes up less storage space.
  • embodiments of the present application provide a page backtracking method for electronic devices, including: acquiring multiple elements of a first web page, where the multiple elements include at least one first type element, and at least one second type element. elements, wherein the first type of elements includes at least one of canvas elements and pdf elements.
  • the second category of elements includes HTML elements.
  • the drawing function called to generate the first type of elements based on monitoring is used to reconstruct the first type of elements and save the reconstructed first type of elements.
  • a DOM tree corresponding to the first web page is generated, and first web page data of the first web page is created based on the DOM tree.
  • Store the first web page data The first web page data is redrawn, and the first web page is traced back.
  • the first type of element is a canvas element.
  • the electronic device can draw the canvas element on the web page according to the drawing attribute of Canvas.
  • the Canvas element is monitored.
  • the electronic device 10 reconstructs and draws the canvas element through the prototype framework according to the monitored drawing attribute of the called Canvas, and converts the canvas element into code. data.
  • the encoded data is then used instead of the image address URL, and the encoded data is used as the src of the HTML element img, so that the encoded data converted from the canvas element can be added to the DOM tree of the web page, and the first web page data containing the DOM tree data is stored.
  • the process of recording the page is actually recording the web page data of the DOM tree at various points in time. It can be understood that the first web page data is composed of DOM tree and other data, and the amount of data is small. The content capacity occupied by storing the first web page data is small, which greatly reduces the occupation of memory resources and saves storage costs.
  • the above method also includes: when the first type of element includes a canvas element, reconstructing the first type of element based on monitoring and generating a drawing function called by the first type of element. Including: determining that the monitored browser calls the first drawing function to generate the canvas element. Based on the first drawing function, the canvas element is redrawn through the prototype framework.
  • the above method further includes: the JavaScript native application program interface is the requestAnimationFrame application program interface.
  • the above method also includes: when the first type of elements includes pdf elements, reconstructing the first type of elements based on the monitoring function called by generating the first type of elements. Including: rendering pdf elements into corresponding canvas elements through PDF.js technology. It is determined that the monitored browser calls the second drawing function to render the pdf element into the corresponding canvas element. Based on the second drawing function, the canvas element corresponding to the pdf element is redrawn through the prototype framework.
  • the above method further includes: redrawing the first web page data, and tracing back to the first web page includes: redrawing the first web page data through a JavaScript native application program interface. Redraw, trace back to the image corresponding to the first web page.
  • the above method further includes: the JavaScript native application program interface is the requestAnimationFrame application program interface.
  • the above method also includes: monitoring the drawing function called to generate the first type element includes: at least one of the following drawing functions provided by CanvasRenderingContext2D: strokeRect(), fillRect(), drawImage (), moveTo(), drawWidgetAsOnScreen(), lineWidth().
  • the above method further includes: for the reconstructed first type elements and second type elements of the first web page, generating a DOM tree corresponding to the first web page includes: Call the function toDataURL() to convert the reconstructed first-type elements into encoded data. Based on the encoded data of the first web page and the second type of elements, a DOM tree corresponding to the first web page is generated.
  • the above method further includes: storing the first web page data includes: serializing the first web page data and generating the serialized first web page data. Store the serialized first web page data.
  • the above method further includes: serializing the first web page data through a serialization function to generate serialized first web page data, wherein the serialization function includes string serialize(mixed $value) function.
  • embodiments of the present application provide a page traceback device, including: an acquisition module, configured to acquire multiple elements of the first web page.
  • the multiple elements include at least one first-type element, and at least one second-type element. elements, wherein the first type of elements includes at least one of canvas elements and pdf elements.
  • the second category of elements includes HTML elements.
  • the reconstruction module is used to generate the drawing function called by monitoring the first type of elements, reconstruct the first type of elements, and save the reconstructed first type of elements.
  • the creation module is configured to generate a DOM tree corresponding to the first web page based on the reconstructed first type elements and second type elements of the first web page, and create first web page data of the first web page based on the DOM tree.
  • the storage module is used to store the first web page data.
  • the traceback module is used to redraw the first web page data and trace back the first web page.
  • embodiments of the present application provide a readable medium, characterized in that instructions are stored on the readable medium, and when the instructions are executed on an electronic device, the electronic device causes the electronic device to perform the first aspect and all possible aspects of the first aspect. Page backtracking method in implementation.
  • an electronic device including:
  • memory for storing instructions for execution by the processor of the electronic device
  • the processor is one of the processors of the electronic device, used for executing the first aspect and possible implementations of the first aspect. Page backtracking method.
  • embodiments of the present application provide a computer program product, including a computer program/instruction, which when executed by a processor implements the above-mentioned first aspect and any one of the various possible implementations of the first aspect. Page backtracking method.
  • Figure 1 shows a page backtracking scenario according to some embodiments of the present application
  • Figure 2 shows a page traceback flow chart according to some embodiments of the present application
  • Figure 3 shows a schematic diagram of a web page of an electronic device according to some embodiments of the present application
  • Figure 4 shows another page backtracking flow chart according to some embodiments of the present application.
  • 5A-5B show a schematic diagram of a web page of a set of electronic devices according to some embodiments of the present application.
  • Figure 6 shows another page backtracking flow chart according to some embodiments of the present application.
  • Figure 7 shows a schematic diagram of a web page of an electronic device according to some embodiments of the present application.
  • Figure 8 shows a page review device according to some embodiments of the present application.
  • Figure 9 shows a block diagram of an electronic device according to some embodiments of the present application.
  • Web page The essence is HTML (HyperText Markup Language, Hypertext Markup Language). By combining the use of other web technologies (such as scripting languages, public gateway interfaces, components, etc.), powerful web pages can be created.
  • a web page can include one or more HTML elements.
  • the HTML elements of a web page can include menu bars, input boxes, link windows, buttons, icons, text boxes, dialog boxes, error messages, help messages, text, tables, and pictures. and other elements.
  • Web pages can also include canvas elements and Portable Document Format (pdf) elements (i.e. pdf files).
  • PDF Portable Document Format
  • Document Object Model (Document Object Model, DOM) tree Before the browser loads the web page document and renders the page, it generates a tree structure describing the document structure based on the loaded web page document. For example, when a web page is loaded, browsing The browser will create the DOM tree of the page based on the HTML elements on the web page. The DOM tree presents the HTML elements of the page as a DOM tree structure with elements, attributes and text. It is not difficult to understand that the status of the user page can be described in the form of a DOM tree. The process of recording the page is actually recording the web page data of the DOM tree at various points in time.
  • DOM Document Object Model
  • Canvas element It is a newly introduced element of HTML5 (HyperText Markup Language 5, HyperText Markup Language 5th Edition).
  • the canvas element can be used by users to draw graphics, text, etc. on the area of the web page where the canvas element is located.
  • the browser can create a canvas element on the web page by calling the drawing attribute of canvas.
  • the browser can create canvas elements on the web page through various drawing properties and methods provided by CanvasRenderingContext2D.
  • CanvasRenderingContext2D provides a variety of drawing functions, which can draw corresponding graphics, text, etc. on the canvas by calling the corresponding graphics functions.
  • Embodiments of the present application provide a page traceback method.
  • the execution subject of the method may be a mobile phone, a tablet, a server, a computer, or other electronic device that can provide web pages to users. This application does not limit this.
  • a web page refers to the overall design of human-computer interaction, operation logic, and beautiful interface of the software.
  • the web page can be a user operation interface on a browser program, such as Microsoft's IE browser, Firefox's Firefox browser, Google's Chrome browser, etc.
  • Figure 1 shows a page backtracking scenario according to an embodiment of the present application.
  • the scenario includes an electronic device 10, where the electronic device 10 displays a web page 110.
  • web page 110 may be a login page for a user to sign up for insurance.
  • the electronic device 10 can generate a traceback video 20 by running the page traceback method provided by this application. Specifically, the electronic device 10 obtains the DOM tree of the web page 110, and uses the DOM tree of the web page 110 to create web page data of the web page 110, where the web page data of the web page 110 may include the DOM tree of the web page, the web page's Logo etc. Serialize the web page data of the web page 110 to generate serialized data corresponding to the web page data. When receiving a page traceback request, deserialize the serialized data to obtain the web page data, and generate a traceback video based on consecutive multiple frames of web page data. 20, wherein the retrospective video 20 includes a screenshot of the web page 110.
  • the web page data of the generated web page 110 is DOM tree, page tags and other data.
  • the web page data saved in this application is DOM. Tree, page tags and other data. compare Compared with the data amount of saving one frame of image, the data amount of saved web page data is smaller and takes up less storage space.
  • the page traceback method of this application can also avoid device manufacturers' restrictions on background video recording. For example, some mobile phone manufacturers do not provide an interface for background video recording, which improves page traceback efficiency.
  • the canvas element of the web page is reconstructed by obtaining the drawing function called by the canvas element that generated the web page, converting the reconstructed canvas element into encoded data, and storing the encoded data in the DOM tree node, so that the canvas element can be saved in the DOM tree together with the HTML element of the web page.
  • the encoded data corresponding to the canvas element in the DOM tree of the web page can be requested to be loaded along with the code of the DOM tree, preventing the electronic device 10 from separately saving the reconstructed canvas element and loading the saved canvas element separately, and improving the efficiency of page backtracking. , reduce the space occupied by data storage.
  • the electronic device 10 can obtain the pdf file of the web page, and convert the pdf file of the web page into the corresponding canvas element. Then by obtaining the drawing function called when converting the pdf file into a canvas element, the canvas element corresponding to the pdf file of the web page is reconstructed, the reconstructed canvas element is converted into encoded data, and the encoded data is stored on the DOM tree node, thereby making the pdf Files can be saved in the DOM tree along with the HTML elements of the web page.
  • Figure 2 is an exemplary page review flow chart provided by this application.
  • the execution subject of the method is the electronic device 10.
  • the method specifically includes:
  • S201 Obtain the DOM tree of the web page, and use the DOM tree of the web page to create web page data of the web page.
  • web page data is to retain the data information at that time, that is, the user's operation on the page and the page content after the operation need to be recorded and stored to support the need for page traceback.
  • FIG. 3 shows a schematic diagram of a web page 110 of an electronic device 10 .
  • the web page 110 includes a variety of HTML elements.
  • it may specifically include a picture element 111 , a text element 112 , an input box element 113 , a button element 114 , etc.
  • the browser creates a tree structure (ie, DOM tree) describing the document structure according to the web page document.
  • the electronic device 10 can obtain the corresponding web page 110. DOM tree.
  • HTML elements of the web page 110 can be described in the form of a DOM tree.
  • the electronic device 10 responds to the user's operations (such as click, long press, drag, input, etc.), and the process of recording the web page is actually recording.
  • Each node on the DOM tree corresponds to each HTML element in the web page.
  • the DOM tree is used to create the web page data of the web page. Therefore, in the subsequent process of generating web page data, each frame of image in the retrospective video can be generated by directly rendering the image based on the web page data recorded at each time point.
  • the electronic device 10 uses the DOM tree of the web page 110 to create the web page data of the web page.
  • the specific process may include: traversing the node status of each node of the DOM tree, obtaining the web page transient status of the web page, and generating according to the web page transient status. Web data.
  • the electronic device 10 can also monitor changes in the DOM tree node status according to the Mutation Observer application program interface (Application Programming Interface, API), record the changed DOM tree node status, and use the changed DOM of the web page Tree, generates the web page data of the changed web page.
  • Mutation Observer application program interface Application Programming Interface, API
  • S202 Serialize the web page data of the web page and generate serialized data corresponding to the web page data.
  • the electronic device 10 serializes the web page data of the web page, generates serialized data corresponding to the web page data, and converts the web page data corresponding to The serialized data is stored in the database.
  • the serialized data corresponding to the web page data can be quickly obtained from the database.
  • the electronic device 10 can serialize the web page data of the web page through a serialization function and generate serialized data corresponding to the web page data.
  • the serialization function can be string serialize(mixed$value).
  • the electronic device 10 serializes the web page data of the web page and obtains the serialized data corresponding to the web page data, that is, converts the web page data into a format that can be stored (such as saved to a file, memory, database) or can be transmitted (such as network transmission) format.
  • the electronic device 10 can also deserialize the serialized data in this format and restore it to web page data.
  • S203 When receiving the page traceback request, deserialize the serialized data and obtain the web page data.
  • the traceback request may include at least one of the following contents: carrying the name of the person who sent the traceback request, the traceback start time point, and the traceback end time point. For example, determine whether the corresponding person has backtracking authority based on the person's name. If so, perform a reverse sorting operation on the serialized data to obtain the web page data. For example, according to the traceback start time point and traceback end time point of the request traceback, confirm whether the traceback start time point and the traceback end time point are before the current time point. If so, deserialize the serialized data and obtain the web page. data.
  • the electronic device 10 can deserialize the serialized data through the deserialization function to obtain the web page data.
  • the deserialization function could be mixed unserialize(string$str).
  • the electronic device 10 confirms the continuous multi-frame web page data between the look-back start time point and the look-back end time point, based on the continuous multi-frame web page data.
  • Generate a retrospective video For example, the electronic device 10 can redraw the continuous multiple frames of webpage data into each frame of webpage data based on the JavaScript native API (for example, requestAnimationFrame) based on the continuous multiple frames of webpage data, and redraw the continuous multiple frames of webpage data based on the time sequence of each frame of image. Compositing operations are performed as video frames to obtain a lookback video.
  • the JavaScript native API for example, requestAnimationFrame
  • the page traceback method provided in Figure 2 can not only get rid of the control of screen recording permissions by major mobile phone manufacturers, but also save traffic and storage costs.
  • the storage file size of a 3-minute video in mp4 format recorded based on the traditional method exceeds 10 megabytes (MByte, MB).
  • the storage file generated by the method provided in Figure 2 is only a few hundred kilobytes. (Kbit, Kb).
  • the DOM tree of the web page is used to describe the HTML elements of the web page
  • the web page contains non-HTML elements (for example, the non-HTML elements can be canvas elements, pdf files, etc.)
  • the non-HTML elements can be canvas elements, pdf files, etc.
  • this application also provides a page backtracking method.
  • the image drawing attributes and methods of the canvas are used.
  • the canvas is called by the monitoring system.
  • the image drawing attributes and methods reconstruct the canvas element of the rendered web page based on the monitoring results, and convert the reconstructed canvas element into encoded data, thereby realizing the preservation of the canvas element.
  • Figure 4 is an exemplary page review flow chart provided by this application.
  • the method is executed by the electronic device 10.
  • the method specifically includes:
  • FIG. 5A shows a schematic diagram of a web page 110 of an electronic device 10 .
  • the web page 510 includes a variety of HTML elements and a canvas element 513.
  • the various HTML elements may be picture elements 511, text elements 512, button elements 514, etc.
  • canvas element 513 can be used for users to draw signatures.
  • the electronic device 10 detects the user's operation of drawing a signature
  • the electronic device 10 displays the web page 520 shown in FIG. 5B in response to the operation.
  • the web page 520 includes a canvas element 513, wherein the canvas element 513 displays the user-drawn
  • the signature is "Zhang San".
  • the HTML elements on the web page 510 of Figure 5A and the web page 520 of Figure 5B can be described in the form of a DOM tree.
  • the electronic device 10 obtains The first DOM tree of web page 520.
  • the specific process of the electronic device 10 obtaining the first DOM tree of the web page 520 refers to S201 in FIG. 2 , which will not be described again here.
  • the electronic device 10 since the canvas element of the web page cannot be saved as an HTML element and through the first DOM tree of the web page, in order to save the canvas element of the web page, when the web page contains the canvas element, the electronic device 10
  • the Canvas element on the web page can be monitored according to the drawing attribute of Canvas.
  • the electronic device 10 draws the Canvas element through the prototype framework according to the monitored drawing attribute of Canvas. Reconstruct the drawn canvas element and convert the canvas element into encoded data.
  • JavaScript scripts can be used to draw arbitrary graphics (2D or 3D) on the canvas element.
  • the Canvas element has two attributes "width” and “height", which are used to set the width and height of the canvas.
  • the browser can draw various graphics through a variety of drawing properties and methods provided by CanvasRenderingContext2D.
  • CanvasRenderingContext2D provides a variety of drawing functions, which draw corresponding graphics on the canvas by calling the corresponding graphics functions.
  • the CanvasRenderingContext2D object provides a set of graphics functions for drawing on the canvas, including: strokeRect(), fillRect(), drawImage(), moveTo(), drawWidgetAsOnScreen(), lineWidth(), etc.
  • the graphics function strokeRect() can be used to draw a rectangle
  • the graphics function fillRect() can be used to draw or fill a rectangle
  • the graphics function drawImage() can be used to draw an image
  • the graphics function moveTo() can be used to set the current position and start a new sub-path
  • the graphics function drawWidgetAsOnScreen() is used to draw the widget on the screen
  • the graphics function lineWidth() is used to represent the thickness of the canvas line segment.
  • the browser can call the graphics function provided by the CanvasRenderingContext2D object to generate the graphics drawn by the user on the canvas.
  • the electronic device 10 can also reconstruct the changed canvas element through the prototype framework by monitoring the graphics function provided by the CanvasRenderingContext2D object called by the browser, and convert the canvas element into encoded data. In this way, the canvas element of the web page can be saved, and the canvas element of the web page will not be lost when the subsequent web page is traced back.
  • S402 Convert the canvas element into encoded data, and generate a second DOM tree of the web page based on the encoded data corresponding to the canvas element and the first DOM tree of the web page.
  • the electronic device 10 can convert the canvas element into encoded data, and then save the encoded data corresponding to the canvas element on the node of the first DOM tree to generate a second DOM tree of the web page. It is not difficult to understand that the second DOM tree of the generated web page includes HTML element data and encoding data corresponding to the Canvas element.
  • the electronic device 10 can convert the reconstructed canvas element into encoded data (for example, dataURL string) in the base64 encoding format using the base64 encoding format by calling the canvas function toDataURL().
  • the base64 encoding format is a representation method based on 64-bit strings to represent binary data. It is a common encoding method used to encode image data into string data.
  • the encoded data corresponding to the canvas element can be saved on the node of the first DOM tree to generate a second DOM tree of the web page.
  • the encoded data corresponding to the canvas element in the second DOM tree of the web page can be requested to be loaded along with the code of the second DOM tree, thereby preventing the electronic device 10 from separately saving the reconstructed canvas element and separately loading the saved canvas element.
  • Improve page backtracking efficiency improves the efficiency.
  • the electronic device 10 can generate a second DOM tree of the web page based on the encoding data corresponding to the canvas element and the first DOM tree of the web page. Specifically, the electronic device 10 can save the encoding data corresponding to the canvas element on the node of the first DOM tree, and record the tags and attributes of the node that saves the encoding data (for example, the img tag and the src attribute), and generate the second DOM. Tree. During page backtracking, the electronic device 10 may obtain the encoding data corresponding to the canvas element from the corresponding node of the second DOM tree according to the img tag and src attribute of the node that stores the encoding data.
  • the encoded data corresponding to the canvas element can be used to save to a node in the DOM tree.
  • the method of converting the canvas element into encoded data in this application is not limited to the method of using the canvas function toDataURL().
  • other encoding methods can also be used to convert canvas elements into corresponding encoded data. It can be understood that, based on actual applications, this application does not limit the specific method of converting canvas elements into encoded data.
  • S403 Create the second web page data of the web page according to the second DOM tree of the web page. Please refer to step S201 in Figure 2 for specific content, which will not be described again here.
  • the second DOM tree of the web page includes encoding data corresponding to the canvas element and data of the HTML element.
  • the electronic device 10 can create the second web page data of the web page according to the second DOM tree of the web page.
  • the second web page data of the web page may include the second DOM tree data of the web page.
  • the electronic device 10 can traverse the node status of each node of the second DOM tree of the web page to obtain the web page instant status of the web page, according to The transient state of the web page generates web page data.
  • S404 Serialize the second web page data of the web page and generate serialized data corresponding to the second web page data of the web page. Please refer to step S202 in Figure 2 for specific content, which will not be described again here.
  • S405 When receiving the page traceback request, deserialize the serialized data of the web page and obtain the second web page data of the web page. Please refer to step S203 in Figure 2 for specific content, which will not be described again here.
  • S406 Generate a traceback video based on the second web page data of the continuous multi-frame web page.
  • the second web page data of web page 520 includes second DOM tree data, tag data of web page 520, etc.
  • the electronic device 10 can redraw the web page 520 based on the second DOM tree data in the second web page data of the web page 520, and generate a screenshot of the web page 520.
  • the electronic device 10 can obtain the encoding data corresponding to the canvas element 513 on the web page 520 from the second web page data of the web page 520 based on the tag and attribute information of the DOM tree node. At the same time, the electronic device 10 can also obtain the HTML element data on the web page 520 from the second web page data of the web page 520 based on the tags and attribute information of the DOM tree node.
  • the electronic device 10 can redraw the web page 520 based on the coding of the HTML element data and the canvas element in the second web page data through a JavaScript native API (for example, requestAnimationFrame), and generate a screenshot of the web page 520 .
  • a JavaScript native API for example, requestAnimationFrame
  • multiple consecutive frames of web page screenshots are synthesized as video frames to obtain a retrospective video.
  • the page traceback method in Figure 4 specifically describes how to save the content of the web page to achieve subsequent page traceback when the web page contains non-HTML elements (ie, canvas elements).
  • This application also uses a page traceback method.
  • the web page contains non-HTML elements (i.e. pdf files)
  • the pdf file of the web page can be converted into canvas elements according to the document image conversion technology, and the converted canvas elements can be saved. , thereby avoiding the loss of the pdf file of the web page when the page is traced back.
  • Figure 6 is an exemplary page review flow chart provided by this application.
  • the method is executed by the electronic device 10.
  • the method specifically includes:
  • S601 Obtain the pdf file of the web page and the first DOM tree of the web page, and convert the pdf file of the web page into a canvas element.
  • FIG. 7 shows a schematic diagram of a web page 710 of the electronic device 10 .
  • the web page 710 includes a variety of HTML elements and a pdf file 712.
  • the various HTML elements may be text elements 711, button elements 713, sliding window elements 714, sliding window elements 715, etc.
  • pdf file 712 can be used to display the content that needs to be browsed by the user. The contents of the "Insurance Policy Delivery Confirmation".
  • the HTML elements on the web page 710 can be described in the form of a DOM tree, and the electronic device 10 obtains the DOM tree of the web page 710.
  • the specific process of the electronic device 10 obtaining the DOM tree of the web page 710 refers to S201 in FIG. 2 and will not be described again here.
  • the electronic device 10 can obtain The pdf file of the web page is converted into a canvas element according to the document image conversion technology.
  • the electronic device 10 can render the PDF file into a canvas element through Mozilla's open source PDF.js technology.
  • the PDF.js framework can realize file conversion through pure HTML5 technology without any local support. As long as the browser supports HTML5 technology, it can be used and has good compatibility.
  • step S602 Convert the canvas element into encoded data, and generate a third DOM tree of the web page based on the encoded data corresponding to the canvas element and the first DOM tree of the web page. Please refer to step S402 in Figure 4 for the specific process, which will not be described again here.
  • S603 Create the third web page data of the web page according to the third DOM tree of the web page. Please refer to step S201 in Figure 2 for specific content, which will not be described again here.
  • S604 Serialize the third web page data of the web page and generate serialized data corresponding to the third web page data of the web page. Please refer to step S202 in Figure 2 for specific content, which will not be described again here.
  • S605 When receiving the page traceback request, deserialize the serialized data of the web page and obtain the third web page data of the web page. Please refer to step S203 in Figure 2 for specific content, which will not be described again here.
  • S606 Generate a traceback video based on the third web page data of the continuous multi-frame web page.
  • the third web page data of web page 710 includes third DOM tree data, tag data of web page 710, etc.
  • the electronic device 10 can redraw the web page 710 based on the third DOM tree data in the third web page data of the web page 710, and generate a screenshot of the web page 710.
  • the electronic device 10 can obtain the encoding data corresponding to the canvas element converted by the pdf file 712 on the web page 710 from the third web page data of the web page 710 based on the tag and attribute information of the DOM tree node.
  • the electronic device 10 can also obtain the HTML element data on the web page 710 from the third web page data of the web page 710 based on the tags and attribute information of the DOM tree node.
  • the electronic device 10 can redraw the web page 710 through a JavaScript native API (for example, requestAnimationFrame) based on the HTML element data in the third web page data and the encoded data of the canvas element converted from the pdf file 712 to generate Screenshot of web page 710. Finally based on generating The time sequence of each third web page data or web page tag information, etc., are synthesized by combining multiple consecutive frames of web page screenshots as video frames to obtain a retrospective video.
  • a JavaScript native API for example, requestAnimationFrame
  • Figure 8 shows a structural block diagram of a page review device 800 according to some embodiments of the present application. As shown in Figure 8, the device includes:
  • the acquisition module (802) is used to acquire multiple elements of the first web page.
  • the multiple elements include at least one first-type element and at least one second-type element, where the first-type element includes a canvas element and a pdf element. of at least one.
  • the second category of elements includes HTML elements.
  • the reconstruction module (804) is used to generate a drawing function called by the first type of element based on monitoring, reconstruct the first type of element, and save the reconstructed first type of element.
  • the creation module (806) is used to generate a DOM tree corresponding to the first web page based on the reconstructed first type elements and second type elements of the first web page, and create a first element of the first web page based on the DOM tree. Web data.
  • the traceback module (810) is used to redraw the first web page data and trace back the first web page.
  • the backtracking module redraws the first web page data, and backtracking to the first web page includes: redrawing the first web page data through a JavaScript native application program interface, and backtracking to the first web page.
  • the JavaScript native application program interface is the requestAnimationFrame application program interface.
  • the creation module creates the first web page data of the first web page based on multiple elements of the first web page including: the multiple elements corresponding to the web page also include second type elements, the DOM of the first web page The tree is used to describe the second type elements and the first type elements of the first web page, where the second type elements include HTML elements.
  • the drawing function called by the creation module to generate the first type of element includes: at least one of the following drawing functions provided by CanvasRenderingContext2D: strokeRect(), fillRect(), drawImage(), moveTo(), drawWidgetAsOnScreen(), lineWidth ().
  • the creation module generates the drawing function called to generate the first type of elements based on monitoring, and reconstructs and saves the first type of elements, including: the drawing function called to generate the first type of elements based on monitoring, reconstructs the drawing of the first type of elements through the prototype framework, and saves the first type of elements.
  • the creation module converts the first type of elements into encoded data and saves it by calling the function toDataURL().
  • the creation module generates a drawing function called by monitoring the first type element, reconstructs and saves the first type element, and further includes: in the case where the first type element is a pdf element, through the Mozilla open source PDF. js technology renders pdf elements into canvas elements, generates drawing functions called by canvas elements based on monitoring, reconstructs and saves pdf elements.
  • the storage module storing the first web page data includes: serializing the first web page data and generating the serialized first web page data. Store the serialized first web page data.
  • the first web page data is serialized through a serialization function to generate serialized first web page data, where the serialization function includes a string serialize(mixed$value) function.
  • page traceback device 800 shown in Figure 8 corresponds to the page traceback method provided by this application, and the technical details in the above specific description of the page traceback method provided by this application are still applicable to the page traceback device shown in Figure 8 800, please refer to the above for detailed description, and will not be repeated here.
  • FIG. 9 shows a schematic structural diagram of an electronic device 10 according to some embodiments of the present application.
  • the electronic device 10 includes one or more processors 101, system memory 102, non-volatile memory (Non-Volatile Memory, NVM) 103, communication interface 104, and input/output (I/O) devices. 105, and system control logic 106 for coupling the processor 101, system memory 102, non-volatile memory 103, communication interface 104, and input/output (I/O) devices 105.
  • processors 101 system memory 102
  • non-volatile Memory Non-Volatile Memory
  • NVM non-Volatile Memory
  • communication interface 104 communication interface 104
  • I/O input/output
  • the processor 101 may include one or more processing units, such as a central processing unit (CPU), an image processor (GPU), a digital signal processor (DSP), or a microprocessor.
  • Data processing units such as MCU (Micro-programmed Control Unit), AI (Artificial Intelligence, artificial intelligence) processor or programmable logic device FPGA (Field Programmable Gate Array), neural network processor (Neural-network Processing Unit, NPU), etc.
  • the processing circuitry may include one or more single-core or multi-core processors.
  • the processor 101 may be configured to execute instructions to implement the above functions of obtaining the DOM tree of the web page and creating the web page data of the web page.
  • the processor 101 can also be used to execute instructions to obtain the above canvas element, convert the canvas element into encoded data, and generate the second DOM tree of the web page according to the encoded data corresponding to the canvas element and the first DOM tree of the web page. Function.
  • the processor 101 may also be used to execute instructions to implement the above function of obtaining the pdf file of the web page and the first DOM tree of the web page, and converting the pdf file of the web page into a canvas element.
  • System memory 102 is a volatile memory, such as random access memory (Random-Access Memory, RAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM), etc. System memory is used for temporary storage of data and/or instructions.
  • RAM Random-Access Memory
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Non-volatile memory 103 may include one or more tangible, non-transitory computer-readable media for storing data and/or instructions.
  • the non-volatile memory 103 may include any suitable non-volatile memory such as flash memory and/or any suitable non-volatile storage device, such as a hard disk drive (Hard Disk Drive, HDD), Compact Disc (CD), Digital Versatile Disc (DVD), Solid-State Drive (SSD), etc.
  • the non-volatile memory 103 may also be a removable storage medium, such as a secure digital (Secure Digital, SD) memory card, etc.
  • non-volatile memory 103 may be used to cache web pages and/or web pages and/or serialized data of web pages. .
  • system memory 102 and non-volatile storage 103 may include temporary and permanent copies of instructions 107, respectively.
  • the instructions 107 may include: when executed by at least one of the processors 101, causing the electronic device 10 to implement the page backtracking method provided by various embodiments of the present application.
  • Communication interface 104 may include a transceiver for providing a wired or wireless communication interface for electronic device 10 to communicate with any other suitable device over one or more networks.
  • the communication interface 104 can be integrated with other components of the electronic device 10 , for example, the communication interface 104 can be integrated with the processor 101 .
  • the input/output (I/O) device 105 may include an input device such as a keyboard, a mouse, etc., and an output device such as a monitor.
  • the user may interact with the electronic device 10 through the input/output (I/O) device 105.
  • the user may Input/output (I/O) devices 105 operate on web pages.
  • System control logic 106 may include any suitable interface controller to provide any suitable interface to other modules of electronic device 10 .
  • system control logic 106 may include one or more memory controllers to provide an interface to system memory 102 and non-volatile memory 103 .
  • At least one of the processors 101 may be packaged with logic for one or more controllers of the system control logic 106 to form a system in package (SiP). In other embodiments, at least one of the processors 101 may also be integrated on the same chip with the logic of one or more controllers for the system control logic 106 to form a system-on-chip (SoC). ).
  • SiP system in package
  • SoC system-on-chip
  • the structure of the electronic device 10 shown in FIG. 9 is only an example. In other embodiments, the electronic device 10 may include more or fewer components than shown, or some components may be combined or separated. Certain parts, or different arrangements of parts. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the electronic device 200 may have the same or similar structure as the electronic device 10 , may include more or fewer components than the electronic device 10 , and may also have other structures. Embodiments of the present application No restrictions.
  • the embodiment of the present application also provides a program product for implementing the page backtracking method provided by the above embodiments.
  • Embodiments of the mechanisms disclosed in this application may be implemented in hardware, software, firmware, or a combination of these implementation methods.
  • Embodiments of the present application may be implemented as computer modules or module codes executed on a programmable system.
  • the system includes at least one processor, a storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
  • Module code can be applied to input instructions to perform the functions described herein and to generate output information.
  • Output information can be applied to one or more output devices in a known manner.
  • a processing system includes any processor having a processor such as, for example, a Digital Signal Processor (DSP), a microcontroller, an Application Specific Integrated Circuit (ASIC), or a microprocessor. system.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • Module code can be implemented in a high-level modular language or an object-oriented programming language to communicate with the processing system. When needed, assembly language or machine language can also be used to implement module code. In fact, the mechanisms described in this application are not limited to the scope of any particular programming language. In either case, the language may be a compiled or interpreted language.
  • the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof.
  • the disclosed embodiments may also be implemented as instructions carried on or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be operated by one or more processors Read and execute.
  • instructions may be distributed over a network or through other computer-readable media.
  • machine-readable media may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), including, but not limited to, floppy disks, optical disks, optical disks, read-only memories (CD-ROMs), magnetic Optical disc, Read Only Memory (ROM), Random Access Memory (RAM), Erasable Programmable Read Only Memory (EPROM), Electrically Erasable Programmable Memory Read memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), magnetic or optical card, flash memory, or used to use the Internet to transmit information through electrical, optical, acoustic or other forms of propagation signals (for example, carrier waves, infrared signals, digital signals etc.) tangible machine-readable storage.
  • machine-readable media includes any type of machine-readable media suitable for storing or transmitting electronic instructions or information in a form readable by a machine (eg, computer).
  • each unit/module mentioned in each device embodiment of this application is a logical unit/module.
  • a logical unit/module can be a physical unit/module, or it can be a physical unit/module.
  • module one Part of it can also be implemented as a combination of multiple physical units/modules.
  • the physical implementation of these logical units/modules is not the most important. It is the combination of functions implemented by these logical units/modules that solves the problem proposed by this application.
  • the above-mentioned equipment embodiments of this application do not introduce units/modules that are not closely related to solving the technical problems raised by this application. This does not mean that the above-mentioned equipment embodiments do not exist. Other units/modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种页面回溯方法及其装置、介质和电子设备,涉及计算机技术领域。该方法包括:获取第一web页面的多个元素。根据第一web页面的多个元素,创建第一web页面的第一网页数据,其中,第一web页面的第一网页数据包括第一web页面的DOM树,并且根据第一web页面的多个元素,创建第一web页面的第一网页数据,包括:对应web页面的多个元素包括第一类元素,基于监控的生成第一类元素调用的绘图函数,重建并保存第一类元素,其中,第一类元素包括canvas元素、pdf元素中的至少一种。存储第一网页数据。对第一网页数据进行重绘,回溯出第一web页面。从而使得保存的网页数据的数据量更小,占用的存储空间更小,节省存储成本。

Description

页面回溯方法及其装置、介质和电子设备
本申请要求于2022年3月23日提交中国专利局、申请号为202210291790.8、申请名称为“页面回溯方法及其装置、介质和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别涉及一种页面回溯方法及其装置、介质和电子设备。
背景技术
随着互联网技术的发展和普及,越来越多的用户选择线上消费,不同于传统的实体消费场景,用户通过互联网即可完成购物、预定、转账、办理证件、投保等,很大程度上提高了用户的便捷性。但是,随着线上服务开展的加深,信息的安全和信息的防护成为了重点关注的问题,在保证用户的信息安全的同时,还需要保证服务端不会因为用户的违规操作而造成损失,才能够促进线上服务的持续性发展。而在此过程中,可能会出现需要对用户浏览的页面或者操作进行回溯的情况,例如,在投保场景中,可能就需要对用户购买保险时的操作过程以及付款过程进行回溯,从而保证投保过程的透明性和可追溯性。
现有技术中,可以通过对用户操作进行后台录制视频,然后对录制的视频进行保存,在需要页面回溯时,播放保存的视频,从而达到页面操作回溯的目的。但是,由于页面的回溯需求是比较大的,需要服务的用户群体也比较庞大,这就导致服务端需要存储大量的页面录制视频,耗费大量的存储空间。
发明内容
本申请实施例提供了一种页面回溯方法及其装置、介质和电子设备。电子设备通过获取web页面的DOM树,利用web页面的DOM树,创建web页面的网页数据,其中,web页面的网页数据包括web页面电DOM树。电子设备通过保存web页面的网页数据,以满足后续的页面回溯需求。不难理解,本申请保存的网页数据为web页面等DOM树等数据。相较于保存一帧图像的数据量,保存的网页数据的数据量更小,占用的存储空间更小。因此,在web页面的回溯需求比较大,并且需要服务的用户群体也比较庞大时,通过本申请提供的页面回溯方法,服务端需要存的储web页面的数据量变小,从而极大地减少了存储空间的占用,节省存储成本。
第一方面,本申请实施例提供了一种页面回溯方法,用于电子设备,包括:获取第一web页面的多个元素,多个元素包括至少一个第一类元素,和至少一个第二类元素,其中,第一类元素包括canvas元素、pdf元素中的至少一种。第二类元素包括HTML元素。基于监控的生成第一类元素调用的绘图函数,对第一类元素进行重建,并保存重建后的第一类元素。针对第一web页面的重建后的第一类元素和第二类元素,生成与第一web页面对应的DOM树,并基于DOM树创建第一web页面的第一网页数据。存储第一网页数据。对第一网页数据进行重绘,回溯出第一web页面。
例如,第一类元素为canvas元素,为了保存web页面(即第一web页面)的canvas元素,在web页面包含canvas元素的情况下,电子设备可以根据Canvas自带的绘图属性对web页面上的Canvas元素进行监控,当系统调用Canvas自带的绘图属性绘制Canvas元素,电子设备10根据监控到的调用的Canvas自带的绘图属性,通过prototype框架重建出绘制canvas元素,并将canvas元素转化成编码数据。然后将该编码数据代替图像地址URL,将该编码数据作为HTML元素img的src,从而使得canvas元素转换的编码数据能够添加到web页面的DOM树中,存储包含DOM树数据的第一网页数据。不难理解,用户页面的状态可以通过DOM树的形式描述,记录页面的过程实际上是记录DOM树在各个时间点上的网页数据。可以理解,第一网页数据由DOM树等数据组成,数据量较小,存储第一网页数据所占用的内容容量较小,极大地减少了的内存资源的占用,节省存储成本。
在上述第一方面的一种可能的实现中,上述方法还包括:在第一类元素包括canvas元素的情况下,基于监控的生成第一类元素调用的绘图函数,对第一类元素进行重建包括:确定监控的浏览器调用第一绘图函数生成canvas元素。基于第一绘图函数,通过prototype框架重新绘制canvas元素。
在上述第一方面的一种可能的实现中,上述方法还包括:JavaScript原生的应用程序接口为requestAnimationFrame应用程序接口。
在上述第一方面的一种可能的实现中,上述方法还包括:在第一类元素包括pdf元素的情况下,基于监控的生成第一类元素调用的绘图函数,对第一类元素进行重建包括:通过PDF.js技术将pdf元素渲染成对应的canvas元素。确定监控的浏览器调用第二绘图函数将pdf元素渲染成对应的canvas元素。基于第二绘图函数,通过prototype框架重新绘制pdf元素对应的canvas元素。
在上述第一方面的一种可能的实现中,上述方法还包括:对第一网页数据进行重绘,回溯出第一web页面包括:通过JavaScript原生的应用程序接口,对第一网页数据进行 重绘,回溯第一web页面对应的图像。
在上述第一方面的一种可能的实现中,上述方法还包括:,JavaScript原生的应用程序接口为requestAnimationFrame应用程序接口。
在上述第一方面的一种可能的实现中,上述方法还包括:监控的生成第一类元素调用的绘图函数包括:CanvasRenderingContext2D提供的以下至少一种绘图函数:strokeRect()、fillRect()、drawImage()、moveTo()、drawWidgetAsOnScreen()、lineWidth()。
在上述第一方面的一种可能的实现中,上述方法还包括:针对第一web页面的重建后的第一类元素和第二类元素,生成与第一web页面对应的DOM树包括:通过调用函数toDataURL()将重建后的第一类元素转换为编码数据。针对第一web页面的编码数据和第二类元素,生成与第一web页面对应的DOM树。
在上述第一方面的一种可能的实现中,上述方法还包括:存储第一网页数据包括:对第一网页数据进行序列化,生成序列化的第一网页数据。存储序列化的第一网页数据。
在上述第一方面的一种可能的实现中,上述方法还包括:通过序列化函数对第一网页数据进行序列化,生成序列化的第一网页数据,其中,序列化函数包括string serialize(mixed$value)函数。
第二方面,本申请实施例提供了一种页面回溯装置,包括:获取模块,用于获取第一web页面的多个元素,多个元素包括至少一个第一类元素,和至少一个第二类元素,其中,第一类元素包括canvas元素、pdf元素中的至少一种。第二类元素包括HTML元素。重建模块,用于基于监控的生成第一类元素调用的绘图函数,对第一类元素进行重建,并保存重建后的第一类元素。创建模块,用于针对第一web页面的重建后的第一类元素和第二类元素,生成与第一web页面对应的DOM树,并基于DOM树创建第一web页面的第一网页数据。存储模块,用于存储第一网页数据。回溯模块,用于对第一网页数据进行重绘,回溯出第一web页面。
第三方面,本申请实施例提供了一种可读介质,其特征在于,可读介质上存储有指令,该指令在电子设备上执行时使电子设备执行第一方面以及第一方面可能的各实现中的页面回溯方法。
第四方面,本申请实施例提供了一种电子设备,包括:
存储器,用于存储由电子设备的处理器执行的指令,以及
处理器,是电子设备的处理器之一,用于执行第一方面以及第一方面可能的各实现中 的页面回溯方法。
第五方面,本申请实施例提供一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现上述第一方面以及第一方面的各种可能实现中的任意一种页面回溯方法。
附图说明
图1根据本申请的一些实施例,示出了一种页面回溯的场景;
图2根据本申请的一些实施例,示出了一种页面回溯流程图;
图3根据本申请的一些实施例,示出了一种电子设备的web页面的示意图;
图4根据本申请的一些实施例,示出了另一种页面回溯流程图;
图5A-图5B根据本申请的一些实施例,示出了一组电子设备的web页面的示意图;
图6根据本申请的一些实施例,示出了另一种页面回溯流程图;
图7根据本申请的一些实施例,示出了一种电子设备的web页面的示意图;
图8根据本申请的一些实施例,示出了一种页面回溯装置;
图9根据本申请的一些实施例,示出了一种电子设备的框图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为便于理解本申请实施例提供的技术方案,这里先对本申请实施例使用的关键名词进行解释:
web页面:本质就是HTML(HyperText Markup Language,超文本标记语言),通过结合使用其他的web技术(如:脚本语言、公共网关接口、组件等),可以创造出功能强大的网页。web页面可以包括一个或多个HTML元素,其中,web页面的HTML元素可以包括菜单栏、输入框、链接窗口、按钮、图标、文本框、对话框、出错信息、帮助信息、文字、表格、图片等元素。web页面还可以包括canvas元素、便携式文档格式(Portable Document Format,pdf)元素(即pdf文件)。
文档对象模型(Document Object Model,DOM)树:在浏览器加载网页文档并渲染页面前,根据加载的网页文档生成描述文档结构的树状结构。例如,当网页被加载时,浏览 器会根据网页上的HTML元素,创建页面的DOM树。DOM树即把页面的HTML元素呈现为带有元素、属性和文本的DOM树结构。不难理解,用户页面的状态可以通过DOM树的形式描述,记录页面的过程实际上是记录DOM树在各个时间点上的网页数据。
canvas(画布)元素:是HTML5(HyperText Markup Language 5,超级文本标记语言第5版)新引入的元素。canvas元素可以用于用户在canvas元素所在的web页面的区域上绘制图形、文字等。其中,浏览器可以通过调用canvas自带的绘图属性在web页面上创建canvas元素。例如,浏览器可以通过CanvasRenderingContext2D提供的多种绘图属性和方法在web页面上创建canvas元素。具体的,CanvasRenderingContext2D提供了多种绘图函数,通过调用相应的图形函数以实现在画布上绘制对应的图形、文字等。
本申请实施例提供一种页面回溯方法,该方法的执行主体可以是手机、平板电脑、服务器、计算机等可以为用户提供web页面的电子设备,本申请对此不作限制。
在本申请的实施例中,web页面是指软件的人机交互、操作逻辑、界面美观的整体设计。web页面可以是浏览器程序上的用户操作界面,例如微软的IE浏览器、火狐的Firefox浏览器、谷歌的chrome浏览器等。
下面将结合附图对本申请的实施例作进一步地详细描述。为了方便描述本申请的技术方案,下文以对电子设备的浏览器的web页面进行页面回溯为例,对本申请的页面回溯方法进行详细描述。
图1为根据本申请实施例,示出了一种页面回溯的场景,如图1所示,该场景包括电子设备10,其中,电子设备10显示web页面110。例如,web页面110可以是一种用户投保的登录页面。
如图1所示,电子设备10可以通过运行本申请提供的的页面回溯方法生成回溯视频20。具体地,电子设备10获取web页面110的DOM树,利用web页面110的DOM树,创建web页面110的网页数据,其中,该web页面110的网页数据可以包括web页面的DOM树、web页面的标识等。对web页面110的网页数据进行序列化,生成网页数据对应的序列化数据,在接收到页面回溯请求时,对序列化数据进行反序列化,获得网页数据,基于连续多帧网页数据生成回溯视频20,其中,回溯视频20包括web页面110的截图。
不难看出,生成的web页面110的网页数据为DOM树、页面标签等数据。通过将web页面110的网页数据进行存储,以满足后续的页面回溯需求。可以理解,相较于背景技术中提到的对web页面录制视频,例如,将web页面110录制成一帧帧图像,通过保存录制的视频来提供后续的页面回溯,本申请保存的网页数据为DOM树、页面标签等数据。相较 于保存一帧图像的数据量,保存的网页数据的数据量更小,占用的存储空间更小。因此,在web页面的回溯需求比较大,并且需要服务的用户群体也比较庞大时,通过本申请提供的页面回溯方法,服务端需要存储web页面的数据量变小,从而极大地减少了存储空间的占用。并且通过本申请的页面回溯方法,还可以避免设备厂商对后台录制视频的限制,例如,部分手机厂商不提供后台录制视频的接口,提升页面回溯效率。
另外,在web页面包括canvas元素的情况下,通过获取生成web页面的canvas元素调用的绘图函数,重建web页面的canvas元素,将重建的canvas元素转换成编码数据,将该编码数据存储在DOM树节点上,从而使得canvas元素可以与web页面的HTML元素一起保存在DOM树中。在页面回溯时,web页面的DOM树中的canvas元素对应的编码数据可以跟随DOM树的代码一起请求加载,避免电子设备10单独保存重建的canvas元素以及单独加载保存的canvas元素,提高页面回溯效率,减少数据存储占用的空间。
另外,在web页面包括pdf文件的情况下,通过获取生成web页面的canvas元素调用的绘图函数,电子设备10可以获取web页面的pdf文件,根据文档图片转换技术将web页面的pdf文件转换成对应的canvas元素。然后通过获取pdf文件转换成canvas元素时调用的绘图函数,重建web页面的pdf文件对应的canvas元素,将重建的canvas元素转换成编码数据,将该编码数据存储在DOM树节点上,从而使得pdf文件可以与web页面的HTML元素一起保存在DOM树中。
图2为本申请示例性的提供一种页面回溯流程图,该方法的执行主体为电子设备10,该方法具体包括:
S201:获取web页面的DOM树,利用web页面的DOM树,创建web页面的网页数据。
为了满足web页面的回溯需求,网页数据就是把当时的数据信息做一个留存,即需要记录用户对页面的操作以及操作后的页面内容,并进行存储,以支持页面回溯的需求。
示例性的,图3示出了一种电子设备10的web页面110的示意图。如图3所示,web页面110包括多种HTML元素,例如,具体可以包括图片元素111,文字元素112,输入框元素113、按钮元素114等。为了支持页面回溯的需求,在浏览器加载网页文档并渲染web页面110前,浏览器根据网页文档创建描述文档结构的树状结构(即DOM树),其间,电子设备10可以获取web页面110对应的DOM树。
不难理解,web页面110的HTML元素可以通过DOM树的形式描述,电子设备10响应于用户的操作(如:点击、长按、拖动、输入等),记录web页面的过程实际上是记录DOM树在各 个时间点上的网页数据。DOM树上的每一个节点都对应着网页里面的每一个HTML元素,通过获取web页面的DOM树,利用DOM树,创建web页面的网页数据。从而使得在后续网页数据生成的过程中,直接根据每个时间点记录的网页数据渲染图像即可生成回溯视频中的每一帧图像。
示例性的,电子设备10利用web页面110的DOM树,创建web页面的网页数据的具体过程可以包括:遍历DOM树各节点的节点状态,得到web页面的网页瞬时状态,根据网页瞬时状态,生成网页数据。在web页面变化时,电子设备10还可以根据Mutation Observer应用程序接口(Application Programming Interface,API)监听DOM树节点状态的变化,并记录变化后的DOM树节点状态,利用变化后的web页面的DOM树,生成变化后web页面的网页数据。
S202:对web页面的网页数据进行序列化,生成网页数据对应的序列化数据。
在一些实施例中,为了方便网页数据的保存,在完成web页面的网页数据记录后,电子设备10对web页面的网页数据进行序列化,生成网页数据对应的序列化数据,将网页数据对应的序列化数据保存在数据库中,在需要进行页面回溯时,可以从数据库中快速获取网页数据对应的序列化数据。
示例性的,电子设备10可以通过序列化函数,对web页面的网页数据进行序列化,生成网页数据对应的序列化数据。例如,序列化函数可以是string serialize(mixed$value)。
不难理解,电子设备10对web页面的网页数据进行序列化,得到网页数据对应的序列化数据,即将网页数据转换成是可以存储(比如保存到文件,内存,数据库)或者是可以传输(比如网络传输)的格式。当然,在需要进行页面回溯时,电子设备10还可以对这个格式的序列化数据进行反序列化,还原成网页数据。
S203:在接收到页面回溯请求时,对序列化数据进行反序列化,获得网页数据。
在一些实施例中,回溯请求可以包括以下至少一个内容:携带有发送该回溯请求的人员名称、回溯起始时间点和回溯终止时间点。例如,根据该人员名称确定对应的人员是否具有回溯权限,若有,对序列化数据进行反排序操作,获得网页数据。例如,根据获取请求回溯的回溯起始时间点和回溯终止时间点,确认该回溯起始时间点和回溯终止时间点是否在当前时间点之前,若是,对序列化数据进行反序列化,获得网页数据。
示例性的,在接收到页面回溯请求时,电子设备10可以通过反序列化函数对序列化数据进行反序列化,获得网页数据。例如,反序列化函数可以是mixed  unserialize(string$str)。
S204:基于连续多帧网页数据生成回溯视频。
在一些实施例中,电子设备10在基于连续多帧网页数据生成回溯视频的过程中,确认处于回溯起始时间点和回溯终止时间点之间的连续多帧网页数据,基于连续多帧网页数据生成回溯视频。例如,电子设备10可以基于连续多帧网页数据,通过JavaScript原生API(例如,requestAnimationFrame),将连续多帧网页数据重绘为每帧网页数据,基于每帧图像的时间顺序将连续多帧网页数据作为视频帧进行合成操作,获得回溯视频。
由上述描述不难看出,通过获取web页面的DOM树,利用web页面的DOM树,创建web页面的网页数据,并对网页数据进行序列化,获得序列化数据。相较于传统的录制web页面的视频,并保存录制的视频,图2提供的页面回溯方法,不仅可以摆脱各大手机厂商对录屏权限的控制,同时还可以节省流量及存储成本。例如,基于传统方法录制的mp4格式的3分钟视频,产生的存储文件大小超过10兆字节(MByte,MB),相同内容的回溯视频,图2提供的方法产生的存储文件只有几百千比特(Kbit,Kb)。
但是,由于web页面的DOM树用于描述web页面的HTML元素,在web页面包含非HTML元素(例如,非HTML元素可以是canvas元素、pdf文件等)的情况下,如果只是保存web页面的HTML元素实现对web页面进行回溯,可能会出现非HTML元素未被保存,则在页面回溯时,web页面的非HTML元素丢失的情形。
因此,本申请还提供一种页面回溯方法,在web页面包含canvas元素时,通过canvas自带的图像绘制属性和方法,在渲染包含canvas元素的web页面时,通过监控系统调用的canvas自带的图像绘制属性和方法,根据监控结果对渲染的web页面的canvas元素进行重建,并将重建的canvas元素转化成编码数据,从而实现对canvas元素的保存。
图4为本申请示例性的提供另一种页面回溯流程图,该方法的执行主体为电子设备10,该方法具体包括:
S401:获取web页面的canvas元素以及web页面的第一DOM树。
示例性的,图5A示出了一种电子设备10的web页面110的示意图。如图5A所示,web页面510包括多种HTML元素和canvas元素513,其中,多种HTML元素具体可以是图片元素511,文字元素512,按钮元素514等。其中,canvas元素513可以用于用户绘制签名。当电子设备10检测到用户绘制签名的操作时,响应于该操作,电子设备10显示图5B所示的web页面520。如图5B所示,web页面520包括canvas元素513,其中,canvas元素513显示用户绘制 的签名“张三”。
在一些实施例中,图5A的web页面510和图5B的web页面520上的HTML元素可以通过DOM树的形式描述,例如,以图5B的web页面520为web页面为例,电子设备10获取web页面520的第一DOM树。电子设备10获取web页面520的第一DOM树的具体过程参考图2中的S201,在此不做赘述。
在一些实施例中,由于web页面的canvas元素无法作为HTML元素并通过web页面的第一DOM树的形式保存,为了保存web页面的canvas元素,在web页面包含canvas元素的情况下,电子设备10可以根据Canvas自带的绘图属性对web页面上的Canvas元素进行监控,当系统调用Canvas自带的绘图属性绘制Canvas元素,电子设备10根据监控到的调用的Canvas自带的绘图属性,通过prototype框架重建出绘制canvas元素,并将canvas元素转化成编码数据。
在一些实施例中,可以用JavaScript脚本在该canvas元素上绘制任意图形(2D或者3D)。Canvas元素有两个属性“width”和“height”,用来设置画布的宽度的高度。对于在canvas元素上绘制2D的图形操作,浏览器可以通过CanvasRenderingContext2D提供的多种绘图属性和方法绘制各种图形。具体的,CanvasRenderingContext2D提供了多种绘图函数,通过调用相应的图形函数以实现在画布上绘制对应的图形。
例如,CanvasRenderingContext2D对象提供的一组用来在画布上绘制的图形函数具体包括:strokeRect()、fillRect()、drawImage()、moveTo()、drawWidgetAsOnScreen()、lineWidth()等。其中,图形函数strokeRect()可以用于绘制矩形,图形函数fillRect()可以用于绘制或填充一个矩形,图形函数drawImage()可以用于绘制一幅图像,图形函数moveTo()可以用于设置当前位置并开始一条新的子路径,图形函数drawWidgetAsOnScreen()用于在屏幕上绘制小部件,图形函数lineWidth()用于表示画布线段厚度。
不难看出,在电子设备10检测到用户在canvas元素上绘制图形的操作,浏览器可以调用CanvasRenderingContext2D对象提供的图形函数,在画布上生成用户绘制的图形。同时,电子设备10还可以通过监控浏览器调用的CanvasRenderingContext2D对象提供的图形函数,通过prototype框架重建出变化后的canvas元素,并将canvas元素转化成编码数据。从而实现web页面的canvas元素的保存,进而实现在后续web页面的页面回溯时,web页面的canvas元素不会丢失。
S402:将canvas元素转换为编码数据,根据canvas元素对应的编码数据以及web页面的第一DOM树,生成web页面的第二DOM树。
在一些实施例中,电子设备10可以将canvas元素转换为编码数据,然后将canvas元素对应的编码数据保存在第一DOM树的节点上,生成web页面的第二DOM树。不难理解,生成的web页面的第二DOM树包括HTML元素数据、Canvas元素对应的编码数据。
具体地,例如,电子设备10可以通过调用canvas的函数toDataURL(),采用base64编码格式将重建的canvas元素转换为base64编码格式的编码数据(例如,dataURL字符串)。其中,base64编码格式是一种基于64位字符串来表示二进制数据的表示方法,是一种常见的用于将图像数据编码为字符串数据的编码方式。
在一些实施例中,将重建的canvas元素转换为编码数据后,canvas元素对应的编码数据可以保存在第一DOM树的节点上,生成web页面的第二DOM树。在页面回溯时,web页面的第二DOM树中的canvas元素对应的编码数据可以跟随第二DOM树的代码一起请求加载,避免电子设备10单独保存重建的canvas元素以及单独加载保存的canvas元素,提高页面回溯效率。
在一些实施例中,电子设备10可以根据canvas元素对应的编码数据以及web页面的第一DOM树,生成web页面的第二DOM树。具体地,电子设备10可以将canvas元素对应的编码数据保存在第一DOM树的节点上,并记录保存该编码数据的节点的标签和属性(例如,img标签和src属性),生成第二DOM树。在页面回溯时,电子设备10可以根据保存编码数据的节点的img标签和src属性,从第二DOM树的对应节点上获取canvas元素对应的编码数据。
在一些实施例中,canvas元素对应的编码数据可以用于保存到在DOM树的节点上,本申请的将canvas元素转换为编码数据并不仅限于通过canvas的函数toDataURL()的方法。在其他一些实施例中,还可以使用其他编码方式将canvas元素转化为对应的编码数据。可以理解,根据实际的应用,本申请对将canvas元素转换为编码数据的具体方法不做限定。
S403:根据web页面的第二DOM树,创建web页面的第二网页数据。具体内容参考图2的步骤S201,在此不做赘述。
在一些实施例中,web页面的第二DOM树包括canvas元素对应的编码数据以及HTML元素的数据。电子设备10可以根据web页面的第二DOM树,创建web页面的第二网页数据。其中,web页面的第二网页数据可以包括web页面的第二DOM树数据。具体地,例如,电子设备10可以遍历web页面的第二DOM树的各节点的节点状态,得到web页面的网页瞬时状态,根据 网页瞬时状态,生成网页数据。
S404:对web页面的第二网页数据进行序列化,生成web页面的第二网页数据对应的序列化数据。具体内容参考图2的步骤S202,在此不做赘述。
S405:在接收到页面回溯请求时,对web页面的序列化数据进行反序列化,获得web页面的第二网页数据。具体内容参考图2的步骤S203,在此不做赘述。
S406:基于连续多帧web页面的第二网页数据生成回溯视频。
以图5B的web页面520为例,web页面520的第二网页数据包括第二DOM树数据和web页面520的标签数据等。电子设备10可以基于web页面520的第二网页数据中的第二DOM树数据,对web页面520进行重绘,生成web页面520的截图。
具体地,电子设备10可以基于DOM树节点的标签和属性信息,从web页面520的第二网页数据中获取web页面520上的canvas元素513对应的编码数据。同时,电子设备10还可以基于DOM树节点的标签和属性信息,从web页面520的第二网页数据中获取web页面520上的HTML元素数据。
在一些实施例中,电子设备10可以通过JavaScript原生API(例如,requestAnimationFrame),基于第二网页数据中HTML元素数据和canvas元素的编码,对web页面520进行重绘,生成的web页面520的截图。最后基于生成每个第二网页数据的时间顺序或网页标签信息等,将连续多帧网页截图作为视频帧进行合成操作,获得回溯视频。
不难看出,图4中的页面回溯方法具体描述了在web页面包含非HTML元素(即canvas元素)时,如何对web页面的内容进行保存已实现后续的页面回溯。本申请还通过一种页面回溯方法,在web页面包含非HTML元素(即pdf文件)时,可以根据文档图片转换技术将web页面的pdf文件转换成canvas元素,通过对转换后的canvas元素进行保存,从而避免页面回溯时,web页面的pdf文件的丢失情形。
图6为本申请示例性的提供一种页面回溯流程图,该方法的执行主体为电子设备10,该方法具体包括:
S601:获取web页面的pdf文件以及web页面的第一DOM树,并将web页面的pdf文件转换成canvas元素。
示例性的,图7示出了一种电子设备10的web页面710的示意图。如图7所示,web页面710包括多种HTML元素和pdf文件712,其中,多种HTML元素具体可以是文字元素711、按钮元素713、滑窗元素714,滑窗元素715等。其中,pdf文件712可以用于显示需要用户浏览 的“保险单送达确认书”的内容。
例如,以web页面为图7的web页面710为例,web页面710上的HTML元素可以通过DOM树的形式描述,电子设备10获取web页面710的DOM树。电子设备10获取web页面710的DOM树的具体过程参考图2中的S201,在此不做赘述。
在一些实施例中,由于web页面的pdf文件无法作为HTML元素并通过web页面的DOM树的形式保存,为了保存web页面的pdf文件,在web页面包含pdf文件的情况下,电子设备10可以获取web页面的pdf文件,根据文档图片转换技术将web页面的pdf文件转换成canvas元素。例如,电子设备10可以通过Mozilla开源的PDF.js技术将PDF文件渲染成canvas元素。其中,PDF.js框架可以通过纯HTML5技术实现文件的转换,无需任何本地支持,只要是浏览器支持HTML5技术均可以使用,兼容性较好。
S602:将canvas元素转换为编码数据,根据canvas元素对应的编码数据以及web页面的第一DOM树,生成web页面的第三DOM树。具体过程参考图4中的步骤S402,在此不做赘述。
S603:根据web页面的第三DOM树,创建web页面的第三网页数据。具体内容参考图2的步骤S201,在此不做赘述。
S604:对web页面的第三网页数据进行系列化,生成web页面的第三网页数据对应的序列化数据。具体内容参考图2的步骤S202,在此不做赘述。
S605:在接收到页面回溯请求时,对web页面的序列化数据进行反序列化,获得web页面的第三网页数据。具体内容参考图2的步骤S203,在此不做赘述。
S606:基于连续多帧web页面的第三网页数据生成回溯视频。
以图7的web页面710为例,web页面710的第三网页数据包括第三DOM树数据和web页面710的标签数据等。电子设备10可以基于web页面710的第三网页数据中的第三DOM树数据,对web页面710进行重绘,生成web页面710的截图。
具体地,电子设备10可以基于DOM树节点的标签和属性信息,从web页面710的第三网页数据中获取web页面710上的pdf文件712转换的canvas元素对应的编码数据。同时,电子设备10还可以基于DOM树节点的标签和属性信息,从web页面710的第三网页数据中获取web页面710上的HTML元素数据。
在一些实施例中,电子设备10可以通过JavaScript原生API(例如,requestAnimationFrame),基于第三网页数据中HTML元素数据以及pdf文件712转换的canvas元素的编码数据,对web页面710进行重绘,生成web页面710的截图。最后基于生成 每个第三网页数据的时间顺序或网页标签信息等,将连续多帧网页截图作为视频帧进行合成操作,获得回溯视频。
图8根据本申请的一些实施例,示出了一种页面回溯装置800的结构框图。如图8所示,该装置包括:
获取模块(802),用于获取第一web页面的多个元素,多个元素包括至少一个第一类元素,和至少一个第二类元素,其中,第一类元素包括canvas元素、pdf元素中的至少一种。第二类元素包括HTML元素。
重建模块(804),用于基于监控的生成第一类元素调用的绘图函数,对第一类元素进行重建,并保存重建后的第一类元素。
创建模块(806),用于针对第一web页面的重建后的第一类元素和第二类元素,生成与第一web页面对应的DOM树,并基于DOM树创建第一web页面的第一网页数据。
存储模块(808),用于存储第一网页数据。
回溯模块(810),用于对第一网页数据进行重绘,回溯出第一web页面。
在一些实施例中,回溯模块对第一网页数据进行重绘,回溯出第一web页面包括:通过JavaScript原生的应用程序接口,对第一网页数据进行重绘,回溯第一web页面。其中,JavaScript原生的应用程序接口为requestAnimationFrame应用程序接口。
在一些实施例中,创建模块根据第一web页面的多个元素,创建第一web页面的第一网页数据包括:对应web页面的多个元素还包括第二类元素,第一web页面的DOM树用于描述第一web页面的第二类元素和第一类元素,其中,第二类元素包括HTML元素。
在一些实施例中,创建模块生成第一类元素调用的绘图函数包括:CanvasRenderingContext2D提供的以下至少一种绘图函数:strokeRect()、fillRect()、drawImage()、moveTo()、drawWidgetAsOnScreen()、lineWidth()。创建模块基于监控的生成第一类元素调用的绘图函数,重建并保存第一类元素包括:基于监控的生成第一类元素调用的绘图函数,通过prototype框架重建出绘制第一类元素,保存第一元素。创建模块通过调用函数toDataURL()将第一类元素转换为编码数据并保存。
在一些实施例中,创建模块基于监控的生成第一类元素调用的绘图函数,重建并保存第一类元素,还包括:在第一类元素为pdf元素的情况下,通过Mozilla开源的PDF.js技术将pdf元素渲染成canvas元素,基于监控的生成canvas元素调用的绘图函数,重建并保存pdf元素。
在一些实施例中,存储模块存储第一网页数据包括:对第一网页数据进行序列化,生成序列化的第一网页数据。存储序列化的第一网页数据。通过序列化函数对第一网页数据进行序列化,生成序列化的第一网页数据,其中,序列化函数包括string serialize(mixed$value)函数。
可以理解,图8所示的页面回溯装置800与本申请提供的页面回溯方法相对应,以上关于本申请提供的页面回溯方法的具体描述中的技术细节依然适用于图8所示的页面回溯装置800,具体描述请参见上文,在此不再赘述。
图9根据本申请的一些实施例,示出了一种电子设备10的结构示意图。如图9所示,电子设备10包括一个或多个处理器101、系统内存102、非易失性存储器(Non-Volatile Memory,NVM)103、通信接口104、输入/输出(I/O)设备105、以及用于耦接处理器101、系统内存102、非易失性存储器103、通信接口104和输入/输出(I/O)设备105的系统控制逻辑106。其中:
处理器101可以包括一个或多个处理单元,例如,可以包括中央处理器CPU(Central Processing Unit)、图像处理器GPU(Graphics Processing Unit)、数字信号处理器DSP(Digital Signal Processor)、微处理器MCU(Micro-programmed Control Unit)、AI(Artificial Intelligence,人工智能)处理器或可编程逻辑器件FPGA(Field Programmable Gate Array)、神经网络处理器(Neural-network Processing Unit,NPU)等的数据处理单元或处理电路可以包括一个或多个单核或多核处理器。在一些实施例中,处理器101可以用于执行指令实现上述web页面的DOM树的获取,并创建web页面的网页数据的功能。处理器101还可以用于执行指令实现上述canvas元素的获取,并将canvas元素转换为编码数据,根据canvas元素对应的编码数据以及web页面的第一DOM树,生成web页面的第二DOM树的功能。处理器101还可以用于执行指令实现上述获取web页面的pdf文件以及web页面的第一DOM树,并将web页面的pdf文件转换成canvas元素的功能。
系统内存102是易失性存储器,例如随机存取存储器(Random-Access Memory,RAM),双倍数据率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDR SDRAM)等。系统内存用于临时存储数据和/或指令。
非易失性存储器103可以包括用于存储数据和/或指令的一个或多个有形的、非暂时性的计算机可读介质。在一些实施例中,非易失性存储器103可以包括闪存等任意合适的非易失性存储器和/或任意合适的非易失性存储设备,例如硬盘驱动器(Hard Disk Drive, HDD)、光盘(Compact Disc,CD)、数字通用光盘(Digital Versatile Disc,DVD)、固态硬盘(Solid-State Drive,SSD)等。在一些实施例中,非易失性存储器103也可以是可移动存储介质,例如安全数字(Secure Digital,SD)存储卡等。在另一些实施例中,非易失性存储器103可以用于缓存web页面和/或web页面和/或web页面的系列化数据。。
特别地,系统内存102和非易失性存储器103可以分别包括:指令107的临时副本和永久副本。指令107可以包括:由处理器101中的至少一个执行时使电子设备10实现本申请各实施例提供的页面回溯方法。
通信接口104可以包括收发器,用于为电子设备10提供有线或无线通信接口,进而通过一个或多个网络与任意其他合适的设备进行通信。在一些实施例中,通信接口104可以集成于电子设备10的其他组件,例如通信接口104可以集成于处理器101中。
输入/输出(I/O)设备105可以包括输入设备如键盘、鼠标等,输出设备如显示器等,用户可以通过输入/输出(I/O)设备105与电子设备10进行交互,例如用户可以通过输入/输出(I/O)设备105对web页面进行操作。
系统控制逻辑106可以包括任意合适的接口控制器,以电子设备10的其他模块提供任意合适的接口。例如在一些实施例中,系统控制逻辑106可以包括一个或多个存储器控制器,以提供连接到系统内存102和非易失性存储器103的接口。
在一些实施例中,处理器101中的至少一个可以与用于系统控制逻辑106的一个或多个控制器的逻辑封装在一起,以形成系统封装(System in Package,SiP)。在另一些实施例中,处理器101中的至少一个还可以与用于系统控制逻辑106的一个或多个控制器的逻辑集成在同一芯片上,以形成片上系统(System-on-Chip,SoC)。
可以理解,图9所示的电子设备10的结构只是一种示例,在另一些实施例中,电子设备10可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
可以理解,在一些实施例中,电子设备200可以和电子设备10具有相同或相似的结构,也可以包括比电子设备10更多或更少的部件,还可以具有其他的结构,本申请实施例不做限定。
本申请实施例还提供了一种程序产品,用于实现上述各实施例提供的页面回溯方法。
本申请公开的机制的各实施例可以被实现在硬件、软件、固件或这些实现方法的组合中。本申请的实施例可实现为在可编程系统上执行的计算机模块或模块代码,该可编程系 统包括至少一个处理器、存储系统(包括易失性和非易失性存储器和/或存储元件)、至少一个输入设备以及至少一个输出设备。
可将模块代码应用于输入指令,以执行本申请描述的各功能并生成输出信息。可以按已知方式将输出信息应用于一个或多个输出设备。为了本申请的目的,处理系统包括具有诸如例如数字信号处理器(Digital Signal Processor,DSP)、微控制器、专用集成电路(Application Specific Integrated Circuit,ASIC)或微处理器之类的处理器的任何系统。
模块代码可以用高级模块化语言或面向对象的编程语言来实现,以便与处理系统通信。在需要时,也可用汇编语言或机器语言来实现模块代码。事实上,本申请中描述的机制不限于任何特定编程语言的范围。在任一情形下,该语言可以是编译语言或解释语言。
在一些情况下,所公开的实施例可以以硬件、固件、软件或其任何组合来实现。所公开的实施例还可以被实现为由一个或多个暂时或非暂时性机器可读(例如,计算机可读)存储介质承载或存储在其上的指令,其可以由一个或多个处理器读取和执行。例如,指令可以通过网络或通过其他计算机可读介质分发。因此,机器可读介质可以包括用于以机器(例如,计算机)可读的形式存储或传输信息的任何机制,包括但不限于,软盘、光盘、光碟、只读存储器(CD-ROMs)、磁光盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁卡或光卡、闪存、或用于利用因特网以电、光、声或其他形式的传播信号来传输信息(例如,载波、红外信号数字信号等)的有形的机器可读存储器。因此,机器可读介质包括适合于以机器(例如计算机)可读的形式存储或传输电子指令或信息的任何类型的机器可读介质。
在附图中,可以以特定布置和/或顺序示出一些结构或方法特征。然而,应该理解,可能不需要这样的特定布置和/或排序。而是,在一些实施例中,这些特征可以以不同于说明性附图中所示的方式和/或顺序来布置。另外,在特定图中包括结构或方法特征并不意味着暗示在所有实施例中都需要这样的特征,并且在一些实施例中,可以不包括这些特征或者可以与其他特征组合。
需要说明的是,本申请各设备实施例中提到的各单元/模块都是逻辑单元/模块,在物理上,一个逻辑单元/模块可以是一个物理单元/模块,也可以是一个物理单元/模块的一 部分,还可以以多个物理单元/模块的组合实现,这些逻辑单元/模块本身的物理实现方式并不是最重要的,这些逻辑单元/模块所实现的功能的组合才是解决本申请所提出的技术问题的关键。此外,为了突出本申请的创新部分,本申请上述各设备实施例并没有将与解决本申请所提出的技术问题关系不太密切的单元/模块引入,这并不表明上述设备实施例并不存在其它的单元/模块。
需要说明的是,在本专利的示例和说明书中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
虽然通过参照本申请的某些优选实施例,已经对本申请进行了图示和描述,但本领域的普通技术人员应该明白,可以在形式上和细节上对其作各种改变,而不偏离本申请的精神和范围。

Claims (12)

  1. 一种页面回溯方法,用于电子设备,其特征在于,所述方法包括:
    获取第一web页面的多个元素,所述多个元素包括至少一个第一类元素,和至少一个第二类元素,其中,所述第一类元素包括canvas元素、pdf元素中的至少一种;所述第二类元素包括HTML元素;
    基于监控的生成所述第一类元素调用的绘图函数,对所述第一类元素进行重建,并保存重建后的第一类元素;
    针对所述第一web页面的所述重建后的第一类元素和所述第二类元素,生成与所述第一web页面对应的DOM树,并基于所述DOM树创建第一web页面的第一网页数据;
    存储所述第一网页数据;
    对所述第一网页数据进行重绘,回溯出所述第一web页面。
  2. 根据权利要求1所述的方法,其特征在于,在所述第一类元素包括canvas元素的情况下,基于监控的生成所述第一类元素调用的绘图函数,对所述第一类元素进行重建包括:
    确定监控的浏览器调用第一绘图函数生成所述canvas元素;
    基于所述第一绘图函数,通过prototype框架重新绘制所述canvas元素。
  3. 根据权利要求1所述的方法,其特征在于,在所述第一类元素包括pdf元素的情况下,基于监控的生成所述第一类元素调用的绘图函数,对所述第一类元素进行重建包括:
    通过PDF.js技术将所述pdf元素渲染成对应的canvas元素;
    确定监控的浏览器调用第二绘图函数将所述pdf元素渲染成对应的canvas元素;
    基于所述第二绘图函数,通过prototype框架重新绘制所述pdf元素对应的所述canvas元素。
  4. 根据权利要求1所述的方法,其特征在于,对所述第一网页数据进行重绘,回溯出所述第一web页面包括:
    通过JavaScript原生的应用程序接口,对所述第一网页数据进行重绘,回溯所述第一web页面对应的图像。
  5. 根据权利要求4所述的方法,其特征在于,所述JavaScript原生的应用程序接口为requestAnimationFrame应用程序接口。
  6. 根据权利要求1所述的方法,其特征在于,所述监控的生成所述第一类元素调用的绘图函数包括:
    CanvasRenderingContext2D提供的以下至少一种绘图函数:strokeRect()、fillRect()、drawImage()、moveTo()、drawWidgetAsOnScreen()、lineWidth()。
  7. 根据权利要求1所述的方法,其特征在于,针对所述第一web页面的所述重建后的第一类元素和所述第二类元素,生成与所述第一web页面对应的DOM树包括:
    通过调用函数toDataURL()将所述重建后的第一类元素转换为编码数据;
    针对所述第一web页面的所述编码数据和所述第二类元素,生成与所述第一web页面对应的DOM树。
  8. 根据权利要求1所述的方法,其特征在于,存储所述第一网页数据包括:
    对第一网页数据进行序列化,生成序列化的第一网页数据;
    存储序列化的第一网页数据。
  9. 根据权利要求8所述的方法,其特征在于,通过序列化函数对第一网页数据进行序列化,生成序列化的第一网页数据,其中,所述序列化函数包括string serialize(mixed$value)函数。
  10. 一种页面回溯装置,其特征在于,所述装置包括:
    获取模块,用于获取第一web页面的多个元素,所述多个元素包括至少一个第一类元素,和至少一个第二类元素,其中,所述第一类元素包括canvas元素、pdf元素中的至少一种;所述第二类元素包括HTML元素;
    重建模块,用于基于监控的生成所述第一类元素调用的绘图函数,对所述第一类元素进行重建,并保存重建后的第一类元素;
    创建模块,用于针对所述第一web页面的所述重建后的第一类元素和所述第二类元素,生成与所述第一web页面对应的DOM树,并基于所述DOM树创建第一web页面的第一网页数据;
    存储模块,用于存储所述第一网页数据;
    回溯模块,用于对所述第一网页数据进行重绘,回溯出所述第一web页面。
  11. 一种可读介质,其特征在于,可读介质上存储有指令,指令在电子设备上执行时使电子设备执行权利要求1至9中任一项的页面回溯方法。
  12. 一种电子设备,包括:存储器,用于存储由电子设备的处理器执行的指令,以及
    处理器,是电子设备的处理器之一,用于执行权利要求1至9中任一项的页面回溯方法。
PCT/CN2023/079244 2022-03-23 2023-03-02 页面回溯方法及其装置、介质和电子设备 WO2023179327A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210291790.8 2022-03-23
CN202210291790.8A CN114861103B (zh) 2022-03-23 2022-03-23 页面回溯方法及其装置、介质和电子设备

Publications (1)

Publication Number Publication Date
WO2023179327A1 true WO2023179327A1 (zh) 2023-09-28

Family

ID=82628472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/079244 WO2023179327A1 (zh) 2022-03-23 2023-03-02 页面回溯方法及其装置、介质和电子设备

Country Status (2)

Country Link
CN (1) CN114861103B (zh)
WO (1) WO2023179327A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861103B (zh) * 2022-03-23 2023-11-10 易保网络技术(上海)有限公司 页面回溯方法及其装置、介质和电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955854A (zh) * 2012-11-06 2013-03-06 北京中娱在线网络科技有限公司 一种基于html5协议的网页展现方法及装置
CN103279574A (zh) * 2013-06-20 2013-09-04 北京小米科技有限责任公司 一种浏览器图片的加载方法、装置和终端设备
CN112148573A (zh) * 2020-09-21 2020-12-29 青岛窗外科技有限公司 一种网页操作过程记录与回放的方法及装置
CN112685672A (zh) * 2020-12-24 2021-04-20 京东数字科技控股股份有限公司 页面会话行为轨迹的回溯方法、装置及电子设备
CN114861103A (zh) * 2022-03-23 2022-08-05 易保网络技术(上海)有限公司 页面回溯方法及其装置、介质和电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101384988B (zh) * 2006-02-09 2011-05-18 Unz.Org有限责任公司 通过数字化内容评论在互联上组织数字化内容
US20180349998A1 (en) * 2017-04-18 2018-12-06 Morris Donald Scott PUMA Modeling and comparing behavior of multiple trading opportunities for options
US11184406B1 (en) * 2018-08-20 2021-11-23 Shashank System and methods for secure user interface session recording, transmission, and replay without third party disclosure
CN112433923A (zh) * 2020-10-27 2021-03-02 北京健康之家科技有限公司 回溯文件生成方法、回溯方法和设备
CN112182473B (zh) * 2020-12-01 2021-02-19 未鲲(上海)科技服务有限公司 页面操作行为回放方法、装置、计算机设备及存储介质
CN113836464A (zh) * 2021-09-16 2021-12-24 平安养老保险股份有限公司 页面数据处理方法、装置、计算机设备和存储介质
CN113760825A (zh) * 2021-09-16 2021-12-07 平安医疗健康管理股份有限公司 可视化用户操作回溯方法、装置、计算机设备及存储介质
CN114003473B (zh) * 2021-09-29 2023-05-30 青岛漫斯特数字科技有限公司 一种页面操作行为的回溯方法、装置及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955854A (zh) * 2012-11-06 2013-03-06 北京中娱在线网络科技有限公司 一种基于html5协议的网页展现方法及装置
US20150286739A1 (en) * 2012-11-06 2015-10-08 Layabox Inc. Html5-protocol-based webpage presentation method and device
CN103279574A (zh) * 2013-06-20 2013-09-04 北京小米科技有限责任公司 一种浏览器图片的加载方法、装置和终端设备
CN112148573A (zh) * 2020-09-21 2020-12-29 青岛窗外科技有限公司 一种网页操作过程记录与回放的方法及装置
CN112685672A (zh) * 2020-12-24 2021-04-20 京东数字科技控股股份有限公司 页面会话行为轨迹的回溯方法、装置及电子设备
CN114861103A (zh) * 2022-03-23 2022-08-05 易保网络技术(上海)有限公司 页面回溯方法及其装置、介质和电子设备

Also Published As

Publication number Publication date
CN114861103A (zh) 2022-08-05
CN114861103B (zh) 2023-11-10

Similar Documents

Publication Publication Date Title
CN106991154B (zh) 网页渲染方法、装置、终端及服务器
JP6313020B2 (ja) システム、コンピュータ可読記憶媒体及び方法
WO2017211190A1 (zh) 一种数据处理方法及装置
US20170351649A1 (en) Virtual library providing content accessibility irrespective of content format and type
KR101740071B1 (ko) 컴퓨터 네트워크에서 전자 문서의 렌디션을 배치하기 위한 서버 전처리의 이용
US9026900B1 (en) Invisible overlay for dynamic annotation
US9507480B1 (en) Interface optimization application
US20150278171A1 (en) Single page application authoring in a content management system
CN102547450B (zh) 延迟的图像解码
CA3023902A1 (en) Automated generation of web forms using fillable electronic documents
WO2023179327A1 (zh) 页面回溯方法及其装置、介质和电子设备
JP7538948B2 (ja) 画像処理方法及び装置、並びにコンピュータ可読記憶媒体
WO2017152776A1 (zh) 网络引擎启动方法及装置
CN103544024A (zh) 生成浏览器页面的方法、系统及终端设备
CN111651966A (zh) 数据报告文件生成方法、装置与电子设备
WO2024193561A1 (zh) 文档目录处理方法及相关设备
CN113196275A (zh) 经由计算机网络的基于网络的协作墨迹书写
US8930808B2 (en) Processing rich text data for storing as legacy data records in a data storage system
CN113342450A (zh) 页面处理的方法、装置、电子设备及计算机可读介质
CN117076811A (zh) 一种网页导出方法、装置、设备及存储介质
CN103699378A (zh) 一种基于嵌入式的web框架构建方法及系统
US11972311B2 (en) Artificial intelligence based integration frameworks
CN112596732A (zh) 一种电子教材制作方法及系统
CN113986850B (zh) 电子卷宗的存储方法、装置、设备和计算机可读存储介质
CN117708458A (zh) 一种基于浏览器的图片超分辨率处理方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23773578

Country of ref document: EP

Kind code of ref document: A1