WO2021184896A1

WO2021184896A1 - Page screenshot method and device

Info

Publication number: WO2021184896A1
Application number: PCT/CN2020/140556
Authority: WO
Inventors: 韩喆; 王春霈
Original assignee: 支付宝(杭州)信息技术有限公司
Priority date: 2020-03-20
Filing date: 2020-12-29
Publication date: 2021-09-23
Also published as: CN111428162A

Abstract

A page screenshot method and device. The method comprises: analyzing a screenshot request initiated by a user, so as to obtain a uniform resource locator (URL) of a page to be subjected to screenshot; loading the page according to the URL; in a loading process, determining whether loading of page elements related to the screenshot is completed; and in response to completion of loading of the page elements related to the screenshot, stopping loading, and performing screenshot on the loaded part in the page.

Description

Method and device for screenshot of page

Technical field

This application relates to the field of computer application technology, and in particular to a method and device for screenshots of pages.

Background technique

Nowadays, some netizens use the Internet to carry out illegal activities such as plagiarism and misappropriation of other people's works, creating rumors, and selling prohibited items, which have brought bad social effects. In order to determine the responsibility of the offender, it is often necessary to conduct electronic evidence collection on the relevant pages. Page screenshots are a feasible method of electronic forensics. By using this method, the content displayed on pages related to illegal activities can be saved in the form of pictures, which can then be used as electronic evidence.

Summary of the invention

This application discloses a method and device for screenshots of a page.

According to the first aspect of the embodiments of the present application, a method for screenshotting a page is disclosed, which includes: in response to a screenshot request initiated by a user, obtaining the uniform resource locator URL of the page to be screenshotted; loading the page to be screenshotted according to the URL, And in the loading process, it is judged whether the page elements related to the screenshot are loaded; wherein the page elements related to the screenshot are the page elements specified by the user; in response to the page elements related to the screenshot being loaded, the loading of the page to be screenshot is stopped. , And take a screenshot of the loaded part of the page to be screenshot.

According to a second aspect of the embodiments of the present application, a method for screenshotting a page is disclosed, including: in response to a screenshot request initiated by a user, obtaining a uniform resource locator URL of the page to be screenshotted; and loading the page to be screenshotted according to the URL; When the loading of the page to be screenshot is completed, preprocessing the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot; and taking a screenshot of the page to be screenshot after the preprocessing is completed.

According to a third aspect of the embodiments of the present application, a page screenshot device is disclosed, including: a URL acquisition module, which, in response to a screenshot request initiated by a user, acquires the uniform resource locator URL of the page to be screenshot; The URL loads the page to be screenshot, and during the loading process it is judged whether the page elements related to the screenshot have been loaded; wherein the page elements related to the screenshot are page elements specified by the user; the execution module responds to the loading of the page elements related to the screenshot When finished, stop the loading of the page to be screenshot, and take a screenshot of the loaded part of the page to be screenshot.

According to a fourth aspect of the embodiments of the present application, a page screenshot device is disclosed, including: a URL acquisition module, which, in response to a screenshot request initiated by a user, acquires the uniform resource locator URL of the page to be screenshot; URL, load the page to be screenshot; page preprocessing module, when the page to be screenshot loaded is completed, preprocess the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot ; Screenshot execution module to take screenshots of the pre-processed page to be screenshot.

In the above technical solution, it is determined during the loading process of the page to be screenshot whether the page elements related to the screenshot have been loaded, thereby reducing the loading of useless resources. On the one hand, it reduces the waste of computer resources and on the other hand, it also reduces screenshots. The useless information in the results increases the proportion of the required information in the screenshot results.

Description of the drawings

The drawings here are incorporated into the specification and constitute a part of the specification, show embodiments conforming to the specification, and are used to explain the principle together with the text of the specification.

Fig. 1 is an example flow chart of a page screenshot method shown in this specification;

Fig. 2 is a schematic diagram of judging whether page elements related to screenshots are loaded as shown in this specification;

Figure 3 is an example diagram showing the comparison between the original page shown in this manual and the image obtained by the screenshot;

Fig. 4 is a structural example diagram of a page screenshot device shown in this specification;

FIG. 5 is a structural example diagram of an electronic device for taking screenshots of a page shown in this specification;

Fig. 6 is an example flow chart of a page screenshot method shown in this specification;

FIG. 7 is an example diagram of page comparison before and after page preprocessing shown in this specification;

Fig. 8 is a schematic diagram of a process for judging whether page elements related to screenshots have been loaded as shown in this specification;

Fig. 9 is a structural example diagram of a page screenshot device shown in this specification;

Fig. 10 is a structural example diagram of an electronic device for taking a screenshot of a page shown in this specification.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions in one or more embodiments of this specification, the following will combine the drawings in one or more embodiments of this specification to compare The technical solution is described clearly and completely. Obviously, the described embodiments are only a part of the embodiments, rather than all the embodiments. Based on one or more embodiments of this specification, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.

When the following description refers to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with this specification. Rather, they are merely examples of systems and methods consistent with some aspects of this specification as detailed in the appended claims.

The terms used in this specification are only for the purpose of describing specific embodiments, and are not intended to limit the specification. The singular forms of "a", "said" and "the" used in this specification and appended claims are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and/or" as used herein refers to and includes any or all possible combinations of one or more associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of this specification, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information. Depending on the context, the word "if" as used herein can be interpreted as "when" or "when" or "in response to determination".

In actual applications, the screenshot is usually taken by calling the webpage screenshot function of the browser to complete the screenshot of the page indicated by the user. However, the screenshots obtained by this method when processing long webpages often include a lot of information that is not related to the intent of the screenshot. When processing "rolling loading" webpages (such as Weibo, Tieba dynamics, etc.), the page size may be too large. Causes problems such as program crashes.

In view of this, this specification discloses a technical solution that dynamically determines whether the page loading needs to be stopped during the loading process of the page to be screenshotted, and after the page loading is stopped, only the part that has been loaded is captured.

In the implementation, during the loading process of the page to be screenshot, it is judged whether the page elements related to the screenshot in the page to be screenshot have been loaded. In response to the page elements related to the screenshot being loaded, the loading of the page to be screenshot can be stopped. ; After that, you can only take a screenshot of the loaded part of the page to be screenshot.

In the above technical solutions, on the one hand, because the page loading is suspended in time, it can avoid the program crash when taking a screenshot of a very long page; on the other hand, because the page loading is stopped after the page elements related to the screenshot are loaded. , Under the premise of ensuring that the image obtained by the final screenshot contains all the content related to the screenshot, the irrelevant content in the obtained image is reduced, the user experience is improved, and the waste of computer resources is reduced.

The application will be described below through specific embodiments in combination with specific application scenarios.

Please refer to FIG. 1. FIG. 1 is a page screenshot method provided by an embodiment of this specification, and the method executes steps S101 to S103.

S101: Parse the screenshot request initiated by the user, and obtain the uniform resource locator URL of the page to be screenshot.

S102, according to the URL, load the page to be screenshot; in the loading process, it is determined whether the page element related to the screenshot has been loaded; wherein the page element related to the screenshot is a page element designated by the user.

S103: In response to the completion of loading of page elements related to the screenshot, the loading is stopped, and a screenshot of the loaded part of the page to be screenshot is performed.

In this manual, the subject who executes the above method can choose according to specific conditions and specific needs. This manual does not need to be limited; for example, it can be a cloud server that receives screenshot requests from users through a network connection, or it can be a user’s personal computer. Receive user's screenshot request through the communication mechanism between software modules, and so on.

In an embodiment shown, the above method is applied to a distributed server cluster, the above screenshot request may include sub-requests corresponding to multiple pages to be screenshot; the above distributed server cluster may complete the matching according to a preset allocation algorithm. The screenshot task of the above multiple pages to be screenshotted.

In this manual, in response to a screenshot request initiated by the user, the uniform resource locator URL of the page to be screenshot can be obtained. Specifically, there are many ways to implement this process, which are not specifically limited in this specification; for example, the URL field of the page to be screenshot can be directly carried in the screenshot request, and the corresponding URL can be obtained by parsing the screenshot request, and the corresponding URL can also be carried in the screenshot request. For the keyword of the page to be screenshot, the corresponding URL can also be obtained indirectly through the keyword.

In one embodiment shown, the above screenshot request may carry a character string for indicating the page to be screenshot, after obtaining the character string from the request, the character string can be further parsed to obtain the URL of the page to be screenshot ; Among them, the analysis method can be semantic analysis based on natural language, or analysis based on specific codes such as short URLs, sharing codes, etc. Those skilled in the art can choose according to specific circumstances, and this specification does not specifically limit it;

For example, the screenshot request can carry "Alipay Weibo", a string indicating the official Weibo page of "Alipay." Semantic information, using the semantic information of "Alipay" and "Weibo" to query the preset mapping table of semantic information and URL, the URL of the official Weibo page of "Alipay" can be obtained as the URL of the page to be captured.

In this manual, after obtaining the URL of the page to be captured, the page to be captured can be loaded according to the URL. This process can be loaded using a normal browser or a Headless browser, which does not need to be limited in this application, and can be determined by those skilled in the art according to specific needs.

In this manual, during the loading process of the page to be screenshot described above, it can be judged whether the relevant elements of the screenshot have been loaded. Specifically, the trigger mechanism of this judgment is not specifically limited in this manual; for example, it can be based on a preset time interval. Periodic triggering can also be triggered based on the number of loaded elements on the page, or based on the capacity of the page file, or can be freely combined with the above-mentioned multiple triggering methods; for example, it can be triggered every 100 milliseconds. Whether the screenshot-related elements have been loaded is judged, and it can also be triggered every time 20 page elements are loaded to judge whether the above-mentioned screenshot-related elements have been loaded, and so on.

In this manual, screenshot-related elements specifically refer to elements related to the purpose of the screenshot, which can be specified by the user; specifically, the user-specified method can be based on the screenshot request or preset in the system; for example, a user-initiated screenshot request Can include strings such as "photographic stealing forensics", which are used to specify screenshot-related elements as photographic pictures in webpages; for example, users can preset that for all forum page screenshots, screenshot-related elements are It can be specified as the content of the forum user's speech, excluding related promotion information and so on.

In this manual, judging whether the screenshot-related page elements are loaded can be based on a variety of different standards, and this manual does not need to be specifically limited; for example, you can judge whether the screenshot-related page elements have been loaded from the perspective of the page elements themselves, or The angle of page elements that are not related to the screenshot is an indirect judgment on whether the page elements related to the screenshot have been loaded.

Figure 2 is a schematic diagram showing an implementation manner for judging whether screenshot-related page elements have been loaded. In this example, the screenshot-related page elements can be determined by judging whether the loaded elements include the last element in the screenshot-related page elements Whether the page elements of the screenshot have been loaded; if the loaded elements include the last element in the page elements related to the screenshot, it can be determined that the page elements related to the screenshot have been loaded; among them, the page elements related to the screenshot can be determined by the user as described above. The screenshot request is determined, or determined according to the user's preset; and because the page to be captured will first receive the page structure of the page issued by the page server during the loading process; for example, the page structure of the html tree structure Therefore, the last element in the page elements related to the screenshot can be determined by the summary structure of the page to be screenshot.

For example, if the user specifies to intercept user comment content in a page, the last element in the user comment content can be determined according to the tree structure of the html file; during the loading process, after detecting that the last element has been loaded, that is It can be considered that the page elements related to the screenshot have been loaded.

In the illustrated embodiment, it can be determined whether the loaded element contains a preset target element used to indicate that the screenshot-related element may be loaded; if so, it can be considered that the screenshot-related page element has been loaded. .

For example, when taking a screenshot for evidence collection on a page with stolen photographic pictures, since it is determined that the APP promotion information at the bottom of the page obviously does not belong to the relevant page elements that require forensics, the APP promotion information is detected as loaded content during the page loading process If it appears, it can be considered that the page element related to the screenshot (in this example, the stolen photographic image in the body of the page) has been loaded.

In the illustrated embodiment, the above-mentioned irrelevant elements of the screenshot may be page elements related to advertisements; for example, image advertisements at the bottom of the page, sharing inducing links, related article recommendations, etc.; those skilled in the art can specify by themselves according to specific needs The specific types of screenshot irrelevant elements.

In this manual, in response to the judgment result that the page elements related to the screenshot are loaded, you can stop the loading of the page to be screenshotd, and take a screenshot of the part of the page to be screenshot that has been loaded. For the specific screenshot method, please refer to Related technologies are not specifically limited in this manual.

In the illustrated embodiment, the final screenshot can be obtained by segmented screenshots; specifically, in the case where it is determined that the size of the loaded part of the page to be screenshot is greater than the preset size threshold, Divide the loaded part of the page to be screenshot into several slices, and record the positional relationship between the above several slices. After taking screenshots of the above several slices, you can divide the parts of the above several slices. The screenshot is stitched into a screenshot of the loaded part of the page to be screenshot according to the recorded positional relationship.

In this manual, before taking a screenshot of the loaded part of the page to be screenshot, you can also preprocess the loaded part of the page to be screenshot to obtain a better screenshot effect; specifically, The preprocessing method may include deleting screenshot interference elements in the page to be screenshotted; for example, deleting floating advertisements, recommendation information, shortcut buttons, etc. that may obscure screenshot-related elements.

In the illustrated embodiment, the above preprocessing can also include other preprocessing methods; for example, the collapsed elements in the page can be expanded, so that the collapsed content can be fully displayed and screenshots; for example, the specified element can be changed Display style, enhance the display effect of specified elements; for example, you can add screenshot markers to elements on the page to highlight the content that needs attention, and so on.

Please refer to Figure 3. Figure 3 is an example of a comparison between the original page of a page to be screenshotted and the image obtained from the screenshot; in the example of Figure 3, the text 1 and 2 of the main text are page elements related to the screenshot. It can be seen that after preprocessing , Floating ads that obscure the text of the text can be removed, the text of the text 2 that was originally folded and hidden can be expanded and displayed, the text of the text 1 that needs to be highlighted is marked with a screenshot, and the relevant recommendations at the end of the page and "Back The "Home" and "Add to Favorites" buttons can all be prevented from appearing in the final screenshot due to the stoppage of loading.

In this manual, the user's screenshot request can also carry other customized information to realize more custom features of the screenshot function.

In the illustrated embodiment, the user’s screenshot request can carry demand information indicating the preprocessing method. Correspondingly, when the above preprocessing process is performed, the preprocessing can be determined according to the demand information carried in the user’s screenshot request. Processing method, and further preprocessing the page to be screenshotted according to the determined preprocessing method;

For example, the user’s screenshot request can carry demand information indicating the need to remove floating ads and expand the hidden text. When the above preprocessing process is performed, the preprocessing method that needs to be performed can be determined according to the demand information, including removing interference content. (Floating advertisement) and expand hidden content (text), and perform preprocessing accordingly.

In the illustrated embodiment, the screenshot request initiated by the user may carry a specification identifier for indicating the screenshot specification; therefore, the screenshot page may be treated according to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user. Take a screenshot of the loaded part in the file;

For example, the specifications of the screenshot specifications, such as the image format, resolution, color specifications, etc. that indicate that the user needs web page screenshots, can be carried in the screenshot request initiated by the user. During the screenshot phase, the screenshot specifications indicated by the specifications can be marked according to the specifications. , To take a screenshot of the loaded part of the screenshot page.

In this manual, you can perform further processing on the image obtained by the screenshot to obtain a better screenshot effect.

In the illustrated embodiment, a machine learning model for determining the position of the interference information in the image can be used to determine the position of the interference information in the image obtained by the screenshot in the image, and further through image processing, The interference information at the above-mentioned location is removed from the image; specifically, the above-mentioned machine learning model may be a machine learning model obtained by training by taking screenshots of a number of pages marked with interference elements as training samples;

For example, an image obtained from a screenshot of a certain page still contains a certain type of advertising information, which will cause interference to the screenshot. Therefore, a machine learning model used to determine the position of this type of advertising information in the image can be called to locate the image from the image. Class advertising information, and remove it from the image through image processing algorithms. By applying this method, the interference information in the screenshot can be removed from the image angle, so as to obtain a screenshot with less interference information.

This specification also correspondingly provides a page screenshot device, please refer to FIG. 4, which is a structural example diagram of the device; the device includes: a URL acquisition module 401, a page loading module 402, and an execution module 403.

The URL obtaining module 401 obtains the uniform resource locator URL of the page to be captured in response to a screenshot request initiated by the user.

The page loading module 402 loads the page to be screenshotted according to the URL, and determines whether the page elements related to the screenshot have been loaded during the loading process; wherein the page elements related to the screenshot are page elements designated by the user.

The execution module 403, in response to the completion of loading of page elements related to the screenshot, stops the loading of the page to be screenshot, and takes a screenshot of the loaded part of the page to be screenshot.

In this manual, the URL obtaining module 401 can obtain the uniform resource locator URL of the page to be screenshot based on the screenshot request initiated by the user; the above process can be implemented in multiple ways, which is not specifically limited in this manual; for example, the above screenshot request can be Directly carry the URL field of the page to be screenshot, and the corresponding URL can be extracted directly from the screenshot request, or the above screenshot request can carry a string corresponding to the URL of the page to be screenshot, and then the corresponding URL can be obtained indirectly by means such as query.

In the illustrated embodiment, the above screenshot request may carry a character string for indicating the URL of the page to be screenshot, and the URL obtaining module 401 may further obtain the URL indicated by the character string by parsing the character string; For example, if the string is "abc Post Bar Homepage", it can be determined based on this string that the indicated URL is the URL of "abc Post Bar Homepage". Therefore, the URL of the page to be screenshot is the URL of "abc Post Bar Homepage"; , The method of determining the corresponding URL based on the character string can be through keyword query or semantic analysis; it can be flexibly selected according to actual development needs, and this specification does not specifically limit it.

In this manual, screenshot-related elements specifically refer to elements related to the purpose of the screenshot, which can be specified by the user; specifically, the user-specified method can be based on the screenshot request or preset in the system; for example, a user-initiated screenshot request Can include strings such as "photographic embezzlement forensics", the screenshot-related element is the photographic picture in the webpage; for example, the user can pre-set, for the screenshot of the forum page, the screenshot-related element can be the forum user The content of the speech, excluding relevant promotion information, etc.

In this specification, the page loading module 402 determines whether the page elements related to the screenshot are loaded. It can be based on a variety of different standards. This specification does not need to be specifically limited; for example, it can be judged from the perspective of the page elements related to the screenshot whether they are loaded. When finished, you can also indirectly judge whether the page elements related to the screenshot have been loaded from the perspective of the page elements that are not related to the screenshot.

In the illustrated embodiment, the page loading module 402 can determine whether the page elements related to the screenshot have been loaded by determining whether the loaded elements include the last element in the page elements related to screenshots; if the loaded elements include screenshots The last element in the relevant page elements, you can determine that the screenshot-related page elements have been loaded; among them, the screenshot-related page elements can be determined by the user's screenshot request as described above, or according to the preset screenshot page elements The type is determined; and because the page to be screenshotted will first receive the summary structure of the page in the form of an html tree structure during the loading process, the last element in the page elements related to the screenshot can be derived from the summary structure of the page to be screenshot Sure.

In the illustrated embodiment, the page loading module 402 can determine whether the page elements related to the screenshot are loaded by determining whether the loaded page elements include the screenshot irrelevant elements indicating that the page elements related to the screenshot have been loaded are completed; If the judgment result is yes, it is considered that the page elements related to the screenshot have been loaded; for example, the above-mentioned irrelevant elements of the screenshot are bottom-page advertisements. Generally speaking, the appearance of the bottom-page advertisement means that all the page elements related to the screenshots on the page have been loaded. Assuming that the advertisement at the bottom of the page has been loaded, a judgment can be made, that is, all page elements related to the screenshot on the page have been loaded.

In the illustrated embodiment, the above-mentioned irrelevant element of the screenshot may be a page element related to an advertisement; for example, an image advertisement at the bottom of the page, a sharing inducement link, a recommendation of a related article, and so on.

In this specification, the execution module 403 responds to the judgment result that the page elements related to the screenshot are loaded, it can stop loading the page to be screenshotd, and take screenshots of the loaded part of the page to be screenshotd, specifically to take screenshots The method can refer to related technologies, and this specification does not make specific restrictions.

In the illustrated embodiment, the execution module 403 may obtain the final screenshot by segmenting screenshots; specifically, when it is determined that the size of the loaded part of the page to be screenshot is greater than the preset size threshold Next, you can divide the loaded part of the page to be screenshot into several slices, and record the positional relationship between the above several slices. After taking screenshots of the several above The fragmented screenshots are spliced into a screenshot of the loaded part of the page to be screenshot according to the recorded position relationship.

In this specification, the device may also include a preprocessing module to preprocess the previously loaded part of the page to be captured to obtain a better screenshot effect; specifically, the preprocessing method may include deleting the captured screenshot. Interfering elements of screenshots on the page; for example, delete floating advertisements, recommendation information, shortcut buttons, etc. that may obscure screenshot-related elements.

In the illustrated embodiment, the above-mentioned pre-processing may also include other pre-processing methods; for example, hidden elements in the page may be expanded, so that the hidden content (for example, folded multi-layered quoted comments) can be fully displayed and displayed. Screenshot; for example, you can change the display style of the specified page element to enhance the display effect of the specified element; for another example, you can add screenshot markers to the elements on the page to highlight the content that needs attention, and so on.

In the illustrated embodiment, the user’s screenshot request may carry demand information indicating the preprocessing mode. Correspondingly, the aforementioned preprocessing module may determine the preprocessing method according to the demand information carried in the user’s screenshot request. Furthermore, according to the determined pre-processing method, the above-mentioned page to be screenshot is pre-processed.

In the illustrated embodiment, the screenshot request initiated by the user may carry a specification identifier for indicating the screenshot specification; therefore, the execution module 403 may follow the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user. , To take a screenshot of the loaded part of the screenshot page.

In this specification, the device may also include an image processing module, which can further process the image obtained by the screenshot to obtain a better screenshot effect.

In the illustrated embodiment, the above-mentioned image processing module may use the machine learning model obtained by training several screenshots of pages with the locations of the interference elements marked as training samples to further process the images obtained by the screenshots; specifically; In other words, the above-mentioned image can be input into the trained machine learning model to determine the position of the interference element in the image, and further based on the position, image processing is performed on the image to delete the above-mentioned interference element.

For the implementation process of the functions and roles of each module in the above-mentioned device, please refer to the implementation process of the corresponding steps in the above-mentioned method for details, which will not be repeated here.

The embodiments of this specification also provide a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the aforementioned page screenshot method when the program is executed.

FIG. 5 shows a more specific hardware structure diagram of a computing device provided by an embodiment of this specification. The device may include a processor 510, a memory 520, an input/output interface 530, a communication interface 540, and a bus 550. The processor 510, the memory 520, the input/output interface 530, and the communication interface 540 realize the communication connection between each other in the device through the bus 550.

The processor 510 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and execute related programs. In order to realize the technical solutions provided in the embodiments of this specification.

The memory 520 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc. The memory 520 may store an operating system and other application programs. When the technical solutions provided in the embodiments of the present specification are implemented through software or firmware, related program codes are stored in the memory 520 and called and executed by the processor 510.

The input/output interface 530 is used to connect an input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be connected to the device to provide corresponding functions. The input device may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and an output device may include a display, a speaker, a vibrator, an indicator light, and the like.

The communication interface 540 is used to connect a communication module (not shown in the figure) to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), or through wireless means (such as mobile network, WIFI, Bluetooth, etc.).

The bus 550 includes a path to transmit information between various components of the device (for example, the processor 510, the memory 520, the input/output interface 530, and the communication interface 540).

It should be noted that although the above device only shows the processor 510, the memory 520, the input/output interface 530, the communication interface 540, and the bus 550, in the specific implementation process, the device may also include the necessary equipment for normal operation. Other components. In addition, those skilled in the art can understand that the above-mentioned device may also include only the components necessary to implement the solutions of the embodiments of the present specification, and not necessarily include all the components shown in the figures.

The embodiment of the present specification also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the aforementioned page screenshot method is implemented.

Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

In practical applications, the screenshot is usually taken by calling the webpage screenshot function of the browser to complete the screenshot of the page indicated by the user. However, the screenshots obtained by this method when processing complex webpages often include a lot of elements that interfere with the information related to the screenshots, resulting in accidental obscuration of key information that needs to be obtained.

In view of this, this specification discloses a technical solution for preprocessing the page to be screenshot before taking a screenshot of the page to be screenshot.

In implementation, after the loading of the page to be screenshot is completed, preprocessing operations including removing interference elements are performed on the page to be screenshot; after that, screenshots can be taken on the page to be screenshot that has been preprocessed.

In the above technical solution, since the interfering elements in the original page are removed by preprocessing, the interfering elements in the screenshot result or the accidental occlusion of the screenshot-related elements are avoided, and the integrity of the screenshot-related elements in the page is guaranteed .

Please refer to FIG. 6. FIG. 6 is a page screenshot method provided by an embodiment of this specification, and the method executes steps S601 to S504.

S601: In response to a screenshot request initiated by the user, obtain the uniform resource locator URL of the page to be screenshotted.

S602: Load the page to be captured according to the URL.

S603: After the loading of the page to be screenshot is completed, perform preprocessing on the page to be screenshot; the preprocessing includes deleting screenshot interference elements in the page to be screenshot.

S604: Take a screenshot of the pre-processed page to be screenshot.

In one embodiment shown, the above screenshot request may carry a character string for indicating the page to be screenshot, after obtaining the character string from the request, the character string can be further parsed to obtain the URL of the page to be screenshot ; Wherein, the analysis method can be semantic analysis based on natural language, or analysis based on specific codes such as short URLs, sharing codes, etc. Those skilled in the art can choose according to specific circumstances, and this specification does not specifically limit it.

In this manual, after obtaining the URL of the page to be captured, the page to be captured can be loaded according to the URL. This process can be loaded using a normal browser or a Headless browser. This application does not need to be limited and can be determined according to specific needs.

In this manual, the above page to be screenshot can be preprocessed to obtain a better screenshot effect; specifically, the preprocessing method can include deleting the screenshot interference elements in the page to be screenshotted; for example, deleting the related elements that may block the screenshot Element's floating advertisement, recommendation information, shortcut button, etc.

In the illustrated embodiment, the above preprocessing may also include other preprocessing methods; for example, the hidden elements in the page can be expanded, so that the hidden and collapsed content can be fully displayed and screenshots; for example, the specified element can be changed The display style of to enhance the display effect of the specified element; for example, you can add screenshot markers to the elements on the page to highlight the content that needs attention, and so on.

In the illustrated embodiment, the user’s screenshot request can carry demand information indicating the preprocessing method. Correspondingly, when the above preprocessing process is performed, the preprocessing can be determined according to the demand information carried in the user’s screenshot request. Processing method, and further preprocessing the page to be captured according to the determined preprocessing method.

For example, the user’s screenshot request can carry demand information indicating the need to remove floating ads and expand and collapse the text. When the above preprocessing process is performed, the preprocessing method that needs to be performed can be determined according to the demand information, including removing interference content. (Floating advertisement) and expand the collapsed content (text), and perform preprocessing accordingly.

Please refer to Figure 7. Figure 7 is a comparison example of a page to be screenshot before and after preprocessing; in the example of Figure 7, text 1 and 2 of the text are page elements related to the screenshot. The floating advertisement of the main text can be removed, the original text of the collapsed text 2 can be expanded and displayed, the text 1 of the main text that needs to be reminded is marked with a screenshot, and the relevant recommendation at the end of the page and "Back to home page"" The "Add to Favorites" button can also be regarded as interference elements removed and will not appear in the final screenshot.

In this manual, during the loading process of the page to be screenshot described above, it can also be judged whether the screenshot-related elements have been loaded, and in response to the page elements related to the screenshot being loaded, the loading of the page to be screenshot can be stopped; In other words, the trigger mechanism of this judgment is not specifically limited in this specification; for example, it can be triggered periodically according to a preset time interval, it can also be triggered according to the number of loaded elements in the page, or it can be triggered according to the capacity of the page file , Or you can freely combine the above-mentioned multiple triggering methods. Stopping the page loading in time can reduce the loading of irrelevant elements in the screenshot and increase the proportion of screenshot-related elements in the final screenshot while ensuring that the page elements related to the screenshot are not missing.

In this manual, screenshot-related elements specifically refer to elements related to the purpose of the screenshot, which can be specified by the user; specifically, the user-specified method can be based on the screenshot request or preset in the system; for example, user-initiated screenshots The request can include a string of "photographic stealing forensics", which is used to specify the screenshot-related elements as the photographic pictures in the webpage; for example, the user can preset that for all the screenshots of the forum page, the screenshot-related elements All can be designated as the content of forum users' speeches, excluding relevant promotion information and so on.

FIG. 8 is a schematic diagram showing an implementation manner for judging whether page elements related to screenshots are loaded. In this example, it is possible to determine whether screenshots are relevant by judging whether the loaded elements include the last element in the page elements related to screenshots Whether the page elements of the screenshot have been loaded; if the loaded elements include the last element in the page elements related to the screenshot, it can be determined that the page elements related to the screenshot have been loaded; among them, the page elements related to the screenshot can be determined by the user as described above. The screenshot request is determined, or determined according to the user's preset; and because the page to be captured will first receive the page structure of the page issued by the page server during the loading process; for example, the page structure of the html tree structure Therefore, the last element in the page elements related to the screenshot can be determined by the summary structure of the page to be screenshot.

In the illustrated embodiment, the final screenshot can be obtained by segmenting screenshots; specifically, in the case where it is determined that the size of the pre-processed page to be screenshot is greater than the preset size threshold, the After the screenshot page is divided into several fragments, and the positional relationship between the above several fragments is recorded, after taking screenshots of the above several fragments, the screenshots of the above several fragments can be taken according to the recorded positions Relationship, spliced into a screenshot of the page to be screenshot.

In the illustrated embodiment, the screenshot request initiated by the user may carry a specification identifier for indicating the screenshot specification; therefore, the screenshot page may be treated according to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user. Take a screenshot;

For example, the specifications of the screenshot specifications, such as the image format, resolution, color specifications, etc. that indicate that the user needs web page screenshots, can be carried in the screenshot request initiated by the user. During the screenshot phase, the screenshot specifications indicated by the specifications can be marked according to the specifications. , To take a screenshot of the page to be taken.

This manual also provides a page screenshot device, please refer to Figure 9, Figure 9 is a structural example of the device; the device includes: URL acquisition module 901, page loading module 902, page preprocessing module 903, screenshots Execute module 904.

The URL obtaining module 901 obtains the uniform resource locator URL of the page to be captured in response to the screenshot request initiated by the user.

The page loading module 902 loads the page to be captured according to the URL.

The page preprocessing module 903 performs preprocessing on the page to be screenshot after the page to be screenshot is loaded; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot.

The screenshot execution module 904 performs screenshots on the pre-processed page to be screenshot.

In this specification, the URL obtaining module 901 may obtain the uniform resource locator URL of the page to be captured in response to a screenshot request initiated by the user. Specifically, there are many ways to implement this process, which are not specifically limited in this specification; for example, the URL field of the page to be screenshot can be directly carried in the screenshot request, and the corresponding URL can be obtained by parsing the screenshot request, and the corresponding URL can also be carried in the screenshot request. For the keyword of the page to be screenshot, the corresponding URL can also be obtained indirectly through the keyword.

In this specification, the page preprocessing module 903 in the device preprocesses the page to be screenshotted to obtain better screenshot effects; specifically, the preprocessing method may include deleting the screenshot interference in the page to be screenshotted. Elements; for example, delete floating ads, recommended information, shortcut buttons, etc. that may obscure elements related to the screenshot.

In this specification, the device may also include a dynamic loading module, which determines whether the page elements related to the screenshot have been loaded during the loading process; and in response to the completion of the page elements related to the screenshot, the loading of the page to be captured is stopped.

In this manual, the dynamic loading module can determine whether the page elements related to the screenshots are loaded according to a variety of different standards, and this manual does not need to make specific restrictions; for example, you can judge whether the page elements related to the screenshots have been loaded from the perspective of the page elements themselves. It is also possible to indirectly judge whether the page elements related to the screenshot have been loaded from the perspective of the page elements that are not related to the screenshot.

In the illustrated embodiment, the above-mentioned dynamic loading module may further determine whether the page elements related to the screenshot are loaded by determining whether the loaded elements include the last element in the page elements related to the screenshot; if the loaded elements include The last element in the screenshot-related page elements, you can determine that the screenshot-related page elements have been loaded; among them, the screenshot-related page elements can be determined by the user's screenshot request as described above, or according to the preset The element type of the screenshot page is determined; and because the page to be screenshot is loaded, it will first receive the summary structure of the page in the form of an html tree structure, so the last element in the page elements related to the screenshot can be used by the page to be screenshot The summary structure is determined.

In the illustrated embodiment, the dynamic loading module may further determine whether the page elements related to the screenshot are loaded by determining whether the loaded page elements include a screenshot irrelevant element indicating that the page element related to the screenshot is loaded is completed; If the judgment result is yes, it is considered that the page elements related to the screenshot have been loaded; for example, the above-mentioned irrelevant elements of the screenshot are bottom-page advertisements. Generally speaking, the appearance of the bottom-page advertisement means that all the page elements related to the screenshots on the page have been loaded. Assuming that the advertisement at the bottom of the page has been loaded, a judgment can be made, that is, all page elements related to the screenshot on the page have been loaded.

In this specification, the screenshot execution module 904 can take a screenshot of the preprocessed page to be screenshot, and the specific way of taking the screenshot can refer to related technologies, and this specification does not make specific restrictions.

In the illustrated embodiment, the screenshot execution module 904 can obtain the final screenshot by segmented screenshot; specifically, when the size of the preprocessed page to be screenshot is greater than the preset size threshold, You can divide the page to be screenshot to be preprocessed into several fragments, and record the positional relationship between the above several fragments. After taking screenshots of the above several fragments, you can combine the above several fragments. The screenshot is spliced into a screenshot of the preprocessed page to be screenshot based on the recorded position relationship.

In the illustrated embodiment, the screenshot request initiated by the user may carry a specification identifier for indicating the screenshot specification; therefore, the above-mentioned screenshot execution module 904 may perform screenshots according to the specification identifier carried in the screenshot request initiated by the user. Specifications, take screenshots of the pre-processed page to be screenshot.

In the illustrated embodiment, the above-mentioned image processing module may use the machine learning model obtained by training several screenshots of pages with the locations of the interference elements marked as training samples to further process the images obtained by the screenshots; specifically; In other words, the above-mentioned image can be input to the trained machine learning model to determine the position of the interference element in the image, and further based on the position, image processing is performed on the image to delete the above-mentioned interference element.

The embodiments of this specification also provide a computer device, which includes at least a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the aforementioned page screenshot method when the program is executed.

FIG. 10 shows a more specific hardware structure diagram of a computing device provided by an embodiment of this specification. The device may include a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. The processor 1010, the memory 1020, the input/output interface 1030, and the communication interface 1040 realize the communication connection between each other in the device through the bus 1050.

The processor 1010 may be implemented by a general CPU (Central Processing Unit, central processing unit), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc., and execute related programs. In order to realize the technical solutions provided in the embodiments of this specification.

The memory 1020 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc. The memory 1020 may store an operating system and other application programs. When the technical solutions provided in the embodiments of this specification are implemented by software or firmware, related program codes are stored in the memory 1020 and called and executed by the processor 1010.

The input/output interface 1030 is used to connect an input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be connected to the device to provide corresponding functions. The input device may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and an output device may include a display, a speaker, a vibrator, an indicator light, and the like.

The communication interface 1040 is used to connect a communication module (not shown in the figure) to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), or through wireless means (such as mobile network, WIFI, Bluetooth, etc.).

The bus 1050 includes a path to transmit information between various components of the device (for example, the processor 1010, the memory 1020, the input/output interface 1030, and the communication interface 1040).

It should be noted that although the above device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040, and the bus 1050, in the specific implementation process, the device may also include the equipment necessary for normal operation. Other components. In addition, those skilled in the art can understand that the above-mentioned device may also include only the components necessary to implement the solutions of the embodiments of the present specification, and not necessarily include all the components shown in the figures.

From the description of the foregoing implementation manners, it can be understood that those skilled in the art can clearly understand that the embodiments of this specification can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solutions of the embodiments of this specification can be embodied in the form of software products, which can be stored in storage media, such as ROM/RAM, A magnetic disk, an optical disk, etc., include several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in the various embodiments or some parts of the embodiments of this specification.

The systems, devices, modules, or units illustrated in the above embodiments may be specifically implemented by computer chips or entities, or implemented by products with certain functions. A typical implementation device is a computer. The specific form of the computer can be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email receiving and sending device, and a game control A console, a tablet computer, a wearable device, or a combination of any of these devices.

The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment. The device embodiments described above are merely illustrative. The modules described as separate components may or may not be physically separated. The functions of the modules can be combined in the same way when implementing the solutions of the embodiments of this specification. Or multiple software and/or hardware implementations. It is also possible to select some or all of the modules according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement without creative work.

The above are only specific implementations of the embodiments of this specification. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the embodiments of this specification, several improvements and modifications can be made. These Improvements and retouching should also be regarded as the protection scope of the embodiments of this specification.

Claims

A page screenshot method, including:

In response to the screenshot request initiated by the user, obtain the uniform resource locator URL of the page to be screenshotted;

Load the page to be screenshotted according to the URL, and determine whether the page element related to the screenshot has been loaded during the loading process; wherein the page element related to the screenshot is a page element designated by the user;

In response to the completion of loading of page elements related to the screenshot, the loading of the page to be screenshot is stopped, and a screenshot of the loaded part of the page to be screenshot is performed.
The method according to claim 1, wherein the screenshot request initiated by the user carries the URL or a string used to indicate the URL of the page to be screenshotted;

The obtaining the uniform resource locator URL of the page to be screenshot includes:

Obtain the URL carried in the screenshot request initiated by the user; or,

Obtain the character string carried in the screenshot request initiated by the user, and parse the character string to obtain the URL indicated by the character string.
The method according to claim 1, wherein said determining whether the page elements related to the screenshot are loaded completely comprises:

Judging whether the loaded page element contains a screenshot irrelevant element indicating that the screenshot-related page element has been loaded;

If it is, it is determined that the page elements related to the screenshot have been loaded.
The method according to claim 3, wherein the screenshot irrelevant elements include page elements related to advertisements.
The method according to claim 1, wherein said determining whether the page elements related to the screenshot are loaded completely comprises:

Determine the last element of the page elements related to the screenshot based on the page structure of the page to be screenshotted;

Determine whether the last element is included in the loaded element;

If the last element is included in the loaded elements, it is determined that the page elements related to the screenshot have been loaded.
The method according to claim 1, before taking a screenshot of the loaded part of the page to be screenshot, the method further comprises:

Preprocessing is performed on the loaded part of the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot.
The method according to claim 6, wherein the preprocessing further comprises any one or a combination of the following preprocessing methods:

Expand hidden elements in the page;

Change the display style of the specified page elements;

Add screenshot markers to elements on the page.
The method according to claim 6, wherein the user's screenshot request carries demand information indicating a preprocessing mode;

The preprocessing of the page to be screenshot includes:

Determine a preprocessing mode according to the demand information carried in the screenshot request of the user, and perform preprocessing on the page to be captured according to the determined preprocessing mode.
The method according to claim 1, wherein said taking a screenshot of the loaded part of the page to be screenshot, comprising:

Determining whether the size of the loaded part of the page to be screenshot is greater than a preset threshold;

If so, divide the loaded part of the page to be screenshot into several fragments, and record the positional relationship between the several fragments;

Take screenshots of the several segments respectively;

The screenshots of the several fragments are spliced into screenshots of the loaded part of the page to be screenshot according to the recorded position relationship.
The method according to claim 1, further comprising:

The image of the page to be screenshot obtained by the screenshot is input into the recognition model to identify the position of the interference element in the image obtained by the screenshot; wherein the recognition model is to take screenshots of a number of pages marked with the position of the interference element as a training sample , The trained machine learning model;

The page element located at the identified position in the image obtained by the screenshot is deleted as an interference element.
The method according to claim 1, wherein the screenshot request initiated by the user carries a specification identifier for indicating the screenshot specification;

The screenshot of the loaded part of the page to be screenshot includes:

According to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user, a screenshot is taken of the loaded part of the page to be screenshot.
A page screenshot method, including:

In response to the screenshot request initiated by the user, obtain the uniform resource locator URL of the page to be screenshotted;

Load the page to be captured according to the URL;

After the loading of the page to be screenshot is completed, preprocessing the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot;

Take a screenshot of the pre-processed page to be screenshot.
The method according to claim 12, wherein the screenshot request initiated by the user carries the URL, or a string used to indicate the URL of the page to be screenshotted;

The obtaining the uniform resource locator URL of the page to be screenshot includes:

Obtain the URL carried in the screenshot request initiated by the user; or,

Obtain the character string carried in the screenshot request initiated by the user, and parse the character string to obtain the URL address indicated by the character string.
The method according to claim 12, the preprocessing further comprises any one or a combination of the following preprocessing methods:

Expand hidden elements in the page;

Change the display style of the specified page elements;

Add screenshot markers to elements on the page.
The method according to claim 12, wherein the user's screenshot request carries demand information indicating a preprocessing mode;

The preprocessing of the page to be screenshot includes:

Determine a preprocessing mode according to the demand information carried in the screenshot request of the user, and perform preprocessing on the page to be captured according to the determined preprocessing mode.
The method according to claim 12, the method further comprising:

In the process that the page to be screenshot is loaded, it is determined whether the page element related to the screenshot is loaded; wherein the page element related to the screenshot is a page element designated by the user;

In response to the completion of loading of the page elements related to the screenshot, the loading is stopped.
The method according to claim 16, said determining whether the page elements related to the screenshot have been loaded, comprising:

Judging whether the loaded page element contains a screenshot irrelevant element indicating that the screenshot-related page element has been loaded;

If it is, it is determined that the page elements related to the screenshot have been loaded.
The method according to claim 17, wherein the screenshot irrelevant elements include page elements related to advertisements.
The method according to claim 16, said determining whether the page elements related to the screenshot have been loaded, comprising:

Determine the last element of the page elements related to the screenshot based on the page structure of the page to be screenshotted;

Determine whether the last element is included in the loaded element;

If the last element is included in the loaded elements, it is determined that the page elements related to the screenshot have been loaded.
The method according to claim 12, wherein the screenshot of the page to be screenshot after the preprocessing is completed includes:

Determining whether the size of the page to be screenshot after the preprocessing is completed is greater than a preset size threshold;

If yes, divide the pre-processed page to be screenshot into several fragments, and record the positional relationship between the several fragments;

Take screenshots of the several segments respectively;

The screenshots of the several fragments are spliced into a screenshot of the page to be screenshot that is completed by the preprocessing according to the recorded position relationship.
The method according to claim 12, the method further comprising:

The image of the page to be screenshot obtained by the screenshot is input into the recognition model to identify the position of the interference element in the image obtained by the screenshot; wherein the recognition model is to take screenshots of a number of pages marked with the position of the interference element as a training sample , The trained machine learning model;

The page element located at the identified position in the image obtained by the screenshot is deleted as an interference element.
The method according to claim 12, wherein the screenshot request initiated by the user carries a specification identifier for indicating the screenshot specification;

The screenshot of the pre-processed page to be screenshotted includes:

According to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user, a screenshot is performed on the page to be screenshot after the preprocessing is completed.
A page screenshot device, including:

The URL acquisition module, in response to the screenshot request initiated by the user, acquires the uniform resource locator URL of the page to be screenshotted;

The page loading module loads the page to be screenshotted according to the URL, and determines whether the page elements related to the screenshot have been loaded during the loading process; wherein the page elements related to the screenshot are page elements specified by the user;

The execution module, in response to the completion of loading of page elements related to the screenshot, stops the loading of the page to be screenshot, and takes a screenshot of the loaded part of the page to be screenshot.
The device according to claim 23, wherein the screenshot request initiated by the user carries the URL or a string used to indicate the URL of the page to be screenshotted;

The URL acquisition module further:

Obtain the URL carried in the screenshot request initiated by the user; or,

Obtain the character string carried in the screenshot request initiated by the user, and parse the character string to obtain the URL indicated by the character string.
The device according to claim 23, the page loading module further:

Judging whether the loaded page element contains a screenshot irrelevant element indicating that the screenshot-related page element has been loaded;

If it is, it is determined that the page elements related to the screenshot have been loaded.
The apparatus according to claim 25, wherein the screenshot irrelevant elements include page elements related to advertisements.
The device according to claim 23, the page loading module further:

Determine the last element of the page elements related to the screenshot based on the page structure of the page to be screenshotted;

Determine whether the last element is included in the loaded element;

If the last element is included in the loaded elements, it is determined that the page elements related to the screenshot have been loaded.
The device according to claim 23, the device further comprising:

The preprocessing module performs preprocessing on the loaded part of the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot.
The device according to claim 28, wherein the preprocessing further comprises any one or a combination of the following preprocessing methods:

Expand hidden elements in the page;

Change the display style of the specified page elements;

Add screenshot markers to elements on the page.
The device according to claim 28, wherein the user's screenshot request carries demand information indicating a preprocessing mode;

The preprocessing module further:

Determine a preprocessing mode according to the demand information carried in the screenshot request of the user, and perform preprocessing on the page to be captured according to the determined preprocessing mode.
The device according to claim 23, the execution module further:

Determining whether the size of the loaded part of the page to be screenshot is greater than a preset threshold;

If so, divide the loaded part of the page to be screenshot into several fragments, and record the positional relationship between the several fragments;

Take screenshots of the several segments respectively;

The screenshots of the several fragments are spliced into screenshots of the loaded part of the page to be screenshot according to the recorded position relationship.
The device according to claim 23, the device further comprising an image processing module,

The image of the page to be screenshot obtained by the screenshot is input into the recognition model to identify the position of the interference element in the image obtained by the screenshot; wherein the recognition model is to take screenshots of a number of pages marked with the position of the interference element as a training sample , The trained machine learning model;

The page element located at the identified position in the image obtained by the screenshot is deleted as an interference element.
The device according to claim 23, wherein the screenshot request initiated by the user carries a specification identifier for indicating a screenshot specification;

The execution module further:

In response to the completion of the loading of the page elements related to the screenshot, the loading is stopped, and according to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user, the loaded part of the page to be screenshot is performed screenshot.
A page screenshot device, including:

The URL acquisition module, in response to the screenshot request initiated by the user, acquires the uniform resource locator URL of the page to be screenshotted;

The page loading module loads the page to be captured according to the URL;

The page preprocessing module, when the page to be screenshot is loaded, preprocesses the page to be screenshot; the preprocessing includes deleting the screenshot interference elements in the page to be screenshot;

The screenshot execution module takes screenshots of the pre-processed page to be screenshot.
The device according to claim 34, wherein the screenshot request initiated by the user carries the URL, or a string used to indicate the URL of the page to be screenshotted;

The URL acquisition module further:

Obtain the URL carried in the screenshot request initiated by the user; or,

Obtain the character string carried in the screenshot request initiated by the user, and parse the character string to obtain the URL address indicated by the character string.
The device according to claim 34, wherein the preprocessing further comprises any one or a combination of the following preprocessing methods:

Expand hidden elements in the page;

Change the display style of the specified page elements;

Add screenshot markers to elements on the page.
The device according to claim 34, wherein the user's screenshot request carries demand information indicating a preprocessing mode;

The page preprocessing module further:

Determine a preprocessing mode according to the demand information carried in the screenshot request of the user, and perform preprocessing on the page to be captured according to the determined preprocessing mode.
The device according to claim 34, further comprising a dynamic loading module,

In the process that the page to be screenshot is loaded, it is determined whether the page element related to the screenshot is loaded; wherein the page element related to the screenshot is a page element designated by the user;

In response to the completion of loading of the page elements related to the screenshot, the loading is stopped.
The device according to claim 38, the dynamic loading module further:

Judging whether the loaded page element contains a screenshot irrelevant element indicating that the screenshot-related page element has been loaded;

If it is, it is determined that the page elements related to the screenshot have been loaded.
The apparatus of claim 39, wherein the screenshot irrelevant elements include page elements related to advertisements.
The device according to claim 38, the dynamic loading module further:

Determine the last element of the page elements related to the screenshot based on the page structure of the page to be screenshotted;

Determine whether the last element is included in the loaded element;

If the last element is included in the loaded elements, it is determined that the page elements related to the screenshot have been loaded.
The device according to claim 34, the screenshot execution module further:

Determining whether the size of the page to be screenshot after the preprocessing is completed is greater than a preset size threshold;

If yes, divide the pre-processed page to be screenshot into several fragments, and record the positional relationship between the several fragments;

Take screenshots of the several segments respectively;

The screenshots of the several fragments are spliced into a screenshot of the page to be screenshot that is completed by the preprocessing according to the recorded position relationship.
The device according to claim 34, further comprising an image processing module,

The image of the page to be screenshot obtained by the screenshot is input into the recognition model to identify the position of the interference element in the image obtained by the screenshot; wherein the recognition model is to take screenshots of a number of pages marked with the position of the interference element as a training sample , The trained machine learning model;

The page element located at the identified position in the image obtained by the screenshot is deleted as an interference element.
The device according to claim 34, wherein the screenshot request initiated by the user carries a specification identifier for indicating the screenshot specification;

The screenshot execution module further:

According to the screenshot specification indicated by the specification identifier carried in the screenshot request initiated by the user, a screenshot is performed on the page to be screenshot after the pre-processing is completed.
A computer device comprising at least a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the method described in any one of claims 1-11 when the processor executes the program.
A computer device comprising at least a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the method according to any one of claims 12-22 when the processor executes the program.