CN117951400A

CN117951400A - Page processing method, device, electronic equipment and storage medium

Info

Publication number: CN117951400A
Application number: CN202211275393.8A
Authority: CN
Inventors: 王洪琪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-10-18
Filing date: 2022-10-18
Publication date: 2024-04-30

Abstract

The embodiment of the application provides a page processing method, a page processing device, electronic equipment and a storage medium, and relates to the technical field of computers. The page processing method comprises the following steps: receiving a page screenshot request aiming at a URL of a target uniform resource positioning system; acquiring a current page of at least one browser instance running currently based on the page screenshot request; if the cache page of the target URL does not exist in the current page of the at least one browser instance, determining the target page from the at least one current page based on the page state of the at least one current page; the page state comprises an expired page, an unused page or an unused page; rendering the target URL by adopting the target page, and generating a page screenshot aiming at the target URL. The number of the simultaneously processed page screenshot requests is increased, and meanwhile, the error reporting of the browser can be avoided.

Description

Page processing method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computers, and in particular, to a page processing method, apparatus, electronic device, computer readable storage medium, and computer program product.

Background

When browsing a page, a screen may be captured on some of the pages, or some of the regions of the page. In the related art, the browser page screenshot function may be implemented by a program debugging tool, for example, browser page screenshot may be performed by Puppeteer. Puppeteer can be used to simulate the operation of a browser.

Currently, when a screenshot is performed on a browser page, when a screenshot request for a target URL (uniform resource locator system, uniform resource locator) is received, that is, one browser instance in the browser is initialized, and a page is newly built to render the target URL, so that a corresponding screenshot is generated. When the screenshot request frequency is higher, browser error reporting is easy to occur because repeated restarting is needed to generate the screenshot.

Disclosure of Invention

The embodiment of the application aims to provide a page processing method, a device, electronic equipment and a storage medium capable of avoiding error reporting of a browser when a screenshot request frequency is high, and in order to achieve the above purpose, the technical scheme provided by the embodiment of the application is as follows:

In one aspect, an embodiment of the present application provides a method for processing a page, where the method includes:

receiving a page screenshot request aiming at a URL of a target uniform resource positioning system;

acquiring a current page of at least one browser instance running currently based on the page screenshot request;

If the cache page of the target URL does not exist in the current page of the at least one browser instance, determining the target page from the at least one current page based on the page state of the at least one current page; the page state comprises an expired page, an unused page or an unused page;

Rendering the target URL by adopting the target page, and generating a page screenshot aiming at the target URL.

In some possible embodiments, the method further comprises:

if the cached page of the target URL exists in the current page of the at least one browser instance, generating a page screenshot aiming at the target URL based on the cached page of the target URL.

In some possible embodiments, determining the target page from the at least one current page based on the page status of the at least one current page includes:

And if the at least one unused page or the expired page exists in the at least one current page based on the page state, taking the unused page or the expired page as a target page.

If it is determined that at least one current page is an unexpired page based on the usage status, determining a target page from the at least one current page based on the page status of the at least one current page, including:

and taking the unexpired page with the earliest time as a target page in the at least one unexpired page.

In some possible embodiments, the method further comprises:

And if at least one of the following conditions is detected, sequentially and respectively executing restarting operations on at least one browser instance:

The current time reaches the set time;

the time interval from the last restarting operation reaches a preset time interval;

The memory occupation of at least one browser instance in the browser reaches a preset threshold;

wherein, for each browser instance, the restart operation includes:

And determining the number of pages in the browser instance, initializing the browser instance, and newly creating new pages corresponding to the number of pages in the initialized browser instance.

In some possible embodiments, the method further comprises:

marking a browser instance in the process of executing the restarting operation; marking that the page representing the browser instance cannot be used as the page for generating the screenshot;

if the restart operation for the browser instance is detected to be completed, the mark is canceled.

In some possible implementations, the at least one browser instance is restarted via the at least one container; each container is used for running at least one browser instance;

Sequentially and respectively restarting at least one browser instance, including:

respectively determining a first restarting time corresponding to each container; the first restart time is the start restart time of the container;

determining, for each container, a second restart time for each browser instance in the container;

And restarting each browser instance in each container sequentially based on the first restarting time corresponding to each container and the second restarting time of each browser instance in each container.

In some possible embodiments, determining the first restart time corresponding to each container includes:

determining the restarting starting moment of the browser instance;

sequentially determining a first restarting time corresponding to each container based on the restarting starting time, the first number of at least one container and a first preset time interval;

determining, for each container, a second restart time for each browser instance in the container, comprising:

For each container, a second restart time for each browser instance in each container is determined in turn based on the first restart time for each container, the second number of browser instances in each container, and the second preset time interval.

In some possible embodiments, the method further comprises:

If the currently running page is not obtained, storing a page screenshot request aiming at the target URL into a request task queue;

And if the screenshot generation notification is received, processing the page screenshot request stored in the request task queue earliest.

In some possible implementations, rendering a target URL with a target page, generating a page screenshot for the target URL includes:

Rendering the target URL by adopting the target page, and recording a time stamp for starting rendering;

if the delay identifier of the target URL is detected, determining delay time length corresponding to the delay identifier;

then a page screenshot for the target URL is generated after the time stamp to begin rendering is delayed by a time delay period.

In another aspect, an embodiment of the present application provides a page processing apparatus, including:

the receiving module is used for receiving a page screenshot request aiming at a target uniform resource positioning system URL;

the acquisition module is used for acquiring a current page of at least one browser instance running currently based on the page screenshot request;

The determining module is used for determining a target page from at least one current page based on the page state of the at least one current page if the cache page of the target URL does not exist in the current page of the at least one browser instance; the page state comprises an expired page, an unused page or an unused page;

and the rendering module is used for rendering the target URL by adopting the target page and generating a page screenshot aiming at the target URL.

In some possible implementations, the method further includes a generating module configured to generate a screenshot for the target URL based on the cached page of the target URL if the cached page of the target URL exists in the current page of the at least one browser instance.

In some possible embodiments, the determining module is specifically configured to, when determining the target page from the at least one current page based on the page status of the at least one current page:

And if the at least one current page is determined to be the unexpired page based on the use state, using the unexpired page with the earliest use time as the target page in the at least one unexpired page.

In some possible embodiments, the method further includes a restart module configured to:

The current time reaches the set time;

the restarting module is specifically configured to:

In some possible embodiments, the method further comprises a marking module for:

The restarting module is specifically configured to, when sequentially performing restarting operations on at least one browser instance, respectively:

In some possible embodiments, the restarting module is specifically configured to, when determining the first restarting time corresponding to each container separately:

determining the restarting starting moment of the browser instance;

In some possible embodiments, the device further comprises a storage module for:

In some possible embodiments, the rendering module is specifically configured to, when rendering the target URL with the target page and generating the screenshot for the target URL:

In another aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a memory and a processor, and the memory stores a computer program, and the processor executes the computer program to implement the method provided in any of the alternative embodiments of the present application.

In another aspect, embodiments of the present application also provide a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the method provided in any of the alternative embodiments of the present application.

In another aspect, embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the method provided in any of the alternative embodiments of the present application.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

When a page screenshot request aiming at a target URL is received, a current page of at least one browser instance running currently is firstly obtained, if a cache page does not exist in the current page, the target page is determined from the at least one current page based on the page state of the at least one current page, then the page screenshot is generated based on the target page rendering target URL, the final page screenshot is directly generated according to the current page of the at least one browser instance, the browser instance does not need to be restarted based on the page screenshot request, the page screenshot task can be normally carried out when the page screenshot request frequency is higher, and the browser error reporting can be avoided while the number of the page screenshot requests processed simultaneously is increased.

Furthermore, if at least one page still has a cache page aiming at the target URL, the cache page is directly called without rendering and loading again aiming at the target URL, and the corresponding page screenshot can be obtained by screenshot according to the screenshot type in the page screenshot request, so that the efficiency of page screenshot processing can be effectively improved.

Furthermore, when the page screenshot request is processed each time, the browser instance does not need to be restarted, namely, the browser instance does not need to be initialized to newly build the page, so that the processing efficiency of the page screenshot request can be effectively improved.

Further, when the current time reaches the set time and the time interval from the last restarting operation reaches the preset time interval or the memory occupation of at least one browser instance in the browser reaches the preset threshold, the restarting operation is respectively executed on the browser instances, so that the browser instances in the browser can be initialized at regular intervals, invalid caches can be cleared timely, the phenomenon that the normal operation is influenced by too much memory occupation of the browser can be avoided, the error reporting rate of the screenshot service is reduced, and the long-term stable screenshot service is maintained.

Further, by determining the first restarting time of each container respectively, determining the second restarting time of each browser instance in each container respectively, and then restarting each browser instance sequentially according to the first restarting time and the second restarting time, when the preset conditions are met, restarting the plurality of browser instances respectively in sequence, and staggering the restarting time of the plurality of browser instances, so that other browser instances can normally operate when one browser instance is restarted, and the failure of the request caused by the screenshot request received in the restarting process of all browser instances is avoided.

Furthermore, when the number of the page screenshot requests is large, the page screenshot requests are stored in the queue task, if the screenshot generation notification is received, the page screenshot requests stored in the request task queue earliest are processed, and therefore the concurrency of the page screenshot requests can be effectively improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.

FIG. 1 is a schematic diagram of an application environment of a page processing method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a page processing method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a solution for processing a page of a browser instance in one example provided by the present application;

FIG. 4 is a schematic diagram of a scheme of page processing in one example provided by the present application;

FIG. 5 is a schematic diagram of a scheme for sequentially starting different dockers in one example provided by the present application;

FIG. 6 is a schematic diagram of a scheme for storing a page screenshot task request in a task queue in one example provided by the present application;

FIG. 7 is a schematic diagram of a scheme of latency settings in rendering a target page in one example provided by the present application;

FIG. 8 is a schematic diagram of a page processing method in one example provided by the present application;

Fig. 9 is a schematic structural diagram of a page processing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present specification. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates that at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B". In describing a plurality of (two or more) items, if a relationship between the plurality of items is not explicitly defined, the plurality of items may refer to one, more or all of the plurality of items, for example, the description of "the parameter a includes A1, A2, A3" may be implemented such that the parameter a includes A1 or A2 or A3, and may also be implemented such that the parameter a includes at least two of three items of the parameters A1, A2, A3.

For a better description and understanding of the solution provided by the embodiments of the present application, first, some related technical terms involved in the embodiments of the present application will be described:

Puppeteer: a node. Js package published by the Chrome (a web browser) development team in 2017 was used to simulate the running of a browser. The node. Js is a JavaScript running environment based on a Chrome V8 engine; javaScript is a lightweight, interpreted or just-in-time programming language with functional prioritization.

Mysql: is an open source relational database management system.

Dock: an open source software is an open platform for developing applications, delivering (shipping) applications, running applications. Dock allows users to individually segment applications in an Infrastructure (infrastruce) to form smaller particles (containers) and thereby increase the speed of delivering software.

Puppeteer provides a more convenient installation environment, better performance and more convenient grammar realization for developers, and has the functions of webpage screenshot, website crawling, UI (User Interface) automatic test and the like. However, when puppeteer makes long-term service in the application system, a bottleneck of high availability occurs, and problems such as overtime of request, error reporting of service and the like often occur in the service.

Existing solutions use puppeteer systems that deploy screenshot services. By initializing multiple browser instances, multiple browsers repeatedly restart the loop to process the screenshot requests to share pressure. High concurrency access is not supported, and when the access times exceeds 10/s, browser errors are reported; too long, the average time spent on visits during a day is close to 5s; the browser often reports errors over time, or after a period of service operation, there are various subtle reports errors.

The scheme aims at the problems and is improved as follows: firstly, the management of the original scheme for the browser instance is changed into the management of the tab page of the browser, so that the destroying time consumption of the browser instance is reduced. Meanwhile, the rendering time of the page is further shortened through caching the tab page. Through the refined restarting based on tab pages, the browser is ensured not to generate memory leakage and page bug.

Optionally, the page processing according to the embodiments of the present application may be implemented based on Cloud technology (Cloud technology), for example, the steps of page rendering may be implemented using Cloud technology. Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Cloud computing refers to a delivery and use mode of an IT infrastructure, namely, obtaining required resources in an on-demand and easily-extensible manner through a network; generalized cloud computing refers to the delivery and usage patterns of services, meaning that the required services are obtained in an on-demand, easily scalable manner over a network. Such services may be IT, software, internet related, or other services. With the development of the internet, real-time data flow and diversification of connected devices, and the promotion of demands of search services, social networks, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Unlike the previous parallel distributed computing, the generation of cloud computing will promote the revolutionary transformation of the whole internet mode and enterprise management mode in concept.

The technical solution provided by the present application and the technical effects produced by the technical solution of the present application are described below by describing several alternative embodiments. It should be noted that the following embodiments may be referred to, or combined with each other, and the description will not be repeated for the same terms, similar features, similar implementation steps, and the like in different embodiments.

Fig. 1 is an application environment schematic diagram of a page processing method according to an embodiment of the present application. The application environment may include a terminal 101 and a server 102, among others. Specifically, the terminal 101 sends a page screenshot request for URL (uniform resource locator, uniform resource location system) to the server 102, and the server 102 receives the page screenshot request for URL of the target uniform resource location system; the server 102 obtains a current page of at least one browser instance currently running based on the page screenshot request; if the cache page of the target URL exists in the current page of at least one browser instance, generating a page screenshot aiming at the target URL based on the cache page of the target URL; if the cache page of the target URL does not exist in the current page of the at least one browser instance, determining the target page from the at least one current page based on the page state of the at least one current page; the server 102 renders the target URL with the target page, generates a page screenshot for the target URL, and transmits the page screenshot to the terminal 101.

In the above scenario, the page is processed by the server, and in other scenarios, the terminal may also directly process the page. Those skilled in the art will recognize that the above scenario is only one example scenario, and the application scenario of the page processing method of the present application is not limited.

It can be appreciated by those skilled in the art that the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server or a server cluster that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content distribution networks), and basic cloud computing services such as big data and artificial intelligence platforms.

A terminal (which may also be referred to as a user terminal or user device) may be, but is not limited to, a smart phone, tablet, notebook, desktop computer, smart voice interaction device (e.g., a smart speaker), wearable electronic device (e.g., a smart watch), vehicle-mounted terminal, smart home appliance (e.g., a smart television), AR/VR device, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.

Fig. 2 is a schematic flow chart of a page processing method according to an embodiment of the present application, where the method may be executed by a server, as shown in fig. 2, and the method may include the following steps:

Step S201, a page screenshot request for a target uniform resource location system URL is received.

The page screenshot request may carry a target URL, and may further include a screenshot type, for example, screenshot is performed for a page corresponding to the entire URL, or screenshot is performed for a portion of pages corresponding to the URL; the method can also simulate a mobile phone screenshot, directly derive a pdf file screenshot of a webpage by configuring pdf, and can also be configured to more complete display of a page with longer rendering time through delay, such as drawing of a report.

Step S202, based on the page screenshot request, a current page of at least one browser instance running currently is obtained.

Wherein, a browser can correspond to at least one browser instance, and each browser instance corresponds to at least one page.

Specifically, at least one browser instance of the browser may have a restarting browser instance, and further includes a running browser instance, and if the browser instance is running, a corresponding current page may be obtained; if the browser instance is being updated or is not running, the corresponding current page cannot be obtained.

In step S203, if the cached page of the target URL does not exist in the current page of the at least one browser instance, the target page is determined from the at least one current page based on the page status of the at least one current page.

Wherein the page status includes stale pages, unused pages, or stale pages.

The unused page refers to a blank page and can be considered as a default page; the expired page refers to that the time of page access or loading exceeds a preset time threshold; unexpired pages refer to pages that have been used but that have not been accessed or loaded for more than a preset time threshold.

Specifically, different time thresholds may be set for different pages, or the same time threshold may be set.

The target page may be a free page, such as an unused page, or a page that does not need to be reused, such as an expired page, and the specific determination process for the target page will be described in further detail below.

Step S204, rendering the target URL by using the target page, and generating a screenshot aiming at the target URL.

Specifically, rendering the target URL with the target page, generating a page screenshot for the target URL may include:

Rendering the target URL by using the target page to obtain a rendered target page;

And carrying out screenshot on the rendered target page based on the page screenshot request to obtain a page screenshot.

Specifically, the screen capturing of the rendered target page can be performed according to the screen capturing type aiming at the target URL in the page screen capturing request, for example, if the screen capturing type is the screen capturing aiming at the full page, the full page of the rendered target page can be captured.

In the implementation process, when the browser is newly built, the server generates a first management object and a second management object at the same time, wherein the first management object can be called pMgr (PAGE MANAGER, page management object), and the first management object is mainly used for managing the page of the browser instance and managing page initialization and configuration and saving information such as URL (uniform resource locator), screenshot generation time, page state, whether the current page is valid, page configuration and loading. The second management object may be referred to as bMgr (Browser manager), a Browser instance management object. bMgr mainly manages browser instances, and all browser instances under the current dock can be obtained through bMgr.

As shown in fig. 3, three Browser instances (Browser instance), namely B1, B2 and B3 in the figure, may be included, each Browser instance corresponds to a plurality of pages, for example, B1 in the figure corresponds to three pages, page1, page2 and page3, and when the Browser instance is initialized, the first management object pMgr is used to directly manage the pages of the Browser instance across the Browser, so that the most possible use of page cache and page concurrency management may be made.

In the above embodiment, when the page screenshot request for the target URL is received, the current page of at least one browser instance running currently is obtained first, if the cache page does not exist in the current page, the target page is determined from the at least one current page based on the page state of the at least one current page, then the page screenshot is generated based on the target page rendering target URL, the final page screenshot is directly generated according to the current page of the at least one browser instance, the browser instance does not need to be restarted based on the page screenshot request, and when the page screenshot request frequency is higher, the page screenshot task can be normally performed, so that the number of the page screenshot requests processed simultaneously is increased, and meanwhile, the browser error reporting can be avoided.

In addition, when the page screenshot request is processed each time, the browser instance does not need to be restarted, namely, the browser instance does not need to be initialized to reestablish the page, and the processing efficiency of the page screenshot request can be effectively improved.

In some possible embodiments, the method further comprises:

Wherein, the cache page of the target URL refers to a page which has been accessed or loaded with the target URL.

Specifically, if at least one page still has a cache page aiming at the target URL, the cache page is directly called without rendering and loading again aiming at the target URL, and the corresponding page screenshot can be obtained by screenshot according to the screenshot type in the page screenshot request.

The specific process of determining a target page will be further described in connection with embodiments below.

In some possible embodiments, step S204 determines, based on the page status of the at least one current page, a target page from the at least one current page, including:

Specifically, the state of the page can be judged according to the used time of the page, and if the used time of the page is 0, that is, the page is not used yet, the state of the page is the unused page; if the used time of the page is smaller than or equal to a preset time threshold value and larger than 0, the page state is a used unexpired page; if the used time of the page is greater than the preset time threshold, the page state is the expired page.

In the implementation process, if at least one current page has an unused page, the unused page is indicated to be an unoccupied page, and the unused page can be directly called to render the target URL, that is, the unused page can be used as the target page.

If an expired page exists in at least one current page, which indicates that the page exists for a longer time and is not updated, and is likely to be unnecessary to be used, the page can be called to render the target URL, and the expired page can be used as the target page.

In some possible embodiments, if it is determined that at least one current page is an unexpired page based on the usage status, step S204 determines a target page from the at least one current page based on the page status of the at least one current page, including:

Specifically, if the usage status determines that at least one current page is an unexpired page, that is, there is neither an unused page nor an expired page in at least one current page.

In the implementation process, the unexpired page with the earliest use time in at least one unexpired page, namely, the page which is closest to the expiration, can be used as the target page.

As shown in fig. 4, in one example, the page processing method may include the steps of:

if a page screenshot request aiming at a target URL is received, judging whether a cache page aiming at the URL exists in a current page of at least one browser instance running currently, namely judging whether a page object for caching the URL exists in a page queue shown in the figure;

if yes, directly calling out the cache page without rendering the page; generating a page screenshot according to the cache page;

If not, further judging whether an unused page or an expired page exists in at least one current page, namely whether an unused or expired page object exists in the graph;

If yes, calling the unused page or the expired page, and adopting the unused page or the expired page to render the target URL to generate a page screenshot;

If not, calling the page with the earliest use time from at least one current page, namely calling the page object which is generated earliest according to the generation time and shown in the figure, and adopting the unexpired page with the earliest use time to render the target URL to generate the page screenshot.

The above embodiment describes a specific procedure of how to determine the target page, and a specific procedure of restarting for the browser instance will be described below in connection with the embodiment.

In some possible embodiments, the page processing method further includes:

The current time reaches the set time;

The memory occupation of at least one browser instance in the browser reaches a preset threshold.

In the implementation process, the restarting operation can be sequentially executed on at least one browser instance at the appointed time, namely, the current time reaches the set time; the restarting operation can be executed on at least one browser instance in sequence at preset time intervals, namely, the time interval of the last restarting operation of the current time distance reaches the preset time interval; and when the memory occupation of at least one browser instance in the browser reaches a preset threshold value, restarting operation is sequentially executed on the at least one browser instance respectively.

Specifically, for each browser instance, the restart operation includes:

Specifically, when the restart operation is performed for each browser instance, each page in the browser instance is initialized to be a blank page, that is, a new page with a corresponding number of pages is newly built.

In the above embodiment, when the current time reaches the set time, the time interval from the last restart operation reaches the preset time interval, or the memory occupation of at least one browser instance in the browser reaches the preset threshold, the restart operation is performed on the browser instance, so that the browser instance in the browser can be initialized regularly, invalid caches can be cleared timely, normal operation can be prevented from being influenced by too much memory occupation of the browser, the error reporting rate of the screenshot service is reduced, and the long-term stable screenshot service is maintained.

In some possible embodiments, the method further comprises:

Specifically, if the browser instance receives the request for capturing the page screenshot during the process of executing the restart operation, if the page of the browser instance that is executing the restart operation is used as the page that can be used for generating the screenshot, the browser instance may be initialized during the process of executing the rendering, so that the rendered page is also initialized, and the screenshot is not obtained.

Specifically, if the browser instance has completed the restart operation, the page illustrating the browser instance may be used to make a screenshot, at which point the markup may be canceled.

In some possible implementations, the at least one browser instance is restarted via the at least one container; each container is for running at least one browser instance.

The container may be a dock, which is an open source software for developing applications, delivering applications, and running applications.

Specifically, one browser corresponds to one dock, one dock may correspond to at least one browser instance, and one browser instance may correspond to at least one page.

Specifically, the restarting operation is sequentially and respectively performed on at least one browser instance, including:

(1) And respectively determining the first restarting time corresponding to each container.

Wherein the first restart time is a start restart time of the container.

Specifically, a container may be started to restart at least one browser instance corresponding to a container, so the first restart time of each container may be determined first.

Determining a first restart time corresponding to each container respectively, including:

a. determining the restarting starting moment of the browser instance;

b. And determining a first restarting time corresponding to each container in sequence based on the restarting starting time, the first quantity of at least one container and a first preset time interval.

The restart starting time of the browser instance is the time when all the browser instances start to restart.

Specifically, a restarting sequence of a plurality of containers may be preset, and from a restarting start time, a first restarting time corresponding to each container is sequentially determined according to the restarting sequence and a first preset time interval.

For example, the first preset time interval is 1 hour, there are 3 containers, each container has a plurality of browser instances, and the restart starting time is 12:00, then 12:00 may be set to start restarting one container, 13:00 to start restarting one container, and 14:00 to start restarting one container.

In the specific implementation process, step restarting is carried out among different dockers in a mode of synchronizing the flag bits of the database:

And sequentially acquiring restart marker bits in Mysql (database) according to the sequence among different dockers. Wherein, communication is carried out between different dockers through Mysql. The flag bit has an initial value of 0, each dock reads the flag bit, starts after the value is multiplied by a fixed time difference, and simultaneously adds one to write back the flag bit, if the value of the flag bit is equal to the number of dockers, it indicates that all dockers have started to restart, and the flag bit is reset to 0 at this time.

(2) For each container, a second restart time for each browser instance in the container is determined.

Specifically, after the first restarting time of each container is determined in turn, the first restarting time is equivalent to a batch of browser instances corresponding to each container, and then the restarting time of each browser instance in each batch of browser instances is determined in batches.

The first restart time of each container is the time when all browser instances in each container start to restart.

Specifically, the restarting sequence of the multiple browser instances in each container may be preset, and from the first restarting time, the second restarting time corresponding to each browser instance is sequentially determined according to the restarting sequence of the multiple browser instances and the second preset time interval.

For example, the first preset time interval is 1 hour, there are 3 containers, each container has a plurality of browser instances, and the restart starting time is 12:00, then 12:00 may be set to start restarting one container, 13:00 to start restarting one container, and 14:00 to start restarting one container. The restart time for each browser instance in each container is then determined separately. For example, the first restart time of the first container is 12:00, the second preset time interval is 10 minutes, and there are 5 browser instances in the first container, and the time for sequentially restarting the five browser instances is: 12:00, 12: 10. 12:20, 12:30 and 12:40.

In the implementation process, in the same dock, the instance in the browser can be restarted step by step, and first, each browser is delayed by a fixed time respectively.

(3) And restarting each browser instance in each container sequentially based on the first restarting time corresponding to each container and the second restarting time of each browser instance in each container.

As shown in fig. 5, in an example, taking a container as an example, restarting at least one browser instance sequentially and respectively at regular time, and an application program in the dock opens a mysql transaction lock to obtain a current identification bit, namely a current restart sequence number, from mysql; after each docker reads the flag bit, the flag bit is started after the time of multiplying the value by a fixed time difference, and the flag bit is added with one write back, if the value of the flag bit is equal to the number of dockers, the flag bit is reset to 0 when all dockers are started to restart, so that the first restart time of each docker can be sequentially determined, and then the second restart time of each browser instance of each docker can be sequentially and respectively determined.

In the above embodiment, the first restart time of each container is determined respectively, the second restart time of each browser instance in each container is determined respectively, and then each browser instance is restarted sequentially according to the first restart time and the second restart time, so that when the preset conditions are met, a plurality of browser instances can be restarted sequentially, the restart times of the plurality of browser instances are staggered, and thus, when one browser instance is restarted, other browser instances can operate normally, and the request failure caused by the screenshot request received in the restarting process of all browser instances is avoided.

In some possible embodiments, the page processing method may further include:

(1) If the currently running page is not obtained, storing a page screenshot request aiming at the target URL into a request task queue;

(2) And if the screenshot generation notification is received, processing the page screenshot request stored in the request task queue earliest.

In the implementation process, when the number of the page screenshot requests is large, the concurrency is high, all browser pages are occupied, at this time, the page screenshot requests cannot be processed any more, and excessive requests can be cached in a task queue mode, namely if the currently running page is not acquired, the page screenshot request aiming at the target URL is stored in the request task queue.

If a screenshot generation notification is received, that is, any page has completed a task, the page screenshot request stored in the request task queue earliest can be processed, that is, the task queue can be called back, and the cached task can be processed.

As shown in fig. 6, taking the execution screenshot task as puppteer as an example, when the number of the screenshot requests is excessive, all browser pages are occupied, and the screenshot requests can be stored in a task queue, namely a task queue including tasks 1-4 in the figure, and when a screenshot generation notification is received, the screenshot requests are stored in the task queue for processing at the earliest time, namely task 1 in the task queue is taken out for processing.

In the above embodiment, when the number of the page screenshot requests is large, the page screenshot requests are stored in the queue task, and if the screenshot generation notification is received, the page screenshot requests stored in the request task queue earliest are processed, so that the concurrency of the page screenshot requests can be effectively improved.

In the implementation process, some URL pages need a long time to be rendered, for example, pages with canvas drawings, in this case, a state that the screenshot is not rendered yet may occur, and the URL may be configured with a delay identifier, that is, the rendering may be delayed for a delay time and then the screenshot operation is performed.

As shown in fig. 7, a target page is adopted to render a target URL, a time stamp for starting rendering is recorded, namely, the URL and the time stamp are saved, whether delay setting exists is judged, if the delay setting exists, namely, the delay identifier of the target URL is detected, the delay time length corresponding to the delay identifier is determined, and after the time stamp for starting rendering is started, a page screenshot for the target URL is generated by the delay time length, namely, the delay callback is performed by using settimeout in the figure. Wherein settimeout is a timing execution function; if no delay setting exists, the screenshot operation can be directly performed.

In order to more clearly illustrate the page processing method of the present application, the page processing method of the present application will be further described below with reference to examples.

In one example, as shown in fig. 8, the page processing method of the present application may include the following steps:

if a page screenshot request aiming at a target URL is received, judging whether available pages exist in the currently running pages or not;

if the available page exists, judging whether a cache page aiming at the URL exists in the current page of at least one browser instance running currently, namely judging whether a page object for caching the URL exists in a page queue or not as shown in the figure;

if not, calling the page with the earliest use time from at least one current page, namely calling the page object which is generated earliest according to the generation time and shown in the figure, and adopting the unexpired page with the earliest use time to render the target URL to generate a page screenshot;

if no available page exists in the currently running pages, judging whether a task queue is out of limit;

If yes, returning to the browser to be busy;

if not, the page screenshot request is stored in the task queue, if the screenshot generation notification is received, the page screenshot request stored in the task queue at the earliest time can be processed, namely the task queue can be called back, and the cached task can be processed.

According to the page processing method, when the page screenshot request aiming at the target URL is received, the current page of at least one browser instance running currently is firstly obtained, if the cache page does not exist in the current page, the target page is determined from the at least one current page based on the page state of the at least one current page, then the page screenshot is generated based on the target page rendering target URL, the final page screenshot is directly generated according to the current page of the at least one browser instance, the browser instance does not need to be restarted based on the page screenshot request, the page screenshot task can be normally performed when the page screenshot request frequency is higher, the number of the page screenshot requests processed simultaneously is increased, and meanwhile, the browser error reporting can be avoided.

As shown in fig. 9, in some possible embodiments, there is provided a page processing apparatus, including:

the receiving module 901 is used for receiving a page screenshot request aiming at a target uniform resource positioning system URL;

An obtaining module 902, configured to obtain, based on the page screenshot request, a current page of at least one browser instance currently running;

A determining module 903, configured to determine, if the cached page of the target URL does not exist in the current page of the at least one browser instance, a target page from the at least one current page based on a page status of the at least one current page; the page state comprises an expired page, an unused page or an unused page;

the rendering module 904 is configured to render the target URL using the target page, and generate a screenshot for the target URL.

In some possible embodiments, the method further comprises a generating module for:

In some possible embodiments, the determining module 904 is specifically configured to, when determining the target page from the at least one current page based on the page status of the at least one current page:

The current time reaches the set time;

the restarting module is specifically configured to:

determining the restarting starting moment of the browser instance;

In some possible implementations, the rendering module 905 is specifically configured to, when rendering the target URL with the target page, generate a page screenshot for the target URL:

According to the page processing device, when the page screenshot request aiming at the target URL is received, the current page of at least one browser instance running at present is firstly obtained, if the cache page does not exist in the current page, the target page is determined from the at least one current page based on the page state of the at least one current page, then the page screenshot is generated based on the target page rendering target URL, the final page screenshot is directly generated according to the current page of the at least one browser instance, the browser instance does not need to be restarted based on the page screenshot request, the page screenshot task can be normally performed when the page screenshot request frequency is higher, the number of the page screenshot requests processed simultaneously is increased, and meanwhile, the browser error reporting can be avoided.

The device of the embodiment of the present application may perform the method provided by the embodiment of the present application, and its implementation principle is similar, and actions performed by each module in the device of the embodiment of the present application correspond to steps in the method of the embodiment of the present application, and detailed functional descriptions of each module of the device may be referred to the descriptions in the corresponding methods shown in the foregoing, which are not repeated herein.

An embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory, where the processor, when executing the computer program stored in the memory, may implement a method according to any of the alternative embodiments of the present application.

Fig. 10 shows a schematic structural diagram of an electronic device, which may be a server or a user terminal, and may be used to implement the method provided in any embodiment of the present invention, as shown in fig. 10, where the embodiment of the present invention is applicable.

As shown in fig. 10, the electronic device 1000 may mainly include at least one processor 1001 (one is shown in fig. 10), a memory 1002, a communication module 1003, an input/output interface 1004, and other components, and optionally, the components may be connected to each other by a bus 1005. It should be noted that, the structure of the electronic device 1000 shown in fig. 10 is only schematic, and does not limit the electronic device to which the method provided in the embodiment of the present application is applicable.

The memory 1002 may be used to store an operating system, application programs, and the like, which may include computer programs that implement the methods of the embodiments of the present invention when called by the processor 1001, and may include programs for implementing other functions or services. The Memory 1002 may be, but is not limited to, a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory ) or other type of dynamic storage device that can store information and computer programs, an EEPROM (ELECTRICALLY ERASABLE PROGRAMMABLE READ ONLY MEMORY ), a CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The processor 1001 is connected to the memory 1002 via a bus 1005, and executes a corresponding function by calling an application program stored in the memory 1002. The Processor 1001 may be a CPU (Central Processing Unit ), a general purpose Processor, a DSP (DIGITAL SIGNAL Processor, a data signal Processor), an ASIC (Application SPECIFIC INTEGRATED Circuit), an FPGA (Field Programmable GATE ARRAY ) or other programmable logic device, transistor logic device, hardware component, or any combination thereof, which may implement or execute the various exemplary logic blocks, modules and circuits described in connection with the present disclosure. The processor 1001 may also be a combination that implements computing functionality, such as a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.

The electronic device 1000 may be connected to a network through a communication module 1003 (which may include, but is not limited to, a component such as a network interface) to enable interaction of data, such as sending data to or receiving data from other devices (e.g., user terminals or servers, etc.) through communication of the network with the other devices. The communication module 1003 may include a wired network interface and/or a wireless network interface, etc., that is, the communication module may include at least one of a wired communication module or a wireless communication module.

The electronic device 1000 may be connected to a required input/output device, such as a keyboard, a display device, etc., through the input/output interface 1004, and the electronic device 100 may itself have a display device, or may be externally connected to another display device through the interface 1004. Optionally, a storage device, such as a hard disk, may be connected to the interface 1004, so that data in the electronic device 1000 may be stored in the storage device, or data in the storage device may be read, and data in the storage device may be stored in the memory 1002. It is understood that the input/output interface 1004 may be a wired interface or a wireless interface. The device connected to the input/output interface 1004 may be a component of the electronic device 1000, or may be an external device connected to the electronic device 1000 when needed, according to the actual application scenario.

The bus 1005 used to connect the components may include a path to transfer information between the components. Bus 1005 may be a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 1005 may be classified into an address bus, a data bus, a control bus, and the like according to functions.

Alternatively, for the solution provided in the embodiment of the present invention, the memory 1002 may be configured to store a computer program for executing the solution of the present invention, and the processor 1001 is configured to execute the computer program, where the processor 1001 executes the computer program to implement the actions of the method or the apparatus provided in the embodiment of the present invention.

Based on the same principle as the method provided by the embodiment of the present application, the embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where the computer program can implement the corresponding content of the foregoing method embodiment when executed by a processor.

Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the respective aspects of the method embodiments described above.

It should be noted that the terms "first," "second," "third," "fourth," "1," "2," and the like in the description and claims of the present application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such that the embodiments of the application described herein may be implemented in other sequences than those illustrated or otherwise described.

It should be understood that, although various operation steps are indicated by arrows in the flowcharts of the embodiments of the present application, the order in which these steps are implemented is not limited to the order indicated by the arrows. In some implementations of embodiments of the application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages can be flexibly configured according to the requirement, which is not limited by the embodiment of the present application.

The foregoing is merely an optional implementation manner of some of the implementation scenarios of the present application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the present application are adopted without departing from the technical ideas of the scheme of the present application, and the implementation manner is also within the protection scope of the embodiments of the present application.

Claims

1. A method of processing a page, comprising:

If the cache page of the target URL does not exist in the current page of the at least one browser instance, determining a target page from the at least one current page based on the page state of the at least one current page; the page state comprises an expired page, an unexpired page or an unused page;

And rendering the target URL by adopting the target page, and generating a page screenshot aiming at the target URL.

2. The page processing method according to claim 1, characterized in that the method further comprises:

and if the cache page of the target URL exists in the current page of the at least one browser instance, generating a screenshot aiming at the target URL based on the cache page of the target URL.

3. The page processing method according to claim 1, wherein the determining a target page from the at least one current page based on the page status of the at least one current page includes:

if at least one unused page or expired page exists in the at least one current page based on the page state, taking the unused page or expired page as the target page;

And if the at least one current page is determined to be the unexpired page based on the use state, using the unexpired page with the earliest time in the at least one unexpired page as the target page.

4. The page processing method according to claim 1, characterized by further comprising:

And if at least one of the following conditions is detected, sequentially and respectively executing restarting operations on the at least one browser instance:

The current time reaches the set time;

the memory occupation of the at least one browser instance in the browser reaches a preset threshold;

wherein, for each browser instance, the restart operation includes:

And determining the number of pages in the browser instance, initializing the browser instance, and creating new pages corresponding to the number of pages in the initialized browser instance.

5. The page processing method as recited in claim 4, further comprising:

Marking a browser instance in the process of executing the restarting operation; the page marked for representing the browser instance cannot be used as the page for generating the screenshot;

6. The page processing method of claim 4, wherein the at least one browser instance is restarted via at least one container; each container is used for running at least one browser instance;

And sequentially and respectively restarting the at least one browser instance, wherein the method comprises the following steps:

respectively determining a first restarting time corresponding to each container; the first restart time is a start restart time of the container;

7. The method of claim 6, wherein determining the first restart time for each container respectively comprises:

determining the restarting starting moment of the browser instance;

Determining a first restarting time corresponding to each container in sequence based on the restarting starting time, the first quantity of at least one container and a first preset time interval;

The determining, for each container, a second restart time for each browser instance in the container, comprising:

for each container, determining a second restart time of each browser instance in each container in turn based on the first restart time of each container, a second number of browser instances in each container, and a second preset time interval.

8. The page processing method according to claim 1, characterized by further comprising:

If the currently running page is not obtained, storing the page screenshot request aiming at the target URL into a request task queue;

and if the screenshot generating notification is received, processing a page screenshot request which is stored in the request task queue earliest.

9. The method of claim 1, wherein the rendering the target URL with the target page generates a screenshot for the target URL, comprising:

rendering the target URL by adopting a target page, and recording a time stamp for starting rendering;

then after the time stamp to begin rendering, delaying the delay duration to generate a page screenshot for the target URL.

10. A page processing apparatus, comprising:

the determining module is used for determining a target page from the at least one current page based on the page state of the at least one current page if the cache page of the target URL does not exist in the current page of the at least one browser instance; the page state comprises an expired page, an unexpired page or an unused page;

And the rendering module is used for rendering the target URL by adopting the target page and generating a screenshot aiming at the target URL.

11. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the page processing method of any one of claims 1-9.

12. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the page processing method of any of claims 1-9.