CN109145182B - Data acquisition method and device, computer equipment and system - Google Patents

Data acquisition method and device, computer equipment and system Download PDF

Info

Publication number
CN109145182B
CN109145182B CN201710461347.XA CN201710461347A CN109145182B CN 109145182 B CN109145182 B CN 109145182B CN 201710461347 A CN201710461347 A CN 201710461347A CN 109145182 B CN109145182 B CN 109145182B
Authority
CN
China
Prior art keywords
script
browser
webpage
server
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710461347.XA
Other languages
Chinese (zh)
Other versions
CN109145182A (en
Inventor
齐恒
徐铭
金才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710461347.XA priority Critical patent/CN109145182B/en
Publication of CN109145182A publication Critical patent/CN109145182A/en
Application granted granted Critical
Publication of CN109145182B publication Critical patent/CN109145182B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a data acquisition method, a data acquisition device, computer equipment and a data acquisition system, and belongs to the field of computers. The method comprises the following steps: the method comprises the steps that a browser obtains a first script according to an entrance tag, and the entrance tag is added to a webpage by a management client in the process that the browser displays the webpage; when the browser runs the first script, a second script corresponding to the webpage is obtained from the server, the first script is a script used for providing a running environment of the second script, and the second script is a script used for carrying out data acquisition on the webpage; when the browser runs the second script, data acquisition is carried out on the webpage; the problem that when data are collected through a web crawler in a terminal, more IP addresses need to be set for the terminal to avoid the situation that one IP address is forbidden to be used, and the complexity of data collection is high is solved; and the complexity of the terminal for acquiring data is reduced.

Description

Data acquisition method and device, computer equipment and system
Technical Field
The embodiment of the invention relates to the field of computers, in particular to a data acquisition method, a data acquisition device, computer equipment and a data acquisition system.
Background
With the development of network information technology, web page data included in websites, forums, blogs, etc. are increasing, and in order to screen out effective data from web pages, a terminal generally needs to collect data of web pages according to a certain rule, send the data to a server, and analyze and process the data by the server, thereby obtaining classified data.
In a typical data collection method, a terminal collects web page information through a web crawler method. The web crawler method comprises the following steps: the terminal sends a webpage acquisition request (HTTP request) to the web server every preset time interval through the web crawler according to a preset crawling task, the web server sends webpage data of the webpage to the terminal according to the webpage acquisition request, and the web crawler collects the webpage data. The web crawler is a program for automatically acquiring web page contents, and has a function of simulating the operation of a browser.
Because some websites make anti-crawler strategies, for example: the web server acquires an IP (Internet Protocol, IP) address of the terminal carried in the webpage acquisition request, and detects whether an access track of the IP meets at least one of the following rules: (1) the interval duration of the access of the web pages is fixed; (2) the total visit amount in one day is larger than a preset value; and for the IP meeting the at least one rule, forbidding the IP address to visit the website again. If the terminal needs to acquire the webpage information of a certain website for a long time through the web crawler, more IP addresses need to be preset to avoid the problem that one IP address is forbidden to be used by the website, and the complexity of acquiring the webpage information by the terminal is improved.
Disclosure of Invention
In order to solve the problem that a terminal is higher in complexity of acquiring data due to the fact that a plurality of IP addresses are set in the terminal to acquire the data in a webpage, the embodiment of the invention provides a data acquisition method, a data acquisition device, computer equipment and a data acquisition system. The technical scheme is as follows:
in a first aspect, a data acquisition method is provided, which is applied to a terminal installed with a browser and a management client, and the method includes:
the browser acquires a first script according to an entrance tag, the entrance tag is added to a webpage by the management client in the process of displaying the webpage by the browser, and the entrance tag is used for triggering the browser to acquire the first script from a server;
when the browser runs the first script, a second script corresponding to the webpage is obtained from the server, wherein the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
and when the browser runs the second script, acquiring data of the webpage.
In a second aspect, a data acquisition method is provided, the method comprising:
Sending a first script to at least one terminal, wherein the first script is requested by a browser in the terminal when an entrance tag added in a webpage is displayed by a running management client;
sending a second script to the terminal, wherein the second script is requested when the browser runs the first script, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
and receiving the data of the webpage collected by the second script.
In a third aspect, a data acquisition device is provided, which is applied in a terminal installed with a management client, and the device includes:
a first obtaining module, configured to obtain the first script according to the entry tag, where the entry tag is a tag that is added to the web page by the management client in a process of displaying the web page, and the entry tag is used to trigger obtaining of the first script from the server;
the second obtaining module is used for obtaining a second script corresponding to the webpage from the server when the first script obtained by the first obtaining module is operated, wherein the first script is a script for providing an operation environment of the second script, and the second script is a script for acquiring data of the webpage;
And the data acquisition module is used for acquiring data of the webpage when the second script acquired by the second acquisition module is operated.
In a fourth aspect, there is provided a data acquisition apparatus, the apparatus comprising:
the system comprises a first sending module, a second sending module and a third sending module, wherein the first sending module is used for sending a first script to at least one terminal, and the first script is requested by a browser in the terminal when an entrance tag added in a webpage is operated by a management client in the process of displaying the webpage;
the second sending module is used for sending a second script to the terminal, wherein the second script is requested when the browser runs the first script, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
and the acquisition module is used for acquiring the data of the webpage acquired by the second script.
In a fifth aspect, a computer device is provided, where the terminal includes a processor and a memory, where the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the data acquisition method provided in the first aspect or the data acquisition method provided in the second aspect.
In a sixth aspect, a data acquisition system is provided, which comprises a terminal and a server,
the terminal comprises the data acquisition device provided by the third aspect;
the server comprises the data acquisition device provided by the fourth aspect.
In a seventh aspect, a computer-readable storage medium is provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the data acquisition method provided in the first aspect or the data acquisition method provided in the second aspect.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
in the process of displaying a webpage by the browser, adding an entry table parent in the webpage through the management client, so as to trigger the browser to acquire a first script and a second script, and enable the management client to trigger the browsing client to acquire data in the webpage through the second script according to the behavior of a user accessing the webpage; the problem that when data are collected through a web crawler in a terminal, more IP addresses need to be set for the terminal to avoid the situation that one IP address is forbidden to be used, and the complexity of data collection is high is solved; because the behavior of the user for accessing the webpage is irregular usually, and the number of times that each user accesses the same website in one day is not too high, the website usually does not forbid the IP address corresponding to the terminal used by the user; in addition, even if the website forbids the IP address corresponding to the terminal used by the user, the data in the webpage can be acquired through the behavior of other users accessing the webpage, and the data can be acquired without setting a plurality of IP addresses in the same terminal, so that the complexity of acquiring the data by the terminal is reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of a data acquisition system according to an embodiment of the present invention;
FIG. 2A is a flow chart of a data collection method provided by an embodiment of the invention;
fig. 2B is a schematic diagram of terminal display data according to an embodiment of the present invention;
fig. 2C is a schematic diagram of terminal display data according to an embodiment of the present invention;
FIG. 2D is a diagram illustrating display data of a terminal according to an embodiment of the invention;
fig. 3 is a schematic diagram illustrating a terminal determining whether to issue a first script according to another embodiment of the present invention;
FIG. 4 is a block diagram of a data acquisition device provided in accordance with an embodiment of the present invention;
FIG. 5 is a block diagram of a data acquisition device provided in accordance with an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention;
Fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
First, a number of terms related to embodiments of the present invention will be described.
The browser: for requesting and displaying web pages from the server. Alternatively, the browser may include only a function of requesting and displaying a web page; alternatively, the browser includes both the functionality to request and display web pages, and other functionality, such as: instant messaging function, online payment function. All clients with the functions of requesting and displaying web pages belong to the browser in the embodiment of the invention.
The management client side comprises: for monitoring web pages requested by the browser. When the management client monitors the webpage, the webpage identifier of the webpage can be obtained.
Optionally, the management client can also detect whether a malicious program exists in the corresponding webpage according to the webpage identifier.
Optionally, the web page is identified as a web page address (or called Uniform Resource Locator (URL)); alternatively, the web page is identified as a domain name of the web page, such as: com; cn, etc.
Optionally, the management client obtains the address identifier. The Address identifier may be an Internet Protocol Address (IP Address), or may also be an identifier of a geographic location where the terminal is located, which is not limited in this embodiment.
And a second script: the method is used for acquiring data of the webpage in the process of displaying the webpage by the browser. Optionally, the second scripts are preconfigured by the developer for different web pages, that is, each web page corresponds to at least one second script, and each second script collects data corresponding to one type.
Such as: and the certain shopping website comprises an entire spot shopping page and a day special price page, the entire spot shopping page corresponds to a second script A and a second script B, and the day special price page corresponds to a second script C. The second script A is used for acquiring data corresponding to the articles with the stock quantity larger than the preset quantity in the whole point rush-purchase page; the second script B is used for acquiring data corresponding to the articles with the price lower than the preset price in the whole point rush-purchase page. The second script C is used for acquiring data corresponding to the articles with the prices lower than the preset prices in the special price page every day.
Optionally, in this embodiment of the present invention, the second script is a js file. The js file is a format file written in a javascript script language, and is usually used with at least one language of html, php, and java. The js file is used in the field of web, generally can not be directly opened, and can be used only by running a webpage.
A first script: and the browser is used for providing a running environment for the second script in the process of displaying the webpage. The terminal can control the loading and running of the second script by running the first script. If the browser needs to run the second script, the browser needs to embed the first script in the webpage in advance, and the browser can run the second script through the first script. In other words, the first script provides the browser with access to run the second script.
Optionally, in this embodiment of the present invention, the first script is a main.
Alternatively, different web pages correspond to the same main.js file, in other words, main.js file is common.
An application scenario of the embodiment of the present invention is described below.
Referring to fig. 1, a schematic structural diagram of a data acquisition system according to an embodiment of the present invention is shown. The system includes at least one terminal 110 and a server 120.
The terminal 110 may be a mobile phone, a tablet computer, a wearable device, a laptop portable computer, a desktop computer, and the like, which is not limited in this embodiment.
The terminal 110 has installed therein a browser 111 and a management client 112. The terminal 110 establishes a communication connection with the server 120 through a wireless network manner or a wired network manner.
The management client 112 sends a first file address request to the server 120 through a communication connection between the terminal 110 and the server 120, and accordingly, the server 120 sends a file address of a first script to the terminal 110 according to the first file address request, and the management client 112 obtains the file address of the first script and adds the file address of the first script to a page code of a webpage currently displayed by the browser 111.
In the process of executing the page code of the current web page, the browser 111 detects a file address of the first script, sends a first file obtaining request to the server 120, correspondingly, the server 120 sends the first script to the terminal 110 according to the first file obtaining request, and the browser 111 obtains the first script.
Since the server 120 first sends the file address of the first script to the management client 112 and then sends the first script to the browser 111, the server 120 can change the storage path (file address) of the first script after the first script is issued, at this time, the browser 111 can also obtain the first script according to the modified storage path sent by the server 120 when requesting the first script next time, and the flexibility of the server 120 in storing the first script is improved.
Alternatively, the browser 111 sends a second file address request to the server 120 through the communication connection between the terminal 110 and the server 120, and accordingly, the server 120 sends the file address of the second script to the terminal 110 according to the second file address request. The first script in the browser 111 acquires the file address of the second script.
The browser 111 detects a file address of the second script through the first script, sends a second file obtaining request to the server 120, correspondingly, the server 120 sends the second script to the terminal 110 according to the second file obtaining request, and the first script in the browser 111 obtains the second script.
Since the server 120 first sends the file address of the second script to the first script in the browser 111 and then sends the second script to the first script, the server 120 can change the storage path (file address) of the second script after the second script is issued, at this time, the browser 111 can also obtain the second script according to the modified storage path sent by the server 120 when requesting the second script next time, and the flexibility of the server 120 in storing the second script is improved.
The browser 111 collects data in the currently displayed web page by the first script running the second script and transmits the data to the server 120.
Optionally, the second script is a js file corresponding to a webpage currently displayed by the browser 111, and the second script is used for collecting data of a predetermined type.
Optionally, the browser 111 and the management client 112 may be independent applications, or may be different functions in the same application, which is not limited in this embodiment.
Optionally, a precision system is running in the server 120, and the precision system is used for providing a background precision service, such as: the background precision services provided by the precision system include, but are not limited to, the following: determining whether to issue the first script according to the address identifier in the first file address request, issuing the first script according to the first file acquisition request, issuing the second script according to the last time and the interval duration between the receiving of the second file address request, and/or determining whether to issue the second script according to the environmental parameter in the second file address request.
Wherein the environment parameter is used for the server 120 to determine whether the browser 111 supports running the second script.
Optionally, the environmental parameters include: a version number of the browser 111, and/or a type of the browser 111.
Alternatively, the server 120 may be a server cluster or a single server, which is not limited in this embodiment.
Optionally, in this embodiment, only the server 120 sends the first script and the second script to one terminal 110 is taken as an example for description, and in actual implementation, the number of the terminals 110 is at least one, which is not limited in this embodiment.
Optionally, the wireless or wired networks described above use standard communication techniques and/or protocols. The Network is typically the Internet, but can be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wired or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including HyperText Mark-up Language (HTML), Extensible Mark-up Language (XML), and so forth. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), Internet Protocol Security (IPsec). In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.
Referring to fig. 2A, a flow chart of a data collection method according to an embodiment of the invention is shown. The data acquisition method is applied to the data acquisition system shown in fig. 1, and the method can comprise the following steps:
step 201, the server sends a first script to the terminal.
The first script is requested by a browser in the terminal when the browser displays the webpage and the entrance tag added in the webpage is operated by the management client.
In step 202, the browser obtains a first script according to the entry tag.
The method comprises the steps that a management client side adds an entry tag to a webpage in the process that the webpage is displayed by a browser, wherein the entry tag is used for triggering the browser to obtain a first script from a server and triggering the browser to obtain the first script from the server.
In this embodiment, the browser displays a web page according to the received browsing operation.
In one approach, the browser displaying the web page refers to: the terminal receives browsing operation input by a user through a man-machine interaction interface; and starting a browser according to the browsing operation, and displaying the webpage indicated by the browsing operation through the browser.
In another way, the browser displaying the web page means: the terminal receives a starting operation input by a user through a man-machine interaction interface; starting a browser according to the starting operation; and receiving browsing operation input by a user through the man-machine interaction interface, and displaying a webpage indicated by the browsing operation through the browser.
Optionally, the browsing operation and the starting operation are at least one of a clicking operation, a long-press operation, a sliding operation, a character input operation, and a voice input operation.
Optionally, the browse operation and the launch operation are the same or different.
The entry tag is used to add an external script (first script) to the page code of the web page, illustratively, the entry tag is a script tag.
Optionally, the administration client adds an entry tag at the bottom of the page code of the web page.
In this embodiment, the entry tag includes a tag attribute. The tag attribute is used to indicate the address of the external script (the file address of the first script), illustratively the tag attribute is src.
Step 203, the server sends the second script to the terminal.
The second script is requested by the browser when running the first script.
And step 204, when the browser runs the first script, acquiring a second script corresponding to the webpage from the server.
And step 205, when the browser runs the second script, acquiring data of the webpage.
The first script is used for providing a running environment of the second script, the terminal runs the first script through the browser, runs the second script through running the first script, and collects data of corresponding types in the webpage through the second script.
Optionally, after the browser collects the data in the webpage, the browser sends the data to the server.
In step 206, the server obtains the data of the web page collected by the second script.
Optionally, the server acquires data of the webpage from the browser every preset time length; or the server receives the data reported after the browser acquires the data in the webpage.
The specific value of the preset duration is not limited in this embodiment, and the preset duration is schematically 10 min.
Optionally, after receiving the data of the web page collected by the second script, the server stores the data in a classified manner, and when the terminal requests the data, the terminal displays the data in a classified manner.
In one example, the server stores together the data of the items with coupons in the shopping page, and when the terminal requests the data of the items with coupons, the server sends these data to the terminal, which displays the data of the items with coupons in a coupon back page 21, see FIG. 2B. The terminal may be a terminal that reports data, or may be another terminal, which is not limited in this embodiment.
In yet another example, the server stores together data of items having a price lower than a preset price, and when the terminal requests the data of items having a price lower than the preset price, the server transmits the data to the terminal, and referring to fig. 2C, the terminal displays the data of items having a price lower than the preset price in the chinese cabbage price page 22. The terminal may be a terminal that reports data, or may be another terminal, which is not limited in this embodiment.
Optionally, the server records a change of a group of data, and sends the change of the group of data to the terminal when the terminal requests the group of data, and the terminal displays the change.
Optionally, the set of data includes price data of the item, time of acquiring the data, remaining quantity of the item, and the like.
In one example, the server stores the change of the price of the item, and when the terminal requests the data of the item, the server transmits the change of the price of the item to the terminal, and referring to fig. 2D, the terminal displays the change 23 of the price of the item. The terminal may be a terminal that reports data, or may be another terminal, which is not limited in this embodiment.
In summary, in the data acquisition method provided in this embodiment, in the process of displaying a web page by a browser, an entry form is added to the web page by a management client, so that the browser is triggered to acquire a first script and a second script, and the management client can trigger the browsing client to acquire data in the web page through the second script according to a behavior of a user accessing the web page; the problem that when data are collected through a web crawler in a terminal, more IP addresses need to be set for the terminal to avoid the situation that one IP address is forbidden to be used, and the complexity of data collection is high is solved; because the behavior of the user for accessing the webpage is irregular usually, and the number of times that each user accesses the same website in one day is not too high, the website usually does not forbid the IP address corresponding to the terminal used by the user; in addition, even if the website forbids the IP address corresponding to the terminal used by the user, the data in the webpage can be acquired through the behavior of other users accessing the webpage, and the data can be acquired without setting a plurality of IP addresses in the same terminal, so that the complexity of acquiring the data by the terminal is reduced.
Alternatively, steps 202, 204, and 205 may be implemented separately as a terminal-side method embodiment, and steps 201, 203, and 206 may be implemented separately as a server-side method embodiment, which is not limited in this embodiment.
The following respectively describes the manner in which the management client adds the entry tag in step 202, the manner in which the browser obtains the first script, and the manner in which the browser obtains the second script in step 205.
Firstly, managing the way of adding the entrance label to the client.
In this embodiment, in the process of displaying a web page by a browser, adding an entry tag in the web page by a management client includes the following steps:
1) and the management client monitors the browser.
The terminal monitors the browser through the management client, and the method comprises the following steps:
in one approach, the management client monitors in real time whether the browser needs to display a web page by monitoring whether the browser sends a web page acquisition request. When the management client monitors that the browser sends a webpage acquisition request, the management client determines that the browser needs to display the webpage. At this time, the browser acquires the web page identifier in the web page acquisition request.
In another mode, the management client monitors whether the browser needs to display the web page in real time by monitoring whether the browser receives the web page data packet. And when the management client monitors that the browser receives the webpage data packet, determining that the browser needs to display the webpage. At this time, the browser obtains the webpage identifier displayed in the browser through a preset interface.
2) When the management client monitors that the browser displays the webpage, the management client sends a first file address request to the server.
The first file address request is used for requesting the server to issue a file address of the first script.
In this embodiment, a management client in the terminal monitors the browser, and when it is monitored that the browser displays a webpage, the browser is triggered to send a first file address request to the server.
Optionally, when the browser loads the web page to be displayed, the browser may also automatically send the first file address request to the server.
In this embodiment, the management client determines the time for the browser to acquire data according to the browsing operation of the user accessing the web page, so that the terminal acquires the data of the web page on the premise that the user accesses the web page, that is, the data of the web page is required by the user, and compared with the case that the terminal directly acquires the data in the web page designated by the server, resources consumed by the terminal when acquiring the data are saved, and meanwhile, a storage space occupied by the server when storing the data is also saved.
Optionally, the first file address request includes at least one of an address identification and a web page identification.
The address identification is used for the server to determine whether to send the first script to the management client. The address identity may reflect the bandwidth provided by the network currently used by the terminal.
The webpage identification is used for determining a second script corresponding to the webpage displayed by the browser by the server.
Optionally, when the server does not need to determine the second script corresponding to the web page, for example: when all the web pages correspond to the same second script, the first file address request may not carry the web page identifier.
3) The server receives a first file address request sent by a management client when the browser displays a webpage.
4) And the server determines whether to send the first script to the terminal according to the address identifier.
Optionally, the server pre-stores a blacklist, where the blacklist includes an address identifier of at least one area with a smaller bandwidth. If the address identifier in the first file address request received by the server belongs to the blacklist, determining not to send the first script, and ending the process; and if the address identifier in the first file address request received by the server does not belong to the blacklist, determining to send the first script, and executing the step 5). And the bandwidth provided by the area corresponding to the address identifier in the blacklist is lower than the preset bandwidth.
For a terminal accessing a web page in a remote area, the bandwidth provided by the network currently used by the terminal may be smaller, and the browser already occupies a larger bandwidth for acquiring data of the web page in the process of displaying the web page. If the browser also acquires the first script and the second script in the process of displaying the webpage, the problem that the webpage displayed by the browser is unsmooth may be caused. Therefore, in this embodiment, the server determines whether to send the first script according to the address identifier, and does not send the first script to the terminal when the bandwidth provided by the area corresponding to the address identifier is lower than the preset bandwidth, so that the smoothness of displaying the webpage by the browser is ensured.
Referring to fig. 3, after receiving a first file address request 301, a server obtains an address identifier 302 in the first file address request 301, and detects whether the address identifier 302 exists in a pre-stored blacklist 303; the server determines to send the first script when detecting that the address identifier 302 does not exist.
5) And when the first script is determined to be sent to the terminal according to the address identifier, the server sends the file address of the first script to the management client.
Optionally, the file address of the first script is used to indicate at least one of a server address, a port number, and a storage location of the first script in the server.
6) And the management client receives the file address of the first script sent by the server when the server determines to send the first script.
7) And the management client adds an entry tag in the webpage according to the file address of the first script.
Adding an entry tag in a page code of a webpage by a management client, wherein the entry tag comprises a tag attribute; and the management client sets the value of the label attribute as the file address of the first script.
Schematically, this step is represented by the following code: src-123.456.567.8008.
In summary, in the data acquisition method provided in this embodiment, the server determines whether to send the first script according to the address identifier in the first file address request, where the address identifier is used to indicate the bandwidth of the corresponding region, so that the server does not send the first script to the terminal when determining that the terminal runs in the region with the smaller bandwidth, thereby ensuring the smoothness of the webpage displayed by the terminal.
Optionally, after receiving the first file address request, the server may further determine a second script corresponding to the web page according to the web page identifier of the web page in the first file address request.
Optionally, the server pre-stores a correspondence between the web page identifier and the second script. And the server determines a corresponding second script according to the pre-stored corresponding relation and the webpage identifier in the first file address request.
Optionally, each web page identifier corresponds to at least one second script, and each second script collects data of a corresponding type.
Referring to fig. 3, after receiving the first file address request 301, the server obtains the web page identifier 304 in the first file address request 301, and determines the second script corresponding to the web page identifier 304 from the corresponding relationship 305.
And secondly, the browser acquires the first script.
In this embodiment, the browser obtaining the first script according to the entry tag includes the following steps:
1) and when the browser runs the file address of the first script in the entry tag, sending a first file acquisition request to the server.
The first file fetch request includes a file address of the first script.
The browser runs the page code of the webpage line by line, and when the browser runs the tag attribute at the bottom of the page code, the browser sends a first file acquisition request to the server according to the value of the tag attribute, namely the file address of the first script.
Optionally, the sending, by the browser, the first file acquisition request to the server according to the file address of the first script means: and carrying the file address of the first script in the first file acquisition request, and sending the first file acquisition request to a server.
2) The server receives a first file acquisition request sent by the browser when the browser runs the entrance tag.
And the server searches the first script according to the file address of the first script in the first file acquisition request.
3) And the server sends the first script to the terminal according to the first file acquisition request.
4) The browser receives a first script sent by the server according to the first file acquisition request.
And thirdly, the browser acquires a second script.
In this embodiment, when the browser runs the first script, the method for acquiring the second script corresponding to the web page from the server includes the following steps.
1) The browser runs the first script and sends a second file address request to the server.
The second file address request is used for requesting the server to issue a file address of the second script.
2) And the server receives a second file address request sent by the browser when the first script is executed.
3) And the server sends the file address of the second script to the terminal according to the second file address request.
Optionally, the file address of the second script is used to indicate at least one of a server address, a port number, and a storage location of the second script in the server.
4) The browser receives the file address of the second script through the first script.
And the terminal receives the file address of the second script, and the first script assigns the file address of the acquired file to the tag attribute in the first script.
Optionally, the tag attribute in the first script is a tag attribute of a tag of a second script added in the first script by the browser; alternatively, the tag attributes in the first script are the default tag attributes of the second script tags in the first script.
The representation manner of the tag attribute in the second script tag may be the same as the representation manner of the tag attribute in the entry tag, or may be different from the representation manner of the tag attribute in the entry tag, which is not limited in this embodiment. Illustratively, the tag attribute in the second script tag is represented by src.
5) And when the browser runs the file address of the second script, sending a second file acquisition request to the server.
The second file fetch request includes a file address of the second script.
And the browser runs the codes in the first script line by line, and when the label attribute in the label of the second script is run, a second file acquisition request is sent to the server according to the value of the label attribute, namely the file address of the second script.
6) And the server receives a second file acquisition request sent by the browser when the browser runs the file address of the second script.
7) And the server sends the second script to the terminal according to the second file acquisition request.
8) And the browser receives a second script sent by the server according to the second file acquisition request.
Optionally, before step 3), when the browser runs the first script, an environment parameter of the browser may also be obtained, where the environment parameter is used for the server to determine whether the browser supports running the second script; and the browser adds the environment parameter to the second file address request, and at the moment, the second file acquisition request comprises the file address and the environment parameter of the second script.
The first script comprises a code for acquiring the environment parameters, and the browser acquires the environment parameters of the browser by running the code.
In this case, after receiving the second file address request, the server also needs to determine whether the browser supports running the second script according to the environment parameter; and when the browser supports the operation of the second script, sending the file address of the second script to the terminal, namely executing the step 3).
Optionally, a blacklist of the browser is preset in the server. The blacklist of the browsers comprises at least one version number of the browser and/or at least one type of the browser.
Optionally, the server detects whether the version number of the browser indicated by the environmental parameter in the second file address request belongs to a blacklist of the browser; if the second script belongs to the first script, determining that the browser supports running of the second script; performing step 3); if not, determining that the browser does not support the operation of the second script, and ending the process.
Because the version of the browser is low or the type of the browser is a type that does not allow the second script to run, at this time, even if the server sends the second script to the terminal, the browser in the terminal cannot run the second script, which wastes transmission resources occupied by the server sending the second script.
In the embodiment, the server can determine whether the browser supports running of the second script according to the environment parameter by carrying the environment parameter in the second address request sent by the terminal; and when the second script is determined to be supported to run, sending the second script to the terminal, so that the browser can normally run the second script after receiving the second script, and the bandwidth resource occupied by the server when sending the second script is saved.
Optionally, before step 3), the server may further detect whether a second script is issued within a preset time length; and if the second script is not issued within the preset time length, sending the file address of the second script to the terminal.
The server indirectly detects the freshness of the data acquired by the second script by detecting whether the second script is issued within a preset time length. The preset time period is not limited in the present embodiment, and is illustratively 6 hours.
When the server detects that the second script is issued within the preset time, it indicates that the data collected by the second script and reported by the terminal exist within the preset time, at this time, the freshness of the data is high, and the server does not need to acquire the data again, so that transmission resources are saved, that is, the server does not need to issue the second script to the terminal, and the process is ended.
When the server detects that the second script is not issued within the preset time, it indicates that the data collected by the second script and reported by the terminal does not exist within the preset time, at this time, the freshness of the data is low, and the server needs to acquire the data, so as to ensure the validity of the stored data, that is, the server needs to issue the second script to the terminal, and step 3) is executed.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
Referring to fig. 4, a block diagram of a data acquisition apparatus according to an embodiment of the present invention is shown. The apparatus has a function of executing the above method example at the terminal side, and the function may be implemented by hardware, or may be implemented by hardware executing corresponding software. The apparatus may include: a first acquisition module 410, a second acquisition module 420, and a data acquisition module 430.
A first obtaining module 410, configured to obtain a first script according to the entry tag, where the entry tag is a tag added to the web page in a process of displaying the web page, and the entry tag is used to trigger obtaining of the first script from the server;
a second obtaining module 420, configured to obtain, from the server, the second script corresponding to the web page when the first script is executed, where the first script is a script for providing an execution environment of the second script, and the second script is a script for acquiring data of the web page;
and a data acquisition module 430, configured to acquire data of the web page when the second script is executed.
Optionally, the apparatus further comprises: and adding a module.
The adding module is used for adding the entrance tag to the webpage in the process of displaying the webpage.
Optionally, the adding module includes: the device comprises a monitoring unit, a first sending unit, a first receiving unit and an adding unit.
The monitoring unit is used for monitoring the browser;
the first sending unit is used for sending a first file address request to the server when the browser is monitored to display the webpage, wherein the first file address request comprises an address identifier; the first file address request is used for requesting the server to issue a file address of the first script, and the address identifier is used for the server to determine whether to send the first script to the management client;
the first receiving unit is used for receiving a file address of the first script sent by the server when the server determines to send the first script;
and the adding unit is used for adding the entry tag in the webpage according to the file address of the first script.
Optionally, the adding unit is further configured to:
adding the entry tag in the page code of the webpage, wherein the entry tag comprises a tag attribute; setting the value of the label attribute as the file address of the first script;
the first obtaining module is further configured to:
When the file address of the first script in the entry tag is operated, sending a first file acquisition request to the server, wherein the first file acquisition request comprises the file address of the first script; and receiving the first script sent by the server according to the first file acquisition request.
Optionally, the first sending unit is further configured to:
when the webpage is monitored to be displayed, acquiring a webpage identifier of the webpage;
carrying the webpage identifier in the first file address request, and sending the first file address request to the server, wherein the first file acquisition request comprises the address identifier and the webpage identifier; the webpage identification is used for the server to determine a second script corresponding to the webpage.
Optionally, the second obtaining module includes:
the second sending unit is used for running the first script and sending a second file address request to the server;
a second receiving unit, configured to receive, through the first script, a file address of the second script;
a third sending unit, configured to send the second file obtaining request to the server when a file address of the second script is run, where the second file obtaining request includes the file address of the second script;
And the third receiving unit is used for receiving the second script sent by the server according to the second file acquisition request.
Optionally, the second sending unit is further configured to:
when the first script is operated, obtaining an environment parameter, wherein the environment parameter is used for the server to determine whether the device supports the operation of the second script;
sending the second file address request to the server by the second file address request to which the environmental parameter is added; the second file obtaining request comprises a file address of the second script and the environment parameter.
Reference may be made to the above-described method embodiments for relevant details.
Referring to fig. 5, a block diagram of a data acquisition apparatus according to an embodiment of the invention is shown. The device has the function of executing the method example of the server side, and the function can be realized by hardware or by hardware executing corresponding software. The apparatus may include: a first sending module 510, a second sending module 520, and an obtaining module 530.
A first sending module 510, configured to send a first script to a terminal, where the first script is requested by a browser in the terminal when an entry tag added to a web page is displayed by running a management client in a process of displaying the web page;
A second sending module 520, configured to send a second script to the terminal, where the second script is requested when the browser runs the first script, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the web page;
an obtaining module 530, configured to obtain the data of the web page collected by the second script.
Optionally, the first sending module 510 includes:
a first receiving unit, configured to receive a first file address request sent by the management client when the browser displays the web page, where the first file address request includes an address identifier;
a first determining unit, configured to determine whether to send the first script to the terminal according to the address identifier;
the first sending unit is used for sending the file address of the first script to the management client when the first script is determined to be sent to the terminal according to the address identifier, and the management client adds an entry tag in the webpage according to the file address of the first script;
a second receiving unit, configured to receive the first file obtaining request sent when the browser runs the entry tag, where the first file obtaining request includes a file address of the first script;
And the second sending unit is used for sending the first script to the terminal according to the first file acquisition request.
Optionally, the apparatus further comprises: and determining a module.
And the determining module is used for determining a second script corresponding to the webpage according to the webpage identifier of the webpage in the first file address request.
Optionally, the second sending module 520 includes:
a third receiving unit, configured to receive a second file address request sent by the browser when the first script is executed, where the second file address request includes an environment parameter of the browser;
a second determining unit, configured to determine whether the browser supports running the second script according to the environment parameter;
a third sending unit, configured to send a file address of the second script to the terminal when it is determined that the browser supports running the second script;
a fourth receiving unit, configured to receive the second file acquisition request sent when the browser runs the file address of the second script;
and the fourth sending unit is used for sending the second script to the terminal according to the second file acquisition request.
Optionally, the apparatus further comprises: the device comprises a detection module and a third sending module.
The detection module is used for detecting whether the second script is issued within a preset time length;
and the third sending module is used for sending the file address of the second script to the terminal when the second script is not sent within the preset time length.
It should be noted that: the data acquisition device provided in the above embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the server is divided into different functional modules to complete all or part of the functions described above. In addition, the data acquisition device and the data acquisition method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
Embodiments of the present invention also provide a computer-readable storage medium, which may be a computer-readable storage medium contained in a memory; or it may be a separate computer-readable storage medium not incorporated into the terminal. The computer readable storage medium stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method for acquiring data at a terminal side provided by the above-mentioned various method embodiments.
Embodiments of the present invention also provide a computer-readable storage medium, which may be a computer-readable storage medium contained in a memory; or it may be a computer readable storage medium that exists separately and is not assembled into a server. The computer readable storage medium stores at least one instruction, at least one program, a set of codes, or a set of instructions that is loaded and executed by a processor to implement the server-side data collection method provided by the various method embodiments described above.
Fig. 6 is a schematic diagram illustrating a structure of a terminal 600 according to an embodiment of the present invention, where the terminal may include Radio Frequency (RF) circuits 601, a memory 602 including one or more computer-readable storage media, an input unit 603, a display unit 604, a sensor 605, an audio circuit 606, a Wireless Fidelity (WiFi) module 607, a processor 608 including one or more processing cores, and a power supply 609. Those skilled in the art will appreciate that the terminal structure shown in fig. 6 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
The RF circuit 601 may be used for receiving and transmitting signals during a message transmission or communication process, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages by one or more processors 608; in addition, data relating to uplink is transmitted to the base station. In general, the RF circuit 601 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 601 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), etc.
The memory 602 may be used to store software programs and modules, and the processor 608 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the terminal, etc. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 608 and the input unit 603 access to the memory 602.
The input unit 603 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, input unit 603 may include a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (e.g., operations by a user on or near the touch-sensitive surface using a finger, a stylus, or any other suitable object or attachment) thereon or nearby, and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 608, and can receive and execute commands sent by the processor 608. In addition, touch sensitive surfaces may be implemented using various types of resistive, capacitive, infrared, and surface acoustic waves. The input unit 603 may include other input devices in addition to the touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 604 may be used to display information input by or provided to the user and various graphical user interfaces of the terminal, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 604 may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay the display panel, and when a touch operation is detected on or near the touch-sensitive surface, the touch operation may be transmitted to the processor 608 to determine the type of touch event, and the processor 608 may then provide a corresponding visual output on the display panel based on the type of touch event. Although in FIG. 6 the touch-sensitive surface and the display panel are two separate components to implement input and output functions, in some embodiments the touch-sensitive surface may be integrated with the display panel to implement input and output functions.
The terminal may also include at least one sensor 605, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or the backlight when the terminal is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when the mobile phone is stationary, and can be used for applications of recognizing the posture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured in the terminal, detailed description is omitted here.
Audio circuitry 606, a speaker, and a microphone may provide an audio interface between the user and the terminal. The audio circuit 606 may transmit the electrical signal converted from the received audio data to a speaker, and convert the electrical signal into a sound signal for output; on the other hand, the microphone converts the collected sound signal into an electric signal, which is received by the audio circuit 606 and converted into audio data, which is then processed by the audio data output processor 608, and then transmitted to, for example, another terminal via the RF circuit 601, or the audio data is output to the memory 602 for further processing. The audio circuit 606 may also include an earbud jack to provide communication of peripheral headphones with the terminal.
WiFi belongs to short-distance wireless transmission technology, and the terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 607, and provides wireless broadband internet access for the user. Although fig. 6 shows the WiFi module 607, it is understood that it does not belong to the essential constitution of the terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 608 is a control center of the terminal, connects various parts of the entire handset using various interfaces and lines, and performs various functions of the terminal and processes data by operating or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby performing overall monitoring of the handset. Alternatively, processor 608 may include one or more processing cores; preferably, the processor 608 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 608.
The terminal also includes a power supply 609 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 608 via a power management system that may be used to manage charging, discharging, and power consumption. The power supply 609 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
Although not shown, the terminal may further include a camera, a bluetooth module, and the like, which will not be described herein.
In particular, in the embodiment of the present invention, the terminal 600 further includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors. The one or more programs include instructions for performing the terminal-side data collection method.
Fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention. The server 700 includes a Central Processing Unit (CPU) 701, a system memory 704 including a Random Access Memory (RAM) 702 and a read-only memory (ROM) 703, and a system bus 705 connecting the system memory 704 and the CPU 701. The server 700 also includes a basic input/output system (I/O system) 706, which facilitates transfer of information between devices within the computer, and a mass storage device 707 for storing an operating system 713, application programs 714, and other program modules 715.
The basic input/output system 706 comprises a display 708 for displaying information and an input device 709, such as a mouse, keyboard, etc., for a user to input information. Wherein the display 708 and input device 709 are connected to the central processing unit 701 through an input/output controller 710 coupled to the system bus 705. The basic input/output system 706 may also include an input/output controller 710 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, an input/output controller 710 may also provide output to a display screen, a printer, or other type of output device.
The mass storage device 707 is connected to the central processing unit 701 through a mass storage controller (not shown) connected to the system bus 705. The mass storage device 707 and its associated computer-readable media provide non-volatile storage for the server 700. That is, the mass storage device 707 may include a computer-readable medium (not shown) such as a hard disk or a Compact Disc-Only Memory (CD-ROM) drive.
Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media include RAM, ROM, erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other solid state memory technology, CD-ROM, Digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 704 and mass storage device 707 described above may be collectively referred to as memory.
The server 700 may also operate as a remote computer connected to a network via a network, such as the internet, according to various embodiments of the invention. That is, the server 700 may be connected to the network 712 through a network interface unit 711 connected to the system bus 705, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 711.
Specifically, in the embodiment of the present invention, the server 700 further includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors. The one or more programs include instructions for performing the server-side data collection method.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (13)

1. A data acquisition method is applied to a terminal provided with a browser and a management client, and comprises the following steps:
in the process of displaying a webpage by the browser, adding an entry tag in a page code of the webpage by the management client, wherein the entry tag comprises a tag attribute;
the management client sets the value of the label attribute as a file address of a first script, and the file address of the first script is issued to the management client by a server;
when the browser runs the file address of the first script in the entry tag, sending a first file acquisition request to the server, wherein the first file acquisition request comprises the file address of the first script;
the browser receives the first script sent by the server according to the first file acquisition request;
when the browser runs the first script, obtaining a second script corresponding to the webpage from the server, wherein the second script is sent when the server determines that the second script is not issued within a preset time length, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
And when the browser runs the second script, acquiring data of the webpage, wherein the acquired data comprises webpage content of the webpage.
2. The method of claim 1, wherein prior to adding an entry tag in the page code of the web page, the management client further comprises:
the management client monitors the browser;
when the management client monitors that the browser displays the webpage, the management client sends a first file address request to the server, wherein the first file address request comprises an address identifier; the first file address request is used for requesting the server to issue a file address of the first script, and the address identifier is used for the server to determine whether to send the first script to the management client;
and the management client receives the file address of the first script sent by the server when the server determines to issue the first script.
3. The method of claim 2, wherein sending a first file address request to the server by the management client while the management client monitors the browser to display the web page comprises:
When the management client monitors that the browser displays the webpage, the management client acquires a webpage identifier of the webpage;
the management client carries the webpage identifier in the first file address request and sends the first file address request to the server, wherein the first file acquisition request comprises the address identifier and the webpage identifier; the webpage identification is used for the server to determine a second script corresponding to the webpage.
4. The method according to any one of claims 1 to 3, wherein the browser, when running the first script, acquires the second script corresponding to the web page from the server, including:
the browser runs the first script and sends a second file address request to the server;
the browser receives the file address of the second script through the first script;
when the browser runs the file address of the second script, sending a second file acquisition request to the server, wherein the second file acquisition request comprises the file address of the second script;
and the browser receives the second script sent by the server according to the second file acquisition request.
5. The method of claim 4, wherein the browser running the first script sends a second file address request to the server, comprising:
when the browser runs the first script, obtaining an environment parameter of the browser, wherein the environment parameter is used for the server to determine whether the browser supports running the second script;
the browser sends the second file address request to the server according to the second file address request to which the environmental parameter is added; the second file obtaining request comprises a file address of the second script and the environment parameter.
6. A method of data acquisition, the method comprising:
receiving a first file acquisition request sent by at least one terminal when a browser runs an entry tag, wherein the first file acquisition request comprises a file address of a first script, the entry tag is added to a page code of the webpage by a management client in the process of displaying the webpage by the browser, and the value of a tag attribute of the entry tag is set as the file address of the first script by the management client;
Sending the first script to the terminal according to the first file acquisition request;
detecting whether a second script is issued within a preset time length;
if the second script is not issued within the preset time length, sending the second script to the terminal, wherein the second script is requested when the browser runs the first script, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
and acquiring the data of the webpage collected by the second script, wherein the collected data comprises the webpage content of the webpage.
7. The method according to claim 6, wherein before receiving the first file obtaining request sent by the at least one terminal when the browser executes the entry tab, the method further comprises:
receiving a first file address request sent by the management client when the browser displays the webpage, wherein the first file address request comprises an address identifier;
determining whether to send the first script to the terminal according to the address identifier;
and when the first script is determined to be sent to the terminal according to the address identifier, sending a file address of the first script to the management client.
8. The method according to claim 6 or 7, wherein the sending the second script to the terminal comprises:
receiving a second file address request sent by the browser when the first script is operated, wherein the second file address request comprises the environment parameters of the browser;
determining whether the browser supports running the second script according to the environment parameter;
when the browser supports the operation of the second script, sending a file address of the second script to the terminal;
receiving a second file acquisition request sent by the browser when the browser runs the file address of the second script;
and sending the second script to the terminal according to the second file acquisition request.
9. A data acquisition device, characterized in that, be applied to install in the terminal of management customer end, the device includes:
the adding unit is used for adding an entrance tag in a page code of a webpage in the process of displaying the webpage by a browser, wherein the entrance tag comprises a tag attribute; setting the value of the label attribute as a file address of a first script, and issuing the file address of the first script to the management client by a server;
A first obtaining module, configured to send a first file obtaining request to a server when a file address of the first script in the entry tag is operated, where the first file obtaining request includes the file address of the first script; receiving the first script sent by the server according to the first file acquisition request;
the second acquisition module is used for acquiring a second script corresponding to the webpage from the server when the first script acquired by the first acquisition module is operated, wherein the second script is sent when the server determines that the second script is not issued within a preset time length, the first script is a script for providing an operation environment of the second script, and the second script is a script for acquiring data of the webpage;
and the data acquisition module is used for acquiring data of the webpage when the second script acquired by the second acquisition module is operated, wherein the acquired data comprises webpage content of the webpage.
10. A data acquisition device, the device comprising:
a second receiving unit, configured to receive a first file obtaining request sent by at least one terminal when a browser runs an entry tag, where the first file obtaining request includes a file address of a first script, the entry tag is added to a page code of a web page by a management client in a process of displaying the web page by the browser, and a value of a tag attribute of the entry tag is set as the file address of the first script by the management client;
The first sending module is used for sending the first script to the terminal according to the first file acquisition request;
the detection module is used for detecting whether the second script is issued within the preset time length;
the second sending module is used for sending the second script to the terminal if the second script is not issued within the preset time length, wherein the second script is requested when the browser runs the first script, the first script is a script for providing a running environment of the second script, and the second script is a script for acquiring data of the webpage;
and the acquisition module is used for acquiring the data of the webpage acquired by the second script, wherein the acquired data comprises the webpage content of the webpage.
11. A computer device, characterized in that the terminal comprises a processor and a memory, wherein at least one program is stored in the memory, and the at least one program is loaded and executed by the processor to implement the data acquisition method according to any one of claims 1 to 5 or the data acquisition method according to any one of claims 6 to 8.
12. A data acquisition system is characterized by comprising a terminal and a server,
The terminal comprises the data acquisition device of claim 9;
the server comprising the data acquisition device of claim 10.
13. A computer-readable storage medium, in which at least one program is stored, which is loaded and executed by a processor to implement the data acquisition method according to any one of claims 1 to 5 or the data acquisition method according to any one of claims 6 to 8.
CN201710461347.XA 2017-06-15 2017-06-15 Data acquisition method and device, computer equipment and system Active CN109145182B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710461347.XA CN109145182B (en) 2017-06-15 2017-06-15 Data acquisition method and device, computer equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710461347.XA CN109145182B (en) 2017-06-15 2017-06-15 Data acquisition method and device, computer equipment and system

Publications (2)

Publication Number Publication Date
CN109145182A CN109145182A (en) 2019-01-04
CN109145182B true CN109145182B (en) 2022-07-12

Family

ID=64804178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710461347.XA Active CN109145182B (en) 2017-06-15 2017-06-15 Data acquisition method and device, computer equipment and system

Country Status (1)

Country Link
CN (1) CN109145182B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113898254B (en) * 2021-10-11 2022-07-22 石家庄华泰电力工具有限公司 Remote control method and management system suitable for intelligent safety management cabinet

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620630A (en) * 2009-06-29 2010-01-06 北京黑米天成科技有限公司 WEB action data collecting model based on JS script
CN103412890A (en) * 2013-07-19 2013-11-27 北京亿赞普网络技术有限公司 Webpage loading method and device
CN103714082A (en) * 2012-10-09 2014-04-09 上海博路信息技术有限公司 Mobile dynamic data engine based internet access analysis system
CN104504125A (en) * 2014-12-30 2015-04-08 北京国双科技有限公司 Web page data monitoring method and device
CN106469185A (en) * 2016-08-29 2017-03-01 浪潮电子信息产业股份有限公司 A kind of method carrying out data collection in website statistics

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083182A1 (en) * 2000-12-18 2002-06-27 Alvarado Juan C. Real-time streamed data download system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620630A (en) * 2009-06-29 2010-01-06 北京黑米天成科技有限公司 WEB action data collecting model based on JS script
CN103714082A (en) * 2012-10-09 2014-04-09 上海博路信息技术有限公司 Mobile dynamic data engine based internet access analysis system
CN103412890A (en) * 2013-07-19 2013-11-27 北京亿赞普网络技术有限公司 Webpage loading method and device
CN104504125A (en) * 2014-12-30 2015-04-08 北京国双科技有限公司 Web page data monitoring method and device
CN106469185A (en) * 2016-08-29 2017-03-01 浪潮电子信息产业股份有限公司 A kind of method carrying out data collection in website statistics

Also Published As

Publication number Publication date
CN109145182A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
US20180308182A1 (en) Information processing method and apparatus
WO2014206143A1 (en) Method, apparatus and device for displaying number of unread messages
CN105471931B (en) Method, device and system for inquiring service data
CN108345543B (en) Data processing method, device, equipment and storage medium
US10643021B2 (en) Method and device for processing web page content
CN107204964B (en) Authority management method, device and system
US20220086600A1 (en) Method for activating service based on user scenario perception, terminal device, and system
CN107204989B (en) Advertisement blocking method, terminal, server and storage medium
WO2015003636A1 (en) Method and device for interception of page elements
CN111178012A (en) Form rendering method, device and equipment and storage medium
US9582584B2 (en) Method, apparatus and system for filtering data of web page
CN105740419A (en) Method and apparatus for acquiring dynamically loaded content in webpage
EP2869604A1 (en) Method, apparatus and device for processing a mobile terminal resource
CN107766358B (en) Page sharing method and related device
US10796087B2 (en) Method and apparatus for obtaining web content
US20150199370A1 (en) Information management method, client and mobile terminal
CN105631059B (en) Data processing method, data processing device and data processing system
CN109145182B (en) Data acquisition method and device, computer equipment and system
WO2019037566A1 (en) Information display method and apparatus, device, and computer-readable storage medium
CN106201220B (en) Display content acquisition method and device
CN106709330B (en) Method and device for recording file execution behaviors
CN106156097B (en) Method and device for processing browser input records
CN109857403B (en) Page updating method and device, page processing method and device
EP3114573B1 (en) Apparatus and method for improving loading time in electronic device
CN112749074A (en) Test case recommendation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant