CN105701153A - Method and device for reading webpage resources and electronic equipment - Google Patents

Method and device for reading webpage resources and electronic equipment Download PDF

Info

Publication number
CN105701153A
CN105701153A CN201511020458.4A CN201511020458A CN105701153A CN 105701153 A CN105701153 A CN 105701153A CN 201511020458 A CN201511020458 A CN 201511020458A CN 105701153 A CN105701153 A CN 105701153A
Authority
CN
China
Prior art keywords
web page
page resources
information
code
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201511020458.4A
Other languages
Chinese (zh)
Other versions
CN105701153B (en
Inventor
徐光圣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201511020458.4A priority Critical patent/CN105701153B/en
Publication of CN105701153A publication Critical patent/CN105701153A/en
Application granted granted Critical
Publication of CN105701153B publication Critical patent/CN105701153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The embodiment of the invention discloses a method and a device for reading webpage resources and electronic equipment. The method is applied to the network view controls of versions 4.0 to 4.3 of the android operating system, and comprises the following steps: if the loading state of the webpage resource to be grabbed corresponding to the acquired webpage grabbing request is loading completion, acquiring URL information of the webpage resource to be grabbed; acquiring a resource cache file path mapped by a package name according to the package name of an application program for constructing the current webpage; extracting a binary data file under the path of the resource cache file, and traversing the binary data file to obtain an information field matched with the URL information; inquiring information before the matched information field, acquiring preset mark information, calculating a strategy according to the information before the mark information and the file name, acquiring a webpage resource file corresponding to the URL information, and reading the webpage resource file under the path of the resource cache file. By applying the invention, the utilization efficiency of network resources can be improved.

Description

A kind of read the method for web page resources, device and electronic equipment
Technical field
The present invention relates to computer network resources technology, particularly relate to and a kind of read the method for web page resources, device and electronic equipment。
Background technology
Along with compunication and Internet technology, the application of electronic equipment is more and more general, such as, intelligent mobile phone, personal digital assistant, palm PC and desktop computer obtain and are increasingly widely applied, types of applications program (APP in the electronic device is installed, Application) and browser control part also get more and more, in order to meet user to the multifarious demand of business。Wherein, browser control part is the network tool that the browser installed in electronic equipment is indispensable, such as, page browser control (Webbrowser) based on Windows operating system browser, network view control (Webview) etc. based on Mobile operating system (iOS) browser of Android (Android) operating system browser and Fructus Mali pumilae, the function that the browser control part of different operating system realizes is similar, simply engine and implement difference。Wherein, Webview as the loading carried in Android operation system, render, the infrastructure component of displayed web page, apply relatively broad。
The brief flow process that Webview loads displayed web page is as follows: receive the webpage load request of user, first, obtain, from cloud server, the Internet resources that webpage load request is corresponding by network, buffer memory is to local storage, wherein, Internet resources adopt the mode of Code Edit, then, resolve the Internet resources of buffer memory, obtain web page element to be loaded, the web page element to be loaded obtained rendered in webpage to be presented and show the webpage rendered, until all of web page element to be loaded renders complete in webpage to be presented, obtaining the webpage browsed for user。
Along with user's diversified demand to application function, user is in the process browsing webpage, if it find that good web page resources, such as, picture resource, audio resource, video resource and animation resource etc., expect to capture (reading) such web page resources so that carrying out subsequent treatment, such as, carry out editing or store this web page resources, the URL information of this web page resources can being obtained by clicking this web page resources, downloading from cloud server according to URL information thus triggering。But the method that web page resources should be obtained from the webpage currently loaded, need the web page resources captured to be needed to download and be saved in local storage again by network, thus causing the repeated downloads of resource, not only consume the network traffics of user, add user and capture the time needed for web page resources, also reduce the level of resources utilization of network。
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method of web page resources, device and electronic equipment of reading, and reduces user and captures the time needed for web page resources, promotes the utilization ratio of Internet resources。
For reaching above-mentioned purpose, embodiments of the invention adopt the following technical scheme that
First aspect, the embodiment of the present invention provides a kind of method reading web page resources, is applied to Android operation system 4.0 version network view control to 4.3 versions, including:
Receive web page resources and capture request, obtain the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
If described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
According to the bag name of the application program building current web page, obtain the caching resource file path that described bag name maps;
Extract the binary data file under the described caching resource file path obtained, travel through described binary data file, obtain capturing, with described waiting, the information field that the URL information of web page resources matches;
Information before the information field matched described in inquiry, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
Optionally, described reception web page resources captures request, and the stress state waiting to capture web page resources obtaining the request of described webpage capture corresponding includes:
Network view control injects the crawl pre-set and monitors event;
When described network view control loads webpage, trigger and start described crawl monitoring event to monitor web page resources crawl request;
After listening to web page resources crawl request, obtain the stress state waiting to capture web page resources that the request of described webpage capture is corresponding。
Optionally, the described binary data file extracted under the described caching resource file path obtained, travel through described binary data file, the information field obtaining matching with the described URL information waiting to capture web page resources includes:
Extracting the binary data file under the described caching resource file path obtained, carry out hexadecimal conversion, corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Travel through described code-group, obtain the flag code group pre-set, extract the continuous non-zero code after each described flag code group, obtain corresponding web page resources code;
Convert the web page resources code obtained to USA standard code for information interchange, if the USA standard code for information interchange being converted to matches with the described URL information waiting to capture web page resources, obtain capturing, with described waiting, the information field that the URL information of web page resources matches。
Optionally, described binary data file is :/data/data/a.b.c/Cache/webviewCacheChromium/data_1。
Optionally, described flag code group is 0080。
Optionally, the described binary data file extracted under the described caching resource file path obtained, travel through described binary data file, the information field obtaining matching with the described URL information waiting to capture web page resources includes:
Extracting the binary data file under the described caching resource file path obtained, carry out hexadecimal conversion, corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
The described URL information waiting to capture web page resources is converted to hexadecimal web page resources code;
Traversal carries out the binary data file of hexadecimal conversion, obtains the code segment with described hexadecimal web page resources code match, obtains capturing, with described waiting, the information field that the URL information of web page resources matches。
Optionally, described according to the information before described flag information and the filename calculative strategy that pre-sets, the web page resources file waiting to capture the URL information of web page resources described in obtaining corresponding includes:
Extract the row at described flag code group place, in units of two, the traveling line position at described flag code group place is identified;
Sequentially extract the 15th of row, the 14th and the 13rd hexadecimal code at described flag code group place, obtain web page resources file Part I;
Character f_ on splicing again before the web page resources file Part I obtained, obtains described web page resources file。
Optionally, described caching resource file path is :/data/data/a.b.c/Cache/webviewCacheChromium, and wherein, a.b.c is described bag name。
Optionally, described web page resources includes: one in picture resource, audio resource, video resource and animation resource or its combination in any。
Second aspect, the embodiment of the present invention provides a kind of device reading web page resources, it is applied to Android operation system 4.0 version network view control to 4.3 versions, including: web page resources state acquisition module, URL information acquisition module, file path acquisition module, information field matching module and web page resources read module, wherein
Web page resources state acquisition module, is used for receiving web page resources and captures request, obtains the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
URL information acquisition module, if described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
File path acquisition module, for the bag name according to the application program building current web page, obtains the caching resource file path that described bag name maps;
Information field matching module, is used for the binary data file extracting under the described caching resource file path of acquisition, travels through described binary data file, obtains capturing, with described waiting, the information field that the URL information of web page resources matches;
Web page resources read module, for the information before the information field that matches described in inquiring about, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
Optionally, described web page resources state acquisition module includes: injection unit, monitoring unit and web page resources state acquiring unit, wherein,
Injection unit, monitors event for injecting the crawl pre-set in network view control;
Monitoring unit, for when described network view control loads webpage, triggering and start described crawl monitoring event to monitor web page resources crawl request;
Web page resources state acquiring unit, for, after listening to web page resources crawl request, obtaining the stress state waiting to capture web page resources that the request of described webpage capture is corresponding。
Optionally, described information field matching module includes: converting unit, web page resources code extraction unit and information field matching unit, wherein,
Converting unit, is used for the binary data file extracting under the described caching resource file path of acquisition, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Web page resources code extraction unit, is used for traveling through described code-group, obtains the flag code group pre-set, extracts the continuous non-zero code after each described flag code group, obtain corresponding web page resources code;
Information field matching unit, for converting the web page resources obtained code to USA standard code for information interchange, if the USA standard code for information interchange being converted to matches with the described URL information waiting to capture web page resources, obtain capturing, with described waiting, the information field that the URL information of web page resources matches。
Optionally, described binary data file is :/data/data/a.b.c/Cache/webviewCacheChromium/data_1。
Optionally, described flag code group is 0080。
Optionally, described information field matching module includes: the first converting unit, the second converting unit and traversal matching unit, wherein,
First converting unit, is used for the binary data file extracting under the described caching resource file path of acquisition, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Second converting unit, for being converted to hexadecimal web page resources code by the described URL information waiting to capture web page resources;
Traversal matching unit, for traveling through the binary data file carrying out hexadecimal conversion, obtains the code segment with described hexadecimal web page resources code match, obtains capturing, with described waiting, the information field that the URL information of web page resources matches。
Optionally, described web page resources read module includes: query unit, extraction unit, web page resources file first generate unit, web page resources file second generates unit and read unit, wherein,
Query unit, is used for the information before the information field matched described in inquiring about, obtains the flag information pre-set;
Extraction unit, for extracting the row at described flag code group place, in units of two, identifies the traveling line position at described flag code group place;
Web page resources file first generates unit, for sequentially extracting the 15th of row, the 14th and the 13rd hexadecimal code at described flag code group place, obtains web page resources file Part I;
Web page resources file second generates unit, for splicing character f_ before the web page resources file Part I obtained again, obtains described web page resources file;
Read unit, for reading the described web page resources file under described caching resource file path。
Optionally, described caching resource file path is :/data/data/a.b.c/Cache/webviewCacheChromium, and wherein, a.b.c is described bag name。
Optionally, described web page resources includes: one in picture resource, audio resource, video resource and animation resource or its combination in any。
The third aspect, the embodiment of the present invention provides a kind of electronic equipment, and described electronic equipment includes: housing, processor, memorizer, circuit board and power circuit, and wherein, circuit board is placed in the interior volume that housing surrounds, processor and memorizer and arranges on circuit boards;Power circuit, powers for each circuit or the device for above-mentioned electronic equipment;Memorizer is used for storing executable program code;Processor runs the program corresponding with executable program code by reading the executable program code of storage in memorizer, for performing aforementioned arbitrary described method reading web page resources。
The method reading web page resources that the embodiment of the present invention provides, device and electronic equipment, by the network view control between research Android operation system 4.0 version to 4.3 versions, analyze the mapping relations of Bao Mingyu caching resource file path, binary data file under caching resource file path is carried out relevant treatment, thus finding and waiting to capture the message segment that the URL information of web page resources matches, row according to the flag information place before the message segment matched, obtain the web page resources file that the URL information waiting to capture web page resources is corresponding, thus parsing the mapping relations waiting to capture between URL information and the web page resources file of web page resources, then the web page resources file of local cache can be directly read, the repeated downloads of resource can be prevented effectively from, save the network traffics of user, shorten user and capture the time needed for web page resources, promote the level of resources utilization of network。
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, the accompanying drawing used required in embodiment or description of the prior art will be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the premise not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings。
Fig. 1 is the method flow schematic diagram that the embodiment of the present invention reads web page resources;
Fig. 2 is that the embodiment of the present invention receives web page resources crawl request, obtains the schematic flow sheet waiting to capture the stress state of web page resources that the request of described webpage capture is corresponding;
Fig. 3 is the binary data file that first embodiment of the invention extracts under the described caching resource file path obtained, travel through described binary data file, obtain the schematic flow sheet of the information field matched with the described URL information waiting to capture web page resources;
Fig. 4 is the binary data file that second embodiment of the invention extracts under the described caching resource file path obtained, travel through described binary data file, obtain the schematic flow sheet of the information field matched with the described URL information waiting to capture web page resources;
Fig. 5 is the embodiment of the present invention according to the information before described flag information and the filename calculative strategy that pre-sets, waits to capture the schematic flow sheet of web page resources file corresponding to the URL information of web page resources described in obtaining;
Fig. 6 is the apparatus structure schematic diagram that the embodiment of the present invention reads web page resources;
Fig. 7 is the structural representation of one embodiment of electronic equipment of the present invention。
Detailed description of the invention
Below in conjunction with accompanying drawing, the embodiment of the present invention is described in detail。
It will be appreciated that described embodiment a part of embodiment that is only the present invention, rather than whole embodiment。Based on the embodiment in the present invention, all other embodiments that those of ordinary skill in the art obtain under not making creative work premise, broadly fall into the scope of protection of the invention。
Fig. 1 is the method flow schematic diagram that the embodiment of the present invention reads web page resources。Referring to Fig. 1, the method is applied to Android operation system 4.0 version network view control to 4.3 versions, including:
Step 11, receives web page resources and captures request, obtains the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
In this step, owing to network view control is when loading displayed web page, parse a web page resources (web page element to be loaded), namely in current web page, render this web page resources and be illustrated in webpage, and for picture resource, audio resource, the web page resources such as video resource and animation resource, when analyzing web page resource, if having URL (URL, UniformResourceLocator) information, then first load the URL information of parsing, then reload and render the web page resources of parsing, finally, after having rendered, it is illustrated in webpage。Thus, in the embodiment of the present invention, it is necessary to the details of this web page resources could be obtained after web page resources has loaded, such as, aforesaid URL information, it is not fully complete if web page resources loads, owing to the details of this web page resources can not be obtained, thus follow-up flow process cannot be performed。
In the embodiment of the present invention, all web page resources comprised in the corresponding entire Web page of Internet resources, include multiple file, the concrete file about Internet resources is configured to known technology, omits detailed description at this。
As an alternative embodiment, web page resources is the resource with URL information, includes but not limited to: the one of picture resource, audio resource, video resource and animation resource etc. or its combination in any。
As an alternative embodiment, Fig. 2 is that the embodiment of the present invention receives web page resources crawl request, obtains the schematic flow sheet waiting to capture the stress state of web page resources that the request of described webpage capture is corresponding。Referring to Fig. 2, this flow process includes:
Step 21, injects the crawl pre-set in network view control and monitors event;
In this step, it is preferred that, utilize JAVA to describe (JS, JavaScript) scripted code and realize capturing monitoring event。Such as, by being previously implanted the JS scripted code of the embodiment of the present invention in network view control, in JS scripted code, network view control is set and starts this JS scripted code when loading webpage, so that the JS scripted code started carries out monitoring users captures the behavior of web page resources, such as, the behavior (web page resources captures request) of monitoring users webpage clicking resource。Wherein, JS scripted code is a kind of based on object and event-driven and the client-side scripting language with relative safety, is widely used in customer terminal webpage exploitation, it is possible to by adding dynamic function to respond the various operations of user。
Step 22, when described network view control loads webpage, triggers and starts described crawl monitoring event to monitor web page resources crawl request;
Step 23, after listening to web page resources crawl request, obtains the stress state waiting to capture web page resources that the request of described webpage capture is corresponding。
In this step, if listening to user operation, for instance, to click or touching current web page has the web page resources of URL information, then confirm that listening to web page resources captures request, this clicked web page resources is for waiting to capture web page resources。
In the embodiment of the present invention, stress state includes: load not complete and loaded。
Step 12, if described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
In this step, when webpage clicking has the web page resources of URL information, it is possible to obtain the URL information of this web page resources, for instance, for a certain picture resource, URL information is: http://www.xxx.com/xx.jpg。
Step 13, according to the bag name of the application program building current web page, obtains the caching resource file path that described bag name maps;
In this step, network view control, after obtaining Internet resources, in the caching resource file path give tacit consent to Internet resources buffer memory to the local storage obtained, and builds the mapping relations of the Bao Mingyu caching resource file path of application program corresponding to webpage。
In the embodiment of the present invention, owing to Internet resources include many parts of files, wherein, existing each web page resources file, also there is binary data file, and web page resources file and web page resources do not have explicit mapping relations, and current network view control does not provide the interface accessing the Internet resources being stored in local storage, thus, web page resources can not be read either directly through the mode of the Internet resources accessing storage, thus needing the URL information again by web page resources, network is utilized to download web page resources corresponding to URL information to this locality from cloud server, waste the network traffics of user。
In the embodiment of the present invention, network view control for Android operation system 4.0 version to 4.3 versions, the mapping relations of Bao Mingyu caching resource file path are saved in the privately owned catalogue corresponding to bag name of the application program (APP, Application) for building current web page。Such as, for mobile electronic device, the application program building current web page is the application program that the Web site format of network is converted to mobile page formatting, for example, if the bag of an application program is called a.b.c, then the caching resource file path of its mapping is :/data/data/a.b.c/Cache/webviewCacheChromium。
Step 14, extracts the binary data file under the described caching resource file path obtained, travels through described binary data file, obtains capturing, with described waiting, the information field that the URL information of web page resources matches;
In step, as an alternative embodiment, Fig. 3 is the binary data file that first embodiment of the invention extracts under the described caching resource file path obtained, travel through described binary data file, obtain the schematic flow sheet of the information field matched with the described URL information waiting to capture web page resources。Referring to Fig. 3, this flow process includes:
Step 31, extracts the binary data file under the described caching resource file path obtained, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
In this step, binary data file is :/data/data/a.b.c/Cache/webviewCacheChromium/data_1。
In the embodiment of the present invention, the binary code in binary file being converted to hexadecimal code, each row shows 256 bit, namely 32 bytes。Such as, a hexadecimal code being converted to is as follows:
... (above eliminating partial content)
0000000000000000ce1a01a1ec020080
00000000000000000000000000000000
00000000000000000000000000000000
687474703a2f2f622e686970686f746f
732e62616964752e636f6d2f696d6167
652f7069632f6974656d2f6561633462
37343534336139383232363636393134
63663438653832623930313462393065
6239352e6a7067000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
... (eliminating partial content below)
Step 32, travels through described code-group, obtains the flag code group pre-set, extracts the continuous non-zero code after each described flag code group, obtain corresponding web page resources code;
In this step, flag code group is 0080,0080 is a banner string, and without concrete meaning, from 0080 rear first non-zero code, till when code is zero, the code segment comprised is web page resources code。
In the embodiment of the present invention, in code-group, it is possible to include one or more flag code group, according to each flag code group, it is possible to obtain a web page resources code。For above-mentioned, the web page resources code obtained is as follows:
687474703a2f2f622e686970686f746f
732e62616964752e636f6d2f696d6167
652f7069632f6974656d2f6561633462
37343534336139383232363636393134
63663438653832623930313462393065
6239352e6a7067
Step 33, convert the web page resources code obtained to USA standard code for information interchange, if the USA standard code for information interchange being converted to matches with the described URL information waiting to capture web page resources, obtain capturing, with described waiting, the information field that the URL information of web page resources matches。
In this step, for above-mentioned web page resources code, if the USA standard code for information interchange being converted to (ASCII, AmericanStandardCodeforInformationInterchange) is: http://www.xxx.com/xx.jpg。Show that this USA standard code for information interchange waits that the URL information (http://www.xxx.com/xx.jpg) capturing web page resources matches。
As another alternative embodiment, Fig. 4 is the binary data file that second embodiment of the invention extracts under the described caching resource file path obtained, travel through described binary data file, obtain the schematic flow sheet of the information field matched with the described URL information waiting to capture web page resources。Referring to Fig. 4, this flow process includes:
Step 41, extracts the binary data file under the described caching resource file path obtained, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Step 42, is converted to hexadecimal web page resources code by the described URL information waiting to capture web page resources;
In this step, to wait that the URL information capturing web page resources is for http://www.xxx.com/xx.jpg, then the hexadecimal web page resources code being converted to is as follows:
687474703a2f2f622e686970686f746f
732e62616964752e636f6d2f696d6167
652f7069632f6974656d2f6561633462
37343534336139383232363636393134
63663438653832623930313462393065
6239352e6a7067
Step 43, traversal carries out the binary data file of hexadecimal conversion, obtains the code segment with described hexadecimal web page resources code match, obtains capturing, with described waiting, the information field that the URL information of web page resources matches。
Step 15, information before the information field matched described in inquiry, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
In this step, flag information is flag code group。As an alternative embodiment, Fig. 5 is the embodiment of the present invention according to the information before described flag information and the filename calculative strategy that pre-sets, waits to capture the schematic flow sheet of web page resources file corresponding to the URL information of web page resources described in obtaining。Referring to Fig. 5, this flow process includes:
Step 51, extracts the row at described flag code group place, in units of two, the traveling line position at described flag code group place is identified;
In this step, waiting that web page resources fileinfo corresponding to URL information capturing web page resources is saved in the row at flag code group place, carrying out the code segment after station location marker is:
0000000000000000ce1a01a1ec020080
12345678910111213141516
Step 52, sequentially extracts the 15th of row, the 14th and the 13rd hexadecimal code at described flag code group place, obtains web page resources file Part I;
In this step, sequentially the 15th of the row at flag code group place, the 14th and the 13rd hexadecimal string are stitched together, obtain hexadecimal string: 0002ec;
Step 53, character f_ on splicing again before the web page resources file Part I obtained, obtain described web page resources file。
In this step, on splicing again before the character string obtained " f ", obtain f_0002ec, described in namely, treat the web page resources file that the URL information of crawl web page resources is corresponding。This web page resources file is stored under caching resource file path, it may be assumed that
/data/data/a.b.c/Cache/webviewCacheChromium/f_0002ec。
In the embodiment of the present invention, by the network view control between research Android operation system 4.0 version to 4.3 versions, analyze the caching resource feature of network view control, the i.e. mapping relations of Bao Mingyu caching resource file path, binary data file under caching resource file path is carried out relevant treatment, thus finding and waiting to capture the message segment that the URL information of web page resources matches, row according to the flag information place before the message segment matched, obtain the web page resources file that the URL information waiting to capture web page resources is corresponding, thus parsing the mapping relations waiting to capture between URL information and the web page resources file of web page resources, then the web page resources file of local cache can be directly read, without the web page resources captured being needed to download and be saved in local storage again by network, effectively prevent the repeated downloads of resource, the network traffics of user are saved, shorten user and capture the time needed for web page resources, namely the time of the web page resources obtained in the webpage loaded is saved, also the level of resources utilization of network is improved。
Fig. 6 is the apparatus structure schematic diagram that the embodiment of the present invention reads web page resources。Referring to Fig. 6, this device is applied to Android operation system 4.0 version network view control to 4.3 versions, including: web page resources state acquisition module 61, URL information acquisition module 62, file path acquisition module 63, information field matching module 64 and web page resources read module 65, wherein
Web page resources state acquisition module 61, is used for receiving web page resources and captures request, obtains the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
In the embodiment of the present invention, stress state includes: load not complete and loaded。
As an alternative embodiment, web page resources state acquisition module 61 includes: injection unit, monitoring unit and web page resources state acquiring unit (not shown), wherein,
Injection unit, monitors event for injecting the crawl pre-set in network view control;
In the embodiment of the present invention, JAVA description script code is utilized to realize capturing monitoring event。
Monitoring unit, for when described network view control loads webpage, triggering and start described crawl monitoring event to monitor web page resources crawl request;
Web page resources state acquiring unit, for, after listening to web page resources crawl request, obtaining the stress state waiting to capture web page resources that the request of described webpage capture is corresponding。
URL information acquisition module 62, if described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
In the embodiment of the present invention, described web page resources is the resource with URL information, including: one in picture resource, audio resource, video resource and animation resource or its combination in any。
File path acquisition module 63, for the bag name according to the application program building current web page, obtains the caching resource file path that described bag name maps;
The present invention is in embodiment, and for the network view control of Android operation system 4.0 version to 4.3 versions, the mapping relations of Bao Mingyu caching resource file path are saved in the privately owned catalogue corresponding to the bag name of the application program for building current web page。
As an alternative embodiment, caching resource file path is :/data/data/a.b.c/Cache/webviewCacheChromium, and wherein, a.b.c is described bag name。
Information field matching module 64, is used for the binary data file extracting under the described caching resource file path of acquisition, travels through described binary data file, obtains capturing, with described waiting, the information field that the URL information of web page resources matches;
In the embodiment of the present invention, as an alternative embodiment, information field matching module includes: converting unit, web page resources code extraction unit and information field matching unit (not shown), wherein,
Converting unit, is used for the binary data file extracting under the described caching resource file path of acquisition, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Web page resources code extraction unit, is used for traveling through described code-group, obtains the flag code group pre-set, extracts the continuous non-zero code after each described flag code group, obtain corresponding web page resources code;
Information field matching unit, for converting the web page resources obtained code to USA standard code for information interchange, if the USA standard code for information interchange being converted to matches with the described URL information waiting to capture web page resources, obtain capturing, with described waiting, the information field that the URL information of web page resources matches。
In the embodiment of the present invention, as an alternative embodiment, described binary data file is :/data/data/a.b.c/Cache/webviewCacheChromium/data_1。
In the embodiment of the present invention, described flag code group is 0080。From rear first non-zero code of flag code group 0080, till when code is zero, the code segment comprised is web page resources code。
In the embodiment of the present invention, in code-group, it is possible to include one or more flag code group, according to each flag code group, it is possible to obtain a web page resources code。
In the embodiment of the present invention, as another alternative embodiment, information field matching module includes: the first converting unit, the second converting unit and traversal matching unit (not shown), wherein,
First converting unit, is used for the binary data file extracting under the described caching resource file path of acquisition, carries out hexadecimal conversion, and corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Second converting unit, for being converted to hexadecimal web page resources code by the described URL information waiting to capture web page resources;
Traversal matching unit, for traveling through the binary data file carrying out hexadecimal conversion, obtains the code segment with described hexadecimal web page resources code match, obtains capturing, with described waiting, the information field that the URL information of web page resources matches。
Web page resources read module 65, for the information before the information field that matches described in inquiring about, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
In the embodiment of the present invention, as an alternative embodiment, web page resources read module includes: query unit, extraction unit, web page resources file first generate unit, web page resources file second generates unit and read unit, wherein,
Query unit, is used for the information before the information field matched described in inquiring about, obtains the flag information pre-set;
Extraction unit, for extracting the row at described flag code group place, in units of two, identifies the traveling line position at described flag code group place;
In the embodiment of the present invention, wait that web page resources fileinfo corresponding to URL information capturing web page resources is saved in the row at flag code group place。
Web page resources file first generates unit, for sequentially extracting the 15th of row, the 14th and the 13rd hexadecimal code at described flag code group place, obtains web page resources file Part I;
In the embodiment of the present invention, sequentially the 15th of the row at flag code group place, the 14th and the 13rd hexadecimal string are stitched together, obtain hexadecimal string。
Web page resources file second generates unit, for splicing character f_ before the web page resources file Part I obtained again, obtains described web page resources file;
In the embodiment of the present invention, web page resources file is stored under caching resource file path/data/data/a.b.c/Cache/webviewCacheChromium。
Read unit, for reading the described web page resources file under described caching resource file path。
The embodiment of the present invention also provides for a kind of electronic equipment, and described electronic equipment comprises the device described in aforementioned any embodiment。
Fig. 7 is the structural representation of one embodiment of electronic equipment of the present invention, the flow process of Fig. 1-6 illustrated embodiment of the present invention can be realized, as shown in Figure 7, above-mentioned electronic equipment may include that housing 71, processor 72, memorizer 73, circuit board 74 and power circuit 75, wherein, circuit board 74 is placed in the interior volume that housing 71 surrounds, processor 72 and memorizer 73 and is arranged on circuit board 74;Power circuit 75, powers for each circuit or the device for above-mentioned electronic equipment;Memorizer 73 is used for storing executable program code;Processor 72 runs the program corresponding with executable program code by reading the executable program code of storage in memorizer 73, for performing the method reading web page resources described in aforementioned any embodiment。
The step that concrete execution process and the processor 72 of above-mentioned steps are performed further by processor 72 by operation executable program code, it is possible to referring to the description of Fig. 1-6 illustrated embodiment of the present invention, do not repeat them here。
This electronic equipment exists in a variety of forms, includes but not limited to:
(1) mobile communication equipment: the feature of this kind equipment is to possess mobile communication function, and to provide speech, data communication for main target。This Terminal Type includes: smart mobile phone (such as iPhone), multimedia handset, functional mobile phone, and low-end mobile phone etc.。
(2) super mobile personal computer equipment: this kind equipment belongs to the category of personal computer, has calculating and processes function, generally also possessing mobile Internet access characteristic。This Terminal Type includes: PDA, MID and UMPC equipment etc., for instance iPad。
(3) portable entertainment device: this kind equipment can show and play content of multimedia。This kind equipment includes: audio frequency, video player (such as iPod), handheld device, e-book, and intelligent toy and portable car-mounted navigator。
(4) server: the equipment of the service of calculating is provided, the composition of server includes processor, hard disk, internal memory, system bus etc., server is similar with general computer architecture, but owing to needing to provide highly reliable service, therefore require higher in disposal ability, stability, reliability, safety, extensibility, manageability etc.。
(5) other have the electronic equipment of data interaction function。
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, can be by the hardware that computer program carrys out instruction relevant to complete, described program can be stored in a computer read/write memory medium, this program is upon execution, it may include such as the flow process of the embodiment of above-mentioned each side method。Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-OnlyMemory, ROM) or random store-memory body (RandomAccessMemory, RAM) etc.。
The above; being only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, any those familiar with the art is in the technical scope that the invention discloses; the change that can readily occur in or replacement, all should be encompassed within protection scope of the present invention。Therefore, protection scope of the present invention should be as the criterion with scope of the claims。

Claims (10)

1. the method reading web page resources, it is characterised in that the method is applied to Android operation system 4.0 version network view control to 4.3 versions, including:
Receive web page resources and capture request, obtain the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
If described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
According to the bag name of the application program building current web page, obtain the caching resource file path that described bag name maps;
Extract the binary data file under the described caching resource file path obtained, travel through described binary data file, obtain capturing, with described waiting, the information field that the URL information of web page resources matches;
Information before the information field matched described in inquiry, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
2. method according to claim 1, it is characterised in that described reception web page resources captures request, the stress state waiting to capture web page resources obtaining the request of described webpage capture corresponding includes:
Network view control injects the crawl pre-set and monitors event;
When described network view control loads webpage, trigger and start described crawl monitoring event to monitor web page resources crawl request;
After listening to web page resources crawl request, obtain the stress state waiting to capture web page resources that the request of described webpage capture is corresponding。
3. method according to claim 1, it is characterized in that, the described binary data file extracted under the described caching resource file path obtained, travels through described binary data file, and the information field obtaining matching with the described URL information waiting to capture web page resources includes:
Extracting the binary data file under the described caching resource file path obtained, carry out hexadecimal conversion, corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
Travel through described code-group, obtain the flag code group pre-set, extract the continuous non-zero code after each described flag code group, obtain corresponding web page resources code;
Convert the web page resources code obtained to USA standard code for information interchange, if the USA standard code for information interchange being converted to matches with the described URL information waiting to capture web page resources, obtain capturing, with described waiting, the information field that the URL information of web page resources matches。
4. method according to claim 3, it is characterised in that described binary data file is :/data/data/a.b.c/Cache/webviewCacheChromium/data_1。
5. method according to claim 3, it is characterised in that described flag code group is 0080。
6. method according to claim 1, it is characterized in that, the described binary data file extracted under the described caching resource file path obtained, travels through described binary data file, and the information field obtaining matching with the described URL information waiting to capture web page resources includes:
Extracting the binary data file under the described caching resource file path obtained, carry out hexadecimal conversion, corresponding 32 bytes of each row, every four hexadecimals are the code-group in a line;
The described URL information waiting to capture web page resources is converted to hexadecimal web page resources code;
Traversal carries out the binary data file of hexadecimal conversion, obtains the code segment with described hexadecimal web page resources code match, obtains capturing, with described waiting, the information field that the URL information of web page resources matches。
7. method according to claim 1, it is characterized in that, described according to the information before described flag information and the filename calculative strategy that pre-sets, the web page resources file waiting to capture the URL information of web page resources described in obtaining corresponding includes:
Extract the row at described flag code group place, in units of two, the traveling line position at described flag code group place is identified;
Sequentially extract the 15th of row, the 14th and the 13rd hexadecimal code at described flag code group place, obtain web page resources file Part I;
Character f_ on splicing again before the web page resources file Part I obtained, obtains described web page resources file。
8. the method according to any one of claim 1 to 7, it is characterised in that described caching resource file path is :/data/data/a.b.c/Cache/webviewCacheChromium, wherein, a.b.c is described bag name。
9. the method according to any one of claim 1 to 7, it is characterised in that described web page resources includes: one in picture resource, audio resource, video resource and animation resource or its combination in any。
10. the device reading web page resources, it is characterized in that, this device is applied to Android operation system 4.0 version network view control to 4.3 versions, including: web page resources state acquisition module, URL information acquisition module, file path acquisition module, information field matching module and web page resources read module, wherein
Web page resources state acquisition module, is used for receiving web page resources and captures request, obtains the stress state waiting to capture web page resources that the request of described webpage capture is corresponding;
URL information acquisition module, if described in wait capture web page resources stress state be loaded, wait described in acquisition capture web page resources URL information;
File path acquisition module, for the bag name according to the application program building current web page, obtains the caching resource file path that described bag name maps;
Information field matching module, is used for the binary data file extracting under the described caching resource file path of acquisition, travels through described binary data file, obtains capturing, with described waiting, the information field that the URL information of web page resources matches;
Web page resources read module, for the information before the information field that matches described in inquiring about, obtain the flag information pre-set, according to the information before described flag information and the filename calculative strategy that pre-sets, treat the web page resources file that the URL information of crawl web page resources is corresponding described in obtaining, read the described web page resources file under described caching resource file path。
CN201511020458.4A 2015-12-29 2015-12-29 Method and device for reading webpage resources and electronic equipment Active CN105701153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511020458.4A CN105701153B (en) 2015-12-29 2015-12-29 Method and device for reading webpage resources and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511020458.4A CN105701153B (en) 2015-12-29 2015-12-29 Method and device for reading webpage resources and electronic equipment

Publications (2)

Publication Number Publication Date
CN105701153A true CN105701153A (en) 2016-06-22
CN105701153B CN105701153B (en) 2019-03-22

Family

ID=56226040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511020458.4A Active CN105701153B (en) 2015-12-29 2015-12-29 Method and device for reading webpage resources and electronic equipment

Country Status (1)

Country Link
CN (1) CN105701153B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108008984A (en) * 2017-11-15 2018-05-08 武汉斗鱼网络科技有限公司 A kind of resource file downloading updating method and device
CN108200191A (en) * 2018-01-29 2018-06-22 杭州电子科技大学 Utilize the client dynamic URL associated script character string detecting systems of perturbation method
CN108197291A (en) * 2018-01-19 2018-06-22 北京小米移动软件有限公司 Operation performs method and device
CN110727891A (en) * 2019-09-09 2020-01-24 中国平安财产保险股份有限公司 Browser cache management method and device and computer readable storage medium
CN110795650A (en) * 2019-09-18 2020-02-14 平安银行股份有限公司 Webpage opening method and device and computer readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1290898A (en) * 1999-06-15 2001-04-11 太阳微系统公司 Net page of high speed buffer storage reducing form on small footprint size device
CN101369284A (en) * 2008-09-28 2009-02-18 北京搜狗科技发展有限公司 Method and apparatus for loading web pages
CN102033917A (en) * 2010-12-09 2011-04-27 广州市动景计算机科技有限公司 Webpage browsing method for mobile terminal and mobile terminal applying same
CN102253941A (en) * 2010-05-21 2011-11-23 卓望数码技术(深圳)有限公司 Cache updating method and cache updating device
CN102654882A (en) * 2011-03-02 2012-09-05 北京千橡网景科技发展有限公司 Method and apparatus for page loading
CN102663074A (en) * 2012-03-31 2012-09-12 奇智软件(北京)有限公司 Method and device for connecting link in search result webpage
CN102833111A (en) * 2012-08-30 2012-12-19 北京锐安科技有限公司 Visual hyper text transfer protocol (HTTP) data supervising method and device
CN102855253A (en) * 2011-06-30 2013-01-02 腾讯科技(深圳)有限公司 Browser and browsing method thereof
WO2014187159A1 (en) * 2013-05-23 2014-11-27 Tencent Technology (Shenzhen) Company Limited A method and an apparatus for performing offline access to web pages
US20140372936A1 (en) * 2013-06-13 2014-12-18 Tencent Technology (Shenzhen) Company Limited Method and apparatus for displaying tag data

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1290898A (en) * 1999-06-15 2001-04-11 太阳微系统公司 Net page of high speed buffer storage reducing form on small footprint size device
CN101369284A (en) * 2008-09-28 2009-02-18 北京搜狗科技发展有限公司 Method and apparatus for loading web pages
CN102253941A (en) * 2010-05-21 2011-11-23 卓望数码技术(深圳)有限公司 Cache updating method and cache updating device
CN102033917A (en) * 2010-12-09 2011-04-27 广州市动景计算机科技有限公司 Webpage browsing method for mobile terminal and mobile terminal applying same
CN102654882A (en) * 2011-03-02 2012-09-05 北京千橡网景科技发展有限公司 Method and apparatus for page loading
CN102855253A (en) * 2011-06-30 2013-01-02 腾讯科技(深圳)有限公司 Browser and browsing method thereof
CN102663074A (en) * 2012-03-31 2012-09-12 奇智软件(北京)有限公司 Method and device for connecting link in search result webpage
CN102833111A (en) * 2012-08-30 2012-12-19 北京锐安科技有限公司 Visual hyper text transfer protocol (HTTP) data supervising method and device
WO2014187159A1 (en) * 2013-05-23 2014-11-27 Tencent Technology (Shenzhen) Company Limited A method and an apparatus for performing offline access to web pages
US20140372936A1 (en) * 2013-06-13 2014-12-18 Tencent Technology (Shenzhen) Company Limited Method and apparatus for displaying tag data

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108008984A (en) * 2017-11-15 2018-05-08 武汉斗鱼网络科技有限公司 A kind of resource file downloading updating method and device
CN108197291A (en) * 2018-01-19 2018-06-22 北京小米移动软件有限公司 Operation performs method and device
CN108200191A (en) * 2018-01-29 2018-06-22 杭州电子科技大学 Utilize the client dynamic URL associated script character string detecting systems of perturbation method
CN108200191B (en) * 2018-01-29 2019-03-22 杭州电子科技大学 Utilize the client dynamic URL associated script character string detection system of perturbation method
CN110727891A (en) * 2019-09-09 2020-01-24 中国平安财产保险股份有限公司 Browser cache management method and device and computer readable storage medium
CN110795650A (en) * 2019-09-18 2020-02-14 平安银行股份有限公司 Webpage opening method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN105701153B (en) 2019-03-22

Similar Documents

Publication Publication Date Title
CN103338254B (en) Terminal across application pushing method, device, terminal and system
CN102567516B (en) Script loading method and device
CN103491169B (en) Method and system for uploading and downloading files
CN103092581B (en) The building method of a kind of web front end this locality development environment and device
CN105701153A (en) Method and device for reading webpage resources and electronic equipment
CN105426549A (en) Method and device for reading webpage resources and electronic equipment
CN103338384A (en) Video play method and video play device
CN103729425B (en) Operate response method, client, browser and system
CN102646135B (en) Webpage collecting method, device and system
CN108287918B (en) Music playing method and device based on application page, storage medium and electronic equipment
CN105122237A (en) Sharing application states
CN103501481A (en) Information propaganda pushing method and pushing device thereof based on wireless network
CN103944944A (en) Webpage link sharing method and system, and electronic device
CN105051685A (en) System and method to enable web property access to a native application
CN105760490A (en) Resource storage method and device and electronic equipment
CN104182429A (en) Web page processing method and terminal
CN104580376A (en) Method, device and system for constructing connection between terminals in local area network
CN105068802A (en) Cross device running Web application software implementation system and method
CN102624910B (en) Method, the Apparatus and system of the web page contents that process user chooses
CN105677730A (en) Method and device for reading webpage resources and electronic equipment
CN105550179A (en) Webpage collection method and browser plug-in
CN103440281A (en) Method, device and equipment for acquiring download file
CN103246501A (en) Method and client-side for sharing application
CN103729440A (en) Method and device for having access to website
CN113065083B (en) Page processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant