CN103577482A - Web page collecting method and device as well as browser - Google Patents

Web page collecting method and device as well as browser Download PDF

Info

Publication number
CN103577482A
CN103577482A CN201210278582.0A CN201210278582A CN103577482A CN 103577482 A CN103577482 A CN 103577482A CN 201210278582 A CN201210278582 A CN 201210278582A CN 103577482 A CN103577482 A CN 103577482A
Authority
CN
China
Prior art keywords
web page
webpage
page contents
contents
interlinkage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210278582.0A
Other languages
Chinese (zh)
Other versions
CN103577482B (en
Inventor
刘刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201210278582.0A priority Critical patent/CN103577482B/en
Publication of CN103577482A publication Critical patent/CN103577482A/en
Application granted granted Critical
Publication of CN103577482B publication Critical patent/CN103577482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention is suitable for the technical field of computers and provides a web page collecting method and device as well as a browser. The web page collecting method comprises the following steps of receiving a web page collecting command, and obtaining a web page link corresponding to a web page; calling a web page grabbing server in a cloud server farm to grab web page content corresponding to the web page link according to the web page link; saving the web page content in a cloud storage server in the cloud server farm. According to the web page collecting method and device as well as the browser, which are disclosed by the invention, the cloud storage of the collected web page content is realized, and the long-term validity of the collected web page is ensured, so that the collected web page content is not limited by time and access addresses, and the function of a bookmark of the browser is expanded.

Description

A kind of web page storage method, device and browser
Technical field
The invention belongs to field of computer technology, relate in particular to a kind of web page storage method, device and browser.
Background technology
Collection is the basic application of in browser, for needing the website/web page interlinkage of often access to be kept at local computer terminal user, by clickthrough, just can directly open corresponding website/page), to access corresponding resource, yet after the operating system of terminal is reinstalled, local user data in collection can be lost before, and Website page cannot be opened again.In addition, on other terminal, also cannot obtain the web site url that local computer terminal is preserved.
In order to guarantee to be linked on the terminal reinstalled after operating system or other terminal, also can open, network profile is suggested, thereby the web site url of collection can be saved on network.Yet, because lattice chain is connected to oneself life cycle, after after a while, due to the migration of corresponding Website server or the adjustment of the corresponding link path of webpage, according to the web page interlinkage of collection, also cannot have access to corresponding resource, allow to access, web page contents when the corresponding page may neither be collected, cannot realize long-term collection, the preservation of web page contents.
Summary of the invention
The object of the embodiment of the present invention is to provide a kind of web page storage method, is intended to solve because prior art cannot provide a kind of effective web page storage method, causes realizing the long-term collection of web page contents, the problem of preservation.
The embodiment of the present invention is achieved in that a kind of web page storage method, and described method comprises the steps:
Receive the instruction of collection webpage, obtain the web page interlinkage that described webpage is corresponding;
The webpage crawl server calling in cloud server group according to described web page interlinkage captures web page contents corresponding to described web page interlinkage;
Described web page contents is saved in to the high in the clouds storage server in cloud server group.
Another object of the embodiment of the present invention is to provide a kind of web page storage device, and described device comprises:
Link acquiring unit, for receiving the instruction of collection webpage, obtains the web page interlinkage that described webpage is corresponding;
Capturing webpage contents unit, captures web page contents corresponding to described web page interlinkage for call cloud server group's webpage crawl server according to described web page interlinkage; And
Web page contents storage unit, for being saved in described web page contents cloud server group's high in the clouds storage server.
Another object of the embodiment of the present invention is to provide a kind of browser, and described browser comprises above-mentioned web page storage device.
The embodiment of the present invention is collected the instruction of webpage by reception, obtain the web page interlinkage that webpage is corresponding, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage, web page contents is saved in to the high in the clouds storage server in cloud server group, realized the high in the clouds storage of collection web page contents, guarantee the permanently effective of collection webpage, made to collect the restriction that web page contents is not subject to time, access locations, expanded the function of browser collection folder.
Accompanying drawing explanation
Fig. 1 is the realization flow figure of the web page storage method that provides of the embodiment of the present invention one;
Fig. 2 is the realization flow figure of the web page storage method that provides of the embodiment of the present invention two;
Fig. 3 is the structural drawing of the web page storage device that provides of the embodiment of the present invention three; And
Fig. 4 is the structural drawing of the web page storage device that provides of the embodiment of the present invention four.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Below in conjunction with specific embodiment, specific implementation of the present invention is described in detail:
embodiment mono-:
Fig. 1 shows the realization flow of the web page storage method that the embodiment of the present invention one provides, and details are as follows:
In step S101, receive the instruction of collection webpage, obtain the web page interlinkage that webpage is corresponding.
In embodiments of the present invention, in the instruction of collection webpage, include corresponding web page interlinkage, when receiving the instruction of collection webpage, obtain wherein corresponding web page interlinkage.
In step S102, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage.
In embodiments of the present invention, in order to improve the grasp speed of web page contents, realize the load balancing of the webpage crawl server in cloud server group, as illustratively, when calling webpage in cloud server group and capture server and capture web page contents corresponding to web page interlinkage according to web page interlinkage, first, obtain the load information that webpage in cloud server group captures server, according to webpage, capture the load information of server and web page interlinkage and call and meet pre-conditioned webpage and capture server and capture web page contents corresponding to web page interlinkage.Default loading condition can comprise that the load of CPU, internal memory etc. is lower than conditions such as preset values, and then in the webpage that meets default loading condition captures server, chooses a web page contents corresponding to webpage crawl server crawl web page interlinkage that load is lighter.At this not in order to limit the present invention.
In step S103, web page contents is saved in to the high in the clouds storage server in cloud server group.
In embodiments of the present invention, after getting web page interlinkage, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage, web page contents is saved in to the high in the clouds storage server in cloud server group, realized the high in the clouds storage of collection web page contents, guarantee the permanently effective of collection webpage, made to collect the restriction that web page contents is not subject to time, access locations, expanded the function of browser collection folder.
embodiment bis-:
Fig. 2 shows the realization flow of the web page storage method that the embodiment of the present invention two provides, and details are as follows:
In step S201, receive the instruction of collection webpage, obtain the web page interlinkage that webpage is corresponding.
In embodiments of the present invention, in the instruction of collection webpage, include corresponding web page interlinkage, when receiving the instruction of collection webpage, obtain wherein corresponding web page interlinkage.
In step S202, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage.
In embodiments of the present invention, in order to improve the grasp speed of web page contents, realize the load balancing of the webpage crawl server in cloud server group, as illustratively, when calling webpage in cloud server group and capture server and capture web page contents corresponding to web page interlinkage according to web page interlinkage, first, obtain the load information that webpage in cloud server group captures server, according to webpage, capture the load information of server and web page interlinkage and call and meet pre-conditioned webpage and capture server and capture web page contents corresponding to web page interlinkage.Wherein, the pre-conditioned load that can comprise CPU, internal memory etc. is lower than conditions such as preset values, at this not in order to limit the present invention.
In step S203, the web page contents grabbing is polymerized to the web page files of a default form
In embodiments of the present invention, because web page contents comprises a plurality of elements, such as image, text, audio frequency etc., in order to improve storage, the transfer efficiency of web page contents, before high in the clouds storage server web page contents being saved in cloud server group, the web page contents grabbing can be polymerized to the web page files of a default form.This default form can be mht form or the file generating according to Request for Comment (Request For Comments) RFC2557 document.
In step S204, web page files is saved in to the high in the clouds storage server in cloud server group.
In step S205, characteristic information and web page interlinkage are saved in to default database.
In embodiments of the present invention, in order to improve quick-searching and the classification of collection webpage, before high in the clouds storage server web page contents being saved in cloud server group, receive the web page contents characteristic of correspondence information of input, this characteristic information is default characteristic item characteristic of correspondence value, this characteristic item has defined the feature of collection webpage, and for example, characteristic item can comprise web page contents theme, purposes, field etc.Characteristic information can extract by default mode from web page contents, and for example, keyword extraction method, can be also user-defined characteristic information.
In embodiments of the present invention, also characteristic information and corresponding web page interlinkage should be associated, be saved in default database, to facilitate the retrieval rate that web page storage task is managed and improves collection web page contents.Like this, when receiving user and collect the instruction of webpage, in this database, retrieve, if while retrieving the same web page interlinkage, i.e. the webpage of collection request no longer, this webpage of prompting user is collected.
In step S206, the web page interlinkage in default database is detected according to the default time cycle.
In step S207, when webpage corresponding to web page interlinkage detecting upgrades, the renewal whether output information collects web page contents with prompting.
In embodiments of the present invention, a default time cycle, web page interlinkage in default database is detected, whether the corresponding webpage of web page interlinkage that judgement is wherein preserved was updated after webpage corresponding to web page interlinkage collected, when webpage corresponding to web page interlinkage detecting upgrades, export information and whether collect the renewal of web page contents with prompting, thereby the automatic monitoring of collecting webpage is provided.
In embodiments of the present invention, if terminal user request will be collected by browser the web page contents that downloads to local computer system, can send teletype command on web page contents, carry out uploading of web page contents, thereby web page contents is saved in to the high in the clouds storage server in cloud server group.
In embodiments of the present invention, further, after high in the clouds storage server web page contents being saved in cloud server group, if terminal user clicks the web page interlinkage of collection, directly the high in the clouds storage server from cloud server group is downloaded the web page contents corresponding to webpage of collection, thereby guaranteed web page interlinkage effectively permanent of collection, made the access of the web page contents of collection not be subject to the restriction of access time, access locations.
One of ordinary skill in the art will appreciate that all or part of step realizing in above-described embodiment method is to come the hardware that instruction is relevant to complete by program, described program can be stored in a computer read/write memory medium, described storage medium, as ROM/RAM, disk, CD etc.
embodiment tri-:
Fig. 3 shows the structure of the web page storage device that the embodiment of the present invention three provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention, comprising:
Link acquiring unit 31, for receiving the instruction of collection webpage, obtains the web page interlinkage that webpage is corresponding.
Capturing webpage contents unit 32, captures web page contents corresponding to server crawl web page interlinkage for call cloud server group's webpage according to web page interlinkage.
Web page contents storage unit 33, for being saved in web page contents cloud server group's high in the clouds storage server.
In embodiments of the present invention, in order to improve the grasp speed of web page contents, realize the load balancing of the webpage crawl server in cloud server group, as illustratively, when calling webpage in cloud server group and capture server and capture web page contents corresponding to web page interlinkage according to web page interlinkage, first, obtain the load information that webpage in cloud server group captures server, according to webpage, capture the load information of server and web page interlinkage and call and meet pre-conditioned webpage and capture server and capture web page contents corresponding to web page interlinkage.Wherein, the pre-conditioned load that can comprise CPU, internal memory etc. is lower than conditions such as preset values, at this not in order to limit the present invention.Therefore, capturing webpage contents unit 32 can comprise:
Load information obtains subelement 321, captures the load information of server for obtaining cloud server group's webpage; And
Capturing webpage contents subelement 322, meets pre-conditioned webpage and captures server and capture web page contents corresponding to web page interlinkage for calling according to load information and web page interlinkage.
In embodiments of the present invention, after getting web page interlinkage, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage, web page contents is saved in to the high in the clouds storage server in cloud server group, realized the high in the clouds storage of collection web page contents, guarantee the permanently effective of collection webpage, made to collect the restriction that web page contents is not subject to time, access locations, expanded the function of browser collection folder.
embodiment tetra-:
Fig. 4 shows the structure of the web page storage device that the embodiment of the present invention four provides, and for convenience of explanation, only shows the part relevant to the embodiment of the present invention, comprising:
Link acquiring unit 41, for receiving the instruction of collection webpage, obtains the web page interlinkage that described webpage is corresponding.
Capturing webpage contents unit 42, captures web page contents corresponding to server crawl web page interlinkage for call cloud server group's webpage according to web page interlinkage.
Content-aggregated unit 43, for being polymerized to the web page contents grabbing the web page files of a default form.
In embodiments of the present invention, because web page contents comprises a plurality of elements, such as image, text, audio frequency etc., in order to facilitate storage, the transmission of web page contents, before high in the clouds storage server web page contents being saved in cloud server group, the web page contents grabbing can be polymerized to the web page files of a default form.This default form can be mht form or the file generating according to Request for Comment (Request For Comments) RFC2557 document.
Web page contents storage unit 44, for being saved in web page contents cloud server group's high in the clouds storage server.
In embodiments of the present invention, when content-aggregated unit 43 is polymerized to the web page files of a default form by the web page contents grabbing, web page contents storage unit 44 can comprise web page contents preservation subelement 441, for web page files being saved in to the high in the clouds storage server in cloud server group.
Link detection unit 45, for detecting the web page interlinkage of default database according to the default time cycle.
Upgrade prompting output unit 46, for webpage corresponding to web page interlinkage when detecting, upgrade, the renewal whether output information collects web page contents with prompting.
In embodiments of the present invention, can also preset a time cycle, web page interlinkage in default database is detected, whether the corresponding webpage of web page interlinkage that judgement is wherein preserved was updated after webpage corresponding to web page interlinkage collected, when webpage corresponding to web page interlinkage detecting upgrades, whether output information collects the renewal of web page contents with prompting, thereby the automatic monitoring of collection webpage is provided.
Characteristic information receiving element 47, for receiving web page contents characteristic of correspondence information.
In embodiments of the present invention, in order to improve quick-searching and the classification of collection webpage, before high in the clouds storage server web page contents being saved in cloud server group, receive the web page contents characteristic of correspondence information of input, this characteristic information is default characteristic item characteristic of correspondence value, this characteristic item has defined the feature of collection webpage, and for example, characteristic item can comprise web page contents theme, purposes, field etc.Characteristic information can extract by default mode from web page contents, and for example, keyword extraction method, can be also User Defined.
Data storage unit 48, for being saved in default database by characteristic information and web page interlinkage.
In embodiments of the present invention, also characteristic information and corresponding web page interlinkage should be associated, be saved in default database, to facilitate the retrieval rate that web page storage task is managed and improves collection web page contents.Like this, when receiving user and collect the instruction of webpage, in this database, retrieve, if while retrieving the same web page interlinkage, i.e. the webpage of collection request no longer, this webpage of prompting user is collected.
Webpage uploading unit 49, for receiving teletype command on web page contents, is saved in the high in the clouds storage server in cloud server group by web page contents.
In embodiments of the present invention, if terminal user request will be collected by browser the web page contents that downloads to local computer system, can send teletype command on web page contents, carry out uploading of web page contents, thereby web page contents is saved in to the high in the clouds storage server in cloud server group.
In embodiments of the present invention, also provide a kind of browser, this browser comprises implements the web page storage device described in three or four, makes terminal user to use the high in the clouds of webpage to collect by browser.
In embodiments of the present invention, after getting web page interlinkage, the webpage calling in cloud server group according to web page interlinkage captures web page contents corresponding to server crawl web page interlinkage, web page contents is saved in to the high in the clouds storage server in cloud server group, realized the high in the clouds storage of collection web page contents, guaranteed the permanently effective of collection webpage, making to collect web page contents is not subject to the time, the restriction of access locations, expanded the function of browser collection folder, simultaneously, if terminal user request will be collected by browser the web page contents that downloads to local computer system, can send teletype command on web page contents, carry out uploading of web page contents, thereby web page contents is saved in to the high in the clouds storage server in cloud server group, to realize the high in the clouds collection of local IP access web page contents.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (15)

1. a web page storage method, is characterized in that, described method comprises the steps:
Receive the instruction of collection webpage, obtain the web page interlinkage that described webpage is corresponding;
The webpage crawl server calling in cloud server group according to described web page interlinkage captures web page contents corresponding to described web page interlinkage;
Described web page contents is saved in to the high in the clouds storage server in cloud server group.
2. the method for claim 1, is characterized in that, before described web page contents is saved in to the step of the high in the clouds storage server in cloud server group, described method also comprises:
Receive described web page contents characteristic of correspondence information, described characteristic information is default characteristic item characteristic of correspondence value.
3. method as claimed in claim 2, is characterized in that, described method also comprises:
Described characteristic information and described web page interlinkage are saved in to default database.
4. method as claimed in claim 3, is characterized in that, described method also comprises:
Web page interlinkage in described default database is detected according to the default time cycle;
The webpage corresponding when the web page interlinkage of described detection upgrades, the renewal whether output information collects web page contents with prompting.
5. the method for claim 1, is characterized in that, calls webpage in cloud server group capture the step that server captures web page contents corresponding to described web page interlinkage and comprise according to described web page interlinkage:
Obtain the load information of the webpage crawl server in cloud server group;
According to described load information and described web page interlinkage, calling the webpage that meets default loading condition captures server and captures web page contents corresponding to described web page interlinkage.
6. the method for claim 1, it is characterized in that, according to described web page interlinkage, calling webpage in cloud server group captures after server captures the step of web page contents corresponding to described web page interlinkage, before described web page contents being saved in to the step of the high in the clouds storage server in cloud server group, described method also comprises:
The web page contents grabbing is polymerized to the web page files of default form;
The step that described web page contents is saved in to the high in the clouds storage server in cloud server group is specially:
Described web page files is saved in to the high in the clouds storage server in cloud server group.
7. the method for claim 1, is characterized in that, described method also comprises:
Receive teletype command on web page contents, described web page contents is saved in to the high in the clouds storage server in cloud server group.
8. a web page storage device, is characterized in that, described device comprises:
Link acquiring unit, for receiving the instruction of collection webpage, obtains the web page interlinkage that described webpage is corresponding;
Capturing webpage contents unit, captures web page contents corresponding to described web page interlinkage for call cloud server group's webpage crawl server according to described web page interlinkage; And
Web page contents storage unit, for being saved in described web page contents cloud server group's high in the clouds storage server.
9. device as claimed in claim 8, is characterized in that, described device also comprises:
Characteristic information receiving element, for receiving described web page contents characteristic of correspondence information, described characteristic information is default characteristic item characteristic of correspondence value.
10. device as claimed in claim 9, is characterized in that, described device also comprises:
Data storage unit, for being saved in default database by described characteristic information and web page interlinkage.
11. devices as claimed in claim 10, is characterized in that, described device also comprises:
Link detection unit, for detecting the web page interlinkage of described default database according to the default time cycle; And
Upgrade prompting output unit, for webpage corresponding to web page interlinkage when described detection, upgrade, the renewal whether output information collects web page contents with prompting.
12. devices as claimed in claim 8, is characterized in that, described capturing webpage contents unit comprises:
Load information obtains subelement, captures the load information of server for obtaining cloud server group's webpage; And
Capturing webpage contents subelement, the webpage that meets default loading condition for calling according to described load information and described web page interlinkage captures server and captures web page contents corresponding to described web page interlinkage.
13. devices as claimed in claim 8, is characterized in that, described device also comprises:
Content-aggregated unit, for being polymerized to the web page contents grabbing the web page files of default form;
Described web page contents storage unit comprises:
Web page contents is preserved subelement, for described web page files being saved in to cloud server group's high in the clouds storage server.
14. devices as claimed in claim 8, is characterized in that, described device also comprises:
Webpage uploading unit, for receiving teletype command on web page contents, is saved in the high in the clouds storage server in cloud server group by described web page contents.
15. 1 kinds of browsers, is characterized in that, described browser comprises the web page storage device as described in as arbitrary in claim 8 to 14.
CN201210278582.0A 2012-08-07 2012-08-07 A kind of webpage collection method, device and browser Active CN103577482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210278582.0A CN103577482B (en) 2012-08-07 2012-08-07 A kind of webpage collection method, device and browser

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210278582.0A CN103577482B (en) 2012-08-07 2012-08-07 A kind of webpage collection method, device and browser

Publications (2)

Publication Number Publication Date
CN103577482A true CN103577482A (en) 2014-02-12
CN103577482B CN103577482B (en) 2017-12-15

Family

ID=50049280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210278582.0A Active CN103577482B (en) 2012-08-07 2012-08-07 A kind of webpage collection method, device and browser

Country Status (1)

Country Link
CN (1) CN103577482B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156397A (en) * 2014-07-16 2014-11-19 百度在线网络技术(北京)有限公司 Method and device for collecting pages
CN104536993A (en) * 2014-12-10 2015-04-22 北京奇虎科技有限公司 Collected webpage processing method, collected webpage processing device and client-side
CN105468270A (en) * 2014-08-18 2016-04-06 腾讯科技(深圳)有限公司 Terminal application control method and device
CN107193976A (en) * 2017-05-25 2017-09-22 北京小米移动软件有限公司 Information resources display methods, device and computer-readable recording medium
CN107203630A (en) * 2017-05-31 2017-09-26 北京安云世纪科技有限公司 Application program page collecting method, device and corresponding mobile terminal
CN107229527A (en) * 2017-05-25 2017-10-03 北京小米移动软件有限公司 Information resources collecting method, device and computer-readable recording medium
CN107832422A (en) * 2015-02-12 2018-03-23 广东欧珀移动通信有限公司 A kind of collection of data method and device of collection
CN108959446A (en) * 2018-06-13 2018-12-07 佛山市车品匠汽车用品有限公司 A kind of Web browser method and system of mobile terminal
CN110213360A (en) * 2019-05-24 2019-09-06 维沃移动通信有限公司 A kind of store method and its terminal device of content

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101674329A (en) * 2009-09-27 2010-03-17 卓望数码技术(深圳)有限公司 Internet access method and Internet access system
US20100114914A1 (en) * 2008-10-30 2010-05-06 International Business Machines Corporation Selective Home Page Manager
CN101887421A (en) * 2009-05-13 2010-11-17 北京博越世纪科技有限公司 Technology for converting unformatted data into formatted data in web website
CN102484653A (en) * 2009-08-31 2012-05-30 思科技术公司 Measuring attributes of client-server applications
CN102624910A (en) * 2012-03-15 2012-08-01 华为技术有限公司 Method, device and system for processing webpage content selected by user

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100114914A1 (en) * 2008-10-30 2010-05-06 International Business Machines Corporation Selective Home Page Manager
CN101887421A (en) * 2009-05-13 2010-11-17 北京博越世纪科技有限公司 Technology for converting unformatted data into formatted data in web website
CN102484653A (en) * 2009-08-31 2012-05-30 思科技术公司 Measuring attributes of client-server applications
CN101674329A (en) * 2009-09-27 2010-03-17 卓望数码技术(深圳)有限公司 Internet access method and Internet access system
CN102624910A (en) * 2012-03-15 2012-08-01 华为技术有限公司 Method, device and system for processing webpage content selected by user

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
安伦: ""基于云平台在线Web挖掘中计算资源动态平衡的研究与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156397A (en) * 2014-07-16 2014-11-19 百度在线网络技术(北京)有限公司 Method and device for collecting pages
CN105468270A (en) * 2014-08-18 2016-04-06 腾讯科技(深圳)有限公司 Terminal application control method and device
CN104536993B (en) * 2014-12-10 2018-03-20 北京奇虎科技有限公司 Collect the processing method of webpage, collect the processing unit and client of webpage
CN104536993A (en) * 2014-12-10 2015-04-22 北京奇虎科技有限公司 Collected webpage processing method, collected webpage processing device and client-side
CN107832422B (en) * 2015-02-12 2022-01-28 Oppo广东移动通信有限公司 Data collection method and device of favorites
CN107832422A (en) * 2015-02-12 2018-03-23 广东欧珀移动通信有限公司 A kind of collection of data method and device of collection
CN107193976A (en) * 2017-05-25 2017-09-22 北京小米移动软件有限公司 Information resources display methods, device and computer-readable recording medium
CN107229527A (en) * 2017-05-25 2017-10-03 北京小米移动软件有限公司 Information resources collecting method, device and computer-readable recording medium
US11216525B2 (en) 2017-05-25 2022-01-04 Beijing Xiaomi Mobile Software Co., Ltd. Information resource collection method, device, and computer-readable storage medium
CN107193976B (en) * 2017-05-25 2024-03-29 北京小米移动软件有限公司 Information resource display method, device and computer readable storage medium
CN107203630B (en) * 2017-05-31 2020-11-24 北京安云世纪科技有限公司 Application page collection method and device and corresponding mobile terminal
CN107203630A (en) * 2017-05-31 2017-09-26 北京安云世纪科技有限公司 Application program page collecting method, device and corresponding mobile terminal
CN108959446A (en) * 2018-06-13 2018-12-07 佛山市车品匠汽车用品有限公司 A kind of Web browser method and system of mobile terminal
CN110213360A (en) * 2019-05-24 2019-09-06 维沃移动通信有限公司 A kind of store method and its terminal device of content
CN110213360B (en) * 2019-05-24 2021-06-15 维沃移动通信有限公司 Content storage method and terminal equipment thereof

Also Published As

Publication number Publication date
CN103577482B (en) 2017-12-15

Similar Documents

Publication Publication Date Title
CN103577482A (en) Web page collecting method and device as well as browser
US10652265B2 (en) Method and apparatus for network forensics compression and storage
CN102930059B (en) Method for designing focused crawler
WO2015196907A1 (en) Search pushing method and device which mine user requirements
CN102761627B (en) Based on cloud network address recommend method and system and the relevant device of terminal access statistics
CN103970788A (en) Webpage-crawling-based crawler technology
CN102436513B (en) Distributed search method and system
CN108847977A (en) A kind of monitoring method of business datum, storage medium and server
CN105512201A (en) Data collection and processing method and device
CN102752288A (en) Method and device for identifying network access action
CN102521251A (en) Method for directly realizing personalized search, device for realizing method, and search server
CN110083391A (en) Call request monitoring method, device, equipment and storage medium
CN106445944A (en) Data query request processing method and apparatus, and electronic device
CN104516982A (en) Method and system for extracting Web information based on Nutch
CN102780726A (en) Log analysis method and log analysis system based on WEB platform
CN102957571A (en) Method and system for monitoring network flows
CN103179164A (en) Method and communication terminal of storing page information
CN104301304A (en) Vulnerability detection system based on large ISP interconnection port and method thereof
CN105302801A (en) Resource caching method and apparatus
CN105550179B (en) Webpage collection method and browser plug-in
CN105893636A (en) Historical sharing recording method and device
CN104035943B (en) Store the method and respective server of data
CN103455597A (en) Distributed information hiding detection method facing mass web images
CN102902784B (en) Web page classification storage system and method
CN109213824B (en) Data capture system, method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant