CN104391978B - Web page storage processing method and processing device for browser - Google Patents

Web page storage processing method and processing device for browser Download PDF

Info

Publication number
CN104391978B
CN104391978B CN201410742954.XA CN201410742954A CN104391978B CN 104391978 B CN104391978 B CN 104391978B CN 201410742954 A CN201410742954 A CN 201410742954A CN 104391978 B CN104391978 B CN 104391978B
Authority
CN
China
Prior art keywords
browser
webpage
collection
text
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410742954.XA
Other languages
Chinese (zh)
Other versions
CN104391978A (en
Inventor
伯诺克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410742954.XA priority Critical patent/CN104391978B/en
Publication of CN104391978A publication Critical patent/CN104391978A/en
Application granted granted Critical
Publication of CN104391978B publication Critical patent/CN104391978B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Abstract

The invention discloses a kind of web page storage processing method and processing device for browser, which includes:Search key is received, wherein, search key is used to search the webpage for needing to browse from the collection webpage of browser;Search key is matched with the collection webpage of browser, obtains the address of matched collection webpage;Export the address of matched collection webpage.By the present invention, solve the problems, such as that the efficiency that target webpage is searched from the collection webpage of browser is low, and then improve the effect for the efficiency that target webpage is searched from the collection webpage of browser.

Description

Web page storage processing method and processing device for browser
Technical field
The present invention relates to internet arena, in particular to a kind of web page storage processing method for browser and Device.
Background technology
Existing browser has the function of collection webpage.It has recorded the URL of the webpage of user's preservation in web page storage folder Address and the title of the webpage.When user needs to access the webpage of collection again, the network address or net in collection can be passed through The title of page accesses to find these webpages.Although aforesaid way can allow user to find the webpage of collection, work as and receive When Tibetan records very much, the webpage recognized the need for can only be removed by the title in collection.But the title of webpage usually cannot Represent web page contents, or in the title of some keywords of web page contents for being concerned about of user webpage for being not contained in collection, So that user is difficult to be quickly found out the webpage that needs access in the webpage largely collected.
For in correlation technique from browser collection webpage in search target webpage efficiency it is low the problem of, at present not yet It is proposed effective solution.
The content of the invention
It is a primary object of the present invention to provide a kind of web page storage processing method and processing device for browser, to solve The problem of efficiency of lookup target webpage is low from the collection webpage of browser.
To achieve these goals, according to an aspect of the invention, there is provided a kind of web page storage for browser Processing method.
The web page storage processing method for browser includes according to the present invention:Search key is received, wherein, retrieval Keyword is used to search the webpage for needing to browse from the collection webpage of browser;By search key and the collection net of browser Page is matched, and obtains the address of matched collection webpage;Export the address of matched collection webpage.
Further, the collection webpage of search key and browser is carried out matching includes:Obtain the collection of browser The title and content of text of webpage;And the title for collecting webpage and content of text and the search key progress by browser Match somebody with somebody, wherein, if the title and content of text of the collection webpage of browser are matched with search key, it is determined that search key Matched with the collection webpage of browser, if the title and content of text and search key of the collection webpage of browser are not Match somebody with somebody, it is determined that the collection webpage of search key and browser mismatches.
Further, the title of the collection webpage of browser and content of text are being carried out with search key to match it Before, method further includes:Obtain the content of text of the collection webpage of browser;Obtain the network address and mark of the collection webpage of browser Topic;And content of text, network address and the title of the collection webpage of storage browser.
Further, obtaining the content of text of the collection webpage of browser includes:Obtain the ground of the collection webpage of browser Location;Collection webpage is accessed according to the address of the collection webpage of browser;And from collection net during collection webpage is accessed Page crawls content of text, obtains the content of text of the collection webpage of browser.
Further, from collection web page crawl content of text, browser is obtained from during collection webpage is accessed The content of text of collection webpage includes:Filter the hypertext markup language label of the collection webpage of browser;It is and super from filtering Content of text is crawled in the collection webpage of the browser of text mark up language label, is obtained in the text of collection webpage of browser Hold.
Further, the receipts of browser are obtained from collection web page crawl content of text during collection webpage is accessed After the content of text for hiding webpage, method further includes:Keyword is obtained from the content of text of the collection webpage of browser, is obtained The keyword of the collection webpage of browser;Keyword, network address and the title of the collection webpage of browser are stored, by the receipts of browser The title and content of text for hiding webpage carry out matching with search key and include:By the keyword and mark of the collection webpage of browser Topic is matched with search key.
To achieve these goals, according to another aspect of the present invention, there is provided a kind of web page storage for browser Processing unit.
The web page storage processing unit for browser includes according to the present invention:Receiving unit, is closed for receiving retrieval Keyword, wherein, search key is used to search the webpage for needing to browse from the collection webpage of browser;Matching unit, is used for Search key is matched with the collection webpage of browser, obtains the address of matched collection webpage;And output unit, For exporting the address of matched collection webpage.
Further, matching unit includes:First acquisition module, the title and text of the collection webpage for obtaining browser This content;And matching module, for the title of the collection webpage of browser and content of text and search key to be carried out Match somebody with somebody, wherein, if the title and content of text of the collection webpage of browser are matched with search key, it is determined that search key Matched with the collection webpage of browser, if the title and content of text and search key of the collection webpage of browser are not Match somebody with somebody, it is determined that the collection webpage of search key and browser mismatches.
Further, device further includes:First acquisition unit, the content of text of the collection webpage for obtaining browser; Second acquisition unit, the network address and title of the collection webpage for obtaining browser;And storage unit, for storing browser Collection webpage content of text, network address and title.
Further, first acquisition unit includes:Second acquisition module, obtains the address of the collection webpage of browser;Visit Ask module, the address for the collection webpage according to browser accesses collection webpage;And module is crawled, for accessing collection From collection web page crawl content of text during webpage, the content of text of the collection webpage of browser is obtained.
By the present invention, the collection webpage for needing to access is searched from the collection webpage of browser by the way of retrieval, Solve the problems, such as that the efficiency that target webpage is searched from the collection webpage of browser is low, and then improve from browser Collect the effect for the efficiency that target webpage is searched in webpage.
Brief description of the drawings
The attached drawing for forming the part of the application is used for providing a further understanding of the present invention, schematic reality of the invention Apply example and its explanation is used to explain the present invention, do not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of the web page storage processing method for browser according to embodiments of the present invention;And
Fig. 2 is the schematic diagram of the web page storage processing unit for browser according to embodiments of the present invention.
Embodiment
It should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.Below with reference to the accompanying drawings and the present invention will be described in detail in conjunction with the embodiments.
In order to make those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application Attached drawing, is clearly and completely described the technical solution in the embodiment of the present application, it is clear that described embodiment is only The embodiment of the application part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's all other embodiments obtained without making creative work, should all belong to the model of the application protection Enclose.
It should be noted that term " first " in the description and claims of this application and above-mentioned attached drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use Data can exchange in the appropriate case, so as to embodiments herein described herein.In addition, term " comprising " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing series of steps or unit Process, method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include without clear It is listing to Chu or for the intrinsic other steps of these processes, method, product or equipment or unit.
An embodiment of the present invention provides a kind of web page storage processing method for browser, Fig. 1 is real according to the present invention Apply the flow chart of the web page storage processing method for browser of example.
As shown in Figure 1, this method includes steps S102 to step S106:
Step S102:Search key is received, wherein, search key, which is used to search from the collection webpage of browser, to be needed The webpage to be browsed.
Search key can be any key for being used to search the webpage for needing to browse from the collection webpage of browser Word, search key can be a keyword or multiple keywords.Specifically, the receipts in browser can be passed through The region for hiding webpage sets a frame retrieval, and search key input by user is received by the frame retrieval.
Step S104:Search key is matched with the collection webpage of browser, obtains matched collection webpage Address.
The collection webpage of browser is usually located in the collection of browser, is preserved in the collection of existing browser The address of collection webpage and title.It can be that retrieval is crucial that the collection webpage of search key and browser is carried out matching Word is matched with collecting the title of webpage, if illustrating the collection webpage there are search key in the title of collection webpage It is related to the webpage that user's needs access.Record reason collection net matched with search key in the collection webpage of browser Page.Preferably, in order to improve by search key search need access collection webpage accuracy, by search key with The collection webpage of browser, which carries out matching, to be included:Obtain the title and content of text of the collection webpage of browser;And it will browse The title and content of text of the collection webpage of device are matched with search key, wherein, if the collection webpage of browser Title and content of text are matched with search key, it is determined that search key is matched with the collection webpage of browser, if clear Look at device collection webpage title and content of text and search key mismatch, it is determined that the receipts of search key and browser Webpage is hidden to mismatch.
The content of collection webpage can be obtained or in advance by the collection net of browser by accessing collection webpage The content of text of each collection webpage is stored in local data base or other storage regions in page, by from database or Other storage regions obtain the content of text of collection webpage.The content of text of collection webpage can be the full text for collecting webpage Content or the keyword of the extraction in the full text content of collection webpage.Due to not collecting the title of webpage sometimes not The content of collection webpage can be represented, or the keyword of the content of the collection webpage of user's care may be not contained in collection net In the title of page, at this time, if only can cause to retrieve by the way that search key is matched with collecting the title of webpage The collection webpage accessed to needs, and by replacing multiple search keys progress, repeatedly retrieval can not also be retrieved possible user To the collection webpage that accesses of needs, by the way that the title of the collection webpage of browser and content of text and search key are carried out Match somebody with somebody, can be to avoid the above problem.Specifically, first the title for collecting webpage can be matched with search key, if collection net The title of page is matched with search key, and the content that can no longer carry out collection webpage is matched with search key, if collection The title of webpage is mismatched with search key, then the content for collecting webpage is matched with search key.By the above method, The matched probability of collection webpage and search key can be improved, further improving to search by search key needs to access Collection webpage accuracy.
Preferably, in order to improve the title of collection webpage of above-mentioned acquisition browser and the efficiency of content of text, in Jiang Liu Look at device collection webpage title and before content of text matched with search key, this method further includes:Acquisition browses The content of text of the collection webpage of device;Obtain the network address and title of the collection webpage of browser;And the collection of storage browser Content of text, network address and the title of webpage.
In text by the collection webpage for obtaining browser in advance before the collection webpage to browser is retrieved Hold, collect the network address of webpage and collect the title of webpage and be stored in local storage region, such as local data base, specifically Ground, during the content of text of storage collection webpage, network address and title, can associate the text of the collection webpage of browser Content, network address and title, that is, establish the correspondence of the content of text for belonging to same collection webpage, network address and title.Pass through The above method, when collection webpage of the user to browser carry out retrieval be when, can be rapidly obtained collection webpage text This content, title are matched with search key, if there is with search key it is matched collection webpage when can be quick The address of the collection webpage is obtained, improves effectiveness of retrieval.
Alternatively, obtaining the content of text of the collection webpage of browser includes:Obtain the address of the collection webpage of browser; Collection webpage is accessed according to the address of the collection webpage of browser;And climbed during collection webpage is accessed from collection webpage Content of text is taken, obtains the content of text of the collection webpage of browser.
The network address and title of the collection webpage of browser are had stored in the collection of browser, specifically, Ke Yitong Cross the application programming interfaces (Application for the address for being used to obtain collection webpage for calling browser to provide Programming Interface, API) come obtain collection webpage address, i.e. universal resource locator (Uniform Resource Locator, URL).Address by collecting webpage can access the collection webpage, access the mistake of collection webpage Cheng Zhongcong collects web page crawl content of text, obtains the content of text of the collection webpage of browser.Specifically, network can be passed through Reptile crawls content of text from collection webpage.Web crawlers be according to setting rule it is automatic crawl on network the program of information or It is script, for example, web crawlers can be set only to crawl the content of text on webpage, web crawlers can also be set only to crawl net Picture on page, waits.The content of text of collection webpage is only crawled in the embodiment of the present invention by web crawlers.Preferably, it is Improve the efficiency for the content of text for crawling collection webpage, access collect webpage during from collection webpage crawling text Content, obtaining the content of text of the collection webpage of browser includes:Filter the hypertext markup language of the collection webpage of browser Label;And content of text is crawled in the collection webpage of the browser from filtering hypertext markup language label, obtain browser Collection webpage content of text.
Hypertext markup language (Hyper Text Markup Language, HTML) label is in hypertext markup language Minimum unit, can set the display format of webpage, for example, passing through hypertext markup by the hypertext markup language label Linguistic labels set display location of the title of webpage, keyword, web page contents etc..Specifically, can be by collecting webpage Address to server ask access webpage after, by server return content matched with default regular expression, mistake The hypertext markup language label of collection webpage is filtered, wherein, regular expression is to be described using single character string, matched one Series meets the character string of some syntactic rule, for example, a regular expression for being used to match China Post's coding is " [1- 9]\\d{5}(!D) ", character string to be matched is " Chinabeijing100081haidian ", then passes through the regular expressions Formula can go out in character string to be detected to represent the character " 100081 " of postcode with Rapid matching, other characters are then filtered.
Preferably, during collection webpage is accessed the collection of browser is obtained from collection web page crawl content of text After the content of text of webpage, method further includes:Keyword is obtained from the content of text of the collection webpage of browser, is obtained clear Look at device collection webpage keyword;Keyword, network address and the title of the collection webpage of browser are stored, by the collection of browser The title and content of text of webpage carry out matching with search key to be included:By the keyword and title of the collection webpage of browser Matched with search key.
The keyword of the collection webpage of browser can be some that occurrence number is more in the content of text for collect webpage Word or collect webpage content of text middle position rest against before content of text word, such as collection webpage text Summary of content etc..Specifically, the embodiment of the present invention is to collect some more words of occurrence number in the content of text of webpage Illustrated exemplified by keyword as the collection webpage, can be to collecting net after the content of text of collection webpage is got The content of text of page carries out cutting word, and the content of text that will collect webpage is divided into independent word, can filter out one in advance The word without physical meaning such as a little stop words, stop words, that is, modal particle, conjunction, word set is formed by the word obtained after filtering Close, count the word repeated in the set of words and the word occurrence number repeated, if what this repeated The occurrence number of word is more than predetermined threshold value, then the keyword using the word that this repeats as collection webpage.Obtain it is clear Look at after the keyword for collecting webpage of device, similarly, can be built in the keyword of storage collection webpage, network address and title process The correspondence of the vertical keyword for collecting webpage, network address and title.Since the content of text for collecting webpage may be more, retrieval is closed Keyword is more time-consuming when the content of text of webpage is matched with collecting, on the other hand, it is also possible to of excessive mistake occurs With as a result, not being the collection webpage that user needs to access with the matched collection webpage of search key, by extracting collection net Keyword in the content of text of page is matched with search key, can not only improve matched efficiency, but also can carry The accuracy of high matching result.
Step S106:Export the address of matched collection webpage.
By above-mentioned steps can obtain browser collection webpage in the matched collection webpage of search key Address, the address for exporting the matched collection webpage are checked for user.
It can be seen from the above description that the present invention realizes following technique effect:
The embodiment of the present invention is by receiving search key, by the progress of the collection webpage of search key and browser Match somebody with somebody, obtain the address of matched collection webpage, the address of matched collection webpage is exported, from browser by way of retrieval Being searched in collection webpage needs the collection webpage that accesses, compared with the prior art in open the collection of browser successively by user Webpage is searched, improve from browser collection webpage in search target webpage efficiency, solve in correlation technique from The problem of efficiency of lookup target webpage is low in the collection webpage of browser.
It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions Performed in computer system, although also, show logical order in flow charts, in some cases, can be with not The order being same as herein performs shown or described step.
Another aspect according to embodiments of the present invention, there is provided a kind of web page storage processing unit for browser, should Device can be used for the web page storage processing method for browser for performing the embodiment of the present invention, the method for the embodiment of the present invention Can also being performed for the web page storage processing unit of browser by the embodiment of the present invention.
Fig. 2 is the schematic diagram of the web page storage processing unit for browser according to embodiments of the present invention.Such as Fig. 2 institutes Show, which includes:Receiving unit 10, matching unit 20 and output unit 30.
Receiving unit 10, for receiving search key, wherein, search key is used for from the collection webpage of browser Search the webpage for needing to browse.
Search key can be any key for being used to search the webpage for needing to browse from the collection webpage of browser Word, search key can be a keyword or multiple keywords.Specifically, the receipts in browser can be passed through The region for hiding webpage sets a frame retrieval, and search key input by user is received by the frame retrieval.
Matching unit 20, for search key to be matched with the collection webpage of browser, obtains matched collection The address of webpage.
The collection webpage of browser is usually located in the collection of browser, is preserved in the collection of existing browser The address of collection webpage and title.It can be that retrieval is crucial that the collection webpage of search key and browser is carried out matching Word is matched with collecting the title of webpage, if illustrating the collection webpage there are search key in the title of collection webpage It is related to the webpage that user's needs access.
Output unit 30, for exporting the address of matched collection webpage.
Behind address in the collection webpage for obtain browser with the matched collection webpage of search key, this is exported The address for the collection webpage matched somebody with somebody is checked for user.
The embodiment of the present invention receives search key by receiving unit 10, and matching unit 20 is by search key with browsing The collection webpage of device is matched, and obtains the address of matched collection webpage, and output unit 30 exports matched collection webpage Address.The embodiment of the present invention searches the collection webpage for needing to access, phase by way of retrieval from the collection webpage of browser Than being searched in the collection webpage for opening browser successively by user in the prior art, the collection net from browser is improved The efficiency of target webpage is searched in page, solves the efficiency for searching target webpage in correlation technique from the collection webpage of browser The problem of low.
Preferably, matching unit 20 includes:First acquisition module, the title and text of the collection webpage for obtaining browser This content;And matching module, for the title of the collection webpage of browser and content of text and search key to be carried out Match somebody with somebody, wherein, if the title and content of text of the collection webpage of browser are matched with search key, it is determined that search key Matched with the collection webpage of browser, if the title and content of text and search key of the collection webpage of browser are not Match somebody with somebody, it is determined that the collection webpage of search key and browser mismatches.
Preferably, which further includes:First acquisition unit, the content of text of the collection webpage for obtaining browser; Second acquisition unit, the network address and title of the collection webpage for obtaining browser;And storage unit, for storing browser Collection webpage content of text, network address and title.
Preferably, first acquisition unit includes:Second acquisition module, obtains the address of the collection webpage of browser;Access Module, the address for the collection webpage according to browser access collection webpage;And module is crawled, for accessing collection net From collection web page crawl content of text during page, the content of text of the collection webpage of browser is obtained.
Obviously, those skilled in the art should be understood that above-mentioned each module of the invention or each step can be with general Computing device realize that they can be concentrated on single computing device, or be distributed in multiple computing devices and formed Network on, alternatively, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Performed in the storage device by computing device, either they are fabricated to respectively each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.In this way, the present invention be not restricted to it is any specific Hardware and software combines.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the invention, for the skill of this area For art personnel, the invention may be variously modified and varied.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should all be included in the protection scope of the present invention.

Claims (5)

  1. A kind of 1. web page storage processing method for browser, it is characterised in that including:
    Search key is received, wherein, the search key, which is used to search from the collection webpage of browser, needs what is browsed Webpage, wherein being stored with title and the address of the collection webpage in the collection of the browser;
    If the title of collection webpage is matched with search key, directly output and the matched collection net of the search key The network address of page;
    If title and the search key of collection webpage mismatch, will collect the content of text of webpage and search key into Row matching, obtains the address of matched collection webpage;And
    The address of the matched collection webpage is exported,
    Wherein, it is described before the content of text of the collection webpage of the browser is matched with the search key Method further includes:
    Obtain the content of text of the collection webpage of the browser;
    Wherein, obtaining the content of text of the collection webpage of the browser includes:
    Obtain the address of the collection webpage of the browser;
    The collection webpage is accessed according to the address of the collection webpage of the browser;And
    From the collection web page crawl content of text during the collection webpage is accessed, the collection of the browser is obtained The content of text of webpage,
    Wherein, from the collection web page crawl content of text, described browse is obtained from during the collection webpage is accessed The content of text of the collection webpage of device includes:
    Filter the hypertext markup language label of the collection webpage of the browser;And
    Content of text is crawled from the collection webpage of the browser of filtering hypertext markup language label, obtains described browse The content of text of the collection webpage of device.
  2. 2. the web page storage processing method according to claim 1 for browser, it is characterised in that close the retrieval The collection webpage of keyword and the browser, which carries out matching, to be included:
    Obtain the title and content of text of the collection webpage of the browser;And
    The title of the collection webpage of the browser and content of text are matched with the search key,
    Wherein, if the title and content of text of the collection webpage of the browser are matched with the search key, it is determined that The search key is matched with the collection webpage of the browser, if the title and text of the collection webpage of the browser Content is mismatched with the search key, it is determined that the collection webpage of the search key and the browser mismatches.
  3. 3. the web page storage processing method according to claim 1 for browser, it is characterised in that
    From the collection web page crawl content of text during the collection webpage is accessed, the collection of the browser is obtained After the content of text of webpage, the method further includes:Obtained from the content of text of the collection webpage of the browser crucial Word, obtains the keyword of the collection webpage of the browser;Store keyword, network address and the mark of the collection webpage of the browser Topic,
    The title of the collection webpage of the browser and content of text are carried out matching with the search key and included:By described in The keyword and title of the collection webpage of browser are matched with the search key.
  4. A kind of 4. web page storage processing unit for browser, it is characterised in that including:
    Receiving unit, for receiving search key, wherein, the search key is used to look into from the collection webpage of browser The webpage that needs browse is looked for, wherein being stored with title and the address of the collection webpage in the collection of the browser;
    Matching unit, if matched for the title for collecting webpage with search key, directly output and the retrieval is crucial The address of the matched collection webpage of word;If the title and search key of collecting webpage mismatch, the text of webpage will be collected This content is matched with search key, obtains the address of matched collection webpage;
    And
    Output unit, for exporting the address of the matched collection webpage,
    Described device further includes:
    First acquisition unit, the content of text of the collection webpage for obtaining the browser;
    Wherein, the first acquisition unit includes:
    Second acquisition module, obtains the address of the collection webpage of the browser;
    Access modules, the address for the collection webpage according to the browser access the collection webpage;And
    Module is crawled, for, from the collection web page crawl content of text, obtaining institute during the collection webpage is accessed The content of text of the collection webpage of browser is stated,
    Wherein, the module that crawls is used for:
    Filter the hypertext markup language label of the collection webpage of the browser;And
    Content of text is crawled from the collection webpage of the browser of filtering hypertext markup language label, obtains described browse The content of text of the collection webpage of device.
  5. 5. the web page storage processing unit according to claim 4 for browser, it is characterised in that the matching unit Including:
    First acquisition module, the title and content of text of the collection webpage for obtaining the browser;And
    A matching module, for the title of the collection webpage of the browser and content of text and the search key to be carried out Match somebody with somebody,
    Wherein, if the title and content of text of the collection webpage of the browser are matched with the search key, it is determined that The search key is matched with the collection webpage of the browser, if the title and text of the collection webpage of the browser Content is mismatched with the search key, it is determined that the collection webpage of the search key and the browser mismatches.
CN201410742954.XA 2014-12-05 2014-12-05 Web page storage processing method and processing device for browser Active CN104391978B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410742954.XA CN104391978B (en) 2014-12-05 2014-12-05 Web page storage processing method and processing device for browser

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410742954.XA CN104391978B (en) 2014-12-05 2014-12-05 Web page storage processing method and processing device for browser

Publications (2)

Publication Number Publication Date
CN104391978A CN104391978A (en) 2015-03-04
CN104391978B true CN104391978B (en) 2018-05-15

Family

ID=52609882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410742954.XA Active CN104391978B (en) 2014-12-05 2014-12-05 Web page storage processing method and processing device for browser

Country Status (1)

Country Link
CN (1) CN104391978B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426224A (en) * 2015-12-28 2016-03-23 上海银天下科技有限公司 Method and device for opening web pages in application program
CN105740417A (en) * 2016-01-29 2016-07-06 青岛海信移动通信技术股份有限公司 Webpage based target data search method and module, browser and terminal
CN106547821A (en) * 2016-09-29 2017-03-29 广东工业大学 A kind of method in browser according to keyword search related web page
CN107229705A (en) * 2017-05-25 2017-10-03 北京小米移动软件有限公司 Information resources lookup method, device and computer-readable recording medium
CN110020335B (en) * 2017-07-28 2022-04-26 北京搜狗科技发展有限公司 Favorite processing method and device
CN110069667B (en) * 2017-11-03 2022-07-19 北京搜狗科技发展有限公司 Searching method, searching device and searching device
CN108491420A (en) * 2018-02-06 2018-09-04 平安科技(深圳)有限公司 Configuration method, application server and the computer readable storage medium of web page crawl
CN109657168B (en) * 2018-11-30 2021-04-23 维沃移动通信有限公司 Collection record display method and device
CN113268184A (en) * 2021-05-29 2021-08-17 五八到家有限公司 Browser tab switching method and device, electronic equipment and readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010115003A1 (en) * 2009-04-03 2010-10-07 Avichai Flombaum System and method for identifying and retrieving targeted advertisements or other related documents
CN102830894A (en) * 2012-05-11 2012-12-19 北京奇虎科技有限公司 Method and apparatus for bookmarking webpage
CN102982134A (en) * 2012-11-16 2013-03-20 北京奇虎科技有限公司 System enabling recommended web site information to be displayed in browser address bar
CN103246746A (en) * 2013-05-23 2013-08-14 百度在线网络技术(北京)有限公司 Method, device and system for searching information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010115003A1 (en) * 2009-04-03 2010-10-07 Avichai Flombaum System and method for identifying and retrieving targeted advertisements or other related documents
CN102830894A (en) * 2012-05-11 2012-12-19 北京奇虎科技有限公司 Method and apparatus for bookmarking webpage
CN102982134A (en) * 2012-11-16 2013-03-20 北京奇虎科技有限公司 System enabling recommended web site information to be displayed in browser address bar
CN103246746A (en) * 2013-05-23 2013-08-14 百度在线网络技术(北京)有限公司 Method, device and system for searching information

Also Published As

Publication number Publication date
CN104391978A (en) 2015-03-04

Similar Documents

Publication Publication Date Title
CN104391978B (en) Web page storage processing method and processing device for browser
CN101908071B (en) Method and device thereof for improving search efficiency of search engine
CN102930059B (en) Method for designing focused crawler
US9183281B2 (en) Context-based document unit recommendation for sensemaking tasks
US7499965B1 (en) Software agent for locating and analyzing virtual communities on the world wide web
CN100394427C (en) Web search system and method thereof
CN103823824B (en) A kind of method and system that text classification corpus is built automatically by the Internet
CN100462969C (en) Method for providing and inquiry information for public by interconnection network
US20070198727A1 (en) Method, apparatus and system for extracting field-specific structured data from the web using sample
US20160034514A1 (en) Providing search results based on an identified user interest and relevance matching
US8560518B2 (en) Method and apparatus for building sales tools by mining data from websites
CN103631794A (en) Method, device and equipment for sorting search results
CN102779169A (en) Extracting method and device for webpage content based on HTML (Hypertext Markup Language) label
US20160103913A1 (en) Method and system for calculating a degree of linkage for webpages
CN105631007A (en) Industry technical information collecting method and system
EP2933734A1 (en) Method and system for the structural analysis of websites
KR101224800B1 (en) Crawling database for infomation
CN105095175A (en) Method and device for obtaining truncated web title
Patil et al. Search engine optimization technique importance
Klein et al. Evaluating methods to rediscover missing web pages from the web infrastructure
CN104778232B (en) Searching result optimizing method and device based on long query
CN103617225B (en) A kind of associating web pages searching method and system
CN106959995A (en) Compatible two-way automatic web page contents acquisition method
CN104077353B (en) A kind of method and device of detecting black chain
CN110110182A (en) A kind of collecting method and system suitable for crawling in batches

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and device for storing and processing web pages of browsers

Effective date of registration: 20190531

Granted publication date: 20180515

Pledgee: Shenzhen Black Horse World Investment Consulting Co., Ltd.

Pledgor: Beijing Guoshuang Technology Co.,Ltd.

Registration number: 2019990000503

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Patentee after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Patentee before: Beijing Guoshuang Technology Co.,Ltd.