CN102929984A - Website failure searching method and device - Google Patents

Website failure searching method and device Download PDF

Info

Publication number
CN102929984A
CN102929984A CN2012103979842A CN201210397984A CN102929984A CN 102929984 A CN102929984 A CN 102929984A CN 2012103979842 A CN2012103979842 A CN 2012103979842A CN 201210397984 A CN201210397984 A CN 201210397984A CN 102929984 A CN102929984 A CN 102929984A
Authority
CN
China
Prior art keywords
network address
server
browser
search results
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012103979842A
Other languages
Chinese (zh)
Other versions
CN102929984B (en
Inventor
赵飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qizhi Business Consulting Co ltd
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201210397984.2A priority Critical patent/CN102929984B/en
Publication of CN102929984A publication Critical patent/CN102929984A/en
Application granted granted Critical
Publication of CN102929984B publication Critical patent/CN102929984B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a website failure searching method and a website failure searching device. The device comprises a website information acquisition module, a search request receiving module, a website failure judging module and a website snapshot acquisition module, wherein the search request receiving module comprises a search request transmitting submodule, a search result return submodule and a search result display submodule; the search request transmitting submodule is positioned in a browser and is used for receiving a search request and transmitting the search request to a server; the search result return submodule is positioned in the server and is used for grabbing a webpage related to the search request to form a search result and returning the search result to the browser; and the search result display submodule is positioned in the browser and is used for displaying the search result. By the method and the device, a user can normally browse content of a webpage when the user fails to click on the search result.

Description

Inefficacy address searching method and apparatus
Technical field
The present invention relates to the internet access technical field, be specifically related to a kind of inefficacy address searching method, and a kind of inefficacy address searching device.
Background technology
Follow the explosive growth of the universal and network information of internet, search engine more and more causes people's attention, and at present, search engine technique becomes the second largest core technology in the internet that is only second to door.
When using search engine to carry out Webpage search, click a certain Search Results and situation about can't access may occur, this is because the webpage on the internet often changes, when the searched webpage that arrives during deleted or dead chain, and the direct clickthrough content that can't check webpage.
In this case, if the content of the webpage that the user need to continue to check that this can't be accessed, the user has to again search corresponding network address or the relevant content of search, and search efficiency is low, the user experiences non-constant, and has increased the resource cost of client and server.
Therefore, those skilled in the art's technical issues that need to address provide a kind of search mechanisms, can guarantee the content of this webpage of user's normal browsing when the user clicks the Search Results failure.
Summary of the invention
In view of the above problems, the present invention has been proposed in case provide a kind of overcome the problems referred to above or address the above problem at least in part a kind of based on inefficacy address searching method and corresponding searcher.
According to one aspect of the present invention, a kind of inefficacy address searching method is provided, comprising:
Gather the website information of the browser collection folder of many subscriber equipmenies, preserve described website information to database, described website information comprises the snapshots of web pages of network address;
Browser receives searching request and described searching request is sent to server;
Server grasps the webpage formation Search Results relevant with described searching request and returns to browser in database;
The described Search Results of browser-presented;
Whether the network address of judging certain Search Results of access is the inefficacy network address;
If the network address of described Search Results is the inefficacy network address, server is searched the snapshots of web pages of coupling in database, and is back to browser.
Alternatively, described snapshots of web pages is that the code that server obtains described webpage preserve to generate, or is, the code that obtains this webpage at described server is preserved when unsuccessful, and the notice browser is uploaded generation with the code of the webpage of correspondence.
Alternatively, whether be the step of inefficacy network address comprise in described judgement if accessing the network address of certain Search Results:
Browser is sent to server with the network address of described Search Results;
Server is resolved the generation response message to the network address of described Search Results and is returned browser;
The described response message of browser resolves extracts the HTTP status code of corresponding network address;
Browser judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
Alternatively, whether be the step of inefficacy network address comprise in described judgement if accessing the network address of certain Search Results:
Browser is sent to server with the network address of described Search Results;
Server is resolved the network address of described Search Results, extracts the HTTP status code in the corresponding network address;
Server judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
According to a further aspect in the invention, provide a kind of inefficacy address searching device, having comprised:
The website information acquisition module is suitable for gathering the website information of the browser collection folder of many subscriber equipmenies, preserves described website information to database, and described website information comprises the snapshots of web pages of network address;
The searching request receiver module is suitable for receiving searching request, and returns Search Results according to described searching request;
Inefficacy network address judge module is suitable for judging whether the network address of certain Search Results of access is the inefficacy network address;
The snapshots of web pages acquisition module is suitable for when the network address of described Search Results is the inefficacy network address, and server is searched the snapshots of web pages of coupling in database, and is back to browser;
Wherein, described searching request receiver module comprises:
The searching request that is positioned at browser sends submodule, is suitable for receiving searching request and described searching request is sent to server;
The Search Results that is positioned at server returns submodule, is suitable for the webpage formation Search Results that crawl is relevant with described searching request in database and returns to browser;
The Search Results that is positioned at browser is showed submodule, is suitable for showing described Search Results.
Alternatively, described snapshots of web pages is that the code that server obtains described webpage preserve to generate, or is, the code that obtains this webpage at described server is preserved when unsuccessful, and the notice browser is uploaded generation with the code of the webpage of correspondence.
Alternatively, described inefficacy network address judge module comprises:
The first network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The response message that is positioned at server returns submodule, is suitable for that the network address of described Search Results is resolved the generation response message and returns browser;
The HTTP status code that is positioned at browser is obtained submodule, is suitable for resolving described response message, extracts the HTTP status code of corresponding network address;
Be positioned at the network address decision sub-module of browser, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
Alternatively, described inefficacy network address judge module comprises:
The second network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The HTTP status code that is positioned at server is obtained submodule, is suitable for the network address of described Search Results is resolved, and extracts the HTTP status code in the corresponding network address;
Be positioned at the network address decision sub-module of server, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
A kind of searching method based on collection according to the present invention can provide a kind of collection mechanism based on collection, solve thus the problem that the Search Results that obtains for searching request can't normally access and obtained the web page contents that guarantees the described Search Results that can't normally access of user's normal browsing, improved the beneficial effect of search efficiency.
Above-mentioned explanation only is the general introduction of technical solution of the present invention, for can clearer understanding technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of drawings
By reading hereinafter detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing only is used for the purpose of preferred implementation is shown, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts with identical reference symbol.In the accompanying drawings:
Fig. 1 shows a kind of according to an embodiment of the invention flow chart of steps of the address searching embodiment of the method that lost efficacy;
Fig. 2 shows the structured flowchart of a kind of according to an embodiment of the invention address searching device embodiment that lost efficacy.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in the accompanying drawing, yet should be appreciated that and to realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order to understand the disclosure more thoroughly that these embodiment are provided, and can with the scope of the present disclosure complete convey to those skilled in the art.
One of core idea of the embodiment of the invention is, website information and snapshots of web pages corresponding to described network address by gathering browser collection folder in many subscriber equipmenies are saved to database with website information and snapshots of web pages.When returning corresponding Search Results for searching request, judge whether described Search Results is the inefficacy network address, if server returns snapshots of web pages corresponding to network address to browser.
With reference to Fig. 1, show the flow chart of steps of the address searching embodiment of the method that lost efficacy according to an embodiment of the invention, specifically can may further comprise the steps:
Step 101: gather the website information of the browser collection folder of many subscriber equipmenies, preserve described website information to database, described website information comprises the snapshots of web pages of network address;
Snapshots of web pages, English name be Web Cache, the webpage buffer memory.Search engine is when webpage, webpage is backed up, exist in the server buffer of oneself, when the user clicks " snapshots of web pages " link in search engine, the Web page content revealing that search engine grasped and preserved at that time Spider (spider) system out is called " snapshots of web pages ".In the present invention, described snapshots of web pages can preserve be generated by the code that server obtain described webpage, perhaps, can preserve when unsuccessful at the code that described server obtains this webpage, and the notice browser is uploaded generation with the code of the webpage of correspondence.That is to say, snapshots of web pages is at some web page codes that are presented as of server side.
Web page code need just to refer to some special " language " of using in Web Page Design, then the designer is carried out being only the effect that we finally see after " translation " to code by browser by webpage is produced in these " language " tissue layouts.Code commonly used during present Web-Designing has HTML, JavaScript, and ASP, PHP, CGI etc., wherein HTML is most basic web page code.Described web page code can directly be obtained when resolving the request message of browser by server; Perhaps, described web page code also can obtain when the response message that the browser resolves server returns, and then web page code is uploaded onto the server.The benefit of obtaining web page code with server is to save like this user's surfing flow, minimally consume user bandwidth, when server is preserved the web page code failure, can notify browser to obtain web page code uploads, server is preserved described web page code again, can adopt the mode of compressed code that described web page code is uploaded when browser is uploaded described web page code, so also can reduce the roam of uploading, reduce bandwidth.
In specific implementation, it can be that number of site is maliciously usurped by other people in order to prevent own content that a kind of server is preserved the unsuccessful situation of web page code, can do some restrict access at own server, for example limit other machines to its access frequency, server just can not directly be preserved web page code like this, in specific implementation, server can carry out web page code hash algorithm and obtain web site contents checking string, described web site contents checking string and web site contents checking string in the default preservation check interface compared judge whether server is preserved web page code successful, server is preserved the web page code success if described web site contents checking string is present in the default preservation check interface, otherwise it is unsuccessful that server is preserved code.Those skilled in the art adopt other modes all to be fine, and the present invention is not restricted this.
Browser is kept in the database website information for follow-up search use after the website information of the browser collection folder that gathers many subscriber equipmenies.In specific implementation, the present invention can preserve website information with two databases, and one is content data base, and one is the snapshots of web pages database, the snapshots of web pages database is used for preserving the snapshots of web pages of network address, and the content data base user preserves network address other information except snapshots of web pages; Perhaps, the present invention also can set up a database, comprise two tables in the database, one is used for the storage snapshots of web pages, a content that is used for beyond the storage snapshots of web pages it will be understood by those skilled in the art that above-mentioned website information storage mode only is example of the present invention, those skilled in the art can adopt other storage meanss to store, and the present invention is not restricted at this.
Step 102: browser receives searching request and described searching request is sent to server;
Step 103: server grasps the webpage formation Search Results relevant with described searching request and returns to browser in database;
For example, when the user carries out keyword search in browser, browser sends to server with keyword after receiving user's the keyword of search, and crawl in the described content data base forms Search Results with the relevant web page contents of keyword and returns to browser server according to described keyword.In specific implementation, Search Results can sort according to the weight of webpage and then return, and also can sort according to additive method and return, and the present invention is not restricted at this.
Step 104: the described Search Results of browser-presented.
Step 105: whether the network address of judging certain Search Results of access is the inefficacy network address;
When the user need to check certain Search Results, browser or server judged that at first can network address corresponding to described Search Results normally access, if described network address can not normally be accessed, then the snapshots of web pages that network address is corresponding shows the user.
Generally speaking, adopt HTTP status code (HTTP Status Code) to judge the validity of network address.The HTTP status code is comprised of three tens digits, in order to success or the failure of pointing out web access requests, if the failure would indicate the causes.Five types of HTTP state code divisions are represented by its first bit digital:
3 digit numerical code with 1 beginning, comprise 100 (client should continue to send request), 101 (server has been understood the request of client, and will finish this request by the different agreement of Upgrade (upgrading) message header notice customer end adopted), 102 (by WebDAV (Web-based Distributed Authoring and Versioning, a kind of communication protocol based on HTTP 1.1 agreements) status code of expansion, representative is processed and will be continued to carry out), the expression request is accepted, need to continue to process, this class response is Temporary Response, only comprise statusline and some optional head response information, and finish with null, but owing to not defining any status code with 1 beginning in the HTTP/1.0 agreement, unless so under some test condition, server forbids sending to this type of client the response of this type of status code;
3 digit numerical code with 2 beginnings, comprise that 200 (ask successful, ask desirable head response or data volume to return with this response), 201 (ask to be implemented, and there is a new resource to set up according to the needs of request), 202 (server accepts request, but not yet process), 203 (server has successfully been processed request, but the entity head metamessage that returns is not effectively to determine set on original server, but from local or third-party copy), 204 (server has successfully been processed request, but do not need to return any entity content, and wish to return the metamessage that has upgraded), 205 (server has successfully been processed request, and do not return any content), 206 (server has successfully been processed part GET request), 207 (by the status code of WebDAV (RFC2518) expansion, message body after the representative will be an XML message), the expression request serviced device of success receives, understand, and accept;
3 digit numerical code with 3 beginnings, 300 (user or browser can select the address of a first-selection to be redirected voluntarily), 301 (requested resource forever moves to reposition, and one of any several URI (generic resource identifier) that all should use this response to return to quoting of this resource in the future), 302 (resource of request is now interim from different URI response request), 303 (response of corresponding current request can be found on another URI, and client should adopt the mode of GET to access that resource), 304 (are allowed to if client has sent GET request and this request of a SNNP, and the content of document (since last visit or according to the condition of asking) does not change, then server should return this status code), 305 (requested resource must be accessed by agency's ability of appointment), 306 (in the standard of latest edition, 306 status codes no longer are used), 307 (resource of request is now interim from different URI response request), expression needs client to take further operation just can finish request, usually, these status codes are used for being redirected, and follow-up request address (redirection target) indicates in the position field of this secondary response;
3 digit numerical code with 4 beginnings, comprise that 400 is (semantic wrong, current request can't be understood by serviced device, required parameter is wrong), 401 (current request needs user rs authentication), 402 (these status codes in the future possible demand reserve), 403 (server has been understood request, but refusal is carried out it), 404 (ask unsuccessfully, ask the desired resource that obtains not found at server), 405 (requesting method of appointment can not be used to ask corresponding resource in the request row), 406 (content character of the resource of request can't satisfy the condition in the request header, thereby can't generate the response entity), 407 is (similar with 401 responses, only client must be carried out authentication at acting server), 408 (request timed outs), 409 (because and have conflict between the current state of requested resource, request can't be finished), 410 (requested resource is no longer available on server, and without any known forwarding address), 411 (servers refusal in the situation that do not define the Content-Length head accept request), 412 (when server provides condition precedent in checking in a field of request, fail to satisfy one or more), 413 (server refusal processing current request, because the solid data size that this request is submitted to has surpassed the scope that server is ready or can be processed), 414 (the URI length of request has surpassed the length that server can be explained, therefore the server refusal provides service to this request), 415 (for method and the requested resource of current request, the entity of submitting in the request is not the form of supporting in the server, therefore request is rejected), 416 (if comprised the Range request header in the request, and any data area of appointment does not all overlap with the usable range of current resource among the Range, do not define again the If-Range request header in the request simultaneously, server just should return 416 status codes so), 417 (the expection content of appointment in request header Expect can't serviced device satisfies, perhaps this server is an acting server, it has the clear evidence proof on the next node of current route, the content of Expect can't be satisfied), 421 (maximum magnitudes that surpassed the server license from the IP address at active client place to the linking number of server), 422 (request correct formats, but owing to containing semantic error, can't respond), 424 (because the mistakes that certain request before occurs, cause the current request failure), 425 (define in WebDav Advanced Collections draft, but do not appear in " WebDAV ordered set agreement " (RFC 3658)), 426 (client should switch to TLS/1.0), 449 (are expanded by Microsoft, the representative request should be carried out retry after executing suitable operation), the expression client seems mistake may occur, and has hindered the processing of server;
3 digit numerical code with 5 beginnings, (server has run into a situation of not expected to comprise 500, caused it can't finish to the request processing), not 501 (server is not supported needed certain function of current request), 502 (when attempting carrying out request as the server of gateway or factorage, receive invalid response from upstream server), 503 (because interim server maintenance or overloads, server is current can't process request), 504 (when attempting carrying out request as the server of gateway or factorage, fail in time to receive response from upstream server), 505 (server is not supported, perhaps refusal is supported in the HTTP version that uses in the request), 506 (are expanded by " transparent content agreement protocol " (RFC 2295), there is the internal configurations mistake in representative server), 507 (server can't be stored and finish the necessary content of request), 509 (server reaches limit bandwidth), not 510 (the needed strategy of Gains resources does not satisfy), expression server wrong or abnormality in the process of processing request occurs, and also might be that server is recognized with current software and hardware resources and can't be finished processing to request.
In a preferred embodiment of the present invention, described step 103 can comprise following substep:
Substep S21: browser is sent to server with the network address of described Search Results;
Substep S22: server is resolved the generation response message to the network address of described Search Results and is returned browser;
Substep S23: the described response message of browser resolves, extract the HTTP status code of corresponding network address;
Substep S24: browser judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
In another kind of preferred embodiment of the present invention, described step 103 can comprise following substep:
Substep S31: browser is sent to server with the network address of described Search Results;
Substep S32: server is resolved the network address of described Search Results, extracts the HTTP status code in the corresponding network address;
Substep S33: server judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
As a kind of preferred exemplary of the present embodiment, status code is that 200,301,302,304 state can be considered as the state that website links success webpage is normally opened, remaining status code can be considered as losing efficacy status code of network address.
In fact, the above-mentioned mode of obtaining the HTTP status code from browser side or server side can be to generate independently thread or process is caught the HTTP status code at browser side or server side, and those skilled in the art should be appreciated that, the mode of more than obtaining the HTTP status code only is a kind of example, those skilled in the art can take other modes to realize all being fine, and the present invention is not restricted at this.
Step 106: if the network address of described Search Results is the inefficacy network address, server is searched the snapshots of web pages of coupling in database, and is back to browser.
In fact, if when browser end judges that the network address of certain Search Results of access is the inefficacy network address, the snapshots of web pages that browser is corresponding with the network address of the described Search Results request of obtaining is sent to server, and server is searched the snapshots of web pages of mating with the snapshots of web pages request of obtaining and returned browser in described snapshots of web pages database;
If when server end judged that the network address of certain Search Results of access is the inefficacy network address, server was directly searched the snapshots of web pages of mating with the snapshots of web pages request of obtaining and is returned browser from the snapshots of web pages database.
Need to prove, for embodiment of the method, for simple description, therefore it all is expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not subjected to the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in the instructions all belongs to preferred embodiment, and related action and module might not be that the present invention is necessary.
With reference to Fig. 2, show the structured flowchart of the address searching device embodiment that lost efficacy according to an embodiment of the invention, specifically can comprise with lower module:
Website information acquisition module 201 is suitable for gathering the website information of the browser collection folder of many subscriber equipmenies, preserves described website information to database, and described website information comprises the snapshots of web pages of network address;
Snapshots of web pages, English name be Web Cache, the webpage buffer memory.Search engine is when webpage, webpage is backed up, exist in the server buffer of oneself, when the user clicks " snapshots of web pages " link in search engine, the Web page content revealing that search engine grasped and preserved at that time Spider (spider) system out is called " snapshots of web pages ".In the present invention, described snapshots of web pages can preserve be generated by the code that server obtain described webpage, perhaps, can preserve when unsuccessful at the code that described server obtains this webpage, and the notice browser is uploaded generation with the code of the webpage of correspondence.That is to say, snapshots of web pages is at some web page codes that are presented as of server side.
Web page code need just to refer to some special " language " of using in Web Page Design, then the designer is carried out being only the effect that we finally see after " translation " to code by browser by webpage is produced in these " language " tissue layouts.Code commonly used during present Web-Designing has HTML, JavaScript, and ASP, PHP, CGI etc., wherein HTML is most basic web page code.Described web page code can directly be obtained when resolving the request message of browser by server; Perhaps, described web page code also can obtain when the response message that the browser resolves server returns, and then web page code is uploaded onto the server.The benefit of obtaining web page code with server is to save like this user's surfing flow, minimally consume user bandwidth, when server is preserved the web page code failure, can notify browser to obtain web page code uploads, server is preserved described web page code again, can adopt the mode of compressed code that described web page code is uploaded when browser is uploaded described web page code, so also can reduce the roam of uploading, reduce bandwidth.
Browser is kept in the database website information for follow-up search use after the website information of the browser collection folder that gathers many subscriber equipmenies.In specific implementation, the present invention can preserve website information with two databases, and one is content data base, and one is the snapshots of web pages database, the snapshots of web pages database is used for preserving the snapshots of web pages of network address, and the content data base user preserves network address other information except snapshots of web pages; Perhaps, the present invention also can set up a database, comprise two tables in the database, one is used for the storage snapshots of web pages, a kind of for the content beyond the storage snapshots of web pages, it will be understood by those skilled in the art that above-mentioned website information storage mode only is example of the present invention, those skilled in the art can adopt other storage meanss to store, and the present invention is not restricted at this.
Searching request receiver module 202 is suitable for receiving searching request, and returns Search Results according to described searching request;
In a preferred embodiment of the present invention, described searching request receiver module 202 can comprise following submodule:
The searching request that is positioned at browser sends submodule, is suitable for receiving searching request and described searching request is sent to server;
The Search Results that is positioned at server returns submodule, is suitable for the webpage formation Search Results that crawl is relevant with described searching request in database and returns to browser;
The Search Results that is positioned at browser is showed submodule, is suitable for showing described Search Results.
Inefficacy network address judge module 203 is suitable for judging whether the network address of certain Search Results of access is the inefficacy network address;
When the user need to check certain Search Results, browser or server judged that at first can network address corresponding to described Search Results normally access, if described network address can not normally be accessed, then the snapshots of web pages that network address is corresponding shows the user.
Generally speaking, adopt HTTP status code (HTTP Status Code) to judge the validity of network address.The HTTP status code is comprised of three tens digits, in order to success or the failure of pointing out web access requests, if the failure would indicate the causes.
In a preferred embodiment of the present invention, described inefficacy network address judge module 203 can comprise following submodule:
The first network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The response message that is positioned at server returns submodule, is suitable for that the network address of described Search Results is resolved the generation response message and returns browser;
The HTTP status code that is positioned at browser is obtained submodule, is suitable for resolving described response message, extracts the HTTP status code of corresponding network address;
Be positioned at the network address decision sub-module of browser, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
In another kind of preferred embodiment of the present invention, described inefficacy network address judge module 203 can comprise following submodule:
The second network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The HTTP status code that is positioned at server is obtained submodule, is suitable for the network address of described Search Results is resolved, and extracts the HTTP status code in the corresponding network address;
Be positioned at the network address decision sub-module of server, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
In fact, the above-mentioned mode of obtaining the HTTP status code from browser side or server side can be to generate independently thread or process is caught the HTTP status code at browser side or server side, and those skilled in the art should be appreciated that, the mode of more than obtaining the HTTP status code only is a kind of example, those skilled in the art can take other modes to realize all being fine, and the present invention is not restricted at this.
Snapshots of web pages acquisition module 204 is suitable for when the network address of described Search Results is the inefficacy network address, and server is searched the snapshots of web pages of coupling in database, and is back to browser.
In fact, if when browser end judges that the network address of certain Search Results of access is the inefficacy network address, the snapshots of web pages that browser is corresponding with the network address of the described Search Results request of obtaining is sent to server, and server is searched the snapshots of web pages of mating with the snapshots of web pages request of obtaining and returned browser in described snapshots of web pages database;
If when server end judged that the network address of certain Search Results of access is the inefficacy network address, server was directly searched the snapshots of web pages of mating with the snapshots of web pages request of obtaining and is returned browser from the snapshots of web pages database.
For the system embodiment of Fig. 2 because itself and the embodiment of the method basic simlarity of Fig. 1, so describe fairly simple, relevant part gets final product referring to the part explanation of embodiment of the method.
Intrinsic not relevant with any certain computer, virtual system or miscellaneous equipment with demonstration at this algorithm that provides.Various general-purpose systems also can be with using based on the teaching at this.According to top description, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.Should be understood that and to utilize various programming languages to realize content of the present invention described here, and the top description that language-specific is done is in order to disclose preferred forms of the present invention.
In the instructions that provides herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can be in the situation that there be these details to put into practice.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the description to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes in the above.Yet the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires the more feature of feature clearly put down in writing than institute in each claim.Or rather, as following claims reflected, inventive aspect was to be less than all features of the disclosed single embodiment in front.Therefore, follow claims of embodiment and incorporate clearly thus this embodiment into, wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can adaptively change and they are arranged in one or more equipment different from this embodiment the module in the equipment among the embodiment.Can be combined into a module or unit or assembly to the module among the embodiment or unit or assembly, and can be divided into a plurality of submodules or subelement or sub-component to them in addition.In such feature and/or process or unit at least some are mutually repelling, and can adopt any combination to disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and so all processes or the unit of disclosed any method or equipment make up.Unless in addition clearly statement, disclosed each feature can be by providing identical, being equal to or the alternative features of similar purpose replaces in this instructions (comprising claim, summary and the accompanying drawing followed).
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included among other embodiment, the combination of the feature of different embodiment means and is within the scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, perhaps realizes with the software module of moving at one or more processor, and perhaps the combination with them realizes.It will be understood by those of skill in the art that can use in practice microprocessor or digital signal processor (DSP) realize according to the embodiment of the invention based on some or all some or repertoire of parts in the search equipment of collection.The present invention can also be embodied as be used to part or all equipment or the device program (for example, computer program and computer program) of carrying out method as described herein.Such realization program of the present invention can be stored on the computer-readable medium, perhaps can have the form of one or more signal.Such signal can be downloaded from internet website and obtain, and perhaps provides at carrier signal, perhaps provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation of the scope that does not break away from claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed in element or step in the claim.Being positioned at word " " before the element or " one " does not get rid of and has a plurality of such elements.The present invention can realize by means of the hardware that includes some different elements and by means of the computing machine of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to come imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title with these word explanations.

Claims (8)

1. inefficacy address searching method comprises:
Gather the website information of the browser collection folder of many subscriber equipmenies, preserve described website information to database, described website information comprises the snapshots of web pages of network address;
Browser receives searching request and described searching request is sent to server;
Server grasps the webpage formation Search Results relevant with described searching request and returns to browser in database;
The described Search Results of browser-presented;
Whether the network address of judging certain Search Results of access is the inefficacy network address;
If the network address of described Search Results is the inefficacy network address, server is searched the snapshots of web pages of coupling in database, and is back to browser.
2. the method for claim 1, described snapshots of web pages are that code that server obtains described webpage is preserved and generated, or are, the code that obtains this webpage at described server is preserved when unsuccessful, and the notice browser is uploaded generation with the code of the webpage of correspondence.
3. whether the network address that method as claimed in claim 1 or 2, described judgement are accessed certain Search Results is that the step of inefficacy network address comprises:
Browser is sent to server with the network address of described Search Results;
Server is resolved the generation response message to the network address of described Search Results and is returned browser;
The described response message of browser resolves extracts the HTTP status code of corresponding network address;
Browser judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
4. whether the network address that method as claimed in claim 1 or 2, described judgement are accessed certain Search Results is that the step of inefficacy network address comprises:
Browser is sent to server with the network address of described Search Results;
Server is resolved the network address of described Search Results, extracts the HTTP status code in the corresponding network address;
Server judges that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
5. inefficacy address searching device comprises:
The website information acquisition module is suitable for gathering the website information of the browser collection folder of many subscriber equipmenies, preserves described website information to database, and described website information comprises the snapshots of web pages of network address;
The searching request receiver module is suitable for receiving searching request, and returns Search Results according to described searching request;
Inefficacy network address judge module is suitable for judging whether the network address of certain Search Results of access is the inefficacy network address;
The snapshots of web pages acquisition module is suitable for when the network address of described Search Results is the inefficacy network address, and server is searched the snapshots of web pages of coupling in database, and is back to browser;
Wherein, described searching request receiver module comprises:
The searching request that is positioned at browser sends submodule, is suitable for receiving searching request and described searching request is sent to server;
The Search Results that is positioned at server returns submodule, is suitable for the webpage formation Search Results that crawl is relevant with described searching request in database and returns to browser;
The Search Results that is positioned at browser is showed submodule, is suitable for showing described Search Results.
6. device as claimed in claim 5, described snapshots of web pages are that code that server obtains described webpage is preserved and generated, or are, the code that obtains this webpage at described server is preserved when unsuccessful, and the notice browser is uploaded generation with the code of the webpage of correspondence.
7. such as claim 5 or 6 described devices, described inefficacy network address judge module comprises:
The first network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The response message that is positioned at server returns submodule, is suitable for that the network address of described Search Results is resolved the generation response message and returns browser;
The HTTP status code that is positioned at browser is obtained submodule, is suitable for resolving described response message, extracts the HTTP status code of corresponding network address;
Be positioned at the network address decision sub-module of browser, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
8. such as claim 5 or 6 described devices, described inefficacy network address judge module comprises:
The second network address that is positioned at browser sends submodule, is suitable for the network address of described Search Results is sent to server;
The HTTP status code that is positioned at server is obtained submodule, is suitable for the network address of described Search Results is resolved, and extracts the HTTP status code in the corresponding network address;
Be positioned at the network address decision sub-module of server, be suitable for judging that according to described HTTP status code whether the network address request of access is the request of access of inefficacy network address.
CN201210397984.2A 2012-10-18 2012-10-18 Inefficacy address searching method and apparatus Active CN102929984B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210397984.2A CN102929984B (en) 2012-10-18 2012-10-18 Inefficacy address searching method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210397984.2A CN102929984B (en) 2012-10-18 2012-10-18 Inefficacy address searching method and apparatus

Publications (2)

Publication Number Publication Date
CN102929984A true CN102929984A (en) 2013-02-13
CN102929984B CN102929984B (en) 2016-06-22

Family

ID=47644782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210397984.2A Active CN102929984B (en) 2012-10-18 2012-10-18 Inefficacy address searching method and apparatus

Country Status (1)

Country Link
CN (1) CN102929984B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945259A (en) * 2012-10-18 2013-02-27 北京奇虎科技有限公司 Searching method and device based on favorites
CN103546830A (en) * 2013-10-28 2014-01-29 Tcl集团股份有限公司 Method and system for processing video address failure
CN103645968A (en) * 2013-12-02 2014-03-19 北京奇虎科技有限公司 Browser status restoration method and device
CN103796046A (en) * 2013-12-24 2014-05-14 Tcl集团股份有限公司 Video source address detection method and device
WO2014206069A1 (en) * 2013-06-28 2014-12-31 Tencent Technology (Shenzhen) Company Limited Method and apparatus for saving web page content
CN104504071A (en) * 2014-12-22 2015-04-08 北京奇虎科技有限公司 SE (search engine)-based web cache providing method and web search client and server
CN104915404A (en) * 2015-06-01 2015-09-16 安一恒通(北京)科技有限公司 Method and device for accessing invalid website
CN104965926A (en) * 2015-07-14 2015-10-07 安一恒通(北京)科技有限公司 Webpage providing method and device
CN105872090A (en) * 2016-05-27 2016-08-17 四川长虹电器股份有限公司 HTTP communication method based on extension status codes
CN106066850A (en) * 2016-05-30 2016-11-02 乐视控股(北京)有限公司 A kind of content processing method and device
CN106682223A (en) * 2017-01-04 2017-05-17 上海智臻智能网络科技股份有限公司 Method and device for detecting data validity and method and device for intelligent interaction
CN106919600A (en) * 2015-12-25 2017-07-04 青岛海信移动通信技术股份有限公司 One kind failure network address access method and terminal
CN109740076A (en) * 2018-12-28 2019-05-10 北京字节跳动网络技术有限公司 Webpage display process and device
CN111444408A (en) * 2020-03-26 2020-07-24 腾讯科技(深圳)有限公司 Network search processing method and device and electronic equipment
CN113282817A (en) * 2021-05-31 2021-08-20 武汉野途电子商务有限公司 Webpage content intelligent collection processing method and system based on webpage search engine data analysis and computer storage medium
CN116389572A (en) * 2023-03-09 2023-07-04 数影星球(杭州)科技有限公司 Web site downloading redirection method and system based on browser

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957818A (en) * 2009-07-13 2011-01-26 北京搜狗科技发展有限公司 Method and system for collecting webpages in batches
US20110060727A1 (en) * 2009-09-10 2011-03-10 Oracle International Corporation Handling of expired web pages
CN102945259A (en) * 2012-10-18 2013-02-27 北京奇虎科技有限公司 Searching method and device based on favorites

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101957818A (en) * 2009-07-13 2011-01-26 北京搜狗科技发展有限公司 Method and system for collecting webpages in batches
US20110060727A1 (en) * 2009-09-10 2011-03-10 Oracle International Corporation Handling of expired web pages
CN102945259A (en) * 2012-10-18 2013-02-27 北京奇虎科技有限公司 Searching method and device based on favorites

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
小痛: "百度收藏 让我的网络收藏更实在", 《电脑迷》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945259A (en) * 2012-10-18 2013-02-27 北京奇虎科技有限公司 Searching method and device based on favorites
CN102945259B (en) * 2012-10-18 2016-06-22 北京奇虎科技有限公司 A kind of searching method based on collection and searcher
WO2014206069A1 (en) * 2013-06-28 2014-12-31 Tencent Technology (Shenzhen) Company Limited Method and apparatus for saving web page content
CN103546830A (en) * 2013-10-28 2014-01-29 Tcl集团股份有限公司 Method and system for processing video address failure
CN103546830B (en) * 2013-10-28 2017-08-08 Tcl集团股份有限公司 A kind of processing method and system of video address failure
CN103645968B (en) * 2013-12-02 2017-03-15 北京奇虎科技有限公司 A kind of browser status restored method and device
CN103645968A (en) * 2013-12-02 2014-03-19 北京奇虎科技有限公司 Browser status restoration method and device
CN103796046A (en) * 2013-12-24 2014-05-14 Tcl集团股份有限公司 Video source address detection method and device
CN103796046B (en) * 2013-12-24 2018-08-31 Tcl集团股份有限公司 A kind of video source address detection method and device
CN104504071A (en) * 2014-12-22 2015-04-08 北京奇虎科技有限公司 SE (search engine)-based web cache providing method and web search client and server
CN104915404A (en) * 2015-06-01 2015-09-16 安一恒通(北京)科技有限公司 Method and device for accessing invalid website
CN104965926A (en) * 2015-07-14 2015-10-07 安一恒通(北京)科技有限公司 Webpage providing method and device
CN104965926B (en) * 2015-07-14 2019-03-26 安一恒通(北京)科技有限公司 Webpage providing method and device
CN106919600A (en) * 2015-12-25 2017-07-04 青岛海信移动通信技术股份有限公司 One kind failure network address access method and terminal
CN105872090B (en) * 2016-05-27 2019-05-07 四川长虹电器股份有限公司 Http communication method based on extended mode code
CN105872090A (en) * 2016-05-27 2016-08-17 四川长虹电器股份有限公司 HTTP communication method based on extension status codes
CN106066850A (en) * 2016-05-30 2016-11-02 乐视控股(北京)有限公司 A kind of content processing method and device
CN106682223A (en) * 2017-01-04 2017-05-17 上海智臻智能网络科技股份有限公司 Method and device for detecting data validity and method and device for intelligent interaction
CN109740076A (en) * 2018-12-28 2019-05-10 北京字节跳动网络技术有限公司 Webpage display process and device
CN111444408A (en) * 2020-03-26 2020-07-24 腾讯科技(深圳)有限公司 Network search processing method and device and electronic equipment
CN111444408B (en) * 2020-03-26 2021-09-14 腾讯科技(深圳)有限公司 Network search processing method and device and electronic equipment
CN113282817A (en) * 2021-05-31 2021-08-20 武汉野途电子商务有限公司 Webpage content intelligent collection processing method and system based on webpage search engine data analysis and computer storage medium
CN113282817B (en) * 2021-05-31 2022-08-23 喀斯玛(北京)科技有限公司 Webpage content collection processing method and processing system
CN116389572A (en) * 2023-03-09 2023-07-04 数影星球(杭州)科技有限公司 Web site downloading redirection method and system based on browser
CN116389572B (en) * 2023-03-09 2024-01-30 数影星球(杭州)科技有限公司 Web site downloading redirection method and system based on browser

Also Published As

Publication number Publication date
CN102929984B (en) 2016-06-22

Similar Documents

Publication Publication Date Title
CN102929984A (en) Website failure searching method and device
CN102945259A (en) Searching method and device based on favorites
CN102929985A (en) Method and system for displaying collected webpage
US20200106850A1 (en) System and method for mobile application deep linking
US9646100B2 (en) Methods and systems for providing content provider-specified URL keyword navigation
CN102937981A (en) Webpage representing system and method
CN100367276C (en) Method and appts for searching within a computer network
CN101147145B (en) Embedded web-based management method
US20080228920A1 (en) System and method for resource aggregation and distribution
EP2724251B1 (en) Methods for making ajax web applications bookmarkable and crawlable and devices thereof
US9223895B2 (en) System and method for contextual commands in a search results page
US20040205076A1 (en) System and method to automate the management of hypertext link information in a Web site
CN104063460A (en) Method and device for loading webpage in browser
CN102970284A (en) User information processing method and server
CN102761532A (en) Information processing system and method for network video
CN110266661A (en) A kind of authorization method, device and equipment
CN103258056B (en) Process the method for style design table, server, client and system
CN105939313A (en) State code redirecting method and device
CN102855334A (en) Browser and method for acquiring domain name system (DNS) resolving data
US20090089245A1 (en) System and method for contextual commands in a search results page
CN104065736A (en) URL redirection method, device, and system
CN1960371B (en) Method and system for accessing file of Web application program
EP2711852A1 (en) Methods and systems for providing content provider-specified URL keyword navigation
CN101727471A (en) Website content retrieval system and method
CN103618742A (en) Method and system for acquiring sub domain names and webmaster permission verification method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee after: Beijing Qizhi Business Consulting Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240111

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Beijing Qizhi Business Consulting Co.,Ltd.