Summary of the invention
Technical matters to be solved by this invention provides a kind of Websites navigation method and system, and is too many to solve existing Web side navigation pattern classification level, and the problem of how fast and effeciently to carry out information filtering and extraction in the mass data.
For solving the problems of the technologies described above, the invention provides a kind of Websites navigation method, comprising:
The ordering rule that provides the multiple user of confession to select;
Regularly read list of websites;
The ordering rule that the corresponding said user of confession selects sorts the website in the said list of websites; When said ordering rule when sorting according to website homepage content change degree, the step of said ordering is specially: grasp the website homepage content; Said content is filtered html label and the processing of separating content; The latest data of result and preservation is compared, calculate rate of change, and preserve said result; Size according to said rate of change sorts from high to low;
The ordering rule of selecting according to the user shows the ranking results of correspondence according to the one-level classification.
Wherein, when said ordering rule sorts for the webpage rank rank of calculating according to Google, carry out following ordered steps:, search the webpage class value of corresponding website according to the network address in the said list of websites; Size according to said webpage class value sorts from high to low.
Wherein, when said ordering rule when sorting according to the Alexa rank, carry out following ordered steps:, search the Alexa value of corresponding website according to the network address in the said list of websites; Size according to said Alexa value sorts from low to high.
Preferably: show the ranking results in the preset range.
The present invention also provides a kind of Web side navigation system, comprising:
Ordering rule provides the unit, is used to the ordering rule that provides the multiple user of confession to select;
Storage unit is used to preserve list of websites and ranking results;
Data-reading unit is used for regularly reading list of websites;
Sequencing unit is used for the ordering rule that the corresponding said user of confession selects, and is sorted in the website in the said list of websites; When said ordering rule when sorting according to website homepage content change degree, said sequencing unit specifically is used for: grasp the website homepage content; Said content is filtered html label and the processing of separating content; The latest data of result and preservation is compared, calculate rate of change, and preserve said result; Size according to said rate of change sorts from high to low;
Display unit is used for the ordering rule according to user's selection, and the ranking results of correspondence is classified according to one-level to be shown.
Wherein, when said ordering rule sorts for the webpage rank rank calculated according to Google, sequencing unit is carried out following ordered steps: according to the network address in the said list of websites, search the webpage class value of corresponding website; Size according to said webpage class value sorts from high to low.
Wherein, when said ordering rule when sorting according to the Alexa rank, sequencing unit is carried out following ordered steps: according to the network address in the said list of websites, search the Alexa value of corresponding website; Size according to said Alexa value sorts from low to high.
Preferably: said display unit shows the ranking results in the preset range.
Compared with prior art, the present invention has the following advantages:
At first, the present invention comes regularly to calculate navigation type website score value now according to certain rule, and the height according to score value comes the foremost demonstration with only website then.Because the arrangement of website is regularly to change; And the rule of ordering represented the user certain browse demand; So only the one-level navigation directory need be set; Under the situation that does not increase the classification level, can the website that the user wants to visit directly be shown up front, thereby avoid the user need visit the problem of multistage classification.Especially change website greatly for score value, can recommend the user according to the actual operation situation in website.And, because the classification range of first class catalogue is relatively large, the problem that therefore can avoid the user that websites collection is understood differently and causes.The method of the invention can be brought better experience to the user, and the user just can select the website fast when browsing the navigation website first class catalogue.
Secondly, the invention provides the sortord that the multiple user of confession selects, satisfy user's multiple demand.And; Webpage rank (PR) ordering of sorting, calculating according to Google according to homepage content change degree that sortord is not limited to put down in writing among the present invention and the mode that sorts according to the Alexa rank; Can also be according to user's different demands; Increase other ordering rule, offer the more choice of user.
Once more; When the website after the ordering is shown to the user; Set different threshold values according to different sortords, only the website of score value in threshold range shown, thereby give the user significant recommendation of websites under the various sortords; And demonstration is just given up owing to there is not too big selection meaning in the website beyond some threshold ranges.
Embodiment
For make above-mentioned purpose of the present invention, feature and advantage can be more obviously understandable, below in conjunction with accompanying drawing and embodiment the present invention done further detailed explanation.
Core concept of the present invention is: the one-level navigation directory only is set; Multiple sortord is provided; Every kind of mode is calculated the score value of website according to certain rule; The ordering back is pushed to the front demonstration with the website that the user wants to visit under the one-level split catalog, and score value is not shown in the website beyond the threshold range.
The present invention proposes a kind of novel Web side navigation station, with reference to Fig. 1, is the process flow diagram that the user uses Web side navigation according to the invention station.
Step 101, Website login select classification to check.Listed the classification of each website under first class catalogue in the homepage of Web side navigation according to the invention website and shown that said first class catalogue is to be classified according to the classification of maximum magnitude in all websites that show in the webpage, as is divided into amusement, news, finance and economics etc.After the user capture website gets into homepage, a certain type of content that selection will be checked.For enlarging the same class range of choice of listed website now, the user can click the classification keyword, gets into first class catalogue.As clicking " amusement ", the page demonstrates the web site url of all amusement classes that navigation website includes.
Step 102, the user judges whether the current page content meets the custom of checking of oneself.When the user logined navigation website first, arrange the website under the first class catalogue was the arrangement of carrying out according to the system default setting.The invention provides three kinds of arrangement modes: arrange according to homepage content change degree, arrange, arrange according to the visit situation of website according to " importance " of website evaluation and test.If the arrangement mode that shows in the page meets user's the demand of browsing, then execution in step 103; Otherwise, execution in step 104.
Step 103 is clicked website links and is jumped to selected website, checks more contents.If the user hopes according to homepage content change degree; Perhaps " importance " of website evaluation and test, perhaps various objectives such as page access amount is selected the website, and current arrangement mode meets user's demand; Then the user directly clicks website links; It is maximum in numerous websites, to select the homepage content change fast, perhaps most worthy, the perhaps the highest website of visit capacity.And when the website quantity that provides under the first class catalogue exceeded one page scope, the link that the user also can click down one page continued to check.
Step 104 is selected other arrangement modes.If current website arrangement mode can not satisfy user's the demand of browsing, then the user can select another kind of arrangement mode in drop-down list.For example, current page is the ordering of carrying out according to the website visiting amount, and the user hopes the website of checking that the homepage content change is bigger then can select through the multiple arrangement mode that navigation website provides.The arrangement mode that can supply the user to select also can not use the mode of drop-down list, and a plurality of rank button or link directly are set in the page.
Step 105, system is shown to the user according to the arrangement mode that the user selects with rank results.As above routine, system can offer the user with the website of arranging according to homepage content change degree.Return step 102 again after the step 105, the user checks whether the web page contents of new demonstration meets the purpose of browsing of oneself, so the user can also carry out repeatedly sequencing selection.And in meeting the webpage of user's request, directly having listed the web site url of a plurality of similar contents, the user can select the website that will visit through the one-level navigation directory.
Said process is the front stage operation explanation at user capture Web side navigation station, for the data handling procedure on backstage, website, with reference to shown in Figure 2, is the system handles process flow diagram of Websites navigation method according to the invention.
Step 201 regularly reads pending list of websites.Listed the network address of all websites that navigation website includes in the said list of websites, background system regularly (being generally every day) reads a list of websites, and the score value that carries out listed website upgrades.
Step 202 judges whether the PR value of listed website is upgraded.Said PR (PageRank; The webpage rank) is a kind of method that the Google search engine is used to evaluate and test a webpage " importance "; Integrated identify such as Title (title) sign and Keywords (keyword) wait all other factorses after; Google adjusts the result through PR, makes those and has more the webpage of " importance " in Search Results, to make the website rank obtain to promote, thereby improve relevance of search results and quality.Therefore, Google calculates for each website according to many index score value (being the PR value) is high more, and the expression website is valuable more.The present invention offers a kind of sortord that the user selects, and the PR value of utilizing Google to calculate exactly sorts.If someone passes through other modes have just been upgraded the website before system upgrades automatically PR value, then system's judgement PR value is upgraded, continues execution in step 205; If the PR value on the same day is not also upgraded, then execution in step 203, and system upgrades the PR value automatically.
Step 203 is according to Web site query PR value.Providing on the website of inquiry, like the online query of Google PageRank (PR value): http://www.123cha.com/google_pagerank/, the website URL that input will be inquired about can inquire the PR value rank of corresponding website.As previously mentioned, the score value of finding is high more, and the website is just valuable more.The website PR value rank that system every day includes in all can the referral web site tabulation.
Step 204 is if successful inquiring then in database, write down up-to-date PR value, and update mode is successfully; If owing to reasons such as the system failure or network interruption cause the inquiry failure, then update mode is failure.For the situation of inquiry failure, the chance that can provide several times (as three times) to inquire about again, if success is not yet then abandoned continuing inquiry, state is designated as failure.Behind the completing steps 204, continue execution in step 205.
Step 205 judges whether the Alexa rank value of listed website is upgraded.The present invention offers the another kind of sortord that the user selects, and utilizes the Alexa rank to sort exactly.Said Alexa is a noticeable website with issue website, world rank, is a website that search is provided, the classification navigation is provided, and the Alexa rank is an index that is used for estimating a certain website visiting amount of often quoting at present.Different with above-mentioned PR value, the rank value that the Alexa rank inquires is more little, and corresponding website visiting amount is just big more.In the present invention, if there is the people to pass through other modes have just been upgraded the website before system upgrades automatically Alexa rank, then system's judgement Alexa rank is upgraded, continues execution in step 208; If the Alexa rank on the same day is not also upgraded, then execution in step 206, and system upgrades automatically.
Step 206 is according to Web site query Alexa rank value.As previously mentioned, providing on the website of inquiry, like online query: http://www.123cha.com/alexa/, the website URL that input will be inquired about can inquire the Alexa rank value of corresponding website.The website Alexa rank value rank that system every day includes in all can the referral web site tabulation.
Step 207, as previously mentioned, if successful inquiring then in database, write down up-to-date Alexa rank value, and update mode is successfully; If owing to reasons such as the system failure or network interruption cause the inquiry failure, then update mode is failure.For the situation of inquiry failure, the chance that can provide several times (as three times) to inquire about again, if success is not yet then abandoned continuing inquiry, state is designated as failure.Behind the completing steps 207, continue execution in step 208.
Step 208 judges whether the homepage intensity of variation of listed website is added up, if add up, then execution in step 215; If do not add up, then execution in step 209.The third sortord provided by the invention is and saidly sorts according to website homepage content change degree, and said method can count the website homepage situation of change of every day.
Step 209 reads the homepage content.System can utilize existing webpage gripping tool to grasp the homepage content of website every day, and the routine interface that also can utilize Java to provide for the system framework based on Java language is through transmitting the direct access websites homepage of URL, the homepage code that obtains returning.Then, system judges whether the page grasps successful, if success, execution in step 210; Otherwise, execution in step 214.
Step 210 is filtered the html label.Behind the homepage code that obtains returning, utilize predefined html tally set, find all html codes and remove, promptly filter out the content of html.
Step 211 is removed the space to remaining content after filtering the html label, and is separated said content according to punctuation mark.
Step 212 reads the last data and compares the statistics variations rate.The data of said the last time are meant the homepage content of the final updating of preservation, and for example, if upgraded yesterday successfully, then the last data are the data of yesterday; If upgraded failure yesterday, then the last data are upgraded successful data the day before yesterday.At first, the content after the above-mentioned separation is sought in the content of the last time of preserving, if find, then corresponding content does not change; If do not find, then corresponding content is content newly-increased or that revise.For the deleted content of comparing with the last data, do not take statistics.Then,, the line number that changes and total line number of today are divided by, obtain the rate of change of today according to the result who compares line by line.
Step 213 is preserved this and is filtered the homepage content after separating, and supplies contrast next time to use.
Step 214, as previously mentioned, if the aforesaid operations success then in database, write down up-to-date rate of change, and update mode is successfully; If owing to reasons such as the system failure or network interruption cause the inquiry failure, then update mode is failure.For the situation of inquiry failure, the chance that can provide several times (as three times) to inquire about again, if success is not yet then abandoned continuing inquiry, state is designated as failure.
The sortord that the multiple user of confession that the invention described above provides selects satisfies user's multiple demand.But sortord is not limited to above-mentioned record, can also increase other ordering rule according to user's different demands, offers the more choice of user.
Step 215 according to the score value that obtains, sorts according to different sortords.Background system is given database processing with sorting operation after accomplishing the website score value renewal of said process.When the user selects a kind of sortord, system carries out the accessing database operation, and database will be according to the score value that has upgraded, and the website rank results of corresponding sortord is shown to the user.For saving system resource, database only carries out a minor sort every day, then ranking results is preserved, and when user capture, directly ordering website is shown.
For the mode that the PR value of calculating according to Google sorts, according to PR value order from high to low, under the classification of first class catalogue, to arrange respectively, the website that the PR value is maximum comes the foremost; For the mode that sorts according to the Alexa rank, according to Alexa rank value order from low to high, under the classification of first class catalogue, to arrange respectively, the website that the Alexa rank is minimum comes the foremost; For the mode that sorts according to homepage content change degree, according to rate of change order from high to low, under the classification of first class catalogue, to arrange respectively, the website that rate of change is maximum comes the foremost.
The ranking results of said regular update will be presented under the first class catalogue according to user's different choice or default setting.If a class listed website now is too many; Do not have too big meaning owing to come the website of back, show but also multipage need be set, so a threshold value is set usually; Only give the user, and the website that will come beyond the threshold value does not show with significant recommendation of websites.For example, for the PR value, but setting threshold is 2, and the PR value does not show less than 2 website; For the Alexa rank value, but setting threshold is 50, and rank 50 later websites will not show.
By on can know; The arrangement of among the present invention because website is regularly to change; And the rule of ordering represented the user certain browse demand, so only the one-level navigation directory need be set, under the situation that does not increase the classification level; Can the website that the user wants to visit directly be shown up front, thereby avoid the user need visit the problem of multistage classification.Especially change website greatly for score value, can recommend the user according to the actual operation situation in website.And, because the classification range of first class catalogue is relatively large, the problem that therefore can avoid the user that websites collection is understood differently and causes.The method of the invention can be brought better experience to the user, and the user just can select the website fast when browsing the navigation website first class catalogue.
The present invention also provides a kind of Web side navigation system, with reference to Fig. 3, is the structural drawing of Web side navigation according to the invention system.Said system comprises: storage unit 301, data-reading unit 302, sequencing unit 303, display unit 304.
Storage unit 301 is used to preserve the score value and the data such as update mode and ranking results of list of websites, corresponding website.Listed the network address of all websites that navigation website includes in the said list of websites, said data-reading unit 302 regularly reads a list of websites, and the score value that carries out listed website upgrades.The score value of said website calculates according to certain rule, is used for according to the height of score value being sorted in the website, and only website is come the foremost.Whether said update mode has identified the regular update of the score value in the list of websites, is divided into two states of success and failure.Said ranking results is that arrange the website after sorting according to different sortords in the website in the list of websites.
Data-reading unit 302 is used for regularly reading the list of websites of said storage unit 301.Usually read once every day, trigger the processing of sorting of 303 pairs of network address that read of sequencing unit then.Therefore, the arrangement of website is regularly to change among the present invention.
Sequencing unit 303 is used for according to the ordering rule that can supply the user to select being sorted in the website that list of websites is included.The invention provides three kinds of arrangement modes: the PR value of calculating according to Google sorts, and sorts according to the Alexa rank, sorts according to homepage content change degree.For preceding two kinds of orderings; Can be directly inquiry obtains PR value or Alexa rank value on the website of inquiry providing according to the URL of website; Then according to from high to low series arrangement PR value; According to series arrangement Alexa rank value from low to high, because the high more website of PR value is valuable more, and the visit capacity of the more little website of Alexa rank value is high more.
For the third sortord; At first need utilize existing webpage gripping tool regularly to grasp the homepage content of website; The routine interface that also can utilize Java to provide for the system framework based on Java language is through transmitting the direct access websites homepage of URL, the homepage code that obtains returning; Utilize predefined html tally set then, find all html codes and remove, promptly filter out the content of html; Remove the space to remaining content, and separate said content according to punctuation mark; Dividing good content to compare with the data of the last time line by line, obtain the line number that change today at last, be divided by with total line number of today again, obtain the rate of change of today.After counting the rate of change of homepage, also in said storage unit 301, preserve this content, supply contrast next time to use through filtering and separating.Size according to rate of change during ordering is arranged from high to low, and rate of change comes front more more greatly.
Said sequencing unit 303 regularly all will carry out the renewal and the rearrangement of score value to above-mentioned three kinds of sortords, and in said storage unit 301, preserve ranking results, is shown to the user with the up-to-date ranking results that obtains the same day.Behind the EO that upgrades every kind of score value, all to write down up-to-date score value in storage unit 301, and update mode.And said sorting operation all is the ordering of under the first class catalogue of navigation website, carrying out, i.e. the ordering of amusement class, the ordering of sport category, or the like.Because the arrangement of website is regularly to change; And the rule of ordering represented the user certain browse demand; So the present invention only need be provided with the one-level navigation directory; Under the situation that does not increase the classification level, can the website that the user wants to visit be come the foremost demonstration, thereby bring better experience to the user.In addition, sortord is not limited to the record among the present invention, can also increase other ordering rule according to user's different demands, offers the more choice of user.
Display unit 304 is used for the ordering rule according to user's selection or default setting, and the corresponding ranking results that sequencing unit 303 is handled shows.After the user logins navigation website and checks classification, according to default setting a kind of ranking results is shown that according to the one-level classification user can be according to the different purposes of browsing then earlier, the multiple sortord that selective system provides carries out the selection of website.In said ranking results; Also be provided with different threshold values, do not have too big meaning, show but also multipage need be set owing to come the website of back according to different sortords; So only give the user usually, and the website that will come beyond the threshold value does not show with significant recommendation of websites.
More than to a kind of Websites navigation method provided by the present invention and system; Carried out detailed introduction; Used concrete example among this paper principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, part all can change on embodiment and range of application.In sum, this description should not be construed as limitation of the present invention.