Summary of the invention
Technical problems to be solved in this application are to provide a kind of website navigation page and generate method and apparatus, and solution must not with multiple classification display user's access the most normal website problem.
In order to solve the problem, this application discloses a kind of website navigation page generation method, comprising:
Data acquisition step, the network address obtaining the user in browsing history browses record;
Classifying step, classifies to the network address in described record according to network address classifying rules; Describedly according to network address classifying rules, the network address in described record to be classified, comprising: according to the network address classification preset, the network address in described record is divided in described network address classification;
Network address merger step, by each network address classification with the network address of merger network address name simple correlation, carry out merger according to merger network address list;
Statistic procedure, according to described record, adds up the access frequency of each network address classification; Described statistic procedure comprises: step S1, according to described record, adds up the access frequency of each network address; Step S2, according to the access frequency of each network address, adds up the access frequency of interior network address of all categories and obtains access frequency of all categories;
Ordered steps, each network address classification sorts to the access frequency of each network address classification by the user according to described statistics;
Position allocation step, the network address of multiple network address classifications that selected and sorted is forward is assigned to assigned address;
Network address step display, the network address in each network address classification of distribution locations described in regularly selecting is put into relevant position and is shown.
Preferably, described frequency is: be certain value for its frequency of all Visitor Logs to a network address of user within a period of time.
Preferably, described position allocation step comprises:
According to network address classification ranking results, in order by a position in each distributing labels page of multiple network address classifications forward for sequence;
Or, based on the access frequency proportion of each network address classification, the regularly multiple position of corresponding distribution in Shipping Options Page.
Preferably, also comprise:
Classification adds step, when the total access frequency of all clients to an event address correlation is greater than threshold value, then this event is added classifying rules as new classification.
Preferably, in the step display of described website:
For network address classification described in each, according to the sequence of network address access frequency each in this network address classification, one or more sorting forward is put into relevant position and shows.
Preferably, also comprised before described network address step display:
Steps A 1, according to described record, to the access times of each network address, and/or access mode is added up;
Steps A 2, sorts to described multiple network address according to the respective access times of multiple network address of described statistics and/or access mode.
Preferably, it is characterized in that:
Also comprise after described statistic procedure:
Merger network address list step of updating, when the access frequency of a network address is greater than threshold value, adds merger network address list by this network address.
Preferably, described according to the sequence of network address access frequency each in this network address classification after also comprise:
Network address delete step: the network address belonging to blacklist in ordering network address is deleted according to blacklist;
And/or, blacklist removal step: for user arrange blacklist in network address, when within a period of time user threshold value is greater than to the access frequency of a network address, then it is deleted from blacklist.
Disclosed herein as well is a kind of website navigation page generating apparatus accordingly, comprising:
Data acquisition module, browses record for the network address obtaining the user in browsing history;
Sort module, for classifying to the network address in described record according to network address classifying rules; Describedly according to network address classifying rules, the network address in described record to be classified, comprising: according to the network address classification preset, the network address in described record is divided in described network address classification;
Network address merge module, for by each network address classification with the network address of merger network address name simple correlation, will merger be carried out according to merger network address list;
Statistical module, for according to described record, adds up the access frequency of each network address classification; Described statistical module comprises: the first statistical module, for according to described record, adds up the access frequency of each network address; Second statistical module, for the access frequency according to each network address, adds up the access frequency of interior network address of all categories and obtains total access frequency of all categories;
Order module, sorts each network address classification to the access frequency of each network address classification for the user according to described statistics;
Position distribution module, the network address for the forward multiple network address classifications of selected and sorted is assigned to assigned address;
Network address display module, puts into relevant position for the network address in each network address classification of distribution locations described in selecting by classifying rules and shows.
Preferably, also comprise after described sort module:
Classification adds module, for being greater than threshold value when total access frequency of all clients to an event address correlation, then this event is added classifying rules as new classification.
Compared with prior art, the application comprises following advantage:
The application is by first classifying each network address that user accesses according to network address classifying rules, then first in units of classification, access frequency of all categories is added up, sequence again according to this access frequency shows to the forward corresponding display position of distribution of all categories of sequence, like this can more objective, more accurately, more comprehensively provide user the website of multiple classifications that most frequentation is asked to user.
Embodiment
For enabling above-mentioned purpose, the feature and advantage of the application more become apparent, below in conjunction with the drawings and specific embodiments, the application is described in further detail.
With reference to Fig. 1, show the schematic flow sheet of the embodiment of a kind of website navigation page of the application generation method, described method comprises:
Data acquisition step 110, the network address obtaining user browses record;
When after user's open any browser, the network address of the user that the application can obtain inside browsing history browses record, then calculates the network address that the most frequentation of user is asked.Generally the number of preset calculating is more than the display number of assigned address.The record of browsing obtaining first 10 days after such as user's open any browser every day from browsing history calculates 30 network address that most frequentation asks, forward 9 of assigned address acquiescence display rank.
Such as generally all historical records all leave in User Catalog, as under win7 C: Users user name AppData Roaming 360se data history.dat, exist with the form in self-defining data storehouse.
Classifying step 120, classifies to the network address in described record according to network address classifying rules.
Such as can preset network address classification is the classifications such as physical culture, music, film and television, news, game, then the network address that the user acquired browses is divided into each classification according to these classifications by certain rule.Such as, the network address for user browses each network address in record, first adds up the keyword set of a network address, then carries out similarity mode with the keyword set of network address classification of classifying, if be greater than threshold value, this network address belongs to described network address classification.
Also comprise after described classifying step:
Network address merger step, by each network address classification with the network address of merger network address name simple correlation, will merger be carried out according to merger network address list.
After the network address getting user browses record, for the described network address belonged to merger network address name simple correlation in the record of each network address classification, according to merger network address list, under the network address that the record described network address being browsed in record subordinate's network address of the same network address belonged in merger network address list is integrated in affiliated merger network address list.
Such as in order to prevent the situation avoiding weibo.com and weibo.com/atme to add up respectively, weibo.com/atme being integrated in weibo.com and adding up.Adopt the mechanism of merger network address list: be integrated into by weibo.com/atme in weibo.com, namely when user accesses weibo.com/atme, current system is contributed in weibo.com, the list of network address list is with browser installation file Auto-mounting to user's the machine, and listing file can upgrade at any time along with browser upgrading.
Described merger network address list by thinking setting, also can be arranged by intelligent learning.Such as, for a well-known door network address, such as Netease, can be rule of thumb, directly the network address of the network address of Netease's homepage and Netease's news is all put into merger network address list, if directly access Netease's news network address and its sublink, instead of pass through the connected reference Netease news network address of Netease's homepage network address, under then these Visitor Logs being integrated into Netease's news network address, if not the access of direct Netease news network address and its sublink, then be integrated into the access of Netease's homepage, other situations can be processed by similar principle.Or, when the access frequency of certain network address is very high, then added merger network address list, such as, the network address starting most the relevant Netease contained in merger network address list only has Netease's homepage network address, all Visitor Logs relevant to Netease all can be integrated into Netease's homepage network address, but the frequency of directly accessing Netease's news network address as certain user is far longer than the frequency of accessing Netease's homepage, then Netease's news is added merger network address list by intelligent learning by engine, and the record that thing directly accesses Netease's news is all integrated into below Netease's news network address.Statistic procedure 130, according to described record, adds up the access frequency of each network address classification.
Described statistic procedure comprises:
Step S1, according to described record, adds up the access frequency of each network address;
Wherein this step can also according to described record, and to the access times of each network address, and/or access mode is added up.
Browse record for the aforementioned network address obtained, the access frequency of each network address is added up.Wherein, described frequency is: be certain value for its frequency of all Visitor Logs to a network address of user within a period of time.Such as described frequency is: when within a period of time, user has access to a network address, then recording frequency is 1; When user in a period of time is to a network address no access, then recording frequency is 0.Again such as in units of sky, no matter to a network address access how many times in one day, all 1 is recorded as to the frequency of the network address record after its merger, if do not access the address correlation of this network address the same day, all 0 is recorded as to the frequency of the network address record after its merger.Certainly, also can 12 hours, within 6 hours, waiting as unit is to frequency being carried out to similar statistics, only needing this time period to cross over certain limit.The access times of network address, access mode etc. can also be added up.
Step S2, according to the access frequency of each network address, adds up the access frequency of interior network address of all categories and obtains total access frequency of all categories.
The access frequency of all network address in same network address classification is carried out the cumulative access frequency obtaining this network address classification, in such as a period of time, the access frequency of user to the website A of sport category is 4, be 5 to the access frequency of such network address B, so this user is 9 to the access frequency of Sport Class.So the access frequency of each network address classification is added up.
Also the access frequency of each network address classification and the number of times of each access mode can finally be added up according to the statistics of the aforementioned access times to network address and access mode in addition.
Ordered steps 140, each categories of websites sorts by the access frequency of user to each network address classification according to described statistics.
Such as user to the access frequency of Sport Class be 50, the access frequency of music categories is 28, the access frequency of film and television classification is 90, the sequencing after so drained order is: film and television classification, Sport Class, music categories.Position allocation step 150, the network address of multiple network address classifications that selected and sorted is forward is assigned to assigned address.
Wherein, preferably, can according to network address classification ranking results, in order by a position in each distributing labels page of multiple network address classifications forward for sequence.
Such as, using the blank page newly opened as Shipping Options Page, then just this Shipping Options Page is divided into nine lattice (i.e. nine positions), and every lattice can a corresponding network address classification.When after the sequence being obtained each network address classification by abovementioned steps, nine forward network address classifications of sequence can be got and be assigned to this nine lattice display position respectively.Wherein, the order sorting nine network address classifications forward and distribute can be preset, such as from the first row above, from upper left to the placement of bottom right order, or from the first row of the left side, place from top to bottom, or order is compiled and edited to these nine grid, places by sequence size.Or, in certain start page, add a tab bar, the forward network address classification of sequence distributed in order multiple positions of this tab bar.
Or, can based on the access frequency proportion of each network address classification, the regularly multiple position of corresponding distribution in Shipping Options Page.
Such as, using the blank page newly opened as Shipping Options Page, then just this Shipping Options Page is divided into nine lattice (i.e. nine positions), every lattice can a corresponding network address classification, for multiple network address classifications that sequence is forward, if the access frequency of the classification of wherein sequence first accounts for the proportion of total access frequency more than 50%, then distribute more position to this classification, such as give 4 positions, then the access frequency of the network address classification of sequence second accounts for classification only has 20%, then distribute 2 positions, then the access frequency of the network address classification of sequence the 3rd accounts for 15%, then distribute 1 position, then the access frequency of the network address classification of sequence the 4th accounts for 8%, then distribute 1 position, then the access frequency of the network address classification of sequence the 5th accounts for 5%, then distribute 1 position.Or, in certain start page, add a tab bar, by network address classification forward for sequence by described multiple positions than this tab bar of code reassignment.
Network address step display 160, the network address in each network address classification of distribution locations described in regularly selecting is put into relevant position and is shown.
Preferably, for network address classification described in each, according to the sequence of network address access frequency each in this network address classification, one or more sorting forward is put into relevant position and shows.
Wherein, also comprise steps A 1, according to described record, to the access times of each network address, and/or access mode is added up;
Steps A 2, sorts to described multiple network address according to the respective access times of multiple network address of described statistics and/or access mode.
If table one browses with the network address in certain network address classification of 7 days to be recorded as example and to be described.
|
First day |
Second day |
3rd day |
4th day |
5th day |
6th day |
7th day |
A website |
1 time |
1 time |
1 time |
1 time |
1 time |
1 time |
1 time |
B website |
4 times |
4 times |
|
|
4 times |
4 times |
4 times |
C website |
|
|
2 times |
2 times |
|
|
|
C1 website |
2 times |
2 times |
|
|
|
|
|
D website |
|
|
|
|
100 times |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table one
Current have 4 network address A, B, C, C1, D(wherein C1 be the subnet location of C)
A network address is only accessed once one week interior every day in the past, all accesses in one week at that time;
Have in 5 days in B network address in the past one week and accessed, access 4 every day;
Have in 2 days in C network address in the past one week and accessed, access 2 every day;
C1 network address is the subnet location of C network address, has in 2 days and accesses, access 2 every day in the past one week;
D network address only has within one day, accessed in the past one week, have accessed 100 times;
Have in merger network address list: network address A, network address B, network address C, network address D.
So the record of C1 network address just needs to be integrated under the record of network address C, then have in C network address in the past one week to have in 4 days to access, and accesses 4 every day.
Think that the frequency that user accesses is that standard calculates, suppose that user accessed once every day, namely remember that one gives corresponding network address:
Each network address is to having a frequency and F value (frequency)
The F of A network address is 1+1+1+1+1+1+1=7F=7
The F of B network address is 1+1+0+0+1+1+1=5F=5
The F of C network address is 1+1+1+1+0+0+0=5F=4
The F of D network address is 0+0+0+0+1+0+0=1F=1
Although D network address have accessed in one week 100 times and A network address only have accessed 7 times in one week
So last rank is A B C D, then assigned address of putting into forward for rank is shown.
If table two is in this network address classification, when occurring that network address access frequency is identical, the example sorted by the network address of network address access times to same frequency.
|
First day |
Second day |
3rd day |
4th day |
5th day |
6th day |
7th day |
A website |
1 time |
1 time |
1 time |
1 time |
1 time |
1 time |
1 time |
B website |
4 times |
4 times |
|
|
4 times |
4 times |
4 times |
C website |
|
|
2 times |
2 times |
2 times |
|
|
C1 website |
2 times |
2 times |
|
|
|
|
|
D website |
|
|
|
|
100 times |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table two
Current have 4 network address A, B, C, C1, D(wherein C1 be the subnet location of C)
A network address is only accessed once one week interior every day in the past, all accesses in one week at that time;
Have in 5 days in B network address in the past one week and accessed, access 4 every day;
Have in 2 days in C network address in the past one week and accessed, access 2 every day;
C1 network address is the subnet location of C network address, has in 2 days and accesses, access 2 every day in the past one week;
D network address only has within one day, accessed in the past one week, have accessed 100 times;
Have in merger network address list: network address A, network address B, network address C, network address D.
So the record of C1 network address just needs to be integrated under the record of network address C, then have in C network address in the past one week to have in 4 days to access, and accesses 4 every day.
Think that the frequency that user accesses is that standard calculates, suppose that user accessed once every day, namely remember that one gives corresponding network address:
Do to add up to the access times of each network address again and draw C value (count)
The F of A network address is the C of 1+1+1+1+1+1+1=7F=7A network address is 1+1+1+1+1+1+1=7C=7
The F of B network address is the C of 1+1+0+0+1+1+1=5F=5B network address is 4+4+0+0+4+4+4=5C=20
The F of C network address is the C of 1+1+1+1+1+0+0=5F=5C network address is 2+2+0+0+2+2+2=5C=10
Although the C that the F of D network address is 0+0+0+0+1+0+0=1F=0D network address is that 0+0+0+0+100+0+0=1C=100 D network address have accessed 100 times in one week and A network address only have accessed 7 times in one week, because the F value of A is greater than B, C and D
, now the F value of B network address is identical with the F value of C network address, but the C value of B network address is greater than C network address, and so last rank is still A B C D.
Wherein, when the access frequency of multiple network address is identical, if with the sequencing selection that access times are preferential, be that Secondary ordination is selected with access mode, so when user's access frequency of multiple network address is identical with access times, the multiple network address access mode separately according to described statistics sorts to described multiple network address.Wherein said sortord such as comprises address field and directly inputs, and collection mode is accessed, and clicks the mode opened from page link.Such as when user's access frequency of multiple network address is identical with access times, the statistics of the mode that can first directly input by address field sorts, the statistics of the mode also can accessed by collection mode sorts, and the statistics also can clicking by page link the mode opened sorts.In addition, such as when user's access frequency of multiple network address, access times are identical with the number of times of the mode that address field directly inputs, can sort with the statistics of the mode of accessing by collection mode again.Other situations can principle be analogized and carry out sequencer procedure according to this.This wherein, also can priority be set between various access mode, the access times sequence of the mode such as directly inputted with first address field, if also have identical, sort with collection mode access times again, if also have identical, then from page link clicks the mode opened send out access times sort.
Wherein, described according to the sequence of network address access frequency each in this network address classification after also comprise:
Network address delete step: the network address belonging to blacklist in ordering network address is deleted according to blacklist.Generally, if user by certain network address from display page delete after, the network address of this network address is just added blacklist by engine, obtain the most frequentation of user again ask among the sequence of the list of network address follow-up, in the event of this network address, then this network address deleted among sequence, the network address after the sequence of this network address is order one in advance then.
Also comprise blacklist removal step: for user arrange blacklist in network address, when within a period of time user threshold value is greater than to the access frequency of a network address, then it is deleted from blacklist.
For the network address in blacklist, if the network address of user within certain a period of time in (in such as 10 days) this blacklist of frequent visit, then engine judges it, when within a period of time user threshold value is greater than to the access frequency of this network address, then it is deleted from blacklist, if its rank is forward, and meets the requirements, assigned address can be put it into and show.
In addition, the network address that on a position, in corresponding network address classification, all client-access frequencies are the highest can also directly be shown.
In addition, also comprise after described statistic procedure:
Merger network address list step of updating, when the access frequency of a network address is greater than threshold value, adds merger network address list by this network address.
In addition, the application also comprises classification and adds step, when the total access frequency of all clients to an event address correlation is greater than threshold value, then this event is added classifying rules as new classification.
Such as, when giving very big concern to a certain event in a period of time in network, a large amount of clients frequently browses each website webpage relevant to this event, such as relevant to " the 2008 Beijing Olympic Games " webpage, all terminals to the access frequency of this event higher than threshold value, user's network address, as a classification, is browsed webpage relevant to this event in record according to this classification and is divided into this classification by webpage that then separately will be relevant to " the 2008 Beijing Olympic Games ".
With reference to Fig. 2, be the structural representation of the embodiment of a kind of website navigation page generating apparatus of the application, comprise:
Data acquisition module 410, browses record for the network address obtaining user;
Sort module 420, for classifying to the network address in described record according to network address classifying rules;
Statistical module 430, for according to described record, adds up the access frequency of each network address classification;
Order module 440, sorts each network address classification to the access frequency of each network address classification for the user according to described statistics;
Position distribution module 450, the network address for the forward multiple network address classifications of selected and sorted is assigned to assigned address;
Network address display module 460, puts into relevant position for the network address in each network address classification of distribution locations described in selecting by classifying rules and shows.
Wherein, described statistical module comprises:
First statistical module, for according to described record, adds up the access frequency of each network address;
Second statistical module, for the access frequency according to each network address, adds up the access frequency of interior network address of all categories and obtains total access frequency of all categories.、
Wherein, also comprise after described sort module:
Classification adds module, for being greater than threshold value when total access frequency of all clients to an event address correlation, then this event is added classifying rules as new classification.
Also comprise after described sort module:
Network address merge module, by each network address classification with the network address of merger network address name simple correlation, will merger be carried out according to merger network address list.
Wherein, preferably, be used for for network address classification described in each at described website display module, according to the sequence of network address access frequency each in this network address classification, one or more sorting forward put into relevant position and shows.
Wherein, also comprised before described network address display module:
Submodule one, according to described record, to the access times of each network address, and/or access mode is added up;
Submodule two, sorts to described multiple network address according to the respective access times of multiple network address of described statistics and/or access mode.
Wherein, also comprise after described statistical module:
Merger network address list update module, when the access frequency of a network address is greater than threshold value, adds merger network address list by this network address.
Wherein, described according to the sequence of network address access frequency each in this network address classification after also comprise:
Network address removing module: the network address belonging to blacklist in ordering network address is deleted according to blacklist;
And/or, blacklist remove module: for user arrange blacklist in network address, when within a period of time user threshold value is greater than to the access frequency of a network address, then it is deleted from blacklist.
For system embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar part mutually see.
Above method and apparatus is generated to a kind of website navigation page that the application provides, be described in detail, apply specific case herein to set forth the principle of the application and embodiment, the explanation of above embodiment is just for helping method and the core concept thereof of understanding the application; Meanwhile, for one of ordinary skill in the art, according to the thought of the application, all will change in specific embodiments and applications, in sum, this description should not be construed as the restriction to the application.