CN103324645A - Method and device for recommending webpage - Google Patents

Method and device for recommending webpage Download PDF

Info

Publication number
CN103324645A
CN103324645A CN2012100808315A CN201210080831A CN103324645A CN 103324645 A CN103324645 A CN 103324645A CN 2012100808315 A CN2012100808315 A CN 2012100808315A CN 201210080831 A CN201210080831 A CN 201210080831A CN 103324645 A CN103324645 A CN 103324645A
Authority
CN
China
Prior art keywords
webpage
interest
user
keyword
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100808315A
Other languages
Chinese (zh)
Other versions
CN103324645B (en
Inventor
王犇
何军
杨志峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shiji Guangsu Information Technology Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201210080831.5A priority Critical patent/CN103324645B/en
Publication of CN103324645A publication Critical patent/CN103324645A/en
Application granted granted Critical
Publication of CN103324645B publication Critical patent/CN103324645B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a device for recommending a webpage. The method comprises the steps as follows: a click query log is acquired, and the click query log comprises user IDs (identifiers), keywords and webpage IDs; keyword information of each user ID is gathered, and an interest model of the user ID is established; the webpage IDs of all the user IDs are gathered, keyword information in a webpage corresponding to each webpage ID is acquired, and an interest model of the webpage ID is established; associations degrees of the user IDs and the webpage IDs are determined according to the interest models of the user IDs and the interest models of the webpage IDs; when a research result clicking command of a user is received, and a wireless web search transcoding page is entered, a first webpage ID with a preset number is selected according to an order from high to low of the associations degrees of the webpage IDs and the user IDs, and the selected webpage corresponding to each webpage ID is recommended in the transcoding page. According to the method and the device, the target webpage can be searched rapidly.

Description

A kind of webpage recommending method and device
Technical field
The present invention relates to the data mining technology field, particularly a kind of webpage recommending method and device.
Background technology
Growth along with the surfing Internet with cell phone user; also more and more by the behavior that mobile phone terminal is searched for; in order to help the information that the user can inquire to be needed; wireless search engine can provide some keywords relevant with this webpage to click inquiry for the user in the radio web page search transcoding page or leaf that the user clicks usually, or provides the keyword relevant with current query string to click inquiry for the user.
Yet, at present industry is this in user search and click and provide relevant keyword to click the mode of inquiry for the user when a certain radio web page search transcoding page is browsed, essence is to reduce the hunting zone, improve the search accuracy, help the user to obtain more preferably Search Results, still, this is also so that the user need to select query string search again, and check that again Search Results can find interested webpage, pilot process is long.
Summary of the invention
In view of this, the invention reside in provides a kind of webpage recommending method, the method can fast finding to target web.
In order to achieve the above object, the invention provides a kind of webpage recommending method, the method comprises:
Obtain the click inquiry log, described click inquiry log comprises user ID, keyword and webpage ID;
The key word information that gathers each user ID is set up the interest model of this user ID; The webpage ID that gathers all user ID obtains the key word information in webpage corresponding to each webpage ID, sets up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID;
When the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf
The present invention also provides a kind of webpage recommending device, and this device comprises: log acquisition unit 201, the first analytic unit, recommendation unit;
Described log acquisition unit is used for obtaining the click inquiry log, and described click inquiry log comprises user ID, keyword and webpage ID;
Described the first analytic unit is used for gathering the key word information of clicking each user ID of inquiry log, sets up the interest model of this user ID; Gather the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, set up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID;
Described recommendation unit, be used for when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
By top technical scheme as can be known, the present invention is by analyzing clicking inquiry log, set up the interest model of user ID and the interest model of webpage ID, set up the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID, thereby when the user clicks Search Results and enters wireless search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in the transcoding page or leaf, thereby can be made user's fast finding to target web.
Description of drawings
Fig. 1 is the process flow diagram of embodiment of the invention webpage recommending method;
Fig. 2 is the structural representation of embodiment of the invention webpage recommending device.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with the accompanying drawing embodiment that develops simultaneously, technical scheme of the present invention is elaborated.
Participate in Fig. 1, Fig. 1 is the process flow diagram of embodiment of the invention webpage recommending method, may further comprise the steps:
Step 101, obtain the click inquiry log, described click inquiry log comprises user ID, keyword and webpage ID.
Click inquiry log, refer to when the user utilizes search engine inquiry information, the relative recording that user's search behavior is carried out, can comprise the information such as user ID (ID), keyword and banner (ID), Search Results of the every click of user can record one and click inquiry log, for example, user search " way is seen ", search engine returns many Search Results; If it is 1234 webpage that user A has clicked webpage ID, then can record one click inquiry log as: user A, way see, 1234.In actual applications, when the user uses search engine inquiry information, provide the service provider of search service generally all can carry out log recording to user's search behavior.Here, described keyword also is the keyword that the user inquires about in search engine, and described webpage ID also is the ID of the webpage clicked in Search Results corresponding to this keyword of user, and each webpage has unique webpage ID.
Step 102, gather the key word information of each user ID, set up the interest model of this user ID; The webpage ID that gathers all user ID obtains the key word information in webpage corresponding to each webpage ID, sets up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID.
In fact, if the user is interested in Search Results corresponding to certain keyword, can illustrate to a certain extent that then this user is also interested in this keyword, therefore, can be with the user to the number of clicks of the Search Results under each keyword as weighing the index of user to the interest level of this keyword.If the user is interested in the Search Results under certain class keywords, can illustrate to a certain extent that then this user is also interested in this class keywords, therefore, can be with the user to the number of clicks of Search Results corresponding to every class keywords as weighing the index of user ID to the interest level of this class keywords.Therefore, can set up interest model, also can set up interest model to the interest level of every class keywords according to user ID the interest level of each keyword according to user ID, can also synthetic user ID to the interest level of each keyword and the interest model of the interest level of every class keywords being set up user ID.
Same reason, if certain keyword repeatedly occurs in webpage corresponding to webpage ID, can illustrate to a certain extent that then the content of the webpage that this webpage ID is corresponding and this keyword may be more relevant, therefore, the occurrence number of each keyword is as weighing the index of webpage ID to the interest level of this keyword in the webpage that can webpage ID is corresponding.Same reason, if certain class keywords repeatedly occurs among the webpage ID, can illustrate to a certain extent that then the content of the webpage that this webpage ID is corresponding and this class keywords may be more relevant, therefore, the occurrence number of every class keywords is as weighing the index of webpage ID to the interest level of this class keywords in the webpage that can webpage ID is corresponding.Therefore, can set up webpage ID interest model, also can set up webpage ID interest model according to the occurrence number of every class keywords in webpage corresponding to webpage ID according to the occurrence number of each keyword in webpage corresponding to webpage ID, the occurrence number of each keyword and the occurrence number of every class keywords be set up webpage ID interest model in can also the composite web page ID corresponding webpage.
The below describes respectively the interest model of setting up user ID and the method for setting up the interest model of webpage ID:
At first, can set up interest model to the interest level of each keyword according to user ID:
In this case, the interest model of described user ID includes only the first relevant item of interest of keyword, in described the first item of interest, can comprise a plurality of the first interest subitems, wherein, each first interest subitem representative of consumer ID is to the interest of a keyword, and particular content can comprise that keyword, user ID are to the interest-degree of keyword;
The described key word information that gathers each user ID, the interest model of setting up this user ID specifically can comprise: gather all keywords that user corresponding to this user ID inquired about, the number of webpage clicking ID determines that according to the number of webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword.
Correspondingly, can set up interest model according to the occurrence number of each keyword in webpage corresponding to webpage ID:
In this case, the interest model of described webpage ID comprises the second item of interest that keyword is relevant, can comprise in described the second item of interest can a plurality of the second interest subitems, wherein, each second interest subitem represents webpage ID to the interest of a keyword, and particular content can comprise that keyword, webpage ID are to the interest-degree of keyword;
The described webpage ID that gathers all user ID, obtain the key word information of webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, determine that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword.
Need to prove, the described invalid word of present specification can comprise that specifically preposition, adverbial word, interjection, adjective, occurrence number (also are about to the very few or too much word of occurrence number in webpage ID less than the first preset ratio and/or greater than the word of the second preset ratio, be considered as invalid word), wherein, described the first preset ratio is less than the second preset ratio.In addition, the content of the webpage that webpage ID is corresponding specifically can be title and the summary info that comprises webpage, also can be to comprise the information such as the title of webpage and text.
When the interest model of the interest-degree of keyword being set up user ID according to user ID, and when the keyword occurrence number is set up the interest model of webpage ID among the webpage ID, the interest model of each user ID can be mapped to a N dimensional vector V K1, wherein, every one dimension representative of consumer ID is to the interest-degree of a keyword; The interest model of each webpage ID is mapped to a N dimensional vector V K2, wherein, every one dimension represents webpage ID to the interest-degree of a keyword; By calculating V K1And V K2Between distance D KDetermine the degree of association between this user ID and this webpage ID.Here, calculate V K1And V K2Between the method for distance can use the method for prior art, for example, calculate both cosine distances.
Secondly, according to user ID the interest level of every class keywords is set up interest model:
In this case, the interest model of described user ID includes only the 3rd relevant item of interest of keyword type, in described the second item of interest, can comprise a plurality of the 3rd interest subitems, wherein, each the 3rd interest subitem representative of consumer ID is to the interest of a class keywords, and particular content can comprise that keyword type, user ID are to the interest-degree of keyword type;
The described key word information that gathers each user ID, the interest model of setting up this user ID specifically can comprise: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords.
Correspondingly, can set up interest model according to the occurrence number of every class keywords among the webpage ID:
In this case, the interest model of described webpage ID comprises the 4th item of interest that keyword is relevant, can comprise a plurality of the 4th interest subitems in described the 4th item of interest, wherein, each the 4th interest subitem represents webpage ID to the interest of a class keywords, and particular content can comprise that keyword type, webpage ID are to the interest-degree of keyword type;
The described webpage ID that gathers all user ID, obtain the key word information in webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, determine the affiliated type of each keyword of residue, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
When the interest model of the interest-degree of every class keywords being set up user ID according to user ID, and when setting up the interest model of webpage ID according to the occurrence number of every class keywords among the webpage ID, the interest model of each user ID can be mapped to a N dimensional vector V C1, wherein, every one dimension representative of consumer ID is to the interest-degree of a class keywords; The interest model of each webpage ID is mapped to a N dimensional vector V C2, wherein, every one dimension represents webpage ID to the interest-degree of a class keywords; By calculating V C1And V C2Between distance D CDetermine the degree of association between this user ID and this webpage ID.Here, calculate V C1And V C2Between the method for distance can use the method for prior art, for example, calculate both cosine distances.
At last, set up interest model according to user ID to each keyword and to the interest level of every class keywords:
In this case, the interest model of described user ID comprises the first item of interest three item of interest relevant with keyword type that keyword is relevant;
The described key word information that gathers each user ID, the interest model of setting up this user ID specifically can comprise: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords.
According to user ID to each keyword and that the interest level of every class keywords is set up an example of interest model is as follows:
[way is seen: 0.9 ix35:0.8 pork braised in brown sauce: 0.6 grilled fish: 0.5] [automobile: 0.8 cuisines: 0.2]
Wherein in first bracket, " way is seen ", " ix35 ", " pork braised in brown sauce " are keyword, and the numeral behind each keyword back colon is that user ID is to the interest-degree of keyword; In second bracket, " automobile ", " cuisines " are keyword type, and the numeral behind each keyword type back colon is that user ID is to the interest-degree of this class keywords.
Correspondingly, can set up interest model according to the occurrence number of each keyword and every class keywords among the webpage ID:
In this case, the interest model of described webpage ID comprises the second item of interest and the 4th relevant item of interest of keyword type that keyword is relevant;
The described webpage ID that gathers all user ID, obtain the key word information in webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, the occurrence number of statistics each keyword of residue in this webpage ID determines that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword; Determine the affiliated type of each keyword, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
An example setting up interest model according to the occurrence number of each keyword and every class keywords among the webpage ID is as follows:
[way is seen: 0.9 evaluation and test: 0.8 test ride: 0.6] [automobile: 0.8]
Wherein in first bracket, way sight, evaluation and test, test ride are keyword, and the numeral behind each keyword back colon is that webpage ID is to the interest-degree of keyword; In second bracket, automobile is keyword type, and the numeral behind each keyword type back colon is that webpage ID is to the interest-degree of this class keywords.
When the interest model of the interest-degree of each keyword and every class keywords being set up user ID according to user ID, and when setting up the interest model of webpage ID according to the occurrence number of each keyword and every class keywords in webpage corresponding to webpage ID, user ID in the interest model of each user ID can be mapped to respectively N dimensional vector V to the interest-degree of keyword and user ID to the interest-degree of keyword type K1And V C1User ID in the interest model of each webpage ID is mapped to respectively N dimensional vector V to the interest-degree of keyword and user ID to the interest-degree of keyword type K2And V C2Calculate V K1And V K2Between distance D K, and V C1And V C2Between distance D C, by to D KAnd D CThe method that is weighted calculating is determined the degree of association between this user ID and this webpage ID, for example, adopts following formula to calculate: D=a * D K+ (1-a) * D C, wherein, D is the degree of association between this user ID and this webpage ID, and a is preset value, and a is less than 1 and greater than a real number of 0.
An example of the degree of association that calculates user ID and webpage ID is as follows:
User A->webpage A:0.9->webpage B:0.7->webpage C:0.3
In this example, user A is respectively 0.9,0.7,0.3 to the interest-degree of webpage A, webpage B, webpage C.
Step 103, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
In this step, when the user need to check certain Search Results, need to click this Search Results, like this, background server can receive this click Search Results order, and will click the corresponding radio web page search transcoding page or leaf that the Search Results order asks and offer the user, in this transcoding page or leaf, carry out webpage recommending simultaneously.Can be according to the user ID of in step 102, setting up and the degree of association between webpage ID, according to the user ID degree of association webpage ID being sorted, select the webpage ID of the first forward default number of ordering, the webpage that each webpage ID that then will select is corresponding is recommended in this transcoding page or leaf.By will in this transcoding page or leaf, recommending with the webpage of higher the first default number of the user ID degree of association, realization with the interested webpage recommending of user's most probable to the user, so that the user does not need again to select query string search, just can find interested webpage, thereby can make user's fast finding to interested target web.For example, when user A clicked Search Results and enters radio web page transcoding page or leaf, can be with two the webpage IDs higher with the user A degree of association: the webpage that webpage A, webpage B are corresponding be recommended in the transcoding page or leaf.
Here, webpage is recommended in the transcoding page or leaf, be actually the transcoding page or leaf is put in the chained address of this webpage, like this, the user can enter interested recommendation webpage by clicking this chained address.
In actual applications, after the user clicks Search Results and enters radio web page search transcoding page or leaf, if the user is interested in the content of this transcoding page or leaf, then the interest level of user's pair other webpage close with this transcoding page or leaf content also can be relatively high, therefore, can also be with other webpage recommending close with the content of this transcoding page or leaf to user ID.
In fact, user's interest exists period, often only interested in content in a certain respect within a period of time, for example, current related content to the automobile aspect of user is interested, the whole of search are the relevant contents of automobile, therefore, user ID within a period of time search and the behavior of clicking Search Results characterized this user ID in the interest of this section in the time.In addition, if two user ID were searched for identical keyword, can think that then there is similarity in these two user ID aspect interest, for example, if user A and user B all searched for " way is seen ", can think that then they all see automobile to the way interested.If two users also clicked identical webpage ID, can think also that then there is similarity in these two users aspect interest.If there is similarity in two users aspect interest, can think that then the webpage ID that they click also has certain relevance.
For example, user A in Preset Time search and the click behavior as shown in Table 1:
Table one
User B in Preset Time search and the click behavior as shown in Table 2:
Figure BDA0000146381260000092
Table two
Wherein, user A and user B all searched for " way see " and all clicked webpage ID is 1234 and 2345 webpage, and therefore, there are similarity in user A and user B aspect interest, also have relevance between the webpage ID of click.
Because the webpage ID that user A and user B click has relevance, can carry out the webpage cluster to the webpage ID of user A and user B click.Can carry out the webpage cluster by different strategies, for example: the webpage ID that user A and user B are clicked separately carries out cluster and obtains: the cluster [1,234 2,345 7,890 8901] of the webpage ID formation that the cluster [1,234 2,345 3,456 4,567 5,678 6789] that the webpage ID that family A clicked consists of and user B clicked; Also all webpage ID of user A and user B click can be carried out cluster obtains: [1,234 2,345 3,456 4,567 5,678 6,789 7,890 8901]; The webpage ID of user A and the common click of user B can also be carried out cluster obtains: [1,234 2345].
In the webpage cluster that above-mentioned three kinds of methods obtain, in the webpage cluster that last a kind of method obtains, the degree of association between each webpage ID is the highest.Therefore, can determine the degree of association between the webpage ID based on user's search and click behavior.If two user ID have been searched for identical keyword and have been clicked identical webpage ID, then has the larger degree of association between two common webpage ID that click of user, example user A described above and user B, all clicked webpage ID and be 1234 and 2345 webpage, then webpage ID has the higher degree of association between two webpages of 1234 and 2345.
Therefore, in the embodiment of the invention shown in Figure 1, can further include: gather each keyword that user corresponding to all user ID inquires about in the Preset Time, for each user of this keyword of inquiry, between any two webpage ID that this user clicks, set up incidence relation; The webpage ID that gathers all users' clicks of this keyword of search in the Preset Time, for each webpage ID, add up the occurrence number of each incidence relation corresponding to this webpage ID, according to the occurrence number of each incidence relation determine this webpage ID and and this webpage ID have the degree of association between the webpage ID of incidence relation.Like this, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, can also be further according to the webpage ID of the degree of association select progressively second default number from high to low of this transcoding page or leaf, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
The below describes the method for determining the degree of association between webpage ID for example: suppose that user A has clicked webpage 1,2,3 in Preset Time, illustrate that webpage 1,2,3 has relevance, therefore can generate one group of incidence relation [1-2], [2-3], [1-3]; Suppose that user B has clicked webpage 1,2,4 in Preset Time, illustrate that webpage 1,2,4 has relevance, therefore can generate another group incidence relation [1-2], [2-4], [1-4].So, for webpage 1, the occurrence number of incidence relation [1-2] that can statistical web page 1 correspondence, [1-3], [1-4], wherein [1-2] occurs 2 times, each occurs [1-3], [1-4] once, so, can think that webpage 1 and the degree of association of webpage 2 are higher than the degree of association of webpage 1 and webpage 3 and webpage 1 and webpage 4.Enter the Web page 1 the time when the user clicks Search Results, can in webpage 1, webpage 2 be recommended the user.
In actual applications, also there is relevance between each webpage ID in Search Results corresponding to same keyword, and according to the difference of the content of webpage corresponding to each webpage ID, the degree of association is not identical yet, and the Search Results (form is: searching order position, title, webpage ID) that for example keyword " Highlander " is corresponding comprises following several:
1: Highlander authority evaluation and test: 123
2: Highlander VS Odyssey VS Mazda 8:234
3: " Highlander of Toyota " fresh picture quotation: 345
4: comprehensive preferential 1.5 ten thousand yuan of minimum the selling 28.88 ten thousand: 456 of Highlander
5. Highlander's relative merits _ automobile China Highlander comment: 567
Wherein, webpage ID is that the content of 123,234,567 webpage mainly lays particular emphasis on the evaluation and test to the Highlander, the degree of association between these three webpages is relatively high, mainly lays particular emphasis on Highlander's quotation and webpage ID is the content of 345,456 webpage, and the degree of association between these two webpages is higher.Therefore, can also carry out the webpage cluster to Search Results corresponding to same keyword, the degree of association that belongs between the webpage ID of same webpage cluster is higher, for example in above-mentioned example, carry out the webpage cluster and can obtain keyword " Highlander " corresponding webpage cluster [123 234 567] and [345 456].
In fact; when the user uses search engine inquiry information; usually can carry out log recording to keyword and the corresponding Search Results of user search; for example keyword and search result list corresponding to keyword are noted down in the Search Results displaying daily record, wherein can be comprised one or more webpage ID in the search result list.Like this, just can show search result list corresponding to each keyword of log acquisition by Search Results.In addition, also can when searching key word, directly obtain search result list corresponding to keyword that search engine returns.
After having obtained search result list corresponding to keyword, can be for each the webpage ID in the search result list content of corresponding webpage analyze, obtain the key word information in webpage corresponding to this webpage ID, generate this webpage ID characteristic of correspondence vector; Then just can according to all the webpage ID characteristic of correspondence vectors in search result list corresponding to this keyword, generate one or more webpage clusters corresponding to this keyword.Like this, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, search the webpage cluster at this transcoding page or leaf place in just can be at the keyword of this user inquiry corresponding one or more webpage clusters, in the webpage cluster that finds, select the webpage ID of the 3rd default number, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
Wherein, the above-mentioned key word information of obtaining in webpage corresponding to webpage ID, the method that generates this webpage ID characteristic of correspondence vector is specifically as follows: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, generate this webpage ID characteristic of correspondence vector according to the occurrence number of each keyword in this webpage.Here, the content of webpage can be title and the summary info that comprises webpage ID, also can be title and the text message that comprises webpage ID.Here, if all words all as a dimension, the proper vector that obtains can be larger, can adopt dimensionality reduction technology, and high dimension vector is changed into low dimensional vector.
In addition, described according to all the webpage ID characteristic of correspondence vectors in search result list corresponding to each keyword, the method that generates one or more webpage clusters corresponding to this keyword is specifically as follows: adopt K arest neighbors (K-Nearest Neighbor, KNN) sorting algorithm to carry out cluster to all the webpage ID characteristic of correspondence vectors in search result list corresponding to each keyword.
Above embodiment of the invention webpage recommending method is had been described in detail, the present invention also provides a kind of webpage recommending device, and this device can make user's fast finding to target web.
Referring to Fig. 2, Fig. 2 is the structural representation of embodiment of the invention webpage recommending device, and this device comprises: log acquisition unit 201, the first analytic unit 202, recommendation unit 203; Wherein,
Log acquisition unit 201 is used for obtaining the click inquiry log, and described click inquiry log comprises user ID, keyword and webpage ID;
The first analytic unit 202 is used for gathering the key word information of clicking each user ID of inquiry log, sets up the interest model of this user ID; Gather the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, set up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID;
Recommendation unit 203, be used for when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
In said apparatus,
The interest model of described user ID comprises the first item of interest, and described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword;
Described the first analytic unit 202 is gathering the key word information of clicking each user ID in the inquiry log, when setting up the interest model of this user ID, be used for: gather all keywords that user corresponding to this user ID inquired about, the number of webpage clicking ID determines that according to the number of webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword;
The interest model of described webpage ID comprises the second item of interest, and described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword;
Described the first analytic unit 202 is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, determine that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword.
Described the first analytic unit 202 is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2
Calculate N dimensional vector V K1And V K2Between distance D K, with D KBe designated as the degree of association between this user ID and this webpage ID.
In said apparatus,
The interest model of described user ID comprises the 3rd item of interest, and described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the first interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
Described the first analytic unit 202 is gathering the key word information of clicking each user ID in the inquiry log, set up the interest model of this user ID when concrete, be used for: gather all keywords that user corresponding to this user ID inquired about and determine type under each keyword; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the 4th item of interest, and described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
Described the first analytic unit 202 is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, determine the affiliated type of each keyword of residue, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
Described the first analytic unit 202 is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V C1And V C2Between distance D C, with D CBe designated as the degree of association between this user ID and this webpage ID.
In said apparatus,
The interest model of described user ID comprises the first item of interest, the 3rd item of interest; Described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword; Described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the 3rd interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
Described the first analytic unit 202 is gathering the key word information of clicking each user ID in the inquiry log, when setting up the interest model of this user ID, is used for: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the second item of interest, the 4th item of interest; Described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword; Described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
Described the first analytic unit 202 is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, the occurrence number of statistics each keyword of residue in this webpage determines that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword; Determine the affiliated type of each keyword, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
Described the first analytic unit 202 is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2, according to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V K1And V K2Between distance D K, and N dimensional vector V C1And V C2Between distance D C, to described D KAnd D CBe weighted the degree of association that calculates between this user ID and this webpage ID.
Described the first analytic unit 202 adopts following formula to described D KAnd D CBe weighted the degree of association that calculates between this user ID and this webpage ID:
D=a * D K+ (1-a) * D C, wherein, D is the degree of association between this user ID and this webpage ID, a is preset value, and a is greater than 0 and less than 1 real number.
In addition, this device also comprises the second analytic unit 204;
Described the second analytic unit 204, be used for gathering each keyword of clicking user's inquiry corresponding to interior all user ID of inquiry log Preset Time, for each user of this keyword of inquiry, between any two webpage ID that this user clicks, set up incidence relation; The webpage ID that gathers all users' clicks of this keyword of inquiry in the Preset Time, for each webpage ID, add up the occurrence number of each incidence relation corresponding to this webpage ID, according to the occurrence number of each incidence relation determine this webpage ID and and this webpage ID have the degree of association between the webpage ID of incidence relation;
Described recommendation unit 203, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further according to the webpage ID of the degree of association of this transcoding page or leaf select progressively the second default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
In addition, this device also comprises: the 3rd analytic unit 205;
Described log acquisition unit 201 is further used for: obtain search result list corresponding to each keyword;
Described the 3rd analytic unit 205 is used for each the webpage ID for search result list corresponding to each keyword, obtains the key word information in webpage corresponding to this webpage ID, generates this webpage ID characteristic of correspondence vector; All webpage ID characteristic of correspondence vectors according in search result list corresponding to each keyword generate one or more webpage clusters corresponding to this keyword;
Described recommendation unit 203, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further in one or more webpage clusters corresponding to the keyword of this user search, search the webpage cluster at this transcoding page or leaf place, in the webpage cluster that finds, select the webpage ID of the 3rd default number, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
The key word information of described the 3rd analytic unit 205 in obtaining webpage corresponding to this webpage ID, when generating this webpage ID characteristic of correspondence vector, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, generate this webpage ID characteristic of correspondence vector according to the occurrence number of each keyword in this webpage;
Described the 3rd analytic unit 205 is vectorial according to all the webpage ID characteristics of correspondence in search result list corresponding to each keyword, when generating one or more webpage cluster corresponding to this keyword, be used for: all the webpage ID characteristics of correspondence vectors to search result list corresponding to this keyword adopt the most contiguous KNN sorting algorithm of K to carry out cluster.
The above only is preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, is equal to replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims (22)

1. the method for a webpage recommending is characterized in that, the method comprises:
Obtain the click inquiry log, described click inquiry log comprises user ID, keyword and webpage ID;
The key word information that gathers each user ID is set up the interest model of this user ID; The webpage ID that gathers all user ID obtains the key word information in webpage corresponding to each webpage ID, sets up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID;
When the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
2. webpage recommending method according to claim 1 is characterized in that,
The interest model of described user ID comprises the first item of interest, and described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword;
The described key word information that gathers each user ID, the interest model of setting up this user ID comprises: gather all keywords that user corresponding to this user ID inquired about, the number of webpage clicking ID determines that according to the number of webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword;
The interest model of described webpage ID comprises the second item of interest, and described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword;
The described webpage ID that gathers all user ID, obtain the key word information in webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, determine that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword.
3. webpage recommending method according to claim 2 is characterized in that,
Describedly determine that according to the interest model of user ID and the interest model of webpage ID the degree of association of user ID and webpage ID comprises:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2
Calculate N dimensional vector V K1And V K2Between distance D K, with D KBe designated as the degree of association between this user ID and this webpage ID.
4. webpage recommending method according to claim 1 is characterized in that,
The interest model of described user ID comprises the 3rd item of interest, and described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the first interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
The described key word information that gathers each user ID, the interest model of setting up this user ID comprises: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the 4th item of interest, and described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
The described webpage ID that gathers all user ID, obtain the key word information in webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, determine the affiliated type of each keyword of residue, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
5. webpage recommending method according to claim 4 is characterized in that,
Describedly determine that according to the interest model of user ID and the interest model of webpage ID the degree of association of user ID and webpage ID comprises:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V C1And V C2Between distance D C, with D CBe designated as the degree of association between this user ID and this webpage ID.
6. webpage recommending method according to claim 1 is characterized in that,
The interest model of described user ID comprises the first item of interest, the 3rd item of interest; Described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword; Described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the 3rd interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
The described key word information that gathers each user ID, the interest model of setting up this user ID comprises: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the second item of interest, the 4th item of interest; Described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword; Described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
The described webpage ID that gathers all user ID, obtain the key word information of webpage corresponding to each webpage ID, the interest model of setting up this webpage ID comprises: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, the occurrence number of statistics each keyword of residue in this webpage determines that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword; Determine the affiliated type of each keyword, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
7. webpage recommending method according to claim 6 is characterized in that,
Describedly determine that according to the interest model of user ID and the interest model of webpage ID the degree of association of user ID and webpage ID comprises:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2, according to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V K1And V K2Between distance D K, and N dimensional vector V C1And V C2Between distance D C, to described D KAnd D CBe weighted the degree of association that calculates between this user ID and this webpage ID.
8. according to webpage recommending method according to claim 7, it is characterized in that:
To described D KAnd D CBe weighted the method that calculates the degree of association between this user ID and this webpage ID and be the following formula of employing:
D=a * D K+ (1-a) * D C, wherein, D is the degree of association between this user ID and this webpage ID, a is preset value, and a is greater than 0 and less than 1 real number.
9. webpage recommending method according to claim 1, it is characterized in that, the method further comprises: gather each keyword that user corresponding to all user ID inquired about in the Preset Time, for each user ID of this keyword of inquiry, between any two webpage ID that this user clicks, set up incidence relation; The webpage ID that gathers all users' clicks of this keyword of inquiry in the Preset Time, for each webpage ID, add up the occurrence number of each incidence relation corresponding to this webpage ID, according to the occurrence number of each incidence relation determine this webpage ID and and this webpage ID have the degree of association between the webpage ID of incidence relation;
When the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further according to the webpage ID of the degree of association of this transcoding page or leaf select progressively the second default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
10. webpage recommending method according to claim 1 is characterized in that, the method further comprises:
Obtain search result list corresponding to each keyword, for each the webpage ID in search result list corresponding to this keyword, obtain the key word information of webpage corresponding to this webpage ID, generate this webpage ID characteristic of correspondence vector;
All webpage ID characteristic of correspondence vectors according in search result list corresponding to each keyword generate one or more webpage clusters corresponding to this keyword;
When the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further in one or more webpage clusters corresponding to the keyword of this user search, search the webpage cluster at this transcoding page or leaf place, in the webpage cluster that finds, select the webpage ID of the 3rd default number, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
11. webpage recommending method according to claim 10 is characterized in that,
The described key word information of obtaining webpage corresponding to this webpage ID, the method that generates this webpage ID characteristic of correspondence vector is: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, generate this webpage ID characteristic of correspondence vector according to the occurrence number of each keyword in this webpage;
Described according to all the webpage ID characteristic of correspondence vectors in search result list corresponding to each keyword, the method that generates one or more webpage clusters corresponding to this keyword is: adopt the most contiguous KNN sorting algorithm of K to carry out cluster to all the webpage ID characteristic of correspondence vectors in search result list corresponding to this keyword.
12. a webpage recommending device is characterized in that, this device comprises: log acquisition unit, the first analytic unit, recommendation unit;
Described log acquisition unit is used for obtaining the click inquiry log, and described click inquiry log comprises user ID, keyword and webpage ID;
Described the first analytic unit is used for gathering the key word information of clicking each user ID of inquiry log, sets up the interest model of this user ID; Gather the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, set up the interest model of this webpage ID; Determine the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID;
Described recommendation unit, be used for when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, according to the webpage ID of the degree of association of user ID select progressively the first default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
13. webpage recommending device according to claim 12 is characterized in that,
The interest model of described user ID comprises the first item of interest, and described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword;
Described the first analytic unit is gathering the key word information of clicking each user ID in the inquiry log, when setting up the interest model of this user ID, be used for: gather all keywords that user corresponding to this user ID inquired about, the number of webpage clicking ID determines that according to the number of webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword;
The interest model of described webpage ID comprises the second item of interest, and described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword;
Described the first analytic unit is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, determine that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword.
14. webpage recommending device according to claim 13 is characterized in that,
Described the first analytic unit is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2
Calculate N dimensional vector V K1And V K2Between distance D K, with D KBe designated as the degree of association between this user ID and this webpage ID.
15. webpage recommending device according to claim 12 is characterized in that,
The interest model of described user ID comprises the 3rd item of interest, and described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the first interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
Described the first analytic unit is gathering the key word information of clicking each user ID in the inquiry log, when setting up the interest model of this user ID, is used for: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the 4th item of interest, and described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
Described the first analytic unit is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, determine the affiliated type of each keyword of residue, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
16. webpage recommending device according to claim 15 is characterized in that,
Described the first analytic unit is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V C1And V C2Between distance D C, with D CBe designated as the degree of association between this user ID and this webpage ID.
17. webpage recommending device according to claim 12 is characterized in that,
The interest model of described user ID comprises the first item of interest, the 3rd item of interest; Described the first item of interest comprises a plurality of the first interest subitems, and described the first interest subitem comprises that keyword, user ID are to the interest-degree of keyword; Described the 3rd item of interest comprises a plurality of the 3rd interest subitems, and described the 3rd interest subitem comprises that keyword type, user ID are to the interest-degree of keyword type;
Described the first analytic unit is gathering the key word information of clicking each user ID in the inquiry log, when setting up the interest model of this user ID, is used for: gather all keywords and the affiliated type of definite each keyword that user corresponding to this user ID inquired about; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this keyword when adding up this user and inquiring about each keyword; The number of webpage clicking ID determines that according to the number of this user's webpage clicking ID this user ID is to the interest-degree of this class keywords when adding up this user and inquiring about every class keywords;
The interest model of described webpage ID comprises the second item of interest, the 4th item of interest; Described the second item of interest comprises a plurality of the second interest subitems, and described the second interest subitem comprises that keyword, webpage ID are to the interest-degree of keyword; Described the 4th item of interest comprises a plurality of the 4th interest subitems, and described the 4th interest subitem comprises that keyword type, webpage ID are to the interest-degree of keyword type;
Described the first analytic unit is gathering the webpage ID that clicks all user ID in the inquiry log, obtain the key word information in webpage corresponding to each webpage ID, when setting up the interest model of this webpage ID, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, the occurrence number of statistics each keyword of residue in this webpage determines that according to the occurrence number of this keyword this webpage ID is to the interest-degree of this keyword; Determine the affiliated type of each keyword, add up the occurrence number of every class keywords in this webpage, determine that according to the occurrence number of this class keywords this webpage ID is to the interest-degree of this class keywords.
18. webpage recommending device according to claim 17 is characterized in that,
Described the first analytic unit is used for when determining the degree of association of user ID and webpage ID according to the interest model of the interest model of user ID and webpage ID:
According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each keyword K1According to the interest-degree generation N dimensional vector V of user ID in the interest model of each user ID to each class keywords C1
According to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each keyword K2, according to the interest-degree generation N dimensional vector V of webpage ID in the interest model of each webpage ID to each class keywords C2
Calculate N dimensional vector V K1And V K2Between distance D K, and N dimensional vector V C1And V C2Between distance D C, to described D KAnd D CBe weighted the degree of association that calculates between this user ID and this webpage ID.
19. webpage recommending device according to claim 18 is characterized in that,
Described the first analytic unit adopts following formula to described D KAnd D CBe weighted the degree of association that calculates between this user ID and this webpage ID:
D=a * D K+ (1-a) * D C, wherein, D is the degree of association between this user ID and this webpage ID, a is preset value, and a is greater than 0 and less than 1 real number.
20. webpage recommending device according to claim 12 is characterized in that this device also comprises the second analytic unit;
Described the second analytic unit, be used for gathering each keyword that user corresponding to all user ID inquired about in the click inquiry log Preset Time, for each user of this keyword of inquiry, between any two webpage ID that this user clicks, set up incidence relation; The webpage ID that gathers all users' clicks of this keyword of inquiry in the Preset Time, for each webpage ID, add up the occurrence number of each incidence relation corresponding to this webpage ID, according to the occurrence number of each incidence relation determine this webpage ID and and this webpage ID have the degree of association between the webpage ID of incidence relation;
Described recommendation unit, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further according to the webpage ID of the degree of association of this transcoding page or leaf select progressively the second default number from high to low, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
21. webpage recommending device according to claim 12 is characterized in that, this device also comprises: the 3rd analytic unit;
Described log acquisition unit is further used for: obtain search result list corresponding to each keyword;
Described the 3rd analytic unit is used for each the webpage ID for search result list corresponding to each keyword, obtains the key word information in webpage corresponding to this webpage ID, generates this webpage ID characteristic of correspondence vector; All webpage ID characteristic of correspondence vectors according in search result list corresponding to each keyword generate one or more webpage clusters corresponding to this keyword;
Described recommendation unit, when the click Search Results order that receives the user enters radio web page search transcoding page or leaf, further in one or more webpage clusters corresponding to the keyword of this user search, search the webpage cluster at this transcoding page or leaf place, in the webpage cluster that finds, select the webpage ID of the 3rd default number, webpage corresponding to each webpage ID of selecting recommended in this transcoding page or leaf.
22. webpage recommending device according to claim 21 is characterized in that,
The key word information of described the 3rd analytic unit in obtaining webpage corresponding to this webpage ID, when generating this webpage ID characteristic of correspondence vector, be used for: the content to webpage corresponding to this webpage ID is carried out participle, remove invalid word, add up the occurrence number of remaining each keyword in this webpage, generate this webpage ID characteristic of correspondence vector according to the occurrence number of each keyword in this webpage;
Described the 3rd analytic unit is vectorial according to all the webpage ID characteristics of correspondence in search result list corresponding to each keyword, when generating one or more webpage cluster corresponding to this keyword, be used for: all the webpage ID characteristics of correspondence vectors to search result list corresponding to this keyword adopt the most contiguous KNN sorting algorithm of K to carry out cluster.
CN201210080831.5A 2012-03-23 2012-03-23 A kind of webpage recommending method and device Active CN103324645B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210080831.5A CN103324645B (en) 2012-03-23 2012-03-23 A kind of webpage recommending method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210080831.5A CN103324645B (en) 2012-03-23 2012-03-23 A kind of webpage recommending method and device

Publications (2)

Publication Number Publication Date
CN103324645A true CN103324645A (en) 2013-09-25
CN103324645B CN103324645B (en) 2018-10-09

Family

ID=49193392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210080831.5A Active CN103324645B (en) 2012-03-23 2012-03-23 A kind of webpage recommending method and device

Country Status (1)

Country Link
CN (1) CN103324645B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559265A (en) * 2013-11-04 2014-02-05 北京中搜网络技术股份有限公司 Individualized push method of cell phone client
CN103678710A (en) * 2013-12-31 2014-03-26 同济大学 Information recommendation method based on user behaviors
CN104063443A (en) * 2014-06-13 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for providing search result
CN104268268A (en) * 2014-10-13 2015-01-07 宁波公众信息产业有限公司 Method and system for associating webpage information
CN105488205A (en) * 2015-12-09 2016-04-13 百度在线网络技术(北京)有限公司 Page generation method and page generation apparatus
CN105528456A (en) * 2015-12-25 2016-04-27 北京奇虎科技有限公司 User type based search interface showing method and device
CN105589971A (en) * 2016-01-08 2016-05-18 车智互联(北京)科技有限公司 Method and device for training recommendation model, and recommendation system
CN105608071A (en) * 2015-12-21 2016-05-25 北京奇虎科技有限公司 Generation method and device for determining machine learning algorithm of head word
CN105678335A (en) * 2016-01-08 2016-06-15 车智互联(北京)科技有限公司 Click rate pre-estimation method, device and calculating equipment
CN105989020A (en) * 2015-01-29 2016-10-05 北京灵集科技有限公司 Method and device for multi-data source matching of call network
CN106156106A (en) * 2015-04-03 2016-11-23 阿里巴巴集团控股有限公司 The computational methods of user characteristic data and device
CN106294596A (en) * 2016-07-29 2017-01-04 北京小米移动软件有限公司 The method and device of information search
CN106844680A (en) * 2017-01-25 2017-06-13 百度在线网络技术(北京)有限公司 The methods of exhibiting and device of recommendation information
CN107544980A (en) * 2016-06-24 2018-01-05 北京国双科技有限公司 A kind of method and device for searching webpage
CN108153857A (en) * 2017-12-22 2018-06-12 北京奇虎科技有限公司 A kind of method and system for being used to be associated network access data processing
CN109241403A (en) * 2018-08-03 2019-01-18 腾讯科技(深圳)有限公司 Item recommendation method, device, machinery equipment and computer readable storage medium
CN109685539A (en) * 2018-08-21 2019-04-26 平安普惠企业管理有限公司 Homepage methods of exhibiting, equipment, storage medium and device based on data processing
CN109871380A (en) * 2019-01-14 2019-06-11 深圳市东信时代信息技术有限公司 A kind of crowd's packet application method and system based on Redis
CN110990571A (en) * 2019-12-02 2020-04-10 精硕科技(北京)股份有限公司 Method and device for obtaining discussion occupation ratio, storage medium and electronic equipment
CN112507230A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Webpage recommendation method and device based on browser, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101551806A (en) * 2008-04-03 2009-10-07 北京搜狗科技发展有限公司 Personalized website navigation method and system
CN101789018A (en) * 2010-02-09 2010-07-28 清华大学 Method and device for constructing webpage click describing files based on mutual information
CN101819572A (en) * 2009-09-15 2010-09-01 电子科技大学 Method for establishing user interest model
CN101853308A (en) * 2010-06-11 2010-10-06 中兴通讯股份有限公司 Method and application terminal for personalized meta-search
US20110302155A1 (en) * 2010-06-03 2011-12-08 Microsoft Corporation Related links recommendation
CN102364467A (en) * 2011-09-29 2012-02-29 北京亿赞普网络技术有限公司 Network search method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101551806A (en) * 2008-04-03 2009-10-07 北京搜狗科技发展有限公司 Personalized website navigation method and system
CN101819572A (en) * 2009-09-15 2010-09-01 电子科技大学 Method for establishing user interest model
CN101789018A (en) * 2010-02-09 2010-07-28 清华大学 Method and device for constructing webpage click describing files based on mutual information
US20110302155A1 (en) * 2010-06-03 2011-12-08 Microsoft Corporation Related links recommendation
CN101853308A (en) * 2010-06-11 2010-10-06 中兴通讯股份有限公司 Method and application terminal for personalized meta-search
CN102364467A (en) * 2011-09-29 2012-02-29 北京亿赞普网络技术有限公司 Network search method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王宇: "基于搜索历史的用户兴趣建模", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559265A (en) * 2013-11-04 2014-02-05 北京中搜网络技术股份有限公司 Individualized push method of cell phone client
CN103678710A (en) * 2013-12-31 2014-03-26 同济大学 Information recommendation method based on user behaviors
CN104063443A (en) * 2014-06-13 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for providing search result
CN104268268B (en) * 2014-10-13 2018-05-22 宁波公众信息产业有限公司 A kind of webpage information correlating method and system
CN104268268A (en) * 2014-10-13 2015-01-07 宁波公众信息产业有限公司 Method and system for associating webpage information
CN105989020B (en) * 2015-01-29 2019-09-10 北京灵集科技有限公司 A kind of matched method and apparatus of call network multi-data source
CN105989020A (en) * 2015-01-29 2016-10-05 北京灵集科技有限公司 Method and device for multi-data source matching of call network
CN106156106B (en) * 2015-04-03 2019-10-22 阿里巴巴集团控股有限公司 The calculation method and device of user characteristic data
CN106156106A (en) * 2015-04-03 2016-11-23 阿里巴巴集团控股有限公司 The computational methods of user characteristic data and device
CN105488205A (en) * 2015-12-09 2016-04-13 百度在线网络技术(北京)有限公司 Page generation method and page generation apparatus
CN105488205B (en) * 2015-12-09 2019-05-03 百度在线网络技术(北京)有限公司 Page generation method and device
CN105608071A (en) * 2015-12-21 2016-05-25 北京奇虎科技有限公司 Generation method and device for determining machine learning algorithm of head word
CN105528456B (en) * 2015-12-25 2019-04-26 北京奇虎科技有限公司 Search interface methods of exhibiting and device based on user type
CN105528456A (en) * 2015-12-25 2016-04-27 北京奇虎科技有限公司 User type based search interface showing method and device
CN105589971B (en) * 2016-01-08 2018-12-18 车智互联(北京)科技有限公司 The method, apparatus and recommender system of training recommended models
CN105678335A (en) * 2016-01-08 2016-06-15 车智互联(北京)科技有限公司 Click rate pre-estimation method, device and calculating equipment
CN105678335B (en) * 2016-01-08 2019-07-02 车智互联(北京)科技有限公司 It estimates the method, apparatus of clicking rate and calculates equipment
CN105589971A (en) * 2016-01-08 2016-05-18 车智互联(北京)科技有限公司 Method and device for training recommendation model, and recommendation system
CN107544980B (en) * 2016-06-24 2020-07-24 北京国双科技有限公司 Method and device for searching webpage
CN107544980A (en) * 2016-06-24 2018-01-05 北京国双科技有限公司 A kind of method and device for searching webpage
CN106294596A (en) * 2016-07-29 2017-01-04 北京小米移动软件有限公司 The method and device of information search
CN106844680A (en) * 2017-01-25 2017-06-13 百度在线网络技术(北京)有限公司 The methods of exhibiting and device of recommendation information
CN108153857A (en) * 2017-12-22 2018-06-12 北京奇虎科技有限公司 A kind of method and system for being used to be associated network access data processing
CN109241403A (en) * 2018-08-03 2019-01-18 腾讯科技(深圳)有限公司 Item recommendation method, device, machinery equipment and computer readable storage medium
CN109241403B (en) * 2018-08-03 2022-11-22 腾讯科技(北京)有限公司 Project recommendation method and device, machine equipment and computer-readable storage medium
CN109685539A (en) * 2018-08-21 2019-04-26 平安普惠企业管理有限公司 Homepage methods of exhibiting, equipment, storage medium and device based on data processing
CN109871380B (en) * 2019-01-14 2022-11-11 深圳市东信时代信息技术有限公司 Crowd pack application method and system based on Redis
CN109871380A (en) * 2019-01-14 2019-06-11 深圳市东信时代信息技术有限公司 A kind of crowd's packet application method and system based on Redis
CN110990571A (en) * 2019-12-02 2020-04-10 精硕科技(北京)股份有限公司 Method and device for obtaining discussion occupation ratio, storage medium and electronic equipment
CN110990571B (en) * 2019-12-02 2024-04-02 北京秒针人工智能科技有限公司 Method and device for acquiring discussion duty ratio, storage medium and electronic equipment
CN112507230A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Webpage recommendation method and device based on browser, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN103324645B (en) 2018-10-09

Similar Documents

Publication Publication Date Title
CN103324645A (en) Method and device for recommending webpage
US10102307B2 (en) Method and system for multi-phase ranking for content personalization
CN102929928B (en) Multidimensional-similarity-based personalized news recommendation method
CN103473273B (en) Information search method, device and server
JP5860456B2 (en) Determination and use of search term weighting
CN107766399B (en) Method and system for matching images to content items and machine-readable medium
US8150979B1 (en) Supporting multiple landing pages
CN106339394B (en) Information processing method and device
CN105701216A (en) Information pushing method and device
Rakesh et al. Personalized recommendation of twitter lists using content and network information
CN102073699A (en) Method, device and equipment for improving search result based on user behaviors
CN103365839A (en) Recommendation search method and device for search engines
CN104008109A (en) User interest based Web information push service system
CN103631794A (en) Method, device and equipment for sorting search results
WO2014149840A1 (en) Method and system for discovery of user unknown interests
CN101894170A (en) Semantic relationship network-based cross-mode information retrieval method
CN101256596A (en) Method and system for instation guidance
CN105721944A (en) News information recommendation method for smart television
CN108319376B (en) Input association recommendation method and device for optimizing commercial word promotion
CN102364467A (en) Network search method and system
CN102214207A (en) Method and equipment for sorting attribute sets in information entities
CN104503988A (en) Searching method and device
CN104123321B (en) A kind of determining method and device for recommending picture
CN103955480A (en) Method and equipment for determining target object information corresponding to user
US20140280350A1 (en) Method and system for user profiling via mapping third party interests to a universal interest space

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: SHENZHEN SHIJI LIGHT SPEED INFORMATION TECHNOLOGY

Free format text: FORMER OWNER: TENGXUN SCI-TECH (SHENZHEN) CO., LTD.

Effective date: 20131028

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 518044 SHENZHEN, GUANGDONG PROVINCE TO: 518057 SHENZHEN, GUANGDONG PROVINCE

TA01 Transfer of patent application right

Effective date of registration: 20131028

Address after: 518057 Tencent Building, 16, Nanshan District hi tech park, Guangdong, Shenzhen

Applicant after: Shenzhen Shiji Guangsu Information Technology Co., Ltd.

Address before: Shenzhen Futian District City, Guangdong province 518044 Zhenxing Road, SEG Science Park 2 East Room 403

Applicant before: Tencent Technology (Shenzhen) Co., Ltd.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant