CN103136224A - Recommendation method and device for keywords - Google Patents

Recommendation method and device for keywords Download PDF

Info

Publication number
CN103136224A
CN103136224A CN201110379470XA CN201110379470A CN103136224A CN 103136224 A CN103136224 A CN 103136224A CN 201110379470X A CN201110379470X A CN 201110379470XA CN 201110379470 A CN201110379470 A CN 201110379470A CN 103136224 A CN103136224 A CN 103136224A
Authority
CN
China
Prior art keywords
words
word
recommended
black horse
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201110379470XA
Other languages
Chinese (zh)
Inventor
广宇昊
鲍鹏飞
陈华良
冯幼乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu com Times Technology Beijing Co Ltd
Original Assignee
Baidu com Times Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu com Times Technology Beijing Co Ltd filed Critical Baidu com Times Technology Beijing Co Ltd
Priority to CN201110379470XA priority Critical patent/CN103136224A/en
Publication of CN103136224A publication Critical patent/CN103136224A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a recommendation method and a device for keywords. The recommendation method comprises the followings steps: digging out search terms as black-horse words, wherein an average popularization result number of the search terms is less than a .preset maximum popularization result number, and a predicted use ratio is larger than a preset use ratio threshold value, and adding the black-horse words into a recommendation word stock; and enabling the words in the recommendation word stock to be intended to be keywords which are recommended to a user, wherein the relevancy degree of the words and business of the user reaches a preset relevancy degree threshold value. According to the recommendation method and the device for the keywords, use ratio of popularization resources can be improved, waste of the popularization resources is reduced, income of providers who provide search engines is improved, and needs of clients and users can be better met.

Description

Keyword recommendation method and device
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of computer networks, in particular to a keyword recommendation method and device.
[ background of the invention ]
Search promotion is a successful network promotion form, has prominent commercial value and is widely adopted by search engine providers. In order to promote or market on the network, a client (the client related to the invention is an enterprise promoting or marketing through the network) purchases a keyword from a search engine provider, and when a user (the user related to the invention is a common user of the network) starts a search on a search engine by using the keyword, a large search result is displayed, and meanwhile, promotion of the client purchasing the keyword is displayed for the user. The position of the promotion is usually above or to the right of the large search results, and the ranking may vary depending on the purchase payment status of other customers who purchase the keyword. The promotion is typically in the form of a link to the customer's website.
The existing search popularization mode is that when a user inputs a certain keyword, other keywords related to the keyword are recommended to the user for the user to select and purchase. The keyword recommendation method is based on the relevance between the keyword and the candidate keyword, however, in the actual application process of search and promotion, the situation that the promotion positions of some keywords are not full often occurs, that is, when the user searches the keyword, the promotion positions of the keywords are vacant, and the situation has the following defects:
first, waste of popularization resources. When a search based on the keyword occurs, the popularization resource is not fully utilized, and the profit of a search engine provider is also influenced while resource waste is caused.
Secondly, the customer requirements are not well met. When purchasing keywords, a customer usually wants to preempt and purchase and use the keywords before other peers, and the keywords with vacant promotion positions are just the "value holes" that the customer tries to find.
Third, user requirements are not well met. When a user searches for a keyword using a search engine, the shortage of the promotion results may cause some more relevant clients not to be shown in the promotion location, and the actual needs of the user may not be met.
Obviously, the existing keyword promotion technology cannot find the keywords with dissatisfactory promotion positions, so as to solve the defects.
[ summary of the invention ]
The invention provides a keyword recommendation method and device, which are used for improving the utilization rate of popularization resources and better meeting the requirements of customers and users.
The specific technical scheme is as follows:
a keyword recommendation method comprises the following steps:
s1, excavating search words with the average promotion result number smaller than the preset maximum promotion result number and the estimated utilization rate larger than the preset utilization rate threshold value from the search logs as black horse words, and adding the black horse words into a recommended word bank;
and S2, determining the words in the recommended word bank, the business relevance of which to the client reaches a preset relevance threshold, as the keywords recommended to the client.
In step S1, the mining of the black horse words specifically includes:
s11, acquiring a search log in a set time period;
s12, calculating the average number of popularization results of each search word in the search log, wherein the average number of the popularization results of the search word is the ratio of the number of the popularization results appearing in the search word to the number of times of searching the popularization results appearing in the search word;
s13, determining the search words with the average promotion result number smaller than the preset maximum promotion result number as candidate black horse words;
and S14, determining the estimated utilization rate of each candidate black horse word, and selecting the candidate black horse word with the estimated utilization rate larger than a preset utilization rate threshold value as the black horse word.
Preferably, between the steps S11 and S12, further comprising: s16, determining search words with the search times larger than a preset search time threshold value from the search log;
in step S12, the average number of the promotion results of each search term in the search log is calculated as: calculating the average promotion result number of each search term determined in the step S16.
Specifically, in step S14, the determining of the expected utilization rate of each candidate black horse word is:
determining the estimated utilization rate of each candidate black horse word according to the utilization rate parameter values of the keywords in the existing recommended word library and the correlation degree of the candidate black horse words and the recommended word library, wherein the utilization rate parameter values at least comprise: number of clicks or purchases.
According to
Figure BDA0000112116110000031
Calculating the estimated utilization rate of the candidate black horse word wscore (w) wherein PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) For P of the keywords meeting the preset correlation requirement between the keywords and the candidate black horse words w in the existing recommended word bankiAverage value of (a).
Preferably, the selecting the candidate black horse word with the estimated utilization rate greater than the preset utilization rate threshold as the black horse word in step S14 includes:
and selecting candidate black horse words with the estimated utilization rate larger than a preset utilization rate threshold value, and filtering the selected candidate black horse words according to a preset filtering strategy to obtain the black horse words.
Wherein, the step S2 specifically includes:
determining words in the recommended word bank, wherein the correlation degree between the words and the keywords input by the client reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the client; or,
determining words of which the correlation degree with the keywords purchased by the customer in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the customer; or,
extracting a characteristic vector from the business data of the customer, determining words of which the correlation degree with the characteristic vector in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as keywords recommended to the customer.
Still further, the method further comprises: ranking the keywords recommended to the client;
the sorting mode comprises the following steps: arranging the black horse words in the keywords recommended to the client in front; or, the weight of the black horse words is considered in a ranking algorithm of the keywords recommended to the client.
Preferably, a region attribute of a word in the recommended word bank is identified, the region attribute of the client is extracted in step S2, and a word having the same region attribute and having a service correlation degree with the client in the recommended word bank that reaches a preset correlation degree threshold is determined as a keyword recommended to the client.
Still further, the method further comprises: providing the black horse words with black horse word identification or recommendation reasons in the keywords recommended to the client.
An apparatus for recommending a keyword, the apparatus comprising:
the system comprises a black horse word mining unit, a recommendation word library and a recommendation word library, wherein the black horse word mining unit is used for mining search words, the average promotion result number of which is smaller than the preset maximum promotion result number and the estimated utilization rate of which is larger than a preset utilization rate threshold value, from a search log as black horse words and adding the black horse words into the recommendation word library;
and the keyword recommendation unit is used for determining the words in the recommended word bank, the business relevance of which to the client reaches a preset relevance threshold, as the keywords recommended to the client.
The black horse word mining unit specifically comprises:
the log obtaining subunit is used for obtaining the search logs in a set time period;
the calculation subunit is used for calculating the average promotion result number of each search word in the search log, wherein the average promotion result number of the search word is the ratio of the promotion result number of the search word to the search frequency of the promotion result of the search word;
the first selection subunit is used for determining that the search words with the average promotion result number smaller than the preset maximum promotion result number serve as candidate black horse words;
the second selection subunit is used for determining the expected utilization rate of each candidate black horse word, selecting the candidate black horse words with the expected utilization rate larger than the preset utilization rate threshold value, and providing the candidate black horse words to the word bank adding subunit;
and the word bank adding subunit is used for adding the received candidate black horse words into the recommended word bank as the black horse words.
Preferably, the computing subunit determines, from the search log, search terms of which the search times are greater than a preset search time threshold, and computes an average promotion result number of each determined search term.
Specifically, the second selecting subunit determines the expected utilization rate of each candidate black horse word according to the utilization rate parameter value of the keyword in the existing recommended word bank and the correlation between the candidate black horse word and the recommended word bank, where the utilization rate parameter value at least includes: number of clicks or purchases.
The second selection subunit is according to
Figure BDA0000112116110000041
Calculating the estimated utilization ratio score (w) of the candidate black horse words w, wherein PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) For P of the keywords meeting the preset correlation requirement between the keywords and the candidate black horse words w in the existing recommended word bankiAverage value of (a).
Still further, the black horse word mining unit further includes:
and the filtering subunit is used for filtering the candidate black horse words provided by the second selecting subunit to the word bank adding subunit according to a preset filtering strategy and then providing the filtered candidate black horse words to the word bank adding subunit.
The keyword recommending unit determines words in the recommended word bank, the relevancy of which to the keywords input by the client reaches a preset relevancy threshold, and takes the determined words as the keywords recommended to the client; or,
determining words of which the correlation degree with the keywords purchased by the customer in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the customer; or,
extracting a characteristic vector from the business data of the customer, determining words of which the correlation degree with the characteristic vector in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as keywords recommended to the customer.
Still further, the apparatus further comprises: the keyword sorting unit is used for sorting the keywords recommended by the keyword recommending unit;
the sorting mode comprises the following steps: arranging the black horse words in the keywords recommended to the client in front; or, the weight of the black horse words is considered in a ranking algorithm of the keywords recommended to the client.
Preferably, the keyword recommendation unit is further configured to identify a region attribute of a word in the recommended word bank, and when determining the keyword recommended to the client, determine, as the keyword recommended to the client, the word in the recommended word bank that has a service correlation degree with the client reaching a preset correlation degree threshold and has the same region attribute.
In addition, the keyword recommendation unit is further used for providing the black horse word with a black horse word identifier or a recommendation reason in the keywords recommended to the client.
According to the technical scheme, the search words with the average number of the promotion results smaller than the preset maximum number of the promotion results and the estimated utilization rate larger than the preset utilization rate threshold are mined from the search logs and added into the recommendation word bank, so that the keywords with unsatisfied promotion positions and higher estimated utilization rates can be recommended to the customers, the utilization rate of the promotion resources is improved, the waste of the promotion resources is reduced, the customers can quickly obtain ' value places ', preempt the first opportunity ', meanwhile, the users can provide enough promotion results for the users when searching the keywords, and the actual requirements of the users are met.
[ description of the drawings ]
FIG. 1 is a flow chart of a main method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a method for mining black horse words according to a second embodiment of the present invention;
fig. 3 is an example diagram of recommending keywords to a user according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a device according to a fourth embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The first embodiment,
Fig. 1 is a flowchart of a main method according to an embodiment of the present invention, and as shown in fig. 1, the method may include the following steps:
step 101: and excavating search words with the average promotion result number smaller than the preset maximum promotion result number and the estimated utilization rate larger than the preset utilization rate threshold value from the search logs as the black horse words, and adding the determined black horse words into the recommendation word bank.
The step is a mining step of the recommendation keywords, namely, the keywords (named as the black horse words in the embodiment of the invention) with vacant promotion positions and high estimated utilization rate are mined and added into a recommendation word bank, so that the black horse words can be recommended to customers. The specific mining method of the black horse words will be described in detail in example two.
The recommended word library related in the embodiment of the invention may only include the black horse words, or may include all the keywords purchased by the customers in addition to the black horse words.
Step 102: and determining the words in the recommended word bank, the business relevance of which to the client reaches a preset relevance threshold value, as the keywords recommended to the client.
The step is a keyword recommendation step, that is, keywords suitable for being popularized to the client are found in the recommended word bank, and are recommended to the client in an active or passive triggering mode.
Example II,
Fig. 2 is a flowchart of a mining method for black horse words according to a second embodiment of the present invention, that is, a specific implementation of the step 101, as shown in fig. 2, the mining method may include the following steps:
step 201: and acquiring a search log in a set time period.
Step 202: and determining the search words with the search times (PV) larger than a preset search time threshold (PV _ th) from the acquired search logs.
Generally, the value of the search term can be reflected by the search times of the search term, and the lower the search times is, the lower the value is, the selection of the black horse word is excluded from the search term of which the search times is less than or equal to a preset search time threshold value. Of course, step 202 may not be performed if the number of searches is not considered.
Step 203: the average number of promotional results (ASN) for each search term determined in step 202 is calculated.
In this step, the following formula can be adopted for calculation:
ASN = PV _ show epv - - - ( 1 )
where PV _ show is the number of promotional results for which the search term occurs, and epv is the number of searches for which the search term occurs for a promotional result.
For example, the number of times of searching for the promotion results appearing in the search log as the search term is 200, and the total number of the promotion results appearing in the search times is 800, so that the average number of the promotion results of the search term is 4, and the average number of the promotion results reflects the use condition of the promotion position of the search term.
Step 204: and determining the search words with the average promotion result number smaller than the preset maximum promotion result number as candidate black horse words.
Continuing with the above example, if the maximum promotion result number of the search term "flower express delivery" as the keyword is 5, that is, the average promotion position set for "flower express delivery" is 5, it is indicated that the search term "flower express delivery" may have a situation where the promotion position is vacant, and this is taken as a candidate black horse word.
Therefore, the mining of the vacant popularization resources is actually realized, the popularization resources are used as the candidate black horse words recommended to the clients, and the waste situation of the vacant popularization resources can be reduced.
Step 205: and determining the estimated utilization rate of each candidate black horse word, and selecting the candidate black horse words with the estimated utilization rate larger than a preset utilization rate threshold value.
In this step, the estimated utilization rate of each candidate black horse word can be determined according to the utilization rate parameter value of the keyword in the existing recommended word bank and the correlation degree between the candidate black horse word and the recommended word bank. The method can adopt a machine learning method, the keywords in the existing recommendation word stock are used as a training set, the utilization rate parameters of the keywords, such as click times and purchase times, are recorded in the training set, and the training set reflects the practical results of customers and users and can be updated in time. And training the candidate black horse words based on the training set, calculating the correlation degree of the candidate black horse words and the training set, calculating the expected utilization rate of each candidate black horse word, specifically, calculating keywords meeting the preset correlation requirement between the candidate black horse words and the training set, and calculating the expected utilization rate of the candidate black horse words by using the utilization rate parameters of the keywords.
Wherein, the relevance of the candidate black horse word and the training set can be calculated by adopting the existing relevance calculation method such as BM25 algorithm.
For example, when the BM25 algorithm is used to calculate the correlation between the candidate black horse word and the training set, the formula is:
Score ( q , d ) = Σ w ∈ d ∩ q ( ln N - df ( w ) + 0.5 df ( w ) + 0.5 × ( k 1 + 1 ) × c ( w , d ) k 1 ( ( 1 - b ) + b | d | avgdl + c ( w , d ) ) × ( k 3 + 1 ) × c ( w , q ) k 3 + c ( w , q ) )
wherein q is a candidate black horse word, d is a keyword in the training set, N is the total number of keywords in the training set, c (w, d) is the number of occurrences of the word w in the training set, df (w) is the number of documents in the large-scale document set containing the word w, | d | is the length of the training set, avgdl is the average length of each keyword in the training set, b, k1And k3Is a preset parameter, b is used for controlling the punishment degree of the training set length, k1For controlling the word w in trainingContribution of concentrated occurrence, k3And the contribution degree is used for controlling the occurrence frequency of the word w in the candidate black horse word.
In calculating the estimated utilization rate of the candidate black horse words, the following formula can be adopted:
score ( w ) = Σ i = 1 M α i * avg ( P i )
wherein score (w) is the predicted utilization of the candidate black mare w, PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) P of the keywords meeting the preset correlation requirement between the training set and the candidate black horse words wiAverage value of (a).
Assume that there are two utilization parameters: and if the keywords meeting the preset correlation requirement between the training set and the candidate black horse words w are A, B and C, carrying out weighted summation on the value obtained by averaging the numbers of clicks of A, B and C in the training set and the value obtained by averaging the numbers of purchases of A, B and C to obtain the expected utilization rate of the candidate black horse words w.
The machine learning method obtains the estimated utilization rate of the candidate black horse words according to the known utilization rate of the existing keywords, and the black horse words with higher estimated utilization rate in the candidate black horse words can be rapidly discovered by adopting the training mode, so that the utilization rate of popularization resources is improved as much as possible, on one hand, more values can be brought to customers, and on the other hand, more popularization results can be provided for the users.
Before this step, url decoding may be performed on the black horse word, the black horse word is usually in a url encoding form in the search log, and before machine learning, url decoding is performed on each candidate black horse word first in order to make the black horse word consistent with a keyword form of a recommended word library stored by a server, which is a general technique and is not described herein again.
Step 206: and according to a preset filtering strategy, filtering the candidate black horse words selected in the step 205 to obtain the black horse words.
The filtering strategy in this step is usually to filter out keywords violating national regulations, such as keywords related to yellow or reaction.
Example III,
In the step 102, when determining the keyword recommended to the client, the following three ways may be adopted:
the first mode is as follows: the passively triggered recommendation mode is characterized in that when a client inputs a keyword, the recommendation of the keyword is triggered: determining words in the recommended word bank, wherein the relevancy of the words in the recommended word bank and the keywords input by the client reaches a preset relevancy threshold, and taking the words as the keywords recommended to the client.
The second mode is as follows: and the active recommendation mode is to determine the words in the recommended word bank, the correlation degree of which with the keywords purchased by the customer reaches a preset correlation degree threshold value, and take the words as the keywords recommended to the customer.
The third mode is as follows: and in the active recommendation mode, extracting a characteristic vector from the business data of the customer, and determining a word of which the correlation degree with the characteristic vector of the customer in the recommended word bank reaches a preset correlation degree threshold value.
In this way, feature vectors may be extracted from business data such as a promotion page of a customer, customer information (e.g., a registered name of the customer) or business information (e.g., business content filled by the customer, an industry scope, etc.), and then keywords recommended to the customer may be determined by calculating a correlation between words in the recommended word bank and the feature vectors. The manner of extracting the feature vector is the prior art, and is not described herein again.
The second mode and the third mode can actively recommend the keywords for the client when the client logs in or inputs the keywords.
In addition, the three manners may be combined in any manner, for example, when a client inputs a keyword, the first manner and the second manner may be adopted to determine the keyword recommended to the client, and the keywords determined in the two manners are merged and then recommended to the client together.
In addition, the conventional correlation calculation method can be used for the correlation calculation methods in the above three modes, and the present invention is not limited thereto.
After the keywords recommended to the client are determined by adopting the above method, the following keyword sorting method can be adopted when recommendation is performed to the client:
and in the sorting mode 1, the black horse words in the keywords recommended to the client are ranked in front.
If the black horse words exist in the keywords recommended to the customer, the black horse words are preferentially recommended to the customer so that the customer can preferentially see the black horse words, and therefore the probability of purchasing the black horse words is improved.
And 2, taking the weight of the black horse words into consideration in a keyword sorting algorithm.
In the embodiment of the invention, whether the keyword is a black horse word or not is given a weight to participate in the calculation of the keyword ranking, so that the black horse word can appear in the front of the recommended keyword as much as possible.
As a preferred embodiment, when recommending keywords to the client, the black horse words in the keywords can be labeled, and the labels can include but are not limited to: providing a black horse word mark, providing a recommendation reason of the black horse word, and the like.
Among them, the black horse word identification may include but is not limited to: blacking the black horse words, marking the black horse words with special colors and graphic marks (for exampleNEW) And the like. Reasons for recommendation are for example: "the search word appearing newly, congratulate your express one step, preempt business opportunity". The reason for the recommendation can be displayed all the time when the black horse word is recommended, and can also be displayed when the mouse slides over the black horse word.
As an example, as shown in fig. 3, assuming that a client inputs a keyword "smart phone" to trigger keyword recommendation, at least one of the first manner, the second manner, and the third manner is adopted to determine that the keyword recommended to the client is: when the intelligent mobile phones, the business intelligent mobile phones, the buying intelligent mobile phones, the second-hand intelligent mobile phones, the domestic intelligent mobile phones, the intelligent mobile phone wholesale and the like are sequenced, the black horse words can be ranked in the front, and the black horse words are represented by figuresNEWAnd (6) labeling. In addition, daily average search volume of the recommended keywords (as shown in fig. 3), the competitive intensity determined according to the average number of the promotion results, and the like may be further noted.
Preferably, the regional characteristics may be introduced in the recommendation process given that the delivery needs of the customers are often geographically significant, i.e. the customers may prefer to deliver the promotional content in the same region. Specifically, when the word library is formed, because the search log is based, the region attribute of the word can be obtained (the region attribute can be obtained in the existing manner), and when the keyword is recommended to the user, the word which is in the recommended word library, has the same region attribute and has the service correlation degree with the client reaching the preset correlation degree threshold value, is determined as the keyword recommended to the client.
The above is a detailed description of the method provided by the embodiment of the present invention, and the following is a detailed description of the apparatus provided by the present invention through the fourth embodiment.
Example four,
Fig. 4 is a structural diagram of an apparatus according to a fourth embodiment of the present invention, and as shown in fig. 4, the apparatus includes: a black horse word mining unit 400 and a keyword recommendation unit 410.
The black horse word mining unit 400 mines search words, the average number of the promotion results of which is less than the preset maximum number of the promotion results and the estimated utilization rate of which is greater than the preset utilization rate threshold, from the search logs as black horse words, and adds the black horse words to the recommended word bank.
The keyword recommendation unit 410 determines a word in the recommended word library, the business relevance of which to the client reaches a preset relevance threshold, as a keyword recommended to the client.
The black horse word mining unit may specifically include: a log obtaining sub-unit 401, a calculating sub-unit 402, a first selecting sub-unit 403, a second selecting sub-unit 404, and a thesaurus adding sub-unit 405.
The log acquisition sub-unit 401 acquires a search log in a set period of time.
The calculating subunit 402 calculates the average promotion result number of each search word in the search log, where the average promotion result number of a search word is the ratio of the promotion result number of the search word to the search frequency of the promotion result of the search word.
Generally, the value of a search term can be reflected by the search times of the search term, and the lower the search times, the lower the value, therefore, in order to improve the mining efficiency of the black horse words, the selection of the black horse words can be firstly excluded from the search terms with the search times less than or equal to a preset search time threshold. At this time, the calculation subunit 402 determines, from the search log, search words whose number of search times is greater than a preset threshold number of search times, and calculates the average number of popularization results for each determined search word.
The first selection subunit 403 determines a search word whose average number of promotion results is less than a preset maximum number of promotion results as a candidate black horse word.
The second selection sub-unit 404 determines the expected utilization rate of each candidate black horse word, selects a candidate black horse word whose expected utilization rate is greater than the preset utilization rate threshold, and provides the candidate black horse word to the word stock addition sub-unit 405.
Specifically, the second selecting sub-unit 404 determines the expected utilization rate of each candidate black horse word according to the utilization rate parameter value of the keyword in the existing recommended word bank and the correlation between the candidate black horse word and the recommended word bank. The method can adopt a machine learning method, and takes the keywords in the existing recommended word stock as a training set, the utilization rate parameters of the keywords in the training set can comprise click times, purchase times and the like, and the training set reflects the practical results of customers and users and can be updated in time. And training the candidate black horse words based on the training set, endowing the predicted utilization rate for each candidate black horse word according to the correlation degree of the candidate black horse words and the training set, specifically, calculating key words meeting the preset correlation requirement between the candidate black horse words and the training set, and calculating the predicted utilization rate of the candidate black horse words by using the utilization rate parameter values of the key words.
For example can be as follows
Figure BDA0000112116110000121
Calculating the estimated utilization ratio score (w) of the candidate black horse words w, wherein PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) For P of the keywords meeting the preset correlation requirement between the keywords and the candidate black horse words w in the existing recommended word bankiAverage value of (a). .
The thesaurus adding sub-unit 405 adds the received candidate black horse word as a black horse word to the recommended thesaurus.
Further, the black horse word mining unit 400 may further include: a filtering subunit 406.
The filtering subunit 406 filters the candidate black horse words provided by the second selecting subunit 404 to the thesaurus adding subunit 405 according to a preset filtering policy, and then provides the filtered candidate black horse words to the thesaurus adding subunit 405.
In determining the keywords recommended to the client, the keyword recommendation unit 410 may adopt the following three ways:
the first mode is as follows: determining words in the recommended word bank, wherein the relevancy between the words and the keywords input by the client reaches a preset relevancy threshold, and taking the determined words as the keywords recommended to the client.
The second mode is as follows: determining words of which the correlation degree with the keywords purchased by the customer in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the customer.
The third mode is as follows: extracting feature vectors from business data of the customers, determining words in the recommended word bank, wherein the correlation degree of the words and the feature vectors reaches a preset correlation degree threshold value, and taking the determined words as keywords recommended to the customers.
Still further, the apparatus further comprises: a keyword ranking unit 420, configured to rank the keywords recommended by the keyword recommendation unit 410. The sorting mode comprises the following steps: arranging the black horse words in the keywords recommended to the client in front; alternatively, the weight of the black horse word is taken into account in the ranking algorithm of the keywords recommended to the client.
Regional characteristics may be introduced in the recommendation process given that the delivery requirements of customers are often geographically significant, i.e., customers may prefer delivery of promotional content in the same region. At this time, the keyword recommendation unit 410 is further configured to identify a region attribute of a word in the recommended word bank, and when determining the keyword recommended to the client, determine a word in the recommended word bank that has a correlation degree with the service of the client reaching a preset correlation degree threshold and has the same region attribute as the keyword recommended to the client.
The keyword recommendation unit 410 is further configured to provide the black horse word with a black horse word identifier or a recommendation reason in the keywords recommended to the client. Among them, the black horse word identification may include but is not limited to: adding black horse words and black horse wordsThe words being marked with special colours, labelling figures (e.g. using special colours)NEW) And the like. Reasons for recommendation are for example: "the search word appearing newly, congratulate your express one step, preempt business opportunity". The reason for the recommendation can be displayed all the time when the black horse word is recommended, and can also be displayed when the mouse slides over the black horse word.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (20)

1. A keyword recommendation method is characterized by comprising the following steps:
s1, excavating search words with the average promotion result number smaller than the preset maximum promotion result number and the estimated utilization rate larger than the preset utilization rate threshold value from the search logs as black horse words, and adding the black horse words into a recommended word bank;
and S2, determining the words in the recommended word bank, the business relevance of which to the client reaches a preset relevance threshold, as the keywords recommended to the client.
2. The recommendation method according to claim 1, wherein in step S1, the mining of the black horse words specifically comprises:
s11, acquiring a search log in a set time period;
s12, calculating the average number of popularization results of each search word in the search log, wherein the average number of the popularization results of the search word is the ratio of the number of the popularization results appearing in the search word to the number of times of searching the popularization results appearing in the search word;
s13, determining the search words with the average promotion result number smaller than the preset maximum promotion result number as candidate black horse words;
and S14, determining the estimated utilization rate of each candidate black horse word, and selecting the candidate black horse word with the estimated utilization rate larger than a preset utilization rate threshold value as the black horse word.
3. The recommendation method according to claim 2, further comprising, between the steps S11 and S12: s16, determining search words with the search times larger than a preset search time threshold value from the search log;
in step S12, the average number of the promotion results of each search term in the search log is calculated as: calculating the average promotion result number of each search term determined in the step S16.
4. The recommendation method according to claim 2, wherein the step S14 of determining the expected utilization rate of each candidate black horse word is:
determining the estimated utilization rate of each candidate black horse word according to the utilization rate parameter values of the keywords in the existing recommended word library and the correlation degree of the candidate black horse words and the recommended word library, wherein the utilization rate parameter values at least comprise: number of clicks or purchases.
5. Recommendation method according to claim 4, characterized in thatCalculating the estimated utilization ratio score (w) of the candidate black horse words w, wherein PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) For P of the keywords meeting the preset correlation requirement between the keywords and the candidate black horse words w in the existing recommended word bankiAverage value of (a).
6. The recommendation method according to claim 2, wherein the selecting the candidate black horse word with the expected utilization rate greater than the preset utilization rate threshold as the black horse word in step S14 comprises:
and selecting candidate black horse words with the estimated utilization rate larger than a preset utilization rate threshold value, and filtering the selected candidate black horse words according to a preset filtering strategy to obtain the black horse words.
7. The recommendation method according to claim 1, wherein the step S2 specifically includes:
determining words in the recommended word bank, wherein the correlation degree between the words and the keywords input by the client reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the client; or,
determining words of which the correlation degree with the keywords purchased by the customer in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the customer; or,
extracting a characteristic vector from the business data of the customer, determining words of which the correlation degree with the characteristic vector in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as keywords recommended to the customer.
8. The recommendation method according to claim 1 or 7, characterized in that the method further comprises: ranking the keywords recommended to the client;
the sorting mode comprises the following steps: arranging the black horse words in the keywords recommended to the client in front; or, the weight of the black horse words is considered in a ranking algorithm of the keywords recommended to the client.
9. The recommendation method according to claim 1, wherein a geographic attribute of the words in the recommended thesaurus is identified, the geographic attribute of the client is extracted in the step S2, and the words in the recommended thesaurus having a correlation degree with the service of the client up to a preset correlation degree threshold and having the same geographic attribute are determined as the keywords recommended to the client.
10. The recommendation method according to any one of claims 1 to 7, further comprising: providing the black horse words with black horse word identification or recommendation reasons in the keywords recommended to the client.
11. An apparatus for recommending a keyword, the apparatus comprising:
the system comprises a black horse word mining unit, a recommendation word library and a recommendation word library, wherein the black horse word mining unit is used for mining search words, the average promotion result number of which is smaller than the preset maximum promotion result number and the estimated utilization rate of which is larger than a preset utilization rate threshold value, from a search log as black horse words and adding the black horse words into the recommendation word library;
and the keyword recommendation unit is used for determining the words in the recommended word bank, the business relevance of which to the client reaches a preset relevance threshold, as the keywords recommended to the client.
12. The recommendation device according to claim 11, wherein the black horse word mining unit specifically comprises:
the log obtaining subunit is used for obtaining the search logs in a set time period;
the calculation subunit is used for calculating the average promotion result number of each search word in the search log, wherein the average promotion result number of the search word is the ratio of the promotion result number of the search word to the search frequency of the promotion result of the search word;
the first selection subunit is used for determining that the search words with the average promotion result number smaller than the preset maximum promotion result number serve as candidate black horse words;
the second selection subunit is used for determining the expected utilization rate of each candidate black horse word, selecting the candidate black horse words with the expected utilization rate larger than the preset utilization rate threshold value, and providing the candidate black horse words to the word bank adding subunit;
and the word bank adding subunit is used for adding the received candidate black horse words into the recommended word bank as the black horse words.
13. The recommendation device according to claim 12, wherein the calculation subunit determines search words from the search log, the number of search times of which is greater than a preset threshold number of search times, and calculates an average number of promotion results for each determined search word.
14. The recommendation device according to claim 12, wherein the second selection subunit determines the expected utilization rate of each candidate black horse word specifically according to a utilization rate parameter value of a keyword in an existing recommendation thesaurus and a correlation between the candidate black horse word and the recommendation thesaurus, where the utilization rate parameter value at least includes: number of clicks or purchases.
15. The recommendation device of claim 14, wherein the second selection subunit is according to
Figure FDA0000112116100000041
Calculating the estimated utilization ratio score (w) of the candidate black horse words w, wherein PiIs the ith utilization parameter, M is the number of utilization parameters, alphaiWeighted value of the ith utilization parameter, avg (P)i) For P of the keywords meeting the preset correlation requirement between the keywords and the candidate black horse words w in the existing recommended word bankiAverage value of (a).
16. The recommendation device of claim 12, wherein the black horse word mining unit further comprises:
and the filtering subunit is used for filtering the candidate black horse words provided by the second selecting subunit to the word bank adding subunit according to a preset filtering strategy and then providing the filtered candidate black horse words to the word bank adding subunit.
17. The recommendation device according to claim 11, wherein the keyword recommendation unit determines a word in the recommended word bank whose degree of correlation with the keyword input by the client reaches a preset degree of correlation threshold, and takes the determined word as the keyword recommended to the client; or,
determining words of which the correlation degree with the keywords purchased by the customer in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as the keywords recommended to the customer; or,
extracting a characteristic vector from the business data of the customer, determining words of which the correlation degree with the characteristic vector in the recommended word bank reaches a preset correlation degree threshold value, and taking the determined words as keywords recommended to the customer.
18. The recommendation device according to claim 11 or 17, further comprising: the keyword sorting unit is used for sorting the keywords recommended by the keyword recommending unit;
the sorting mode comprises the following steps: arranging the black horse words in the keywords recommended to the client in front; or, the weight of the black horse words is considered in a ranking algorithm of the keywords recommended to the client.
19. The recommendation device of claim 11, wherein the keyword recommendation unit is further configured to identify a regional attribute of a word in the recommended thesaurus, and when determining the keyword recommended to the client, determine a word in the recommended thesaurus, which has a service correlation with the client reaching a preset correlation threshold and has the same regional attribute, as the keyword recommended to the client.
20. The recommendation device according to any one of claims 11 to 17, wherein the keyword recommendation unit is further configured to provide a black horse word identifier or a recommendation reason for the black horse word in the keywords recommended to the client.
CN201110379470XA 2011-11-24 2011-11-24 Recommendation method and device for keywords Pending CN103136224A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110379470XA CN103136224A (en) 2011-11-24 2011-11-24 Recommendation method and device for keywords

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110379470XA CN103136224A (en) 2011-11-24 2011-11-24 Recommendation method and device for keywords

Publications (1)

Publication Number Publication Date
CN103136224A true CN103136224A (en) 2013-06-05

Family

ID=48496060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110379470XA Pending CN103136224A (en) 2011-11-24 2011-11-24 Recommendation method and device for keywords

Country Status (1)

Country Link
CN (1) CN103136224A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412926A (en) * 2013-08-14 2013-11-27 上海兆民云计算科技有限公司 Method for building cloud storage retrieval index based on network access point characteristics
CN103559284A (en) * 2013-11-07 2014-02-05 北京国双科技有限公司 Word expansion method and device for webpage keywords
CN103902714A (en) * 2014-04-03 2014-07-02 北京国双科技有限公司 Method and device for filtering keywords
CN104091058A (en) * 2014-06-27 2014-10-08 北京君和信达科技有限公司 Safety inspection conclusion submitting method and device
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN104965842A (en) * 2014-11-27 2015-10-07 深圳市腾讯计算机系统有限公司 Search recommending method and apparatus
CN105183710A (en) * 2015-06-23 2015-12-23 武汉传神信息技术有限公司 Method for automatically generating document summary
CN106484698A (en) * 2015-08-25 2017-03-08 北京奇虎科技有限公司 A kind of method for pushing of search keyword and device
CN106557480A (en) * 2015-09-25 2017-04-05 阿里巴巴集团控股有限公司 Implementation method and device that inquiry is rewritten
CN106598976A (en) * 2015-10-15 2017-04-26 百度在线网络技术(北京)有限公司 Internet-based information promotion method and apparatus
CN106649323A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Method and device for recommending keyword
CN106919693A (en) * 2017-03-07 2017-07-04 广州优视网络科技有限公司 It is a kind of to improve the method and apparatus that hot word exposes coverage rate
CN107066497A (en) * 2016-12-29 2017-08-18 努比亚技术有限公司 A kind of searching method and device
WO2017143703A1 (en) * 2016-02-24 2017-08-31 百度在线网络技术(北京)有限公司 Offline resource mining method and device
CN107871259A (en) * 2016-09-26 2018-04-03 阿里巴巴集团控股有限公司 A kind of processing method of information recommendation, device and client
CN108304533A (en) * 2018-01-29 2018-07-20 上海名轩软件科技有限公司 Keyword recommendation method and equipment
CN110069676A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 Keyword recommendation method and device
CN113836379A (en) * 2021-09-26 2021-12-24 北京百炼智能科技有限公司 Intelligent recommendation method and system based on customer image

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101331487A (en) * 2005-12-15 2008-12-24 微软公司 Advertising keyword cross-selling
CN101650731A (en) * 2009-08-31 2010-02-17 浙江大学 Method for generating suggested keywords of sponsored search advertisement based on user feedback
US20100076991A1 (en) * 2008-09-09 2010-03-25 Kabushiki Kaisha Toshiba Apparatus and method product for presenting recommended information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101331487A (en) * 2005-12-15 2008-12-24 微软公司 Advertising keyword cross-selling
US20100076991A1 (en) * 2008-09-09 2010-03-25 Kabushiki Kaisha Toshiba Apparatus and method product for presenting recommended information
CN101650731A (en) * 2009-08-31 2010-02-17 浙江大学 Method for generating suggested keywords of sponsored search advertisement based on user feedback

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
唐卫东等: "《基于关键词效能的搜索引擎优化策略分析》", 《现代情报》, vol. 31, no. 10, 31 October 2011 (2011-10-31) *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412926A (en) * 2013-08-14 2013-11-27 上海兆民云计算科技有限公司 Method for building cloud storage retrieval index based on network access point characteristics
CN104424342A (en) * 2013-09-11 2015-03-18 携程计算机技术(上海)有限公司 Method for keyword matching, and device, server and system of method
CN103559284B (en) * 2013-11-07 2017-08-01 北京国双科技有限公司 Web Page Key Words open up word method and apparatus
CN103559284A (en) * 2013-11-07 2014-02-05 北京国双科技有限公司 Word expansion method and device for webpage keywords
CN103902714A (en) * 2014-04-03 2014-07-02 北京国双科技有限公司 Method and device for filtering keywords
CN103902714B (en) * 2014-04-03 2017-08-01 北京国双科技有限公司 Keyword filter method and device
CN104091058A (en) * 2014-06-27 2014-10-08 北京君和信达科技有限公司 Safety inspection conclusion submitting method and device
CN104965842A (en) * 2014-11-27 2015-10-07 深圳市腾讯计算机系统有限公司 Search recommending method and apparatus
CN105183710A (en) * 2015-06-23 2015-12-23 武汉传神信息技术有限公司 Method for automatically generating document summary
CN106484698A (en) * 2015-08-25 2017-03-08 北京奇虎科技有限公司 A kind of method for pushing of search keyword and device
CN106557480A (en) * 2015-09-25 2017-04-05 阿里巴巴集团控股有限公司 Implementation method and device that inquiry is rewritten
CN106557480B (en) * 2015-09-25 2020-07-07 阿里巴巴集团控股有限公司 Method and device for realizing query rewriting
CN106598976A (en) * 2015-10-15 2017-04-26 百度在线网络技术(北京)有限公司 Internet-based information promotion method and apparatus
CN106649323B (en) * 2015-10-29 2020-07-03 北京国双科技有限公司 Method and device for recommending keywords
CN106649323A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Method and device for recommending keyword
US11416502B2 (en) 2016-02-24 2022-08-16 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for mining offline resources
WO2017143703A1 (en) * 2016-02-24 2017-08-31 百度在线网络技术(北京)有限公司 Offline resource mining method and device
CN107871259A (en) * 2016-09-26 2018-04-03 阿里巴巴集团控股有限公司 A kind of processing method of information recommendation, device and client
CN107066497A (en) * 2016-12-29 2017-08-18 努比亚技术有限公司 A kind of searching method and device
CN106919693B (en) * 2017-03-07 2020-12-01 阿里巴巴(中国)有限公司 Method and device for improving hot word exposure coverage rate
CN106919693A (en) * 2017-03-07 2017-07-04 广州优视网络科技有限公司 It is a kind of to improve the method and apparatus that hot word exposes coverage rate
CN110069676A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 Keyword recommendation method and device
CN108304533A (en) * 2018-01-29 2018-07-20 上海名轩软件科技有限公司 Keyword recommendation method and equipment
CN113836379A (en) * 2021-09-26 2021-12-24 北京百炼智能科技有限公司 Intelligent recommendation method and system based on customer image
CN113836379B (en) * 2021-09-26 2023-08-25 北京百炼智能科技有限公司 Intelligent recommendation method and system based on client image

Similar Documents

Publication Publication Date Title
CN103136224A (en) Recommendation method and device for keywords
US10726446B2 (en) Method and apparatus for pushing information
JP6262764B2 (en) Method and system for pushing mobile applications
CN101641697B (en) Related search queries for a webpage and their applications
CN105005582B (en) The recommendation method and device of multimedia messages
US20200412675A1 (en) Network based data traffic latency reduction
JP4809403B2 (en) Advertisement distribution apparatus, advertisement distribution method, and advertisement distribution control program
CN104850546B (en) Display method and system of mobile media information
EP2484113A1 (en) A method, apparatus and system for increasing website data transfer speed
EP2800012A1 (en) Search device, search method, search program, and recording medium
CN105868332A (en) hot topic recommendation method and device
CN112136127B (en) Action indicator for search operation output element
CN105045901A (en) Search keyword push method and device
CN103176982A (en) Recommending method and recommending system of electronic book
CN103390194A (en) Method, device and system for predicating user intention and recommending suggestion
EP3117339A1 (en) Systems and methods for keyword suggestion
CN105095311B (en) The processing method of promotion message, apparatus and system
CN103020049A (en) Searching method and searching system
KR101123697B1 (en) Apparatus and method for searching user of common interest
CN103345489A (en) Event inquiry demand processing method and device
US20120123876A1 (en) Recommending and presenting advertisements on display pages over networks of communication devices and computers
CN105468649A (en) Method and apparatus for determining matching of to-be-displayed object
JP2011227721A (en) Interest extraction device, interest extraction method, and interest extraction program
CN110209921B (en) Method and device for pushing media resource, storage medium and electronic device
CN108470289B (en) Virtual article issuing method and equipment based on E-commerce shopping platform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130605