WO2014056337A1 - 搜索词获取方法、服务器、搜索词推荐系统 - Google Patents

搜索词获取方法、服务器、搜索词推荐系统 Download PDF

Info

Publication number
WO2014056337A1
WO2014056337A1 PCT/CN2013/079173 CN2013079173W WO2014056337A1 WO 2014056337 A1 WO2014056337 A1 WO 2014056337A1 CN 2013079173 W CN2013079173 W CN 2013079173W WO 2014056337 A1 WO2014056337 A1 WO 2014056337A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
tag
category
received application
search
Prior art date
Application number
PCT/CN2013/079173
Other languages
English (en)
French (fr)
Inventor
曹远铖
曹越
尹华彬
宁合军
宫建涛
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2014056337A1 publication Critical patent/WO2014056337A1/zh
Priority to US14/678,355 priority Critical patent/US20150213042A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90324Query formulation using system suggestions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Definitions

  • the invention relates to a network search technology of a computer, in particular to a search word acquisition method, a server, a search word recommendation method and system, and a storage medium.
  • the search strategy of the general search engine is to obtain data as much as possible, but the level of processing of the data is relatively low.
  • general search engines such as Baidu and Google usually list a large number of search results based on the similarity of the entered keywords.
  • the outstanding problems are: In the search results, there is too much valueless information, insufficient effective information and unstructured, and no personalized mechanism for returning search results.
  • the search engine provides a high proportion of non-value data in the search results.
  • the vertical search engine is a new search engine service model that is based on the problem of large amount of information, inaccurate query, and insufficient depth of the general search engine.
  • the model is specific to a specific area, a specific group of people, or a specific demand. Providing valuable information and related services, characterized by "specialization, precision, and deepness", and with industry color.
  • vertical search engines are more focused, specific and in-depth. .
  • the industry characteristics of vertical search engines the amount of data is limited. When users need to search in different fields, they have to use different vertical search engines, which is inconvenient to operate.
  • the embodiment of the invention provides a search word acquisition method, a server, a search word recommendation method and system, and a storage medium, so as to solve the problem that the general search engine has low data processing capability, the vertical search engine is inconvenient, and the existing search engine cannot The user intelligently recommends the search term and then the search result.
  • the present invention provides a search term acquisition method, which is run on a server.
  • the method includes:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords; determining whether the received application keywords are fuzzy keywords;
  • the received application keyword is a fuzzy keyword, obtaining a tag that matches the received application keyword according to the received application keyword;
  • the tag corresponding to the category with the most occurrences is determined as the recommended search term.
  • the invention also provides a server, which comprises:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords; and the matching unit is configured to: after receiving the application keyword, determine whether the received application keyword is a fuzzy keyword, if The received application keyword is a fuzzy keyword, and the label matching the received application keyword is obtained according to the received application keyword;
  • a summary unit configured to obtain a category corresponding to the matched label according to the matched label obtained by the matching unit, and summarize the obtained categories to find a category in which the number of occurrences is the most; a recommendation word output unit, A tag for determining a category corresponding to the most frequently occurring number is used as a recommended search term.
  • the invention also proposes a search term recommendation system, which comprises: a server and at least one a client, the client is configured to send an application keyword to the server, and receive a recommended search term returned by the server and present the search term to the user, the server further comprising:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords;
  • the matching unit is configured to receive the application keywords sent by the client, and determine whether the received application keywords are a fuzzy keyword, if the received application keyword is a fuzzy keyword, obtaining a tag that matches the received application keyword according to the received application keyword;
  • a summary unit configured to obtain a category corresponding to the matched label according to the matched label obtained by the matching unit, and summarize the obtained categories to find a category in which the number of occurrences is the most;
  • the recommendation word output unit is configured to determine a tag corresponding to the category with the most occurrences as the recommended search term.
  • the embodiment of the present invention can find the same functional features and popular recommendation words through the application keywords directly input by the user or derived from the search results of the general search engine, and present them to the user, thereby If the subjective search purpose is not clear, the potential needs of the user can be mined, or the user's needs can be refined, and the search result is more in line with the user's intention, which is highly practical.
  • FIG. 1 is a flowchart of a method for acquiring a search term according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a search process according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of another method for acquiring a search word according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a search term recommendation method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of another search term recommendation method according to an embodiment of the present invention.
  • FIG. 6 is a structural diagram of a server according to an embodiment of the present invention.
  • FIG. 7 is a structural diagram of another server according to an embodiment of the present invention.
  • FIG. 8 is a structural diagram of a search term recommendation system according to an embodiment of the present invention.
  • FIG. 9 is a corresponding relationship diagram of categories, tags, and application keywords according to an embodiment of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The drawings and preferred embodiments describe in detail the specific implementation manners, methods, steps and technical effects of the search term acquisition method, server, search term recommendation method and system, and the corresponding storage media.
  • the invention can find out the implied needs of the user according to the input keywords and output the recommended search words.
  • FIG. 1 which is a flowchart of a method for acquiring a search term according to an embodiment of the present invention, the method runs on a server end, and includes the following steps S1 1-S16:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords, wherein each category includes a plurality of tags, each application keyword corresponds to at least one tag, and each tag belongs to at least one category.
  • the application keyword refers to the content that the user wants to search
  • the tag library configures a corresponding tag for the application keyword that may be input, and the tag needs to cover various characteristics of the application keyword. For example, if the application keyword is "angry bird”, you can configure the corresponding tags “cartoon”, “puzzle”, “throw”, and for example, if the application keyword is "WeChat”, you can configure it. Corresponding tags "intercom”, “chat”, “voice”, “transfer file”, “note”. The correspondence between the application keywords and the tags is configured according to the mechanism of data mining and manual verification.
  • each tag corresponds to at least one category, and the correspondence between the category and the tag is classified according to the functional characteristics of the tag.
  • the labels “alarm clock”, “killing Trojan”, “watching novels” correspond to one category “function label”
  • the labels “3D”, “horizontal screen”, “vertical screen” correspond to a category “interface”
  • label “ Gravity sensing”, “Bluetooth networking” corresponds to a category “characteristics”.
  • the application keyword may be directly input by the user, or may be an output result of a general search engine or a vertical search engine.
  • the user can directly type "angry bird" as the response
  • users can also enter "Anger Birds” into a general search engine, and a general search engine will generate a list of search results (often referred to as an APP feature list).
  • This search result list may contain "angry little”. Bird Back to School, Angry Birds Space Edition, Angry Birds HD Edition", then export each result in this search results list as an application keyword.
  • the fuzzy keyword described here refers to a word whose subjective meaning is ambiguous, and it is possible to determine whether it is a fuzzy keyword by setting a relevance score to the applied keyword. For example, when the user inputs "QQ2012", the user wants to search for a specific software, and the search purpose is relatively clear, and it is not necessary to present the recommendation word to the user, and the general search engine can be directly used to apply the keyword "QQ2012" as the search term. Searching, so you can set a higher score for "QQ2012”. If the user enters "Tencent" to search, he may want to search for a certain type of software owned by Tencent. At this time, the search purpose is vague, so you can set a lower score for "Tencent” and enter the next One step.
  • the relevance score threshold may be preset in the tag library.
  • the application keyword is determined to be a fuzzy key. Word; otherwise, the application keyword is not a fuzzy keyword.
  • other preset criteria can also be used to determine whether the application keyword is a fuzzy keyword. This manner of judgment is similarly applicable to the following embodiments.
  • a corresponding relevance score may be stored for each stored application keyword. After receiving the application keyword, the stored application keyword consistent with the received application keyword is found in the tag library, and obtained. The relevance score corresponding to the stored application keyword is used as a correlation score of the received application keyword to determine whether the received application keyword is a fuzzy keyword.
  • the application keyword After receiving the application keyword, it performs tag matching according to the tag library, and obtains a tag matching the received application keyword from the tag library; specifically, finds and receives the applied keyword in the tag library. Consistently stored application keywords, and obtaining tags that match the stored application keywords as tags that match the received application keywords. For example, according to the application keyword “angry bird”, get the matching three labels “cartoon”, “puzzle”, “throw”.
  • Each tag has its corresponding category, and the correspondence between the category and the label follows the functional characteristics of the label. sort.
  • One or more categories can be obtained in this step (if the search results of the search engine are used as application keywords, a large number of categories will be available).
  • the categories obtained in the previous step are summarized to find the category with the most occurrences.
  • the category with the most occurrences is the category with the most relevance to the content searched by the user.
  • the corresponding result of the label and the category obtained in step S14 and step S15 can be referred to as the attribute distribution of the label.
  • S16 Find the label corresponding to the category with the most occurrence as the recommended search term; preferably, find the popular label corresponding to the category with the most occurrences as the recommended search term from the tag library.
  • the category with the most occurrences is the category that has the most relevance to the content searched by the user.
  • this category there may be multiple tags, and the popularity of each tag may be manually set or determined according to the record of the number of searches. of. For example, the three tabs "3D", “horizontal screen” and “vertical screen” contained under the category “interface”, where "3D” is the most popular label because it is often searched, ie if the category "interface” It is the category with the most occurrences, and this step will output the "3D" tag as the recommended search term.
  • the final output of the search term can also be multiple, which can be achieved by setting the tag's popular threshold.
  • Label 1 is "Intercom”
  • Label 2 is “Chat”
  • Label 3 is "Voice”
  • Label 4 is "Transfer File”
  • Label 5 is "Notepad”. Then, by summarizing the five categories of attribute categories, it is concluded that the label 1, the label 2, and the label 3 belong to one category: attribute 1 , that is, "Tencent”. It can be seen that among the five tabs, the category "Tencent” has appeared three times, which is the category with the most occurrences.
  • each application keyword outputted in the search result of the search engine is searched for and recommended, and the recommended words potentially related to the user's search content are presented to the user. Therefore, the present invention can flexibly exploit the potential needs of the user, or refine the user's needs, so that the search results are more in line with the user's intention.
  • FIG. 3 is a flowchart of another method for acquiring a search word according to an embodiment of the present invention.
  • the method includes the following steps S31-S36:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords, wherein each category includes a plurality of tags, each application keyword corresponding to at least one tag, and each tag belongs to at least one category.
  • the feature library stores a plurality of approximate tags, and the approximate tags correspond to tags in the tag library.
  • Each approximate tag is similar to the corresponding one or more tag functions in the tag library, that is, the approximate tag belongs to the same category as the corresponding tag from the tag library.
  • the existence of the feature library facilitates the expansion and redundancy of the system.
  • the fuzzy keyword described here refers to a word whose subjective meaning is ambiguous, and it is possible to determine whether it is a fuzzy keyword by setting a relevance score to the applied keyword. For example, when the user inputs "QQ2012", the user wants to search for a specific software, and the search purpose is relatively clear, and it is not necessary to present the recommendation word to the user, and the general search engine can be directly used to apply the keyword "QQ2012" as the search term. Searching, so you can set a higher score for "QQ2012”. If the user enters "Tencent" to search, he may want to search for a certain type of software owned by Tencent. At this time, the search purpose is vague, so you can set a lower score for "Tencent” and enter the next One step.
  • the received application keyword is a fuzzy keyword, the matched tag and/or the approximate tag are obtained according to the received application keyword; otherwise, the received application keyword is directly used as the search term.
  • S34 Obtain a category corresponding to the matched label and/or the approximate label according to the matched label and/or the approximate label.
  • S36 Find a label and/or an approximate label corresponding to the category with the most occurrence as the recommended search word; preferably, find the popular label and/or the popular approximate label corresponding to the category with the most occurrence as the recommended search term.
  • the category with the most occurrences is the category that has the most relevance to the content searched by the user.
  • this category there may be multiple tags, and the popular tags can be presented to the user as recommended search terms.
  • the present invention also provides a search term recommendation method for recommending a search term that matches the user's search intention to the user through the server, so as to fully satisfy the search requirement of the user.
  • the present invention is a search method according to an embodiment of the present invention.
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords, wherein each category includes a plurality of tags, each application keyword corresponds to at least one tag, and each tag belongs to at least one category. Each tag corresponds to at least one category, and the correspondence between the category and the tag is classified according to the functional characteristics of the tag.
  • S42 The client sends the application keyword that the user wants to search to the server.
  • the application keyword refers to the content that the user wants to search, and the tag library configures the corresponding tag for various possible application key words, and the tag needs to cover various characteristics of the application keyword.
  • S43 The server receives the application keyword sent by the client, and determines whether the received application keyword is a fuzzy keyword.
  • the fuzzy keyword described here refers to a word whose subjective meaning is ambiguous, and as described above, it is possible to determine whether or not it is a fuzzy keyword by setting a relevance score to the application keyword.
  • the server obtains a label matching the received application keyword according to the received application keyword; otherwise, the received application keyword is directly used as the search term.
  • the server After receiving the application keyword, the server performs tag matching on the received application keyword according to the tag library, and obtains a tag that matches the application keyword, that is, finds a pre-alignment with the received application keyword in the tag library.
  • the stored application keyword is obtained with a tag that matches the pre-stored application keyword as a tag that matches the received application keyword.
  • S45 The server obtains a category corresponding to the matched label according to the matched label.
  • Each tag has its corresponding category, and the correspondence between the category and the tag is classified according to the functional characteristics of the tag.
  • S46 The server summarizes the obtained categories to find the category with the most occurrences. In the previous step, you can get multiple categories. In this step, you can summarize these categories to find the category with the most occurrences. The category with the most occurrences is the category with the most relevance to the content searched by users.
  • the server finds a label corresponding to the category with the most occurrence as the recommended search word, and returns the recommended search word to the user.
  • the server finds a hot tag corresponding to the most frequently occurring category as the recommended search term, and returns the recommended search term to the client.
  • the category with the most occurrences is the category that has the most relevance to the content searched by the user.
  • this category there may be multiple tags, and the popularity of each tag may be manually set or determined according to the record of the number of searches. of.
  • S48 The client presents the received search term received to the user.
  • FIG. 5 is a flowchart of another search term recommendation method according to an embodiment of the present invention, where the method includes steps S51-S58:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords, wherein each category includes a plurality of tags, each application keyword corresponding to at least one tag, and each tag belongs to at least one category. Each tag corresponds to at least one category, and the correspondence between categories and tags is classified according to the functional characteristics of the tags.
  • the feature library stores a plurality of approximate tags, and the approximate tags correspond to tags in the tag library.
  • Each approximate tag is similar to the corresponding one or more tag functions in the tag library, that is, the approximate tag belongs to the same category as the corresponding tag from the tag library.
  • the existence of the feature library facilitates the expansion and redundancy of the system.
  • S52 The client sends the application keyword that the user wants to search to the server.
  • the application keyword refers to the content that the user wants to search, and the tag library configures corresponding tags for various application keywords that may be input, and the tag needs to cover various characteristics of the application keywords.
  • S53 The server receives the application keyword sent by the client, and determines whether the received application keyword is a fuzzy keyword.
  • the fuzzy keyword described here refers to a word whose subjective meaning is ambiguous, and as described above, it is possible to determine whether or not it is a fuzzy keyword by setting a relevance score to the application keyword.
  • the server obtains a label and/or an approximate label that matches the received application keyword according to the received application keyword; otherwise, the received application keyword Directly as a search term.
  • S55 The server obtains a category corresponding to the matched label and/or the approximate label according to the matched label and/or the approximate label.
  • S56 The server summarizes the obtained categories to find the category with the most occurrences. In the previous step, you can get multiple categories. In this step, you can summarize these categories to find the category with the most occurrences. The category with the most occurrences is the category with the most relevance to the content searched by users.
  • the server finds a label corresponding to the category with the most occurrence as the recommended search term, and returns the recommended search term to the client.
  • the server finds a hot tag corresponding to the most frequently occurring category as the recommended search term, and returns the recommended search term to the client.
  • the category with the most occurrences is the category that has the most relevance to the content searched by the user.
  • this category there may be multiple tags, and the popularity of each tag may be manually set or determined according to the record of the number of searches. of.
  • S58 The client presents the recommended search term received to the user.
  • the present invention also provides a server.
  • the server includes a tag library 41, a matching unit 42, a summary unit 43, and a recommendation word output unit 44.
  • the tag library 41 is connected to the matching unit 42, the summary unit 43, and the recommended word output unit 44, respectively, the summary unit 43 is connected to the matching unit 42, and the recommendation word output unit 44 is connected to the summary unit 43.
  • the tag library 41 stores a plurality of tags, a plurality of categories, and a plurality of application keywords, wherein each category Containing a plurality of tags, each application keyword corresponds to at least one tag, and each tag belongs to at least one category.
  • the application keyword refers to the content that the user wants to search
  • the tag library 41 configures corresponding tags for various application keywords that may be input, and the tag needs to cover various characteristics of the application keyword.
  • the correspondence between categories and tags can be classified according to the functional characteristics of the tags.
  • the correspondence between the application keyword and the label can be configured according to the mechanism of data mining and manual verification. For example, for application keywords
  • Anger Bird can be configured with the corresponding labels “cartoon”, “puzzle”, “throwing”, and for example, for the application keyword “WeChat”, it can be configured with the corresponding label “Intercom”. , “chat”, "voice”,
  • the correspondence between the application keyword and the tag is configured according to the mechanism of data mining and manual verification.
  • Each tag corresponds to at least one category, and the correspondence between the category and the label is classified according to the functional characteristics of the label. For example, the labels “alarm clock”, “killing Trojan”, “watching novels” correspond to one category “function label”, and the labels “3D”, “horizontal screen”, and “vertical screen” correspond to a category “interface”.
  • the server of this embodiment can be used alone, receiving an application keyword input by a user, or can be used in conjunction with a general search engine, and the search result output by the general search engine can be used as an application keyword input to the server.
  • the tag library 41 obtains the tag of the application keyword matching for the application keyword.
  • Each of the tags has its corresponding category, and the summary unit 43 finds the category corresponding to each tag output by the matching unit 42 through the tag library 41, and summarizes the found categories to find out the most frequently occurring ones. category.
  • the summary unit 43 outputs the category with the most occurrences to the recommended word output unit 44, and the recommended word output unit 44 scans the tag library 41 to find the tag corresponding to the category as the recommended search term, preferably to find the corresponding category.
  • the popular tag is the recommended search term.
  • the category with the most occurrences is the category most relevant to the content searched by the user.
  • the category “interface” contains three tabs “3D", “horizontal screen”, “vertical screen”, where "3D” is the most popular label because it is often searched, ie if the category "interface” is The category with the most occurrences, the recommendation word output unit 44 will output the label "3D". And as a recommended search term.
  • the final output of the search term can also be multiple, which can be achieved by setting the popular threshold of the tag.
  • the matching unit 42 may first determine whether the received application keyword is a fuzzy keyword, and if not, directly search the application keyword as a search term, and if so, A tag matching the received application keyword is obtained according to the received application keyword.
  • the fuzzy keyword described here refers to a word whose subjective meaning is ambiguous, and it is possible to determine whether it is a fuzzy keyword by setting a relevance score to the applied keyword. For example, when the user inputs "QQ2012", the user wants to search for a specific software, and the search purpose is relatively clear, and there is no need to present the recommendation words to the user, and the general search engine can be directly used to apply the keyword "QQ2012" as a search.
  • the word is searched so that a higher score can be set for "QQ2012". If the user enters "Tencent” to search, he may want to search for a certain type of software owned by Tencent. At this time, the search purpose is vague, so you can set a lower score for "Tencent” and proceed further. Search.
  • the server includes a tag library 41, a matching unit 42, a summary unit 43, a recommendation word output unit 44, and a feature library 45.
  • the tag library 41 is connected to the feature library 45, and the tag library 41 and the feature library 45 are respectively connected to the matching unit 42, the summary unit 43, and the recommendation word output unit 44, and the summary unit 43 is connected to the matching unit 42, and the recommendation word output unit 44 is The summary unit 43 is connected.
  • the server of this embodiment further includes a feature library 45.
  • a plurality of approximate tags are stored in the feature library 45, and the approximate tags correspond to the tags in the tag library 41.
  • Each approximate tag has an approximate functional characteristic to the corresponding one or more tags in the tag library, that is, the approximate tag belongs to the same category as the corresponding tag from the tag library.
  • the matching unit 42 receives the application keyword, the tag matching the application keyword may be obtained from the tag library 41 and/or an approximate tag matching the application keyword from the feature library 45 may be obtained, and then found.
  • the categories corresponding to these tags and/or approximate tags It can be seen that the search function of the system can be improved by adding an approximate label to the feature library 45, which facilitates the expansion of the system.
  • the present invention also provides a search term recommendation system.
  • the search term recommendation system includes a server 81 and at least one client 82, and a client. 82 is connected to the server 81 via a network.
  • the client 82 can be a computer, a mobile phone, and a flat A terminal such as a tablet computer for inputting a word or a sentence to be searched by the user and transmitting it to the server 81 as an application keyword.
  • the server 81 uses the application keyword sent by the user terminal 82 to obtain a search term that matches the potential search intention of the user, and feeds back to the user terminal 82, and the user terminal 82 presents the recommended keyword to the user, so that the user can make the user clearer. Search for the ground.
  • the function structure of the server 81 of this embodiment refer to the related description of the server in the embodiment of FIG. 6 and FIG. 7, and details are not described herein again.
  • the invention can find the popular recommendation words with the same functional characteristics and display them to the user through the application keywords directly input by the user or derived from the search results of the general search engine, so that the subjective search purpose of the user is not clear. Underneath, you can mine the potential needs of users or refine the needs of users, so that the search results are more in line with the user's intentions, and have a strong practicality.
  • the present invention also provides a storage medium containing computer executable instructions for performing a search term acquisition method when executed by a processor, the method comprising:
  • the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords; determining whether the received application keywords are fuzzy keywords;
  • the received application keyword is a fuzzy keyword, obtaining a tag that matches the received application keyword according to the received application keyword;
  • the present invention also provides another storage medium containing computer executable instructions for executing a search term recommendation method when executed by a processor for recommending compliance to a client through a server a user-intended search term, wherein the server is provided with a tag library, wherein the tag library stores a plurality of tags, a plurality of categories, and a plurality of application keywords, and the search term recommendation method includes:
  • the client sends the application keyword that the user wants to search to the server;
  • the server obtains a label that matches the received application keyword according to the received application keyword;
  • the server obtains a category corresponding to the matched label according to the matched label;
  • the server summarizes the obtained categories to find a category in which the number of occurrences is the most;
  • the server finds and appears the most a tag corresponding to the category as a recommended search term, and returning the recommended search term to the client;
  • the user terminal presents the received search term received to the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提出了一种搜索词获取方法、服务器、搜索词推荐方法及系统以及存储介质,其搜索词获取方法包括:设置标签库,所述标签库中存储有多个标签、多个类别及多个应用关键词,其中一个类别对应多个标签,一个应用关键词对应至少一个标签,一个标签对应至少一个类别;判断接收到的应用关键词是否为模糊关键词;若是,则根据接收到的应用关键词获得与接收到的应用关键词匹配的标签;根据所述匹配的标签获得与所述匹配的标签对应的类别;对获得的类别进行汇总,找出其中出现次数最多的类别;找出与出现次数最多的类别对应的标签作为推荐的搜索词。本发明可以在用户的主观搜索目的不明确的情况下,挖掘出用户潜在的需求,或者细化用户的需求,使搜索结果更符合用户意图,具有很强的实用性。

Description

说 明 书
搜索词获取方法、 服务器、 搜索词推荐系统 本专利申请要求于 2012年 10月 9 日提交的、 申请号为 201210379599.5、 申请人为腾讯科技(深圳)有限公司、 发明名称为 "搜索词获取方法、 服务器、 搜索词推荐方法及系统" 的中国专利申请的优先权, 该申请的全文以引用的方 式并入本申请中。 技术领域
本发明涉及一种计算机的网络搜索技术, 特别涉及一种搜索词获取方法、 服务器、 搜索词推荐方法及系统以及存储介质。 背景技术
随着 WEB2. 0技术的迅猛发展, 互联网数据海量增长。 如何为互联网用户提 供准确有效的信息显得尤为重要。 通用搜索引擎的搜索策略是尽量获取数据, 但是对数据的处理水平比较低, 如百度、 谷歌等通用搜索引擎, 通常是根据输 入的关键字的相似度罗列大量的搜索结果。 其突出问题就是: 在搜索结果中, 无价值信息过多、有效信息不足且非结构化、以及返回搜索结果无个性化机制。 通用搜索引擎提供的搜索结果中无价值数据的比例较高, 这些对用户无价值的 数据浪费了数据中心相当多的存储和运算能力, 意味着不仅单次搜索消耗的能 源浪费比例高,还会干扰有效信息的提取,致使用户很可能需要进行多次搜索。
垂直搜索引擎是相对于通用搜索引擎的信息量大、 查询不准确、 深度不够 等问题所提出来的新的搜索引擎服务模式, 该模式针对某一特定领域、 某一特 定人群或某一特定需求提供有一定价值的信息和相关服务, 其特点就是 "专、 精、 深", 且具有行业色彩, 与通用搜索引擎的海量信息无序化相比, 垂直搜索 引擎则显得更加专注、 具体和深入。 但是, 由于垂直搜索引擎所具有的行业特 点, 因而其数据量有限, 用户需要在不同领域进行搜索时, 不得不使用不同的 垂直搜索引擎, 操作上较为不便。
此外, 用户在搜索时, 由于不同用户在主观上存在差异性, 所以很多时候 因为不能提供准确的关键词而导致无法获得想要的搜索结果, 因而在现有技术 中, 不管是通用搜索引擎还是垂直搜索引擎, 均不具备根据用户提供的模糊关 键词向用户推荐搜索词、 进而推荐搜索结果的功能, 因此无法满足用户的潜在 搜索需求, 具有一定局限性。 发明内容
本发明实施例提供一种搜索词获取方法、 服务器、 搜索词推荐方法及系统 以及存储介质, 以解决通用搜索引擎对数据的处理能力低、 垂直搜索引擎操作 不便、 以及现有的搜索引擎无法向用户智能化推荐搜索词、 进而推荐搜索结果 的问题。
本发明提出了一种搜索词获取方法, 运行于服务器端, 其特征在于, 所述 方法包括:
设置标签库,所述标签库中存储有多个标签、多个类别及多个应用关键词; 判断接收到的应用关键词是否为模糊关键词;
若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词获得与 接收到的应用关键词匹配的标签;
根据所述匹配的标签获得与所述匹配的标签对应的类别;
对获得的所述类别进行汇总, 确定其中出现次数最多的类别;
确定与出现次数最多的类别对应的标签作为推荐的搜索词。
本发明还提出了一种服务器, 其特征在于, 包括:
标签库, 所述标签库中存储有多个标签、 多个类别及多个应用关键词; 匹配单元, 用于在接收应用关键词后, 判断接收到的应用关键词是否为模 糊关键词, 若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词 获得与接收到的应用关键词匹配的标签;
汇总单元, 用于根据所述匹配单元获得的匹配的标签获得与所述匹配的标 签对应的类别,并对获得的所述类别进行汇总,找出其中出现次数最多的类别; 推荐词输出单元, 用于确定与所述出现次数最多的类别对应的标签作为推 荐的搜索词。
本发明还提出了一种搜索词推荐系统, 其特征在于, 包括服务器与至少一 个用户端, 所述用户端用于向所述服务器发送应用关键词, 以及接收所述服务 器返回的推荐的搜索词并向用户展现, 所述服务器进一步包括:
标签库, 所述标签库中存储有多个标签、 多个类别及多个应用关键词; 匹配单元, 用于接收所述用户端发送的应用关键词, 并判断接收到的应用 关键词是否为模糊关键词, 若接收到的应用关键词是模糊关键词, 则根据接收 到的应用关键词获得与接收到的应用关键词匹配的标签;
汇总单元, 用于根据所述匹配单元获得的所述匹配的标签获得与所述匹配 的标签对应的类别, 并对获得的所述类别进行汇总, 找出其中出现次数最多的 类别;
推荐词输出单元, 用于确定与所述出现次数最多的类别对应的标签作为推 荐的搜索词。
相对于现有技术, 本发明实施例可以通过由用户直接输入的或由通用搜索 引擎的搜索结果导出的应用关键词, 找出相同功能特性且热门的推荐词, 并展 现给用户, 从而在用户的主观搜索目的不明确的情况下, 可以挖掘出用户潜在 的需求, 或者细化用户的需求, 使搜索结果更符合用户意图, 具有很强的实用 性。 附图说明
上述说明仅是本发明技术方案的概述, 为了能够更清楚了解本发明的技术 手段从而可依照说明书的内容予以实施, 并且为了让本发明的上述特征和优点 能够更明显易懂, 以下特举较佳实施例, 并配合附图, 详细说明如下, 其中: 图 1为根据本发明实施例的一种搜索词获取方法的流程图;
图 2为根据本发明实施例的搜索过程的示意图;
图 3为根据本发明实施例的另一种搜索词获取方法的流程图;
图 4为根据本发明实施例的一种搜索词推荐方法的流程图;
图 5为根据本发明实施例的另一种搜索词推荐方法的流程图;
图 6为根据本发明实施例的一种服务器的结构图;
图 7为根据本发明实施例的另一种服务器的结构图; 图 8为根据本发明实施例的一种搜索词推荐系统的结构图;
图 9为根据本发明实施例的一种类别、 标签、 应用关键词的对应关系图。 具体实施方式 附图及较佳实施例, 详细说明本发明提出的搜索词获取方法、 服务器、 搜索词 推荐方法及系统的具体实施方式、 方法、 步骤及技术效果, 且说明相应的存储 介质。 本发明可以根据输入的关键词寻找出用户的隐含需求, 并输出推荐的搜索 词。 请参见图 1 , 其为根据本发明实施例的一种搜索词获取方法的流程图, 该方 法运行于服务器端, 且包括以下步骤 S1 1-S16 :
S11 :设置标签库。 所述标签库中存储有多个标签、 多个类别及多个应用关 键词, 其中每个类别包含多个标签, 每个应用关键词对应至少一个标签, 每个 标签属于至少一个类别。
请参见图 9 ,应用关键词是指用户想要搜索的内容, 标签库为可能输入的应 用关键词配置对应的标签, 该标签需要涵盖应用关键词的各类特性。 例如, 如 果应用关键词是 "愤怒的小鸟", 则可以为其配置对应的标签 "卡通"、 "益智"、 "投掷 ", 又例如应用关键词是 "微信", 则可以为其配置对应的标签 "对讲"、 "聊天"、 "语音"、 "传文件"、 "记事"。 应用关键词与标签的对应关系是根据数 据挖掘及人工校验的机制进行配置的。
另外, 每个标签与至少一个类别相对应, 类别与标签的对应关系按照标签 的功能特性进行分类。 例如标签 "闹钟"、 "杀木马"、 "看小说" 对应于一个类 别 "功能标签", 又如标签 " 3D"、 "横屏"、 "竖屏" 对应于一个类别 "界面", 标签 "重力感应"、 "蓝牙联网" 对应于一个类别 "特性"。
S12:判断接收到的所述应用关键词是否为模糊关键词。
本实施例中, 应用关键词可以是由用户直接输入, 也可以是通用搜索引擎 或垂直搜索引擎的输出结果。 比如, 用户可以直接键入 "愤怒的小鸟" 作为应 用关键词, 用户也可以将 "愤怒的小鸟" 输入通用搜索引擎, 由通用搜索引擎 得出一个搜索结果列表(通常称之为 APP特性列表), 这个搜索结果列表中可能 包含 "愤怒的小鸟返校版、 愤怒的小鸟太空版、 愤怒的小鸟高清版… ", 然后将 这个搜索结果列表中每一个结果导出作为应用关键词。
这里所述的模糊关键词是指用户主观意思不明确的词, 可以通过对应用关 键词设置相关性分值来确定其是否为模糊关键词。 例如当用户输入 "QQ2012 " 时, 用户是想要搜索一款具体的软件, 其搜索目的较为明确, 无需向用户展现 推荐词, 可以直接釆用通用搜索引擎以应用关键词 "QQ2012" 作为搜索词进行 搜索, 因而可以为 "QQ2012 "设置较高的分值。 而如果用户输入 "腾讯" 进行 搜索时, 其可能想要搜索的是腾讯公司旗下的某一类软件, 这时搜索目的较为 模糊, 因而可以为 "腾讯" 设置较低的分值, 并进入下一步骤。 在这里, 为了 判断相关性分值的高低, 可以在标签库中预先设置相关性分值阔值, 当应用关 键词的相关性分值低于该阔值时, 判断该应用关键词为模糊关键词; 否者, 判 断该应用关键词并非是模糊关键词。 当然, 也可以釆用其它的预设标准来判断 应用关键词是否为模糊关键词。 该判断方式也类似适用于以下实施例。
在标签库中可以为每个存储的应用关键词存储对应的相关性分值, 在收到 应用关键词后, 在标签库中找到与接收到的应用关键词一致的存储的应用关键 词, 获得与该存储的应用关键词对应的相关性分值, 作为接收到的应用关键词 的相关性分值用于判断接收到的应用关键词是否为模糊关键词。
S13: 若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词获 得与其匹配的标签; 否者, 将接收到的应用关键词直接作为搜索词。
接收到应用关键词后, 根据标签库对其进行标签匹配, 并从标签库中获得 与接收到的应用关键词相匹配的标签; 具体而言, 在标签库中找到与接收到的 应用关键词一致的存储的应用关键词, 并获得与该存储的应用关键词匹配的标 签, 作为与接收到的应用关键词相匹配的标签。 如根据应用关键词 "愤怒的小 鸟" 获得匹配的三个标签 "卡通"、 "益智"、 "投掷"。
S14: 根据所述匹配的标签获得与所述匹配的标签对应的类别。
每个标签都有其对应的类别, 类别与标签的对应关系按照标签的功能特性 进行分类。 在本步骤中可以获得一个或多个类别 (如果将搜索引擎的搜索结果 作为应用关键词, 则将可能得到大量的类别)。
S15: 对获得的所述类别进行汇总, 找出其中出现次数最多的类别。
在本步骤中对上一步骤中获得的类别进行汇总, 找出其中出现次数最多的 类别, 这个出现次数最多的类别也就是与用户搜索的内容相关性最大的类别。 而步骤 S14和步骤 S15 中得出的标签与类别的对应结果可以称作为标签的属性 分布。
S16: 找出与出现次数最多的类别对应的标签作为推荐的搜索词; 优选是从 标签库中找出与出现次数最多的类别对应的热门标签作为推荐的搜索词。
出现次数最多的类别即与用户搜索的内容相关性最大的类别, 在这个类别 中可能会包含多个标签, 而其中每个标签的热门度可以是人工设置的或者根据 被搜索次数的记录来确定的。 比如类别 "界面" 下包含的三个标签 " 3D"、 "横 屏"、 "竖屏", 其中 " 3D"这个标签因常常被搜索而被设置为最热门的标签, 即 如果类别 "界面" 是出现次数最多的类别, 则本步骤会输出 " 3D"这个标签, 并作为推荐的搜索词。 当然, 最终输出的搜索词也可以是多个, 可以通过设置 标签的热门阔值来实现。
为便于理解, 下面以一个具体实例来说明整个搜索过程, 请参见图 2 , 假设 搜索引擎的搜索结果中输出一个应用关键词 "微信", 则通过标签库了找出 "微 信" 所对应的五个标签: 标签 1为 "对讲"、 标签 2为 "聊天"、 标签 3为 "语 音"、 标签 4为 "传文件"、 标签 5为 "记事本"。 然后通过对这五个标签进行属 性类别汇总,得出标签 1、标签 2、标签 3同属于一个类别: 属性 1 , 即 "腾讯"。 可见在五个标签中, "腾讯" 这个类别出现了三次, 是出现次数最多的类别。 接 着对类别 "腾讯" 进行扫描, 得到其中最热门的标签 "QQ" , 最终将标签 "QQ " 作为推荐词输出给用户。 以此类推, 对搜索引擎的搜索结果中输出的每一个应 用关键词进行检索推荐, 并将与用户搜索内容潜在相关的推荐词展现给用户。 因此, 通过本发明能灵活地挖掘出用户潜在的需求, 或者细化用户的需求, 使 搜索结果更符合用户意图。
请参见图 3 , 其为根据本发明实施例的另一种搜索词获取方法的流程图, 该 方法包括以下步骤 S31-S36 :
S31 : 设置标签库和特征库。
所述标签库中存储有多个标签、 多个类别及多个应用关键词, 其中每个类 别包含多个标签, 每个应用关键词对应至少一个标签, 每个标签属于至少一个 类别。
所述特征库中存储有多个近似标签, 近似标签与标签库中的标签相对应。 每个近似标签与标签库中对应的一个或多个标签功能特性相近似, 即近似标签 与对应的、 来自标签库的标签属于同一类别。 特征库的存在便于系统的扩展和 冗善。
S32: 判断接收到的应用关键词是否为模糊关键词。
这里所述的模糊关键词是指用户主观意思不明确的词, 可以通过对应用关 键词设置相关性分值来确定其是否为模糊关键词。 例如当用户输入 "QQ2012 " 时, 用户是想要搜索一款具体的软件, 其搜索目的较为明确, 无需向用户展现 推荐词, 可以直接釆用通用搜索引擎以应用关键词 "QQ2012" 作为搜索词进行 搜索, 因而可以为 "QQ2012 "设置较高的分值。 而如果用户输入 "腾讯" 进行 搜索时, 其可能想要搜索的是腾讯公司旗下的某一类软件, 这时搜索目的较为 模糊, 因而可以为 "腾讯" 设置较低的分值, 并进入下一步骤。
S33: 若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词获 得与其匹配的标签和 /或近似标签; 否者, 将接收到的应用关键词直接作为搜索 词。
S34: 根据所述匹配的标签和 /或近似标签获得与所述匹配的标签和 /或近似 标签对应的类别。
在应用关键词匹配过程中, 可能会有特征库中的近似标签与其相匹配, 而 由于近似标签与其对应的、 来自标签库的标签属于同一类别, 因而同样也可以 获得对应的类别。
S35: 对获得的所述类别进行汇总, 找出其中出现次数最多的类别。
在上一步骤中可以获得多个类别 (如果是由搜索引擎的搜索结果作为应用 关键词则可能会得到大量的类别), 在本步骤中对这些类别进行汇总, 找出其中 出现次数最多的类别, 这个出现次数最多的类别也就是与用户搜索的内容相关 性最大的类别。
S36 : 找出与出现次数最多的类别对应的标签和 /或近似标签作为推荐的搜 索词; 优选是找出与出现次数最多的类别对应的热门标签和 /或热门近似标签作 为推荐的检索词。
出现次数最多的类别即与用户搜索的内容相关性最大的类别, 在这个类别 中可能会包含多个标签, 而热门标签即可以作为推荐的搜索词展现给用户。
本发明还提出一种搜索词推荐方法, 用于通过服务器向用户端推荐符合用 户检索意图的搜索词, 以充分满足用户的搜索需求, 请参见图 4 , 其为本发明实 施例的一种搜索词推荐方法的流程图, 该方法包括以下步骤 S41-S48:
S41 : 在服务器上设置标签库。 所述标签库中存储有多个标签、 多个类别及 多个应用关键词, 其中每个类别包含多个标签, 每个应用关键词对应至少一个 标签, 每个标签属于至少一个类别。 每个标签与至少一个类别相对应, 类别与 标签的对应关系按照标签的功能特性进行分类。
S42: 用户端将用户想要搜索的应用关键词发送给服务器。
应用关键词是指用户想要搜索的内容, 标签库为各种可能输入的应用关键 词配置对应的标签, 该标签需要涵盖应用关键词的各类特性。
S43: 服务器接收所述用户端发送的应用关键词, 并判断接收到的应用关键 词是否为模糊关键词。
这里所述的模糊关键词是指用户主观意思不明确的词, 如上所述, 可以通 过对应用关键词设置相关性分值来确定其是否为模糊关键词。
S44: 若接收到的应用关键词是模糊关键词, 则服务器根据接收到的应用关 键词获得与接收到的应用关键词匹配的标签; 否者, 将接收到的应用关键词直 接作为搜索词。
服务器接收到应用关键词后, 根据标签库对接收到的应用关键词进行标签 匹配, 并获得与该应用关键词相匹配的标签, 即在标签库中找到与接收到的应 用关键词一致的预先存储的应用关键词并获得与该预先存储的应用关键词匹配 的标签, 作为与接收到的应用关键词匹配的标签。 S45: 服务器根据所述匹配的标签获得与所述匹配的标签对应的类别。
每个标签都有其对应的类别, 类别与标签的对应关系按照标签的功能特性 进行分类。
S46: 服务器对获得的所述类别进行汇总, 找出其中出现次数最多的类别。 在上一步骤中可以获得多个类别, 在本步骤中对这些类别进行汇总, 找出 其中出现次数最多的类别, 这个出现次数最多的类别也就是与用户搜索的内容 相关性最大的类别。
S47: 服务器找出与出现次数最多的类别对应的标签作为推荐的搜索词, 并 将推荐的搜索词返回给所述用户端。 优选地, 服务器找出与出现次数最多的类 别对应的热门标签作为推荐的搜索词, 并将推荐的搜索词返回给所述用户端。
出现次数最多的类别即与用户搜索的内容相关性最大的类别, 在这个类别 中可能会包含多个标签, 而其中每个标签的热门度可以是人工设置的或者根据 被搜索次数的记录来确定的。
S48: 用户端将接收到的所述推荐的搜索词展现给用户。
请参见图 5 , 图 5为本发明实施例的另一种搜索词推荐方法的流程图, 该方 法包括步骤 S51-S58 :
S51 : 在服务器上设置标签库和特征库。
所述标签库中存储有多个标签、 多个类别及多个应用关键词, 其中每个类 别包含多个标签, 每个应用关键词对应至少一个标签, 每个标签属于至少一个 类别。 每个标签至少会与一个类别相对应, 类别与标签的对应关系按照标签的 功能特性进行分类。
所述特征库中存储有多个近似标签, 近似标签与标签库中的标签相对应。 每个近似标签与标签库中对应的一个或多个标签功能特性相近似, 即近似标签 与对应的、 来自标签库的标签属于同一类别。 特征库的存在便于系统的扩展和 冗善。
S52: 用户端将用户想要搜索的应用关键词发送给服务器。
应用关键词是指用户想要搜索的内容, 标签库为各种可能输入的应用关键 词配置对应的标签, 该标签需要涵盖应用关键词的各类特性。 S53: 服务器接收所述用户端发送的应用关键词, 并判断接收到的应用关键 词是否为模糊关键词。
这里所述的模糊关键词是指用户主观意思不明确的词, 如上所述, 可以通 过对应用关键词设置相关性分值来确定其是否为模糊关键词。
S54: 若接收到的应用关键词是模糊关键词, 则服务器根据接收到的应用关 键词获得与接收到的应用关键词匹配的标签和 /或近似标签; 否者, 将接收到的 应用关键词直接作为搜索词。
S55: 服务器根据匹配的所述标签和 /或近似标签获得与所述匹配的标签和 / 或近似标签对应的类别。
在应用关键词匹配过程中, 可能会有特征库中的近似标签与其相匹配, 而 由于近似标签与其对应的、 来自标签库的标签属于同一类别, 因而同样也可以 获得对应的类别。
S56: 服务器对获得的所述类别进行汇总, 找出其中出现次数最多的类别。 在上一步骤中可以获得多个类别, 在本步骤中对这些类别进行汇总, 找出 其中出现次数最多的类别, 这个出现次数最多的类别也就是与用户搜索的内容 相关性最大的类别。
S57: 服务器找出与出现次数最多的类别对应的标签作为推荐的搜索词, 并 将推荐的搜索词返回给所述用户端。 优选地, 服务器找出与出现次数最多的类 别对应的热门标签作为推荐的搜索词, 并将推荐的搜索词返回给所述用户端。
出现次数最多的类别即与用户搜索的内容相关性最大的类别, 在这个类别 中可能会包含多个标签, 而其中每个标签的热门度可以是人工设置的或者根据 被搜索次数的记录来确定的。
S58: 用户端将接收到的所述推荐的搜索词展现给用户。
本发明还提出一种服务器, 请参见图 6 , 其为本发明实施例的一种服务器的 结构图, 该服务器包括标签库 41、 匹配单元 42、 汇总单元 43以及推荐词输出 单元 44。 标签库 41分别与匹配单元 42、 汇总单元 43以及推荐词输出单元 44 相连, 汇总单元 43与匹配单元 42相连, 推荐词输出单元 44与汇总单元 43相 连。 标签库 41中存储有多个标签、 多个类别及多个应用关键词, 其中每个类别 包含多个标签, 每个应用关键词对应至少一个标签, 每个标签属于至少一个类 别。
请参见图 9 , 应用关键词是指用户想要搜索的内容, 标签库 41为各种可能 输入的应用关键词配置对应的标签, 标签需要涵盖应用关键词的各类特性。 类 别与标签的对应关系可以按照标签的功能特性进行分类。 应用关键词与标签的 对应关系可根据数据挖掘及人工校验的机制进行配置。 例如, 对于应用关键词
"愤怒的小鸟", 可以为其配置对应的标签 "卡通"、 "益智"、 "投掷", 又例如, 对于应用关键词 "微信", 则可以为其配置对应的标签 "对讲"、 "聊天"、 "语音"、
"传文件"、 "记事"。 应用关键词与标签的对应关系是根据数据挖掘及人工校验 的机制进行配置的。 每个标签与至少一个类别相对应, 类别与标签的对应关系 按照标签的功能特性进行分类。 例如标签 "闹钟"、 "杀木马"、 "看小说" 对应 到一个类别 "功能标签", 又如标签 "3D"、 "横屏"、 "竖屏"对应到一个类别 "界 面"。
本实施例的服务器可以单独使用, 接收由用户输入的应用关键词, 也可以 配合通用搜索引擎来使用, 由通用搜索引擎输出的搜索结果可作为输入给本服 务器的应用关键词。
工作时, 当匹配单元 42接收到应用关键词时, 通过标签库 41为该应用关 键词获得该应用关键词匹配的标签。 而每个标签都有其对应的类别, 汇总单元 43通过标签库 41找出与匹配单元 42输出的每个标签所对应的类别, 并对找出 的类别进行汇总, 找出其中出现次数最多的类别。 最后, 汇总单元 43将出现次 数最多的类别输出给推荐词输出单元 44 ,由推荐词输出单元 44扫描标签库 41 , 找出该类别对应的标签作为推荐的搜索词, 优选是找出该类别对应的热门标签 作为推荐的搜索词。
出现次数最多的类别即与用户搜索的内容相关性最大的类别, 在这个类别 中可能会包含多个标签, 而其中标签的热门度可以是人工设置的或者根据被搜 索次数的记录来确定的。 比如类别 "界面" 下包含三个标签 "3D"、 "横屏"、 "竖 屏", 其中 "3D"这个标签因常常被搜索而被设置为最热门的标签, 即如果类别 "界面 "是出现次数最多的类别,则推荐词输出单元 44将输出 " 3D"这个标签, 并作为推荐的搜索词。 当然, 最终输出的搜索词也可以是多个, 可以通过设置 标签的热门阔值来实现。
特别的是, 在匹配单元 42接收到应用关键词时, 可以先判断接收到的所述 应用关键词是否为模糊关键词, 若不是则直接以所述应用关键词作为检索词进 行搜索, 若是则根据接收到的应用关键词获得与接收到的应用关键词匹配的标 签。 这里所述的模糊关键词是指用户主观意思不明确的词, 可以通过对应用关 键词设置相关性分值来确定其是否为模糊关键词。例如当用户输入" QQ2012 " , 这时用户是想要搜索一款具体的软件, 其搜索目的较为明确, 无需向用户展现 推荐词, 可以直接釆用通用搜索引擎以应用关键词 "QQ2012" 作为搜索词进行 搜索, 因而可以为 "QQ2012 "设置较高的分值。 而如果用户输入 "腾讯" 进行 搜索时, 其可能想要搜索的是腾讯公司旗下的某一类软件, 这时搜索目的较为 模糊, 因而可以为 "腾讯" 设置较低的分值, 并进行进一步的搜索。
请参见图 7 , 其为本发明实施例的另一种服务器的结构图, 该服务器包括标 签库 41、 匹配单元 42、 汇总单元 43、 推荐词输出单元 44以及特征库 45。 标签 库 41与特征库 45相连, 且标签库 41和特征库 45均分别与匹配单元 42、 汇总 单元 43、 推荐词输出单元 44相连, 汇总单元 43与匹配单元 42相连, 推荐词输 出单元 44与汇总单元 43相连。
与图 4的实施例不同的是, 本实施例的服务器还包括特征库 45。 特征库 45 中存储有多个近似标签, 近似标签与标签库 41中的标签相对应。 每个近似标签 与标签库中对应的一个或多个标签具有近似的功能特性, 即近似标签与对应的、 来自标签库的标签属于同一类别。 当匹配单元 42接收到所述应用关键词后, 可 以从标签库 41中获得与该应用关键词匹配的标签和 /或从特征库 45中获得与该 应用关键词匹配的近似标签, 然后找出这些标签和 /或近似标签对应的类别。 可 见, 可以通过向特征库 45中加入近似标签来完善系统的搜索功能, 便于系统的 扩展。
本发明还提出一种搜索词推荐系统,请参见图 8 , 其为本发明实施例的一种 搜索词推荐系统的结构图, 该搜索词推荐系统包括服务器 81与至少一个用户端 82 , 用户端 82通过网络与服务器 81连接。 用户端 82可以是计算机、 手机、 平 板电脑等终端, 其用于供用户输入想要搜索的词或语句, 并作为应用关键词发 送给服务器 81。 服务器 81利用用户端 82发送的应用关键词, 获取符合用户潜 在搜索意图的推荐的搜索词, 并反馈给用户端 82 , 由用户端 82将推荐的关键词 展现给用户, 以使用户可以更加明确地进行搜索。 其中, 本实施例服务器 81的 功能结构参见关于图 6和图 7的实施例中服务器的相关描述, 在此不再赘述。
本发明可以通过用户直接输入的或从通用搜索引擎的搜索结果导出的应用 关键词, 找出具有相同功能特性且热门的推荐词, 并展现给用户, 从而在用户 的主观搜索目的不明确的情况下, 可以挖掘出用户潜在的需求或者细化用户的 需求, 使搜索结果更符合用户意图, 具有很强的实用性。
本发明还提出了一种包含计算机可执行指令的存储介质, 所述计算机可执 行指令在由处理器运行时用于执行一种搜索词获取方法, 所述方法包括:
设置标签库,所述标签库中存储有多个标签、多个类别及多个应用关键词; 判断接收到的应用关键词是否为模糊关键词;
若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词获得与 接收到的应用关键词匹配的标签;
根据所述匹配的标签获得与所述匹配的标签对应的类别;
对获得的所述类别进行汇总, 找出其中出现次数最多的类别;
找出与出现次数最多的类别对应的标签作为推荐的搜索词。
本发明还提出了另一种包含计算机可执行指令的存储介质, 所述计算机可 执行指令在由处理器运行时用于执行一种搜索词推荐方法, 该方法用于通过服 务器向用户端推荐符合用户意图的搜索词, 其特征在于, 所述服务器中设置有 标签库, 所述标签库中存储有多个标签、 多个类别及多个应用关键词, 所述搜 索词推荐方法包括:
用户端将用户想要搜索的应用关键词发送给所述服务器;
所述服务器接收所述用户端发送的应用关键词, 并判断接收到的所述应用 关键词是否为模糊关键词;
若接收到的所述应用关键词是模糊关键词, 则所述服务器根据接收到的应 用关键词获得与接收到的应用关键词匹配的标签; 所述服务器根据所述匹配的标签获得与所述匹配的标签对应的类别; 所述服务器对获得的所述类别进行汇总 , 找出其中出现次数最多的类别; 所述服务器找出与出现次数最多的类别对应的标签作为推荐的搜索词, 并 将推荐的搜索词返回给所述用户端;
所述用户端将接收到的所述推荐的搜索词展现给用户。
以上所述仅为本发明的较佳实施例, 并非对本发明作任何形式上的限制, 虽然本发明已通过以上较佳实施例得以披露, 然而并非用以限定本发明。 任何 熟悉本专业的技术人员, 在不脱离本发明技术方案范围内, 都可利用上述揭示 的技术内容作出具有替换或修改的等效实施例, 但凡是未脱离本发明技术方案 均仍属于本发明技术方案的范围内。

Claims

权 利 要 求 书
1、 一种搜索词获取方法, 运行于服务器端, 其特征在于, 所述方法包括: 设置标签库,所述标签库中存储有多个标签、多个类别及多个应用关键词; 判断接收到的应用关键词是否为模糊关键词;
若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词获得与 接收到的应用关键词匹配的标签;
根据所述匹配的标签获得与所述匹配的标签对应的类别;
对获得的所述类别进行汇总, 找出其中出现次数最多的类别;
确定与出现次数最多的类别对应的标签作为推荐的搜索词。
2、 如权利要求 1所述的搜索词获取方法, 其特征在于, 每个所述类别包含 多个所述标签, 每个所述应用关键词对应至少一个标签, 每个标签属于至少一 个类别。
3、 如权利要求 1或 2所述的搜索词获取方法, 其特征在于,
所述判断接收到的应用关键词是否为模糊关键词包括: 判断所述接收到的
收到的应用关键词为模糊关键词; 否则, 确定接收到的应用关键词不是模糊关 键词。
4、 如权利要求 1或 2所述的搜索词获取方法, 其特征在于, 所述接收到的 应用关键词由用户输入或来自搜索引擎输出的搜索结果。
5、 如权利要求 1或 2所述的搜索词获取方法, 其特征在于, 所述方法还包 括:
设置特征库, 所述特征库中存储有多个近似标签, 所述近似标签与所述标 签库中的标签相对应;
所述根据接收到的应用关键词获得与接收到的应用关键词匹配的标签包括: 根据接收到的应用关键词获得与接收到的应用关键词匹配的标签和 /或近似标 签;
所述根据所述匹配的标签获得与所述匹配的标签对应的类别包括: 根据所 述匹配的标签和 I或近似标签获得对应的类别。
6、 一种服务器, 其特征在于, 包括: 标签库, 所述标签库中存储有多个标签、 多个类别及多个应用关键词; 匹配单元, 用于在接收应用关键词后, 判断接收到的应用关键词是否为模 糊关键词, 若接收到的应用关键词是模糊关键词, 则根据接收到的应用关键词 获得与接收到的应用关键词匹配的标签;
汇总单元, 用于根据所述匹配单元获得的匹配的标签获得与所述匹配的标 签对应的类别,并对获得的所述类别进行汇总,找出其中出现次数最多的类别; 推荐词输出单元, 用于确定与所述出现次数最多的类别对应的标签作为推 荐的搜索词。
7、 如权利要求 6所述的服务器, 其特征在于, 每个所述类别包含多个所述 标签, 每个所述应用关键词对应至少一个标签, 每个标签属于至少一个类别。
8、 如权利要求 6或 7所述的服务器, 其特征在于,
所述匹配单元具体用于: 判断所述接收到的应用关键词的相关性分值是否 低于预设的相关性分值阔值, 如果是, 则确定接收到的应用关键词为模糊关键 词; 否则, 确定接收到的应用关键词不是模糊关键词。
9、 如权利要求 6或 7所述的服务器, 其特征在于, 所述接收到的应用关键 词由用户输入或来自搜索引擎输出的搜索结果。
10、 如权利要求 6或 7所述的服务器, 其特征在于, 所述服务器还包括: 特征库, 所述特征库中存储有多个近似标签, 所述近似标签与所述标签库 中的标签相对应;
所述匹配单元接收到所述应用关键词后, 从所述标签库中获得与所述接收 到的应用关键词匹配的标签和 /或从所述特征库中获得与所述接收到的应用关 键词匹配的近似标签, 并根据匹配的所述标签和 /或近似标签获得对应的类别。
11、 一种搜索词推荐系统, 其特征在于, 包括服务器与至少一个用户端, 所述用户端用于向所述服务器发送应用关键词, 以及接收所述服务器返回的推 荐的搜索词并向用户展现, 所述服务器进一步包括:
标签库, 所述标签库中存储有多个标签、 多个类别及多个应用关键词; 匹配单元, 用于接收所述用户端发送的应用关键词, 并判断接收到的应用 关键词是否为模糊关键词, 若接收到的应用关键词是模糊关键词, 则根据接收 到的应用关键词获得与接收到的应用关键词匹配的标签;
汇总单元, 用于根据所述匹配单元获得的所述匹配的标签获得与所述匹配 的标签对应的类别, 并对获得的所述类别进行汇总, 找出其中出现次数最多的 类别;
推荐词输出单元, 用于确定与所述出现次数最多的类别对应的标签作为推 荐的搜索词。
12、 如权利要求 11所述的搜索词推荐系统, 其特征在于, 每个所述类别包 含多个所述标签, 每个所述应用关键词对应至少一个标签, 每个标签属于至少 一个类别。
13、 如权利要求 11或 12所述的搜索词推荐系统, 其特征在于,
所述匹配单元判断接收到的应用关键词是否为模糊关键词包括: 判断所述 则确定接收到的应用关键词为模糊关键词; 否则, 确定接收到的应用关键词不 是模糊关键词。
14、 如权利要求 11或 12所述的搜索词推荐系统, 其特征在于, 所述搜索 词推荐系统还包括:
特征库, 所述特征库中存储有多个近似标签, 所述近似标签与所述标签库 中的标签相对应;
所述匹配单元接收到所述应用关键词后, 从所述标签库中获得与所述接收 到的应用关键词匹配的标签和 /或从所述特征库中获得与所述接收到的应用关 键词匹配的近似标签, 并根据匹配的所述标签和 /或近似标签获得对应的类别。
PCT/CN2013/079173 2012-10-09 2013-07-11 搜索词获取方法、服务器、搜索词推荐系统 WO2014056337A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/678,355 US20150213042A1 (en) 2012-10-09 2015-04-03 Search term obtaining method and server, and search term recommendation system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210379599.5 2012-10-09
CN201210379599.5A CN103714088A (zh) 2012-10-09 2012-10-09 搜索词获取方法、服务器、搜索词推荐方法及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/678,355 Continuation US20150213042A1 (en) 2012-10-09 2015-04-03 Search term obtaining method and server, and search term recommendation system

Publications (1)

Publication Number Publication Date
WO2014056337A1 true WO2014056337A1 (zh) 2014-04-17

Family

ID=50407073

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/079173 WO2014056337A1 (zh) 2012-10-09 2013-07-11 搜索词获取方法、服务器、搜索词推荐系统

Country Status (3)

Country Link
US (1) US20150213042A1 (zh)
CN (1) CN103714088A (zh)
WO (1) WO2014056337A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104503975A (zh) * 2014-11-20 2015-04-08 百度在线网络技术(北京)有限公司 推荐卡片的定制方法及装置
CN106708886A (zh) * 2015-11-17 2017-05-24 北京国双科技有限公司 站内搜索词的显示方法及装置

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104063418A (zh) * 2014-03-17 2014-09-24 百度在线网络技术(北京)有限公司 搜索推荐方法和装置
CN104143001B (zh) * 2014-08-01 2018-03-30 百度在线网络技术(北京)有限公司 搜索词推荐方法及装置
CN104715066B (zh) * 2015-03-31 2017-04-12 北京奇付通科技有限公司 一种搜索优化方法、装置和系统
US20160371340A1 (en) * 2015-06-19 2016-12-22 Lenovo (Singapore) Pte. Ltd. Modifying search results based on context characteristics
CN106844406B (zh) * 2015-12-07 2021-03-02 腾讯科技(深圳)有限公司 检索方法和检索装置
CN108781406B (zh) * 2016-03-14 2021-09-07 罗伯特·博世有限公司 用于内部通话系统的无线接入点和路由音频流数据的方法
CN108010523B (zh) * 2016-11-02 2023-05-09 松下电器(美国)知识产权公司 信息处理方法以及记录介质
CN108205545B (zh) * 2016-12-16 2022-06-10 百度在线网络技术(北京)有限公司 一种为用户提供推荐信息的方法与设备
CN106709040B (zh) * 2016-12-29 2021-02-19 北京奇虎科技有限公司 一种应用搜索方法和服务器
CN106909688B (zh) * 2017-03-07 2020-10-16 阿里巴巴(中国)有限公司 一种基于输入搜索词来推荐搜索词的方法和装置
CN107291930A (zh) * 2017-06-29 2017-10-24 环球智达科技(北京)有限公司 权重数的计算方法
EP3667516A1 (en) * 2017-08-31 2020-06-17 Shenzhen Heytap Technology Corp., Ltd. Method for recommending search word, and related device
CN109801119B (zh) * 2017-11-15 2022-04-15 阿里巴巴集团控股有限公司 界面展示、信息提供、用户行为内容信息处理方法及设备
CN108197242A (zh) * 2017-12-29 2018-06-22 北京奇虎科技有限公司 搜索推荐词的推送方法、装置及服务器
CN109086389A (zh) * 2018-07-26 2018-12-25 国信优易数据有限公司 一种信息查询方法、推送方法、装置及电子设备
CN111078989B (zh) * 2018-10-18 2024-03-22 阿里巴巴集团控股有限公司 一种应用程序的推荐方法、装置、及电子设备
CN109918555B (zh) * 2019-02-20 2021-10-15 百度在线网络技术(北京)有限公司 用于提供搜索建议的方法、装置、设备和介质
CN110609956B (zh) * 2019-09-18 2022-10-11 苏州达家迎信息技术有限公司 一种信息搜索方法、装置、介质及设备
CN112307069A (zh) * 2020-11-12 2021-02-02 京东数字科技控股股份有限公司 数据查询方法、系统、设备及存储介质
CN113536127A (zh) * 2021-01-12 2021-10-22 陈漩 基于大数据和人工智能的数据处理方法及云服务器
CN113609380B (zh) * 2021-07-12 2024-03-26 北京达佳互联信息技术有限公司 标签体系更新方法、搜索方法、装置以及电子设备
CN117076783B (zh) * 2023-10-16 2023-12-26 广东省科技基础条件平台中心 基于数据分析的科技信息推荐方法、装置、介质及设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101593200A (zh) * 2009-06-19 2009-12-02 淮海工学院 基于关键词频度分析的中文网页分类方法
CN101751405A (zh) * 2008-12-12 2010-06-23 国际商业机器公司 用于搜索文档的方法和系统
CN102236663A (zh) * 2010-04-30 2011-11-09 阿里巴巴集团控股有限公司 一种基于垂直搜索的查询方法、系统和装置
CN102591890A (zh) * 2011-01-17 2012-07-18 腾讯科技(深圳)有限公司 一种展示搜索信息的方法及搜索信息展示装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8005919B2 (en) * 2002-11-18 2011-08-23 Aol Inc. Host-based intelligent results related to a character stream
US20080040126A1 (en) * 2006-08-08 2008-02-14 Microsoft Corporation Social Categorization in Electronic Mail
US8554783B2 (en) * 2007-09-17 2013-10-08 Morgan Stanley Computer object tagging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751405A (zh) * 2008-12-12 2010-06-23 国际商业机器公司 用于搜索文档的方法和系统
CN101593200A (zh) * 2009-06-19 2009-12-02 淮海工学院 基于关键词频度分析的中文网页分类方法
CN102236663A (zh) * 2010-04-30 2011-11-09 阿里巴巴集团控股有限公司 一种基于垂直搜索的查询方法、系统和装置
CN102591890A (zh) * 2011-01-17 2012-07-18 腾讯科技(深圳)有限公司 一种展示搜索信息的方法及搜索信息展示装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104503975A (zh) * 2014-11-20 2015-04-08 百度在线网络技术(北京)有限公司 推荐卡片的定制方法及装置
CN106708886A (zh) * 2015-11-17 2017-05-24 北京国双科技有限公司 站内搜索词的显示方法及装置
CN106708886B (zh) * 2015-11-17 2020-08-11 北京国双科技有限公司 站内搜索词的显示方法及装置

Also Published As

Publication number Publication date
US20150213042A1 (en) 2015-07-30
CN103714088A (zh) 2014-04-09

Similar Documents

Publication Publication Date Title
WO2014056337A1 (zh) 搜索词获取方法、服务器、搜索词推荐系统
US10885076B2 (en) Computerized system and method for search query auto-completion
US20210026910A1 (en) Expert Detection in Social Networks
EP3767492B1 (en) Data information transaction method and system
CN102368262B (zh) 一种提供与查询序列相对应的搜索建议的方法与设备
US9292877B2 (en) Methods and systems for generating concept-based hash tags
US10992612B2 (en) Contact information extraction and identification
WO2018028443A1 (zh) 数据处理方法、设备及系统
US20110314011A1 (en) Automatically generating training data
US20110136542A1 (en) Method and apparatus for suggesting information resources based on context and preferences
WO2018000569A1 (zh) 话题订阅方法、装置和存储介质
US10430718B2 (en) Automatic social media content timeline summarization method and apparatus
US10606853B2 (en) Systems and methods for intelligent prospect identification using online resources and neural network processing to classify organizations based on published materials
US10606910B2 (en) Ranking search results using machine learning based models
US11397737B2 (en) Triggering local extensions based on inferred intent
CN110929125A (zh) 搜索召回方法、装置、设备及其存储介质
US11086941B2 (en) Generating suggestions for extending documents
US11436446B2 (en) Image analysis enhanced related item decision
WO2017136295A1 (en) Adaptive seeded user labeling for identifying targeted content
US9824149B2 (en) Opportunistically solving search use cases
US20150310491A1 (en) Dynamic text ads based on a page knowledge graph
WO2017099979A1 (en) Providing automated hashtag suggestions to categorize communication
JP4651975B2 (ja) 情報検索システム、情報検索装置、情報検索支援装置および情報検索プログラムおよび情報検索支援プログラム
KR20190141876A (ko) 물물 교환 중개 방법
US9183251B1 (en) Showing prominent users for information retrieval requests

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13846100

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19/08/2015)

122 Ep: pct application non-entry in european phase

Ref document number: 13846100

Country of ref document: EP

Kind code of ref document: A1