WO2023151576A1 - Search recommendation method, search recommendation system, computer device and storage medium - Google Patents

Search recommendation method, search recommendation system, computer device and storage medium Download PDF

Info

Publication number
WO2023151576A1
WO2023151576A1 PCT/CN2023/074947 CN2023074947W WO2023151576A1 WO 2023151576 A1 WO2023151576 A1 WO 2023151576A1 CN 2023074947 W CN2023074947 W CN 2023074947W WO 2023151576 A1 WO2023151576 A1 WO 2023151576A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
user
recommendation
words
index
Prior art date
Application number
PCT/CN2023/074947
Other languages
French (fr)
Chinese (zh)
Inventor
刘杨
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023151576A1 publication Critical patent/WO2023151576A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the technical field of search, and in particular to a search recommendation method, a search recommendation system, computer equipment, and a storage medium.
  • the current search and sorting method is as follows: after the user enters a keyword as a search term, the background server of the system directly matches the search term with the search term in the knowledge base to obtain multiple search target articles. Then, for each article, the ranking score of each article is calculated according to the "relevance score", and these articles are sorted and recommended to users according to the score.
  • the correlation score represents the matching degree between the user's search term and the article, and is calculated by the system server using a specific algorithm.
  • the main purpose of the embodiments of the present disclosure is to provide a search recommendation method, a search recommendation system, a computer device, and a storage medium.
  • An embodiment of the present disclosure provides a search recommendation method, including: obtaining the index words corresponding to the search sentence input by the user; obtaining the user's search recommendation words, and the search recommendation words are used to associate the user's historical search behavior; according to the index words and search recommendation words Get search results.
  • An embodiment of the present disclosure also provides a search recommendation system, including: an index word acquisition module, configured to acquire an index word corresponding to a search sentence input by a user; a user preference module, configured to acquire a user's search recommendation word, and the search recommendation word is used associate The user's historical search behavior; the search module is used to obtain search results based on index words and search recommendation words.
  • An embodiment of the present disclosure also provides a computer device, the computer device includes a processor, a memory, a computer program stored on the memory and executable by the processor, and a data bus for realizing connection and communication between the processor and the memory, wherein When the computer program is executed by the processor, it realizes the steps of any search and recommendation method provided in the present disclosure.
  • An embodiment of the present disclosure also provides a storage medium for computer-readable storage.
  • the storage medium stores one or more programs, and one or more programs can be executed by one or more processors, so as to implement the information provided in this disclosure specification. Any one of the searches for recommended method steps.
  • FIG. 1 is a schematic flowchart of a search and recommendation method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of a search and recommendation method in the telecommunications industry according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a dictionary tree constructed in Embodiment 1 of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a search recommendation system provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural block diagram of a computer device provided by an embodiment of the present disclosure.
  • Embodiments of the present disclosure provide a search recommendation method, system, computer equipment, and storage medium, which can improve the existing knowledge search that cannot be associated with the user's historical behavior and cannot accurately judge the user's search intention, resulting in insufficient accuracy of the search result. Improved user experience.
  • the search recommendation method can be applied to mobile terminals, and the mobile terminals may include electronic devices such as mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, and wearable devices.
  • FIG. 1 is a schematic flowchart of a search and recommendation method provided by an embodiment of the present disclosure.
  • the search recommendation method includes steps S101 to S103.
  • Step S101 acquiring index words corresponding to a search sentence input by a user.
  • the search sentence input by the user is matched with a preset dictionary tree to obtain index words corresponding to the search sentence.
  • a preset dictionary tree By classifying the articles in the preset knowledge base, the preset dictionary tree is constructed according to the word segmentation result of the content of the article and the category to which the article belongs.
  • the business category corresponding to the index word can also be obtained according to the category in the preset dictionary tree.
  • the professional terms and articles in the telecommunications industry are collected first, and the articles are classified according to the business; the professional terms and common terms in the telecommunications industry are added to the thesaurus, and the contents of the knowledge articles are respectively subjected to coarse-grained word segmentation and fine-grained word segmentation.
  • Word segmentation results are obtained by deduplication.
  • Coarse-grained word segmentation means that when a sentence contains custom common words in the telecommunications industry, word segmentation is performed according to custom words.
  • knowledge articles can be segmented by an ANSJ tokenizer.
  • Step S102 acquiring the user's search recommendation words, which are used to correlate the user's historical search behavior.
  • the search recommendation words preset by the user may be obtained from the user database according to the user ID of the user, and the search recommendation words are used together with the index words matched with the search sentence for searching. Furthermore, in each search, the user's historical search behavior can be associated. On the one hand, the input information of the search is expanded, and on the other hand, it is more based on the user's search recommendation words to make the search results closer to the user's search intent.
  • the user's historical search and article reading behavior can be analyzed based on the neural network, and then the user's search recommendation words can be updated and maintained according to the analysis results .
  • the network model analyzes the user's information and search history to obtain the user's search preference, and updates the user's search recommendation words according to the search preference.
  • Step S103 obtaining search results according to the index words and the search recommendation words.
  • the articles in the knowledge base are searched according to the index words and the search recommendation words.
  • the required search results are obtained by filtering the articles in the knowledge base that do not contain the index words and the search recommendation words.
  • the search results can also be sorted, so that the search results that are more relevant to the user's search intention are displayed in the front, so that the user can locate the required knowledge articles more quickly .
  • a correlation score is performed on the search results, and the search results are sorted according to the obtained correlation scores to obtain recommended results.
  • the search results need to be sorted, the articles that better meet the user's search intention can be sorted in a higher position. It is necessary to score the relevance of the search results. According to the scoring results, articles with higher scores are closer to the user's search intent. Therefore, the search results are sorted in order from high to low, and the sorted search results are obtained, that is, the recommended results displayed to the user.
  • the relevance score of the search results is performed by setting different weights for index words, synonyms, and search recommendation words.
  • the index word is a search entity word matched according to the user's search statement, so the highest weight is set, and the synonym of the index word is set with the second highest weight, and the user's search recommendation word represents the user's historical search Behavior, may have little connection with this search, so set the third highest weight.
  • TF-IDF Term Frequency-Inverse Document Frequency, term frequency-inverse document frequency
  • the recommendation result is optimized according to the preference classification to obtain the optimized recommendation result; wherein, the optimization of the recommendation result according to the preference classification includes: weighting the relevance scores of the search results belonging to the preference classification, and according to The obtained weighted relevance scores rank the search results to obtain optimized recommendation results.
  • the user's preference classification can be set by the user in advance according to their own business needs. In order to maintain the user's preference classification more intelligently, it can also intelligently analyze the user's historical behavior and information of various dimensions based on the neural network, so as to intelligently Ability to maintain and update user preference categories.
  • collecting user information includes: collecting user information in various dimensions, such as user position information, customer-oriented types, and working hours; user search history data includes: user's browsing order of articles, number of times read articles, read article time, whether the article is a favorite, etc., and the above-mentioned search history data of the user are used as neurons in the input layer of the neural network.
  • x i is the i-th input neuron data
  • w i is the weight of the i-th input neuron data
  • b i is the offset of the hidden layer.
  • the user's preference score for the most recently read articles is obtained, and the articles with high preference scores for the recently read articles are counted, and the user preference classification score is calculated according to the category of the article, the preference score, and the number of classified articles according to a self-defined algorithm. Value, classify the category with high score as the user's preference.
  • the user's access frequency can also be recorded for the top-ranked articles. If the user's access frequency of the article is very small, or the user does not access it, it will be considered that such an article does not meet the user's search intention. , and perform a subtraction operation on it when sorting.
  • category analysis may be performed on articles that are rarely accessed by users to identify categories that users dislike, and use the categories that users dislike as a basis for reducing points when sorting search results.
  • Exemplarily record the access frequency of the first 100 articles recommended to the user, by associating the ID of the article in the database with the user ID, count the number of visits, and obtain the number of visits when sorting, the number of visits is very When it is small or 0, the corresponding article will be deducted.
  • the categories of these low-frequency access articles are obtained.
  • the category is considered to be the user's non-preferred category.
  • search results subtract points for search results belonging to non-preferred categories.
  • the scores of the articles in the search results are sorted from high to low, and the sorted search results are returned to the user as recommended results.
  • the words are used to score the relevance of the articles and sort them according to the scoring results, so that the articles that are more in line with the user's search intention are displayed in the front, which is convenient for the user to find the desired article more quickly.
  • the search recommendation method extracts the index words contained in the search sentence based on the dictionary tree of the knowledge base, and obtains the user's search recommendation words and/or search preferences for searching, and expands the search sentence to obtain More results that match user search intent.
  • the search recommendation words and preference categories representing the user's search preferences are used together with the search index words to score the relevance of the articles and sort them according to the scoring results, so that the articles that are more in line with the user's search intention are displayed first.
  • the user's search preference is predicted and analyzed through deep learning of user search and browsing data, and the analysis result is used to modify the user's search preference, so that the user's search preference can be intelligently adjusted according to the search history.
  • the search recommendation results that are more in line with the user's intention are obtained, and the user's satisfaction is improved.
  • the technical solutions of the embodiments of the present disclosure improve the search results by associating the user's historical search preferences when searching, and optimize the ranking of the search results, so that the user's search intention can be judged more accurately, and the user's experience is improved.
  • embodiments of the present disclosure also provide two specific embodiments on the search recommendation method, which are as follows.
  • FIG. 2 is a schematic flowchart of a search and recommendation method in the telecommunications industry according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a dictionary tree constructed in Embodiment 1. This embodiment specifies search recommendations in specific scenarios only for simplicity, and the present disclosure is also applicable to search recommendations in other scenarios.
  • Step 1 Collect knowledge articles in the telecommunications industry, for example: “Internet Domestic Data Transmission”, “Internet Data Center”, “Call Center”. And the article is classified according to the content: “Internet domestic data transmission” belongs to the basic telecommunications business, “Internet data center” and “call center” belong to the value-added telecommunications business.
  • Step 2 Carry out word segmentation for knowledge articles.
  • the word segmentation results for "internet domestic data transmission” include: “Internet”, “domestic", “data”, and “transmission”
  • the word segmentation results for "Internet Data Application Center” include: “ Internet”, “data”, “center”, “data center”
  • the word segmentation results for "call center” include: “call", "center”.
  • Table 1 shows the results of word segmentation for the above knowledge articles.
  • Step 3 obtain the user ID (such as Test1), and obtain the search sentence entered by the user, such as "Internet data”.
  • Step 4 the user's search sentence is matched through the nodes of the dictionary tree, and the index words "Internet” and “data” are obtained, and the matching categories are "basic telecommunication services” and "value-added telecommunication services”.
  • Step 5 obtain the search recommendation words in the user search preferences stored in the database through the user ID, such as "domestic".
  • Step 6 Perform a matching search in different classification indexes according to the category of the index words and search recommendation words, and the matched articles include: "Internet domestic data transmission" and "Internet data center”.
  • Step 7 Set different weight values for index words and search recommended words, where the weight of index words is greater than the weight of search recommended words, and use the TF-IDF scoring algorithm to score matching articles according to the weights of index words and search recommended words.
  • the TF-IDF model calculates a value for each document D and the query string Q composed of keywords W[1]...W[k] according to TF and IDF, which is used to indicate the matching degree between the query string Q and the document D :
  • the weighting coefficient H is added, and the final query matching degree is:
  • the predetermined preference score ratio is shown in Table 4.
  • the database is queried through the user ID to obtain preference categories such as "basic telecommunication services", and then according to the preference category "basic telecommunication services", the articles belonging to the preference category in the search results are added points, and the "internet domestic Data Transfer" for further bonus points.
  • Step 8 Finally, sort the search results according to the scores from large to small to form a recommended result list, and finally return the recommended result list to the user.
  • Embodiment 2 of the present disclosure provides an application of a user preference screening method in a telecommunication customer service system, and the present disclosure is also applicable to preference screening in other application scenarios.
  • Step 1 collect user information in various dimensions, such as user position information (business consultation, fee inquiry, business handling, new business promotion, etc.), customer-oriented types (family customers, government and enterprise customers, public phones, wireless local calls), working hours, And the user's search and browsing data of articles, including: browsing order, number of times of reading articles, time of reading articles, whether to save or not, etc., input the user's various dimension information and search and browsing data to the neurons of the input layer.
  • user position information business consultation, fee inquiry, business handling, new business promotion, etc.
  • customer-oriented types family customers, government and enterprise customers, public phones, wireless local calls
  • working hours And the user's search and browsing data of articles, including: browsing order, number of times of reading articles, time of reading articles, whether to save or not, etc.
  • Step 2 Establish a BP neural network model for each category, use the above-mentioned information data of each dimension as the input of the BP neural network, use the activation function in the hidden layer to calculate and output the preference score for reading articles in the latest period, and obtain user information based on the BP neural network structure Recently read article preference score.
  • Step 3 Count the recently read articles with high preference scores, calculate the user preference classification score according to the category, preference score, and number of classified articles according to the article category, preference score, and the number of classified articles, and use the category with high score as the user's preference category.
  • Step 4 Count the articles with high preference scores of the recently read articles, analyze the coarse-grained word segmentation words with high frequency in each article according to the recently read articles, use such words as the user's search recommendation words, and correct and update the search in the user database Recommended words.
  • Step 5 continue to collect data on the user's reading behavior by using the ranking items pushed to the user, including: matching the number of clicks on the article, the length of reading, whether to save it, etc. as the input of step 1.
  • the top browsing order, multiple clicks, long reading time, and bookmarking an article reflect that the user likes the article; otherwise, it means that the user does not like the article.
  • the user's search recommendation words and preference classification are updated based on the user's historical behavior and user information based on the BP neural network.
  • FIG. 4 is a schematic diagram of a scene implementing the search recommendation system provided by this embodiment. As shown in FIG. Search module 203 .
  • the index word obtaining module 201 includes: a search request obtaining module 2011 and an index word matching module 2012 .
  • the search request obtaining module 2011 is used to receive the user's search statement
  • the index word matching module 2012 is used to extract the index words and classifications in the input search sentence.
  • the specific method for generating index words and classifications is as follows: by matching the search sentence with a preset dictionary tree, extracting the index words contained in the search sentence and the classification information corresponding to the index words from the dictionary tree.
  • the user preference acquisition module 202 includes a search recommendation word acquisition module 2021 and a preference classification acquisition module 2022.
  • the search recommendation word acquisition module 2021 is used to obtain the user's search recommendation words from the database according to the ID information of the user; the preference classification acquisition module 2022 uses The user's preference classification is obtained from the database according to the user's ID information, and the preference classification is used as one of the basis for subsequent ranking of search results.
  • the search module 203 includes a synonym acquisition module 2031 and a matching search module 2032 .
  • the synonym acquisition module 2031 is used to obtain the synonym of the index word extracted from the search sentence; the matching search module 2032 is used to And the user's search recommendation words are searched and matched from the knowledge database to obtain search results.
  • the search recommendation system further includes: a ranking module 204 , which can be specifically divided into: a weight marking module 2041 , a correlation scoring module 2042 , and a ranking recommendation module 2043 .
  • the weight marking module 2041 is used to assign different weights to index words, synonyms, and user's search recommendation words
  • the correlation scoring module 2042 is used to perform search results according to index words, synonyms, search recommendation words and corresponding different weights. Relevance score, the result of which will be used to sort the search results.
  • the sorting and recommending module 2043 is used to sort the search results in order of scores from high to low according to the relevance scoring results of the search results and the user's preference classification to form recommended results, and return the searched recommended results to the user.
  • the search recommendation system further includes: a user preference screening module 205 , specifically: a user information acquisition module 2051 , a browsing history acquisition module 2052 , a user preference analysis module 2053 , and a user preference correction module 2054 .
  • the user information acquisition module 2051 is used to collect the basic information of the user
  • the browsing history acquisition module 2052 is used to obtain the user's article search and historical data of the article (such as the order of browsing articles, the number of times they are read, and whether they are favorites)
  • the user preference analysis module 2053 is used to predict the degree of user preference for articles based on user information and reading history of articles combined with BP neural network, extract articles with high preference, summarize and fuse the number of preferred articles and degree of preference, and extract preference classification.
  • the search recommendation words are screened out by predetermined screening principles
  • the user preference modification module 2054 is used to update the preference classification and search recommendation words in the user database according to the analysis and prediction results of the BP neural network.
  • the search recommendation system further includes: a knowledge base classification and indexing module 206 , which can be specifically divided into: a knowledge base classification module 2061 , a knowledge base word segmentation module 2062 , and a dictionary tree construction module 2063 .
  • the knowledge base classification module 2061 is used to classify the articles in the knowledge base according to the business;
  • the knowledge base word segmentation module 2062 is used to extract the professional vocabulary and other commonly used custom vocabulary in the knowledge base articles as index words;
  • the dictionary tree construction sub-module 2063 It is used to construct a dictionary tree according to the classification of the indexed articles.
  • FIG. 5 is a schematic structural block diagram of a computer device provided by an embodiment of the present disclosure.
  • a computer device 300 includes a processor 301 and a memory 302, and the processor 301 and the memory 302 are connected through a bus 303, such as an I2C (Inter-integrated Circuit) bus.
  • a bus 303 such as an I2C (Inter-integrated Circuit) bus.
  • the processor 301 is used to provide computing and control capabilities to support the operation of the entire computer device.
  • the processor 301 can be a central processing unit (Central Processing Unit, CPU), and the processor 301 can also be other general-purpose processors, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC ), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 302 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk, or a mobile hard disk.
  • FIG. 5 is only a block diagram of a partial structure related to the embodiment of the present disclosure, and does not constitute a limitation on the computer equipment to which the embodiment of the present disclosure is applied.
  • the computer device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • the processor is configured to run a computer program stored in the memory, and implement any search and recommendation method provided by the embodiments of the present disclosure when executing the computer program.
  • the processor is configured to run a computer program stored in the memory, and implement the following steps when executing the computer program: obtain the index word corresponding to the search sentence input by the user; obtain the user's search recommended word, and search the recommended word with Based on the historical search behavior of associated users; obtain search results based on index words and search recommendation words.
  • the processor when implementing the search recommendation method, is configured to: perform correlation scoring on search results, and sort the search results according to the obtained correlation scores to obtain recommendation results.
  • the processor when implementing the search and recommendation method, is configured to: obtain user preference classifications, optimize the recommendation results according to the preference classifications, and obtain optimized recommendation results; where the recommendation results are optimized according to the preference classifications
  • the optimization includes: weighting the relevance scores of the search results belonging to the preference category, and sorting the search results according to the obtained weighted relevance scores to obtain optimized recommendation results.
  • the processor when the processor implements the search recommendation method, it is used to: collect user information and search history, analyze the user information and search history based on the preset BP neural network model to obtain the user's search preference; Preferences update the user's search recommendation words and preference categories.
  • the processor when the processor obtains the index words corresponding to the search sentences input by the user, it is used to implement: perform word segmentation and matching on the search sentences input by the user and the preset dictionary tree, and obtain the index words corresponding to the search sentences .
  • the processor when implementing the search and recommendation method, is used to: classify the articles in the preset knowledge base, construct a preset dictionary according to the results of word segmentation of the content of the articles and the categories to which the articles belong Tree.
  • the processor when the processor acquires the search results according to the index words and search recommended words, it is used to realize: searching for synonyms associated with the index words from a preset relational database; Get search results.
  • the processor when the processor implements the correlation scoring of the search results, it is used to: set different weights for index words, synonyms and search recommendation words to perform correlation scoring on the search results, and obtain the search results relevance score.
  • the embodiment of the present disclosure also provides a storage medium for computer-readable storage, the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors, so as to realize the Steps for any one of the search recommended methods provided in the instructions.
  • the storage medium may be an internal storage unit of the computer device in the foregoing embodiments, such as a hard disk or memory of the computer device.
  • the storage medium can also be an external storage device of the computer equipment, such as a plug-in hard disk equipped on the computer equipment, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card) wait.
  • a smart memory card Smart Media Card, SMC
  • SD Secure Digital
  • flash memory card Flash Card
  • the functional modules/units in the system, and the device can be implemented as software, firmware, hardware, and an appropriate combination thereof.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit .
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Abstract

The embodiments of the present disclosure provide a search recommendation method, a search recommendation system, a computer device and a storage medium, and belong to the technical field of search. The method comprises: acquiring an index term corresponding to a search phrase inputted by a user; obtaining a recommended search term for the user, wherein the recommended search term is used for linking to historical search behavior of the user; and according to the index term and the recommended search term, obtaining a search result.

Description

搜索推荐方法、搜索推荐系统、计算机设备及存储介质Search recommendation method, search recommendation system, computer equipment and storage medium
相关申请的交叉引用Cross References to Related Applications
本公开要求享有2022年02月08日提交的名称为“搜索推荐方法、搜索推荐系统、计算机设备及存储介质”的中国专利申请CN202210118904.9的优先权,其全部内容通过引用并入本公开中。This disclosure claims the priority of the Chinese patent application CN202210118904.9 entitled "Search Recommendation Method, Search Recommendation System, Computer Equipment, and Storage Medium" filed on February 08, 2022, the entire contents of which are incorporated into this disclosure by reference .
技术领域technical field
本公开涉及搜索技术领域,尤其涉及一种搜索推荐方法、搜索推荐系统、计算机设备及存储介质。The present disclosure relates to the technical field of search, and in particular to a search recommendation method, a search recommendation system, computer equipment, and a storage medium.
背景技术Background technique
随着电信行业的不断发展,业务知识不断更新,电信客服人员为获取相关业务信息,经常通过统一的知识管理系统进行检索,且根据工作岗位、习惯等因素影响会经常搜索同类型知识。精准、高效的业务信息检索,将最符合搜索预期的目标文章推荐给搜索人员显得十分重要。With the continuous development of the telecommunications industry, business knowledge is constantly updated. In order to obtain relevant business information, telecommunications customer service personnel often search through a unified knowledge management system, and often search for the same type of knowledge according to factors such as job positions and habits. For accurate and efficient business information retrieval, it is very important to recommend the target articles that best meet the search expectations to searchers.
当前搜索排序的方法如下:用户输入关键词作为搜索词后,系统后台服务器直接将待搜索词与知识库中的检索词进行匹配,获得多个搜索目标文章。然后,针对每篇文章,按照“相关性评分”计算出每篇文章的排序分值,根据分值将这些文章进行排序并推荐给用户。其中,相关性得分表征用户的搜索词与该文章的契合程度,由系统服务器利用特定的算法计算。然而对于一些用户的搜索语句与搜索关键词不匹配的情况,则无法准确判断用户的搜索意图,导致搜索结果不能够令用户满意。The current search and sorting method is as follows: after the user enters a keyword as a search term, the background server of the system directly matches the search term with the search term in the knowledge base to obtain multiple search target articles. Then, for each article, the ranking score of each article is calculated according to the "relevance score", and these articles are sorted and recommended to users according to the score. Among them, the correlation score represents the matching degree between the user's search term and the article, and is calculated by the system server using a specific algorithm. However, when some users' search statements do not match the search keywords, it is impossible to accurately determine the user's search intent, resulting in unsatisfactory search results.
发明内容Contents of the invention
本公开实施例的主要目的在于提供一种搜索推荐方法、搜索推荐系统、计算机设备及存储介质。The main purpose of the embodiments of the present disclosure is to provide a search recommendation method, a search recommendation system, a computer device, and a storage medium.
本公开实施例提供一种搜索推荐方法,包括:获取用户输入的搜索语句对应的索引词;获取用户的搜索推荐词,搜索推荐词用于关联用户的历史搜索行为;根据索引词及搜索推荐词获取搜索结果。An embodiment of the present disclosure provides a search recommendation method, including: obtaining the index words corresponding to the search sentence input by the user; obtaining the user's search recommendation words, and the search recommendation words are used to associate the user's historical search behavior; according to the index words and search recommendation words Get search results.
本公开实施例还提供一种搜索推荐系统,包括:索引词获取模块,用于获取用户输入的搜索语句对应的索引词;用户偏好模块,用于获取用户的搜索推荐词,搜索推荐词用于关联 用户的历史搜索行为;搜索模块,用于根据索引词及搜索推荐词获取搜索结果。An embodiment of the present disclosure also provides a search recommendation system, including: an index word acquisition module, configured to acquire an index word corresponding to a search sentence input by a user; a user preference module, configured to acquire a user's search recommendation word, and the search recommendation word is used associate The user's historical search behavior; the search module is used to obtain search results based on index words and search recommendation words.
本公开实施例还提供一种计算机设备,计算机设备包括处理器、存储器、存储在存储器上并可被处理器执行的计算机程序以及用于实现处理器和存储器之间的连接通信的数据总线,其中计算机程序被处理器执行时,实现如本公开说明书提供的任一项搜索推荐方法的步骤。An embodiment of the present disclosure also provides a computer device, the computer device includes a processor, a memory, a computer program stored on the memory and executable by the processor, and a data bus for realizing connection and communication between the processor and the memory, wherein When the computer program is executed by the processor, it realizes the steps of any search and recommendation method provided in the present disclosure.
本公开实施例还提供一种存储介质,用于计算机可读存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现如本公开说明书提供的任一项搜索推荐的方法的步骤。An embodiment of the present disclosure also provides a storage medium for computer-readable storage. The storage medium stores one or more programs, and one or more programs can be executed by one or more processors, so as to implement the information provided in this disclosure specification. Any one of the searches for recommended method steps.
附图说明Description of drawings
为了更清楚地说明本公开实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present disclosure. Ordinary technicians can also obtain other drawings based on these drawings on the premise of not paying creative work.
图1为本公开实施例提供的一种搜索推荐方法的流程示意图;FIG. 1 is a schematic flowchart of a search and recommendation method provided by an embodiment of the present disclosure;
图2为本公开实施例在电信行业中的搜索推荐方法的流程示意图;FIG. 2 is a schematic flowchart of a search and recommendation method in the telecommunications industry according to an embodiment of the present disclosure;
图3为本公开实施例一所构造的字典树的示意图;FIG. 3 is a schematic diagram of a dictionary tree constructed in Embodiment 1 of the present disclosure;
图4为本公开实施例提供的一种搜索推荐系统的结构示意图;FIG. 4 is a schematic structural diagram of a search recommendation system provided by an embodiment of the present disclosure;
图5为本公开实施例提供的一种计算机设备的结构示意框图。FIG. 5 is a schematic structural block diagram of a computer device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the accompanying drawings in the embodiments of the present disclosure. Apparently, the described embodiments are part of the embodiments of the present disclosure, not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present disclosure.
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flow charts shown in the drawings are just illustrations, and do not necessarily include all contents and operations/steps, nor must they be performed in the order described. For example, some operations/steps can be decomposed, combined or partly combined, so the actual order of execution may be changed according to the actual situation.
本公开说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本公开。如在本公开说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。 The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to limit the present disclosure. As used in this disclosure and the appended claims, the singular forms "a", "an" and "the" are intended to include plural referents unless the context clearly dictates otherwise.
本公开实施例提供一种搜索推荐方法、系统、计算机设备及存储介质,能够改善现有的知识搜索无法与用户的历史行为关联、无法准确判断用户搜索意图而导致搜索结果正确率不足的问题,提升了用户的体验。其中,该搜索推荐方法可应用于移动终端中,该移动终端可以包括手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等电子设备。Embodiments of the present disclosure provide a search recommendation method, system, computer equipment, and storage medium, which can improve the existing knowledge search that cannot be associated with the user's historical behavior and cannot accurately judge the user's search intention, resulting in insufficient accuracy of the search result. Improved user experience. Wherein, the search recommendation method can be applied to mobile terminals, and the mobile terminals may include electronic devices such as mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, and wearable devices.
下面结合附图,对本公开的一些实施例作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Some embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
请参照图1,图1为本公开实施例提供的一种搜索推荐方法的流程示意图。Please refer to FIG. 1 , which is a schematic flowchart of a search and recommendation method provided by an embodiment of the present disclosure.
如图1所示,该搜索推荐方法包括步骤S101至步骤S103。As shown in FIG. 1 , the search recommendation method includes steps S101 to S103.
步骤S101、获取用户输入的搜索语句对应的索引词。Step S101 , acquiring index words corresponding to a search sentence input by a user.
在一实施方式中,将用户输入的搜索语句与预设的字典树进行分词匹配,得到与搜索语句对应的索引词。首先,构建预设的字典树,通过对预设的知识库中的文章进行分类,依据将文章的内容进行分词的结果与文章所属的类别构造预设的字典树。在一实施方式中,还可以根据预设的字典树中的类别得到索引词对应的业务类别。In one embodiment, the search sentence input by the user is matched with a preset dictionary tree to obtain index words corresponding to the search sentence. First, construct a preset dictionary tree. By classifying the articles in the preset knowledge base, the preset dictionary tree is constructed according to the word segmentation result of the content of the article and the category to which the article belongs. In an embodiment, the business category corresponding to the index word can also be obtained according to the category in the preset dictionary tree.
以电信行业具体场景下的分类知识索引建立为例进行介绍,本公开也适用于其它业务场景。The establishment of the classified knowledge index in the specific scenario of the telecommunications industry is taken as an example for introduction, and the present disclosure is also applicable to other business scenarios.
在一实施方式中,先收集电信行业专业术语、文章,并对文章按业务进行分类;将电信行业专业术语和常用术语加入词库,对知识文章的内容分别进行粗粒度分词和细粒度分词并去重得到分词结果,其中,粗粒度分词是语句包含自定义电信行业常用词时,按自定义词进行分词。根据文章分类构建不同类型的知识库索引,构造字典树,将上述分词结果和文章分类信息保存到字典树中。将用户的搜索语句输入预设的字典树中进行匹配,即可得到与搜索语句匹配的索引词。In one embodiment, the professional terms and articles in the telecommunications industry are collected first, and the articles are classified according to the business; the professional terms and common terms in the telecommunications industry are added to the thesaurus, and the contents of the knowledge articles are respectively subjected to coarse-grained word segmentation and fine-grained word segmentation. Word segmentation results are obtained by deduplication. Coarse-grained word segmentation means that when a sentence contains custom common words in the telecommunications industry, word segmentation is performed according to custom words. Construct different types of knowledge base indexes according to article classification, construct a dictionary tree, and save the above word segmentation results and article classification information into the dictionary tree. Input the user's search sentence into the preset dictionary tree for matching, and then the index words matching the search sentence can be obtained.
示例性地,可以通过ANSJ分词器对知识文章进行分词。Exemplarily, knowledge articles can be segmented by an ANSJ tokenizer.
步骤S102、获取用户的搜索推荐词,搜索推荐词用于关联用户的历史搜索行为。Step S102, acquiring the user's search recommendation words, which are used to correlate the user's historical search behavior.
在一实施方式中,可根据用户的用户ID从用户数据库中获取该用户预先设置的搜索推荐词,并将该搜索推荐词用于与搜索语句所匹配的索引词一并用于搜索。进而在每次搜索中能够关联用户的历史搜索行为,一方面对搜索的输入信息进行了拓展,另一方面,也更多地基于用户的搜索推荐词使得搜索结果更加接近用户的搜索意图。In one embodiment, the search recommendation words preset by the user may be obtained from the user database according to the user ID of the user, and the search recommendation words are used together with the index words matched with the search sentence for searching. Furthermore, in each search, the user's historical search behavior can be associated. On the one hand, the input information of the search is expanded, and on the other hand, it is more based on the user's search recommendation words to make the search results closer to the user's search intent.
在一实施方式中,为了更好地预测和分析用户的搜索偏好,可以基于神经网络对用户的历史搜索和对文章的阅览行为进行分析,进而根据分析结果对用户的搜索推荐词进行更新和维护。在一实施方式中,通过收集用户的信息以及搜索和阅览的数据,基于预设的BP神经 网络模型分析用户的信息和搜索历史得到用户的搜索偏好,并根据搜索偏好来更新用户的搜索推荐词。In one embodiment, in order to better predict and analyze the user's search preference, the user's historical search and article reading behavior can be analyzed based on the neural network, and then the user's search recommendation words can be updated and maintained according to the analysis results . In one embodiment, by collecting user information and search and browsing data, based on the preset BP nerve The network model analyzes the user's information and search history to obtain the user's search preference, and updates the user's search recommendation words according to the search preference.
步骤S103、根据索引词及搜索推荐词获取搜索结果。Step S103, obtaining search results according to the index words and the search recommendation words.
根据该索引词及搜索推荐词对知识库中的文章进行搜索,在一实施方式中,通过过滤知识库中不包含索引词和搜索推荐词的文章得到所需的搜索结果。The articles in the knowledge base are searched according to the index words and the search recommendation words. In one embodiment, the required search results are obtained by filtering the articles in the knowledge base that do not contain the index words and the search recommendation words.
在一实施方式中,为了对搜索用的关键词进行进一步的拓展,还可以从预设的关系数据库中查找与索引词关联的同义词,根据索引词、同义词以及搜索推荐词从预设的知识库中得到匹配的搜索结果。In one embodiment, in order to further expand the keywords used for searching, it is also possible to search for synonyms associated with the index words from the preset relational database, and to search from the preset knowledge base according to the index words, synonyms and search recommendation words Get matching search results.
为了更好地提升用户对搜索结果的满意度,还可以对搜索结果进行排序,进而实现将与用户搜索意图更相关的搜索结果展示在前面,使用户能更快地定位到所需要的知识文章。In order to better improve the user's satisfaction with the search results, the search results can also be sorted, so that the search results that are more relevant to the user's search intention are displayed in the front, so that the user can locate the required knowledge articles more quickly .
在一实施方式中,对搜索结果进行相关性评分,并依据得到的相关性分数对搜索结果进行排序得到推荐结果。In one embodiment, a correlation score is performed on the search results, and the search results are sorted according to the obtained correlation scores to obtain recommended results.
由于需要对搜索结果进行排序,以便将更满足用户搜索意图的文章排序在更前面的位置。需要对搜索结果进行相关性评分,根据评分结果,分数越高的文章越接近用户的搜索意图,因此按照从高到低的顺序对搜索结果进行排序,得到排序后的搜索结果,也即推荐结果展示给用户。Because the search results need to be sorted, the articles that better meet the user's search intention can be sorted in a higher position. It is necessary to score the relevance of the search results. According to the scoring results, articles with higher scores are closer to the user's search intent. Therefore, the search results are sorted in order from high to low, and the sorted search results are obtained, that is, the recommended results displayed to the user.
在一实施方式中,为了更好地体现搜索词汇在每次搜索中的重要性,通过为索引词、同义词及搜索推荐词设置不同权重对搜索结果进行相关性评分。在一实施方式中,索引词是根据用户的搜索语句匹配出来的搜索实体词,因此设置最高权重,索引词的同义词设置第二高的权重,而用户的搜索推荐词代表的是用户的历史搜索行为,可能与本次搜索的联系不大,因此设置第三高的权重。按照索引词的权重>同义词的权重>搜索推荐词的权重设置策略,进而根据三种词汇以及各自的权重对所有搜索结果中的文章进行相关性的评分,并根据相关性分数对这些文章进行排序。In one embodiment, in order to better reflect the importance of search words in each search, the relevance score of the search results is performed by setting different weights for index words, synonyms, and search recommendation words. In one embodiment, the index word is a search entity word matched according to the user's search statement, so the highest weight is set, and the synonym of the index word is set with the second highest weight, and the user's search recommendation word represents the user's historical search Behavior, may have little connection with this search, so set the third highest weight. According to the weight of index words > the weight of synonyms > the weight setting strategy of search recommended words, and then according to the three kinds of vocabulary and their respective weights, the relevance of the articles in all search results is scored, and these articles are sorted according to the relevance scores .
示例性地,可基于TF-IDF(Term Frequency–Inverse Document Frequency,词频-逆向文件频率)对搜索结果中的文章进行相关性评分的计算。Exemplarily, based on TF-IDF (Term Frequency-Inverse Document Frequency, term frequency-inverse document frequency), the calculation of the relevance score can be performed on the articles in the search results.
在一实施方式中,依据偏好分类对推荐结果进行优化,得到优化后的推荐结果;其中,依据偏好分类对推荐结果进行优化包括:对属于偏好分类的搜索结果的相关性分数进行加权,并依据得到的加权后的相关性分数对搜索结果进行排序得到优化后的推荐结果。In one embodiment, the recommendation result is optimized according to the preference classification to obtain the optimized recommendation result; wherein, the optimization of the recommendation result according to the preference classification includes: weighting the relevance scores of the search results belonging to the preference classification, and according to The obtained weighted relevance scores rank the search results to obtain optimized recommendation results.
用户的偏好分类可以由用户预先根据自己的业务需求进行设置,为了更加智能地维护用户的偏好分类,还可以基于神经网络对用户的历史行为和各维度信息进行智能分析,从而智 能维护和更新用户的偏好分类情况。The user's preference classification can be set by the user in advance according to their own business needs. In order to maintain the user's preference classification more intelligently, it can also intelligently analyze the user's historical behavior and information of various dimensions based on the neural network, so as to intelligently Ability to maintain and update user preference categories.
示例性地,收集用户的信息包括:采集用户各维度的信息,如用户岗位信息、面向客户类型、工作时段;用户的搜索历史数据包括:用户对文章的浏览顺序、阅读文章的次数、阅读文章的时间、文章是否收藏等,将用户的上述搜索历史数据作为神经网络输入层的神经元。Exemplarily, collecting user information includes: collecting user information in various dimensions, such as user position information, customer-oriented types, and working hours; user search history data includes: user's browsing order of articles, number of times read articles, read article time, whether the article is a favorite, etc., and the above-mentioned search history data of the user are used as neurons in the input layer of the neural network.
为各个分类建立BP神经网络模型,将上述各维度信息数据作为BP神经网络的输入,在隐含层利用激励函数计算并输出最近时段用户最近阅读文章偏好分值,计算公式为:
Establish a BP neural network model for each classification, use the above-mentioned dimension information data as the input of the BP neural network, use the activation function in the hidden layer to calculate and output the user's recent reading article preference score in the most recent period, the calculation formula is:
其中,xi为第i个输入的神经元数据,wi为第i个输入的神经元数据的权重,bi为隐含层的偏移量。Among them, x i is the i-th input neuron data, w i is the weight of the i-th input neuron data, and b i is the offset of the hidden layer.
由此,根据BP神经网络结构获取用户最近阅读文章偏好分值,统计最近阅读文章偏好分值高的文章,根据文章的类别、偏好分值、分类文章数量按自定义算法计算出用户偏好分类分值,将分值高的类别作为用户的偏好分类。Thus, according to the structure of the BP neural network, the user's preference score for the most recently read articles is obtained, and the articles with high preference scores for the recently read articles are counted, and the user preference classification score is calculated according to the category of the article, the preference score, and the number of classified articles according to a self-defined algorithm. Value, classify the category with high score as the user's preference.
还可以统计最近阅读文章偏好分值高的文章,根据这部分文章分析出文中频率较高的粗粒度词汇,将此类词汇作为用户的搜索推荐词对用户数据库中的搜索推荐词进行修正。It is also possible to count the articles with high preference scores for recently read articles, analyze the coarse-grained words with high frequency in the articles according to these articles, and use such words as the user's search recommendation words to correct the search recommendation words in the user database.
通过对用户搜索和阅览数据的深度学习,预测和分析用户的搜索偏好,并将分析结果用于修正用户的搜索偏好,使得用户的搜索偏好能根据搜索历史智能调整。Through in-depth learning of user search and browsing data, predict and analyze user search preferences, and use the analysis results to modify user search preferences, so that user search preferences can be intelligently adjusted according to search history.
在一实施方式中,还可以根据用户的历史搜索,针对排名在前面的文章记录用户访问频次,如文章的用户访问频次很少,或者用户不访问,将认为此类文章不符合用户的搜索意图,在排序时对其进行减分操作。In one embodiment, according to the user's historical search, the user's access frequency can also be recorded for the top-ranked articles. If the user's access frequency of the article is very small, or the user does not access it, it will be considered that such an article does not meet the user's search intention. , and perform a subtraction operation on it when sorting.
在一实施方式中,还可以对用户访问频次很少的文章进行类别分析,识别出用户不喜欢的类别,并将该不喜欢的类别作为对搜索结果进行排序时减分的依据。In one embodiment, category analysis may be performed on articles that are rarely accessed by users to identify categories that users dislike, and use the categories that users dislike as a basis for reducing points when sorting search results.
示例性地,针对推荐给用户的前100条文章进行访问频次的记录,通过将文章在数据库中的ID与用户ID进行关联,对访问次数进行计数,在排序时获取该访问次数,访问次数很小或者为0时,对相应的文章进行减分。Exemplarily, record the access frequency of the first 100 articles recommended to the user, by associating the ID of the article in the database with the user ID, count the number of visits, and obtain the number of visits when sorting, the number of visits is very When it is small or 0, the corresponding article will be deducted.
示例性地,针对这些访问次数很小或者为0的文章,获取这些低频访问文章所属的类别,当某一个类别的低频访问文章超过一定数量时,即认为该类别为用户的非偏好类别,在对搜索结果进行排序时,对属于非偏好类别的搜索结果进行减分。Exemplarily, for these articles with a small number of visits or 0, the categories of these low-frequency access articles are obtained. When the low-frequency access articles of a certain category exceed a certain number, the category is considered to be the user's non-preferred category. When ranking search results, subtract points for search results belonging to non-preferred categories.
根据对搜索结果中的文章的分数按照从高到低的顺序进行排序,并将排序后的搜索结果作为推荐结果返回给用户。通过将代表用户搜索偏好的搜索推荐词以及偏好类别与搜索索引 词一并用于对文章进行相关性评分并根据评分结果进行排序,从而使得更符合用户搜索意图的文章显示在前面,方便用户更加快捷地找到所需的文章。According to the scores of the articles in the search results, they are sorted from high to low, and the sorted search results are returned to the user as recommended results. By combining the search recommendation words and preference categories representing the user's search preferences with the search index The words are used to score the relevance of the articles and sort them according to the scoring results, so that the articles that are more in line with the user's search intention are displayed in the front, which is convenient for the user to find the desired article more quickly.
本公开实施例提供的搜索推荐方法,通过对搜索语句基于知识库字典树提取所包含的索引词,并获取用户的搜索推荐词和/或搜索偏好进行搜索,对搜索语句进行了扩展,获得了更多符合用户搜索意图的结果。另一方面,将代表用户搜索偏好的搜索推荐词以及偏好类别与搜索索引词一并用于对文章进行相关性评分并根据评分结果进行排序,从而使得更符合用户搜索意图的文章显示在前。在一实施方式中,通过对用户搜索和阅览数据的深度学习,预测和分析用户的搜索偏好,并将分析结果用于修正用户的搜索偏好,使得用户的搜索偏好能根据搜索历史智能调整。获得了更符合用户意图的搜索推荐结果,提升了用户的满意度。The search recommendation method provided by the embodiment of the present disclosure extracts the index words contained in the search sentence based on the dictionary tree of the knowledge base, and obtains the user's search recommendation words and/or search preferences for searching, and expands the search sentence to obtain More results that match user search intent. On the other hand, the search recommendation words and preference categories representing the user's search preferences are used together with the search index words to score the relevance of the articles and sort them according to the scoring results, so that the articles that are more in line with the user's search intention are displayed first. In one embodiment, the user's search preference is predicted and analyzed through deep learning of user search and browsing data, and the analysis result is used to modify the user's search preference, so that the user's search preference can be intelligently adjusted according to the search history. The search recommendation results that are more in line with the user's intention are obtained, and the user's satisfaction is improved.
本公开实施例的技术方案通过在搜索时关联用户的历史搜索偏好改进搜索结果,并对搜索结果的排序进行优化,从而能够更准确地判断用户的搜索意图,提升了用户的体验。The technical solutions of the embodiments of the present disclosure improve the search results by associating the user's historical search preferences when searching, and optimize the ranking of the search results, so that the user's search intention can be judged more accurately, and the user's experience is improved.
此外,本公开的实施例还提供了关于搜索推荐方法的两个具体实施例,具体如下。In addition, the embodiments of the present disclosure also provide two specific embodiments on the search recommendation method, which are as follows.
实施例一Embodiment one
请参照图2和图3,图2为本公开实施例在电信行业中的搜索推荐方法的流程示意图,图3为实施例一所构造的字典树的示意图。该实施例仅为了简单起见,指定了具体场景下的搜索推荐,本公开也适用于其它场景下的搜索推荐。Please refer to FIG. 2 and FIG. 3 . FIG. 2 is a schematic flowchart of a search and recommendation method in the telecommunications industry according to an embodiment of the present disclosure, and FIG. 3 is a schematic diagram of a dictionary tree constructed in Embodiment 1. This embodiment specifies search recommendations in specific scenarios only for simplicity, and the present disclosure is also applicable to search recommendations in other scenarios.
步骤1,收集电信行业知识文章,例如:“互联网国内数据传送”、“互联网数据中心”、“呼叫中心”。并对文章按内容进行分类:“互联网国内数据传送”属于基础电信业务,“互联网数据中心”、“呼叫中心”属于增值电信业务。Step 1. Collect knowledge articles in the telecommunications industry, for example: "Internet Domestic Data Transmission", "Internet Data Center", "Call Center". And the article is classified according to the content: "Internet domestic data transmission" belongs to the basic telecommunications business, "Internet data center" and "call center" belong to the value-added telecommunications business.
步骤2,对知识文章进行分词,对“互联网国内数据传送”的分词结果包括:“互联网”、“国内”、“数据”、“传送”,对“互联网数据应用中心”的分词结果包括:“互联网”、“数据”、“中心”、“数据中心”,对“呼叫中心”的分词结果包括:“呼叫”、“中心”。针对以上知识文章分词后的结果如表1所示。Step 2: Carry out word segmentation for knowledge articles. The word segmentation results for "internet domestic data transmission" include: "Internet", "domestic", "data", and "transmission", and the word segmentation results for "Internet Data Application Center" include: " Internet", "data", "center", "data center", the word segmentation results for "call center" include: "call", "center". Table 1 shows the results of word segmentation for the above knowledge articles.
表1

Table 1

依据上述表格所示的分词结果及分类构建如图3所示的字典树。According to the word segmentation results and classification shown in the above table, a dictionary tree as shown in Figure 3 is constructed.
步骤3,获取用户ID(如Test1),并获取用户输入的搜索语句,例如“互联网数据”。Step 3, obtain the user ID (such as Test1), and obtain the search sentence entered by the user, such as "Internet data".
步骤4,用户的搜索语句经过字典树节点进行匹配,得到索引词为“互联网”、“数据”,和匹配的分类“基础电信业务”、“增值电信业务”。Step 4, the user's search sentence is matched through the nodes of the dictionary tree, and the index words "Internet" and "data" are obtained, and the matching categories are "basic telecommunication services" and "value-added telecommunication services".
步骤5,通过用户ID获取数据库存储的用户搜索偏好中的搜索推荐词,如“国内”。Step 5, obtain the search recommendation words in the user search preferences stored in the database through the user ID, such as "domestic".
步骤6,根据索引词、搜索推荐词所属分类在不同分类索引进行匹配搜索,得到匹配的文章包括:“互联网国内数据传送”、“互联网数据中心”。Step 6: Perform a matching search in different classification indexes according to the category of the index words and search recommendation words, and the matched articles include: "Internet domestic data transmission" and "Internet data center".
步骤7,为索引词、搜索推荐词设置不同的权重值,其中索引词的权重大于搜索推荐词的权重,并根据索引词、搜索推荐词的权重利用TF-IDF评分算法对匹配文章进行评分,计算词W在文档D中的词频TF即词W在文档D中出现次数COUNT(W,D)和文档D中总词数SIZE(D)的比值:TF(W,D)=COUNT(W,D)/SIZE(D)。Step 7. Set different weight values for index words and search recommended words, where the weight of index words is greater than the weight of search recommended words, and use the TF-IDF scoring algorithm to score matching articles according to the weights of index words and search recommended words. Calculate the term frequency TF of word W in document D, that is, the ratio of the number of occurrences of word W in document D COUNT(W,D) to the total number of words in document D SIZE(D): TF(W,D)=COUNT(W, D)/SIZE(D).
“互联网国内数据传送”、“互联网数据中心”、“呼叫中心”统计如表2所示。The statistics of "internet domestic data transmission", "internet data center" and "call center" are shown in Table 2.
表2
Table 2
词W在整个文档集合中的逆向文档频率IDF,即文档总数N与词W所出现文件数DOCS(W,D)比值的对数IDF=log(N/DOCS(W,D))。The inverse document frequency IDF of the word W in the entire document collection, that is, the logarithm of the ratio of the total number of documents N to the number of documents DOCS(W,D) in which the word W appears IDF=log(N/DOCS(W,D)).
例如当文档总数为10000时,可统计出数据如表3所示。For example, when the total number of documents is 10000, the statistical data can be shown in Table 3.
表3
table 3
TF-IDF模型根据TF和IDF为每一个文档D和由关键词W[1]...W[k]组成的查询串Q计算一个数值,用于表示查询词串Q与文档D的匹配度:The TF-IDF model calculates a value for each document D and the query string Q composed of keywords W[1]...W[k] according to TF and IDF, which is used to indicate the matching degree between the query string Q and the document D :
tf-idf(q,d)=sum{i=1..k|tf-idf(w[i],d)}=sum{i=1..k|tf(w[i],d)*idf(w[i])}tf-idf(q,d)=sum{i=1..k|tf-idf(w[i],d)}=sum{i=1..k|tf(w[i],d)* idf(w[i])}
在一实施方式中,加入了加权系数H,最终查询匹配度为:In one embodiment, the weighting coefficient H is added, and the final query matching degree is:
tf-idf(q,d,H)=sum{i=1..k|tf-idf(w[i],d)}=sum{i=1..k|H*tf(w[i],d)*idf(w[i])}tf-idf(q,d,H)=sum{i=1..k|tf-idf(w[i],d)}=sum{i=1..k|H*tf(w[i] ,d)*idf(w[i])}
例如预定偏好分值比例如表4所示。For example, the predetermined preference score ratio is shown in Table 4.
表4
Table 4
通过tf-idf(q,d,H)所计算的得分如表5所示。The scores calculated by tf-idf(q,d,H) are shown in Table 5.
在一实施方式中,通过用户ID查询数据库,得到偏好分类如“基础电信业务”,再根据偏好分类“基础电信业务”对搜索结果中属于该偏好分类的文章进行加分处理,对“互联网国内数据传送”进一步进行加分。In one embodiment, the database is queried through the user ID to obtain preference categories such as "basic telecommunication services", and then according to the preference category "basic telecommunication services", the articles belonging to the preference category in the search results are added points, and the "internet domestic Data Transfer" for further bonus points.
表5
table 5
根据上述的相关性计算,“互联网国内数据传送”分值大于“互联网数据中心”。According to the above correlation calculation, the score of "internet domestic data transmission" is higher than that of "internet data center".
步骤8,最后将搜索结果按照分数从大到小的分值排序形成推荐结果列表,最后将推荐结果列表返回给用户。Step 8: Finally, sort the search results according to the scores from large to small to form a recommended result list, and finally return the recommended result list to the user.
实施例二Embodiment two
本公开实施例二提供了用户偏好筛选方法在电信客服系统中的应用,本公开也适用于其它应用场景下的偏好筛选。 Embodiment 2 of the present disclosure provides an application of a user preference screening method in a telecommunication customer service system, and the present disclosure is also applicable to preference screening in other application scenarios.
步骤1,采集用户各维度信息,如用户岗位信息(业务咨询、费用查询、业务办理、新业务推销等)、面向客户类型(家庭客户、政企客户、公用电话、无线市话)、工作时段,以及用户对文章的搜索和浏览数据,包括:浏览顺序、阅读文章次数、阅读文章时间、是否收藏等,将用户的各维度信息和搜索浏览数据输入到输入层的神经元。Step 1, collect user information in various dimensions, such as user position information (business consultation, fee inquiry, business handling, new business promotion, etc.), customer-oriented types (family customers, government and enterprise customers, public phones, wireless local calls), working hours, And the user's search and browsing data of articles, including: browsing order, number of times of reading articles, time of reading articles, whether to save or not, etc., input the user's various dimension information and search and browsing data to the neurons of the input layer.
步骤2,为各个分类建立BP神经网络模型,将上述各维度信息数据作为BP神经网络的输入,在隐含层利用激励函数计算并输出最近时段阅读文章偏好分值,根据BP神经网络结构获取用户最近阅读文章偏好分值。Step 2: Establish a BP neural network model for each category, use the above-mentioned information data of each dimension as the input of the BP neural network, use the activation function in the hidden layer to calculate and output the preference score for reading articles in the latest period, and obtain user information based on the BP neural network structure Recently read article preference score.
步骤3,统计最近阅读文章偏好分值高的文章,根据文章所属分类、偏好分值、分类文章数量按自定义算法计算出用户偏好分类分值,将分值高的分类作为用户的偏好分类。Step 3: Count the recently read articles with high preference scores, calculate the user preference classification score according to the category, preference score, and number of classified articles according to the article category, preference score, and the number of classified articles, and use the category with high score as the user's preference category.
步骤4,统计最近阅读文章偏好分值高的文章,根据最近阅读文章分析出各个文章中频率较高的粗粒度分词词汇,将此类词汇作为用户的搜索推荐词,修正更新用户数据库中的搜索推荐词。Step 4: Count the articles with high preference scores of the recently read articles, analyze the coarse-grained word segmentation words with high frequency in each article according to the recently read articles, use such words as the user's search recommendation words, and correct and update the search in the user database Recommended words.
步骤5,利用推送给用户的排序条文,继续采集用户阅读行为的数据,包括:匹配文章点击次数、阅读时长、是否收藏等作为步骤1的输入。浏览顺序靠前、多次点击、阅读时间长、收藏某文章则反映了用户喜欢该文章;反之,则说明用户不喜欢该文章。Step 5, continue to collect data on the user's reading behavior by using the ranking items pushed to the user, including: matching the number of clicks on the article, the length of reading, whether to save it, etc. as the input of step 1. The top browsing order, multiple clicks, long reading time, and bookmarking an article reflect that the user likes the article; otherwise, it means that the user does not like the article.
最后,通过基于BP神经网络对用户的历史行为以及的用户信息更新用户的搜索推荐词和偏好分类。Finally, the user's search recommendation words and preference classification are updated based on the user's historical behavior and user information based on the BP neural network.
请参照图4,图4为实施本实施例提供的搜索推荐系统的一场景示意图,如图4所示,上述实施例提供的搜索推荐系统包括:索引词获取模块201、用户偏好获取模块202,搜索模块203。Please refer to FIG. 4. FIG. 4 is a schematic diagram of a scene implementing the search recommendation system provided by this embodiment. As shown in FIG. Search module 203 .
其中,索引词获取模块201,包括:搜索请求获取模块2011、索引词匹配模块2012。搜索请求获取模块2011用于接收用户的搜索语句,索引词匹配模块2012用于提取输入搜索语句中的索引词与分类。索引词与分类生成的具体方法为:通过搜索语句与预设字典树的匹配,从字典树中提取搜索语句中包含的索引词以及索引词对应的分类信息。Wherein, the index word obtaining module 201 includes: a search request obtaining module 2011 and an index word matching module 2012 . The search request obtaining module 2011 is used to receive the user's search statement, and the index word matching module 2012 is used to extract the index words and classifications in the input search sentence. The specific method for generating index words and classifications is as follows: by matching the search sentence with a preset dictionary tree, extracting the index words contained in the search sentence and the classification information corresponding to the index words from the dictionary tree.
用户偏好获取模块202,包括,搜索推荐词获取模块2021和偏好分类获取模块2022,搜索推荐词获取模块2021用于根据用户的ID信息从数据库中获取用户的搜索推荐词;偏好分类获取模块2022用于根据用户的ID信息从数据库获取用户的偏好分类,并将该偏好分类作为后续对搜索结果排序的依据之一。The user preference acquisition module 202 includes a search recommendation word acquisition module 2021 and a preference classification acquisition module 2022. The search recommendation word acquisition module 2021 is used to obtain the user's search recommendation words from the database according to the ID information of the user; the preference classification acquisition module 2022 uses The user's preference classification is obtained from the database according to the user's ID information, and the preference classification is used as one of the basis for subsequent ranking of search results.
搜索模块203,包括同义词获取模块2031、匹配搜索模块2032。同义词获取模块2031用于获取从搜索语句中提取的索引词的同义词;匹配搜索模块2032用于根据索引词、同义词 以及用户的搜索推荐词从知识数据库中进行搜索匹配得到搜索结果。The search module 203 includes a synonym acquisition module 2031 and a matching search module 2032 . The synonym acquisition module 2031 is used to obtain the synonym of the index word extracted from the search sentence; the matching search module 2032 is used to And the user's search recommendation words are searched and matched from the knowledge database to obtain search results.
在一实施方式中,搜索推荐系统还包括:排序模块204,具体可分为:权重标记模块2041、相关性评分模块2042、排序推荐模块2043。其中,权重标记模块2041用于对索引词、同义词以及用户的搜索推荐词赋予不同的权重;相关性评分模块2042用于根据索引词、同义词、搜索推荐词以及相应的不同的权重对搜索结果进行相关性评分,其评分结果将用于对搜索结果进行排序的依据。排序推荐模块2043用于根据搜索结果的相关性评分结果以及用户的偏好分类对搜索结果按照分数从高到低的顺序进行排序形成推荐结果,并将搜索的推荐结果返回给用户。In one embodiment, the search recommendation system further includes: a ranking module 204 , which can be specifically divided into: a weight marking module 2041 , a correlation scoring module 2042 , and a ranking recommendation module 2043 . Among them, the weight marking module 2041 is used to assign different weights to index words, synonyms, and user's search recommendation words; the correlation scoring module 2042 is used to perform search results according to index words, synonyms, search recommendation words and corresponding different weights. Relevance score, the result of which will be used to sort the search results. The sorting and recommending module 2043 is used to sort the search results in order of scores from high to low according to the relevance scoring results of the search results and the user's preference classification to form recommended results, and return the searched recommended results to the user.
在一实施方式中,搜索推荐系统还包括:用户偏好筛选模块205,具体包括:用户信息获取模块2051、阅览历史获取模块2052,用户偏好分析模块2053、用户偏好修正模块2054。其中,用户信息获取模块2051用于收集用户的基础信息;阅览历史获取模块2052用于获取用户的文章搜索和查阅文章的历史数据(如浏览文章顺序、阅读次数、是否收藏);用户偏好分析模块2053用于根据用户的信息和文章的阅览历史结合BP神经网络预测用户对文章的偏好程度,提取偏好程度高的文章,将偏好文章数量和偏好程度汇总与融合提取偏好分类,根据偏好的文章按照预定筛选原则筛选出搜索推荐词;用户偏好修正模块2054用于根据BP神经网络的分析和预测结果来更新用户数据库中的偏好分类和搜索推荐词。In one embodiment, the search recommendation system further includes: a user preference screening module 205 , specifically: a user information acquisition module 2051 , a browsing history acquisition module 2052 , a user preference analysis module 2053 , and a user preference correction module 2054 . Among them, the user information acquisition module 2051 is used to collect the basic information of the user; the browsing history acquisition module 2052 is used to obtain the user's article search and historical data of the article (such as the order of browsing articles, the number of times they are read, and whether they are favorites); the user preference analysis module 2053 is used to predict the degree of user preference for articles based on user information and reading history of articles combined with BP neural network, extract articles with high preference, summarize and fuse the number of preferred articles and degree of preference, and extract preference classification. The search recommendation words are screened out by predetermined screening principles; the user preference modification module 2054 is used to update the preference classification and search recommendation words in the user database according to the analysis and prediction results of the BP neural network.
在一实施方式中,搜索推荐系统还包括:知识库分类索引模块206,具体可分为:知识库分类模块2061、知识库分词模块2062、字典树构造模块2063。其中,知识库分类模块2061用于将知识库的文章按照业务进行分类;知识库分词模块2062用于提取知识库文章中的专业词汇和其他常用自定义词汇作为索引词;字典树构造子模块2063用于根据索引词文章所属分类构造字典树。In one embodiment, the search recommendation system further includes: a knowledge base classification and indexing module 206 , which can be specifically divided into: a knowledge base classification module 2061 , a knowledge base word segmentation module 2062 , and a dictionary tree construction module 2063 . Among them, the knowledge base classification module 2061 is used to classify the articles in the knowledge base according to the business; the knowledge base word segmentation module 2062 is used to extract the professional vocabulary and other commonly used custom vocabulary in the knowledge base articles as index words; the dictionary tree construction sub-module 2063 It is used to construct a dictionary tree according to the classification of the indexed articles.
请参阅图5,图5为本公开实施例提供的一种计算机设备的结构示意性框图。Please refer to FIG. 5 . FIG. 5 is a schematic structural block diagram of a computer device provided by an embodiment of the present disclosure.
如图5所示,计算机设备300包括处理器301和存储器302,处理器301和存储器302通过总线303连接,该总线比如为I2C(Inter-integrated Circuit)总线。As shown in FIG. 5, a computer device 300 includes a processor 301 and a memory 302, and the processor 301 and the memory 302 are connected through a bus 303, such as an I2C (Inter-integrated Circuit) bus.
在一实施方式中,处理器301用于提供计算和控制能力,支撑整个计算机设备的运行。处理器301可以是中央处理单元(Central Processing Unit,CPU),该处理器301还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。 In one embodiment, the processor 301 is used to provide computing and control capabilities to support the operation of the entire computer device. The processor 301 can be a central processing unit (Central Processing Unit, CPU), and the processor 301 can also be other general-purpose processors, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC ), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. Wherein, the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
在一实施方式中,存储器302可以是Flash芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。In one embodiment, the memory 302 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a U disk, or a mobile hard disk.
本领域技术人员可以理解,图5中示出的结构,仅仅是与本公开实施例方案相关的部分结构的框图,并不构成对本公开实施例方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a partial structure related to the embodiment of the present disclosure, and does not constitute a limitation on the computer equipment to which the embodiment of the present disclosure is applied. The computer device may include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
其中,处理器用于运行存储在存储器中的计算机程序,并在执行计算机程序时实现本公开实施例提供的任意一种搜索推荐方法。Wherein, the processor is configured to run a computer program stored in the memory, and implement any search and recommendation method provided by the embodiments of the present disclosure when executing the computer program.
在一实施例中,处理器用于运行存储在存储器中的计算机程序,并在执行计算机程序时实现如下步骤:获取用户输入的搜索语句对应的索引词;获取用户的搜索推荐词,搜索推荐词用于关联用户的历史搜索行为;根据索引词及搜索推荐词获取搜索结果。In one embodiment, the processor is configured to run a computer program stored in the memory, and implement the following steps when executing the computer program: obtain the index word corresponding to the search sentence input by the user; obtain the user's search recommended word, and search the recommended word with Based on the historical search behavior of associated users; obtain search results based on index words and search recommendation words.
在一实施例中,处理器在实现搜索推荐方法时,用于实现:对搜索结果进行相关性评分,并依据得到的相关性分数对搜索结果进行排序得到推荐结果。In an embodiment, when implementing the search recommendation method, the processor is configured to: perform correlation scoring on search results, and sort the search results according to the obtained correlation scores to obtain recommendation results.
在一实施例中,处理器在实现搜索推荐方法时,用于实现:获取用户的偏好分类,依据偏好分类对推荐结果进行优化,得到优化后的推荐结果;其中,依据偏好分类对推荐结果进行优化包括:对属于偏好分类的搜索结果的相关性分数进行加权,并依据得到的加权后的相关性分数对搜索结果进行排序得到优化后的推荐结果。In one embodiment, when implementing the search and recommendation method, the processor is configured to: obtain user preference classifications, optimize the recommendation results according to the preference classifications, and obtain optimized recommendation results; where the recommendation results are optimized according to the preference classifications The optimization includes: weighting the relevance scores of the search results belonging to the preference category, and sorting the search results according to the obtained weighted relevance scores to obtain optimized recommendation results.
在一实施例中,处理器在实现搜索推荐方法时,用于实现:收集用户的信息和搜索历史,基于预设的BP神经网络模型分析用户的信息和搜索历史得到用户的搜索偏好;基于搜索偏好更新用户的搜索推荐词及偏好分类。In one embodiment, when the processor implements the search recommendation method, it is used to: collect user information and search history, analyze the user information and search history based on the preset BP neural network model to obtain the user's search preference; Preferences update the user's search recommendation words and preference categories.
在一实施例中,处理器在实现获取用户输入的搜索语句对应的索引词时,用于实现:将用户输入的搜索语句与预设的字典树进行分词匹配,得到与搜索语句对应的索引词。In one embodiment, when the processor obtains the index words corresponding to the search sentences input by the user, it is used to implement: perform word segmentation and matching on the search sentences input by the user and the preset dictionary tree, and obtain the index words corresponding to the search sentences .
在一实施例中,处理器在实现搜索推荐方法时,用于实现:对预设的知识库中的文章进行分类,依据将文章的内容进行分词的结果与文章所属的类别构造预设的字典树。In one embodiment, when implementing the search and recommendation method, the processor is used to: classify the articles in the preset knowledge base, construct a preset dictionary according to the results of word segmentation of the content of the articles and the categories to which the articles belong Tree.
在一实施例中,处理器在实现根据索引词及搜索推荐词获取搜索结果时,用于实现:从预设的关系数据库中查找与索引词关联的同义词;根据索引词、同义词以及搜索推荐词获取搜索结果。In one embodiment, when the processor acquires the search results according to the index words and search recommended words, it is used to realize: searching for synonyms associated with the index words from a preset relational database; Get search results.
在一实施例中,处理器在实现对对所述搜索结果进行相关性评分时,用于实现:为索引词、同义词及搜索推荐词设置不同权重对搜索结果进行相关性评分,得到搜索结果的相关性分数。 In one embodiment, when the processor implements the correlation scoring of the search results, it is used to: set different weights for index words, synonyms and search recommendation words to perform correlation scoring on the search results, and obtain the search results relevance score.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的计算机设备的具体工作过程,可以参考前述搜索推荐方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the computer device described above can refer to the corresponding process in the foregoing search and recommendation method embodiment, which will not be repeated here.
本公开实施例还提供一种存储介质,用于计算机可读存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现如本公开实施例说明书提供的任一项搜索推荐方法的步骤。The embodiment of the present disclosure also provides a storage medium for computer-readable storage, the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors, so as to realize the Steps for any one of the search recommended methods provided in the instructions.
其中,存储介质可以是前述实施例的计算机设备的内部存储单元,例如计算机设备的硬盘或内存。存储介质也可以是计算机设备的外部存储设备,例如计算机设备上配备的插接式硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。Wherein, the storage medium may be an internal storage unit of the computer device in the foregoing embodiments, such as a hard disk or memory of the computer device. The storage medium can also be an external storage device of the computer equipment, such as a plug-in hard disk equipped on the computer equipment, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card) wait.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施例中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those of ordinary skill in the art can understand that all or some of the steps in the methods disclosed above, the functional modules/units in the system, and the device can be implemented as software, firmware, hardware, and an appropriate combination thereof. In hardware embodiments, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
在本公开说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。 The term "and/or" used in the present disclosure and the appended claims refers to any combination of one or more of the associated listed items and all possible combinations, and includes these combinations. As used herein, the terms "comprises,""comprises," or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article, or system that includes a set of elements includes not only those elements, but also includes not expressly other elements listed, or also include elements inherent in such a process, method, article, or system. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article or system comprising that element.
上述本公开实施例序号仅仅为了描述,不代表实施例的优劣。以上所述,仅为本公开的具体实施例,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。 The serial numbers of the above-mentioned embodiments of the present disclosure are for description only, and do not represent the advantages and disadvantages of the embodiments. The above are only specific embodiments of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily think of various equivalents within the technical scope of the present disclosure. Modifications or replacements should be covered within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be determined by the protection scope of the claims.

Claims (11)

  1. 一种搜索推荐方法,其中,包括:A search recommendation method, including:
    获取用户输入的搜索语句对应的索引词;Obtain the index words corresponding to the search sentence entered by the user;
    获取所述用户的搜索推荐词,所述搜索推荐词用于关联所述用户的历史搜索行为;Obtain the user's search recommendation words, the search recommendation words are used to correlate the user's historical search behavior;
    根据所述索引词及所述搜索推荐词获取搜索结果。Acquire search results according to the index words and the search recommendation words.
  2. 根据权利要求1所述的搜索推荐方法,其中,所述根据所述索引词及所述搜索推荐词获取搜索结果之后,所述方法还包括:The search recommendation method according to claim 1, wherein, after obtaining the search results according to the index words and the search recommendation words, the method further comprises:
    对所述搜索结果进行相关性评分,并依据得到的相关性分数对所述搜索结果进行排序得到推荐结果。Carrying out a correlation score on the search results, and sorting the search results according to the obtained correlation scores to obtain a recommendation result.
  3. 根据权利要求2所述的搜索推荐方法,其中,所述方法还包括:The search recommendation method according to claim 2, wherein said method further comprises:
    获取所述用户的偏好分类,依据所述偏好分类对所述推荐结果进行优化,得到优化后的推荐结果;Obtaining the preference classification of the user, optimizing the recommendation result according to the preference classification, and obtaining the optimized recommendation result;
    其中,依据所述偏好分类对所述推荐结果进行优化包括:Wherein, optimizing the recommendation result according to the preference classification includes:
    对属于所述偏好分类的搜索结果的相关性分数进行加权,并依据得到的加权后的相关性分数对所述搜索结果进行排序得到优化后的推荐结果。Weighting the relevance scores of the search results belonging to the preference category, and sorting the search results according to the obtained weighted relevance scores to obtain optimized recommendation results.
  4. 根据权利要求3所述的搜索推荐方法,其中,所述方法还包括:The search recommendation method according to claim 3, wherein said method further comprises:
    收集所述用户的信息和搜索历史,基于预设的BP神经网络模型分析所述用户的信息和搜索历史得到所述用户的搜索偏好;Collect the user's information and search history, analyze the user's information and search history based on the preset BP neural network model to obtain the user's search preference;
    基于所述搜索偏好更新所述用户的搜索推荐词及偏好分类。The user's search recommendation words and preference categories are updated based on the search preferences.
  5. 根据权利要求1所述的搜索推荐方法,其中,所述获取用户输入的搜索语句对应的索引词,包括:The search recommendation method according to claim 1, wherein said obtaining the index words corresponding to the search sentence input by the user comprises:
    将用户输入的搜索语句与预设的字典树进行分词匹配,得到与所述搜索语句对应的索引词。The search sentence input by the user is segmented and matched with the preset dictionary tree to obtain the index word corresponding to the search sentence.
  6. 根据权利要求5所述的搜索推荐方法,其中,所述方法还包括:The search recommendation method according to claim 5, wherein said method further comprises:
    对预设的知识库中的文章进行分类,依据将所述文章的内容进行分词的结果与所述文章所属的类别构造所述预设的字典树。The articles in the preset knowledge base are classified, and the preset dictionary tree is constructed according to the word segmentation result of the contents of the articles and the category to which the articles belong.
  7. 根据权利要求2所述的搜索推荐方法,其中,所述根据所述索引词及所述搜索推荐词获取搜索结果,还包括:The search recommendation method according to claim 2, wherein said obtaining search results according to said index words and said search recommendation words further comprises:
    从预设的关系数据库中查找与所述索引词关联的同义词; Searching for synonyms associated with the index term from a preset relational database;
    根据所述索引词、所述同义词以及所述搜索推荐词获取搜索结果。The search results are acquired according to the index words, the synonyms and the search recommendation words.
  8. 根据权利要求7所述的搜索推荐方法,其中,所述对所述搜索结果进行相关性评分包括:The search recommendation method according to claim 7, wherein said performing a correlation score on said search result comprises:
    为所述索引词、所述同义词及所述搜索推荐词设置不同权重对所述搜索结果进行相关性评分,得到所述搜索结果的相关性分数。Setting different weights for the index words, the synonyms, and the search recommendation words to perform a correlation score on the search results to obtain a correlation score of the search results.
  9. 一种搜索推荐系统,其中,包括:A search recommendation system, including:
    索引词获取模块,用于获取用户输入的搜索语句对应的索引词;An index term obtaining module, configured to obtain the index term corresponding to the search statement input by the user;
    用户偏好模块,用于获取所述用户的搜索推荐词,所述搜索推荐词用于关联所述用户的历史搜索行为;A user preference module, configured to obtain the user's search recommendation words, and the search recommendation words are used to correlate the user's historical search behavior;
    搜索模块,用于根据所述索引词及所述搜索推荐词获取搜索结果。A search module, configured to acquire search results according to the index words and the search recommendation words.
  10. 一种计算机设备,其中,所述计算机设备包括处理器、存储器、存储在所述存储器上并可被所述处理器执行的计算机程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,其中所述计算机程序被所述处理器执行时,实现如权利要求1至8中任一项所述的搜索推荐方法的步骤。A computer device, wherein the computer device comprises a processor, a memory, a computer program stored on the memory and executable by the processor, and a connection between the processor and the memory A data bus for communication, wherein when the computer program is executed by the processor, the steps of the search and recommendation method according to any one of claims 1 to 8 are realized.
  11. 一种存储介质,用于计算机可读存储,其中,所述存储介质存储有一个或者多个程序,所述一个或者多个程序被一个或者多个处理器执行,以实现权利要求1至8中任一项所述的搜索推荐方法的步骤。 A storage medium for computer-readable storage, wherein the storage medium stores one or more programs, and the one or more programs are executed by one or more processors to implement claims 1 to 8 The steps of any one of the search and recommendation methods.
PCT/CN2023/074947 2022-02-08 2023-02-08 Search recommendation method, search recommendation system, computer device and storage medium WO2023151576A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210118904.9 2022-02-08
CN202210118904.9A CN116610853A (en) 2022-02-08 2022-02-08 Search recommendation method, search recommendation system, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2023151576A1 true WO2023151576A1 (en) 2023-08-17

Family

ID=87563626

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/074947 WO2023151576A1 (en) 2022-02-08 2023-02-08 Search recommendation method, search recommendation system, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN116610853A (en)
WO (1) WO2023151576A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117391824B (en) * 2023-12-11 2024-04-12 深圳须弥云图空间科技有限公司 Method and device for recommending articles based on large language model and search engine

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188186A (en) * 2019-04-24 2019-08-30 平安科技(深圳)有限公司 Content recommendation method, electronic device, equipment and the storage medium of medical field
CN111737574A (en) * 2020-06-19 2020-10-02 口口相传(北京)网络技术有限公司 Search information acquisition method and device, computer equipment and readable storage medium
KR20210011102A (en) * 2019-07-22 2021-02-01 주식회사 앱컴파니 An artificial intelligence system providing customized goods
CN113282832A (en) * 2021-06-10 2021-08-20 北京爱奇艺科技有限公司 Search information recommendation method and device, electronic equipment and storage medium
CN113343091A (en) * 2021-06-22 2021-09-03 力合科创集团有限公司 Industrial and enterprise oriented science and technology service recommendation calculation method, medium and program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188186A (en) * 2019-04-24 2019-08-30 平安科技(深圳)有限公司 Content recommendation method, electronic device, equipment and the storage medium of medical field
KR20210011102A (en) * 2019-07-22 2021-02-01 주식회사 앱컴파니 An artificial intelligence system providing customized goods
CN111737574A (en) * 2020-06-19 2020-10-02 口口相传(北京)网络技术有限公司 Search information acquisition method and device, computer equipment and readable storage medium
CN113282832A (en) * 2021-06-10 2021-08-20 北京爱奇艺科技有限公司 Search information recommendation method and device, electronic equipment and storage medium
CN113343091A (en) * 2021-06-22 2021-09-03 力合科创集团有限公司 Industrial and enterprise oriented science and technology service recommendation calculation method, medium and program

Also Published As

Publication number Publication date
CN116610853A (en) 2023-08-18

Similar Documents

Publication Publication Date Title
US11663254B2 (en) System and engine for seeded clustering of news events
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
JP4920023B2 (en) Inter-object competition index calculation method and system
US20160034514A1 (en) Providing search results based on an identified user interest and relevance matching
WO2021098648A1 (en) Text recommendation method, apparatus and device, and medium
CN109299383B (en) Method and device for generating recommended word, electronic equipment and storage medium
US9720979B2 (en) Method and system of identifying relevant content snippets that include additional information
WO2006108069A2 (en) Searching through content which is accessible through web-based forms
Im et al. Linked tag: image annotation using semantic relationships between image tags
WO2012129149A2 (en) Aggregating search results based on associating data instances with knowledge base entities
WO2012142553A2 (en) Identifying query formulation suggestions for low-match queries
KR20150016973A (en) Generating search results
CN110390094B (en) Method, electronic device and computer program product for classifying documents
WO2018176913A1 (en) Search method and apparatus, and non-temporary computer-readable storage medium
KR20180097120A (en) Method for searching electronic document and apparatus thereof
WO2023151576A1 (en) Search recommendation method, search recommendation system, computer device and storage medium
AU2018313274B2 (en) Diversity evaluation in genealogy search
CN111930949B (en) Search string processing method and device, computer readable medium and electronic equipment
US9336280B2 (en) Method for entity-driven alerts based on disambiguated features
Selvan et al. ASE: Automatic search engine for dynamic information retrieval
TWI483129B (en) Retrieval method and device
CN116610782B (en) Text retrieval method, device, electronic equipment and medium
CN112860940B (en) Music resource retrieval method based on sequential concept space on description logic knowledge base
KR101137491B1 (en) System and Method for Utilizing Personalized Tag Recommendation Model in Web Page Search
Lobo et al. A novel method for analyzing best pages generated by query term synonym combination

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23752356

Country of ref document: EP

Kind code of ref document: A1