CN116204688A - Method for recommending user search terms based on typing search terms - Google Patents

Method for recommending user search terms based on typing search terms Download PDF

Info

Publication number
CN116204688A
CN116204688A CN202310483388.4A CN202310483388A CN116204688A CN 116204688 A CN116204688 A CN 116204688A CN 202310483388 A CN202310483388 A CN 202310483388A CN 116204688 A CN116204688 A CN 116204688A
Authority
CN
China
Prior art keywords
target
search
user
difference
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310483388.4A
Other languages
Chinese (zh)
Other versions
CN116204688B (en
Inventor
李志洁
王鹏
陈拉拉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quantum Digital Technology Co ltd
Original Assignee
Quantum Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quantum Digital Technology Co ltd filed Critical Quantum Digital Technology Co ltd
Priority to CN202310483388.4A priority Critical patent/CN116204688B/en
Publication of CN116204688A publication Critical patent/CN116204688A/en
Application granted granted Critical
Publication of CN116204688B publication Critical patent/CN116204688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of electric digital data processing, in particular to a method for recommending user search words based on typing search words, which comprises the following steps: acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information; according to the target category corresponding to the target typing information, obtaining a target prediction score corresponding to each candidate search word in the candidate search word set, wherein the candidate search word set comprises: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information; screening a search word set to be recommended from the candidate search word sets according to target prediction scores corresponding to the candidate search words; and recommending the search word set to be recommended to the user to be recommended. According to the method and the device for recommending the search terms, the accuracy of recommending the search terms for the user is improved by carrying out data processing on the target typing information, and the method and the device are applied to recommending the search terms for the user.

Description

Method for recommending user search terms based on typing search terms
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a method for recommending user search words based on typing search words.
Background
With the development of science and technology, various types of electronic devices walk into people's daily life, and in order to improve the intelligent experience of using electronic devices, most electronic devices at present often recommend related content according to search words of users, where the search words refer to terms input when users search for content in a search engine. In order to improve user experience, search word recommendation is often performed when a user inputs part of the content of the search word, and when the recommended search word contains the search word required by the user, the user does not need to perform subsequent input, so that the user experience is improved. Currently, when a user is recommended a search term, the following methods are generally adopted: recommended search terms are determined based on historical search terms of the user.
However, when the above manner is adopted, there are often the following technical problems:
when the content that the user wants to search is content in terms of the type that the user has not searched for in the history, it is often difficult to accurately recommend the search word to the user based on the user's history search word, resulting in low accuracy in recommending the search word to the user.
Disclosure of Invention
The summary of the invention is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. The summary of the invention is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In order to solve the technical problem of low accuracy of search word recommendation for users, the invention provides a method for recommending user search words based on typing the search words.
The invention provides a method for recommending user search words based on typing search words, which comprises the following steps:
acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information;
obtaining a target prediction score corresponding to each candidate search word in a candidate search word set according to a target category corresponding to the target typing information, wherein the candidate search word set comprises: each similar user in the similar user set corresponding to the user to be recommended and search words of the user to be recommended under the target category corresponding to the target typing information;
screening a search word set to be recommended from the candidate search word set according to target prediction scores corresponding to the candidate search words;
recommending the search word set to be recommended to the user to be recommended;
determining the set of similar users comprises the steps of:
acquiring a historical search information set corresponding to a user to be recommended and a historical search information set corresponding to each reference user in a reference user set;
Classifying search words included in all the obtained historical search information sets to obtain a target category set;
determining target association degrees between every two search words included in all obtained historical search information sets;
determining target evaluation indexes of each target user under each target category according to all the obtained historical search information sets, the target category sets and the target association degrees among the search words to obtain a target evaluation matrix, wherein the target users are users to be recommended or reference users;
and screening a similar user set from the reference user set according to all the obtained historical search information sets and the target evaluation matrix.
Further, the classifying the search terms included in all the obtained historical search information sets to obtain a target category set includes:
inputting each search word included in all obtained historical search information sets into a target classification network which is trained in advance to obtain the probability that the search word belongs to each preset category in a preset category set, and obtaining a category probability set corresponding to the search word as the category probability of the search word under the preset category;
For each search word included in all obtained historical search information sets, screening out the maximum category probability from the category probability set corresponding to the search word, taking the maximum category probability as the target probability corresponding to the search word, and determining the preset category corresponding to the target probability corresponding to the search word as the target category corresponding to the search word;
and combining the target categories corresponding to all the search words included in all the obtained historical search information sets into a target category set.
Further, the determining the target association degree between every two search words included in all the obtained historical search information sets includes:
according to the target probabilities corresponding to the two search words, determining the two search words as a first search word and a second search word respectively;
determining a first difference between the first search word and the second search word according to a first quantity, a second quantity and a third quantity, wherein the first quantity is the quantity of the historical search information which comprises the first search word and does not comprise the second search word in all the historical search information sets, the second quantity is the quantity of the historical search information which comprises the second search word and does not comprise the first search word in all the historical search information sets, the third quantity is the quantity of the historical search information which comprises both the first search word and the second search word in all the historical search information sets, the first quantity and the second quantity are positively correlated with the first difference, and the third quantity is negatively correlated with the first difference;
Determining an absolute value of a difference value of a target probability corresponding to the second probability and the first search word as a second difference between the first search word and the second search word, wherein the second probability is a category probability of the second search word under a target category corresponding to the first search word;
determining a third difference between the first search word and the second search word according to the first difference and the second difference between the first search word and the second search word, wherein the first difference and the second difference are positively correlated with the third difference;
encoding the first search word and the second search word to obtain first encoded data corresponding to the first search word and second encoded data corresponding to the second search word;
determining an edit distance between the first encoded data and the second encoded data as a fourth difference between the first search term and the second search term;
and determining the target association degree between the first search word and the second search word according to a fourth difference and a third difference between the first search word and the second search word, wherein the fourth difference and the third difference are in negative correlation with the target association degree.
Further, the determining the target evaluation index of each target user under each target category according to the obtained all historical search information sets, the target category sets and the target association degree between the search words comprises the following steps:
Determining the duty ratio of a fourth quantity in a fifth quantity as an initial evaluation index of the target user under the target category, wherein the fourth quantity is the quantity of the target user input search words included in all the historical search information sets, and the fifth quantity is the quantity of the target user input search words included in all the historical search information sets;
determining a first relevance of the target user under the target category according to target relevance between each search word typed in by the target user and each search word in the target category, which are included by all historical search information sets, wherein the target relevance between each search word typed in by the target user and each search word in the target category is positively correlated with the first relevance;
determining a second association degree corresponding to the target category according to target association degrees between all search words typed in by all target users and all search words in the target category, wherein the target association degrees between all search words typed in by all target users and all search words in the target category are positively correlated with the second association degree;
Determining the first association degree of the target user under the target category and the duty ratio of the first association degree in the second association degree corresponding to the target category as the third association degree of the target user under the target category;
determining a reference evaluation index corresponding to the target user according to the initial evaluation index of the target user under the target category in the target category set, wherein the initial evaluation index of the target user under the target category in the target category set is positively correlated with the reference evaluation index;
and determining a target evaluation index of the target user under the target category according to the reference evaluation index corresponding to the target user and a third association degree of the target user under the target category, wherein the reference evaluation index and the third association degree are positively correlated with the target evaluation index.
Further, the screening the similar user set from the reference user set according to the obtained all historical search information sets and the target evaluation matrix includes:
determining the semantic fitness of each target user under each target category according to all the obtained historical search information sets to obtain a semantic fitness matrix;
And screening a similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix.
Further, the determining the semantic fitness of each target user under each target category according to all the obtained historical search information sets includes:
determining target behavior fitness of the target user under the target category according to target behavior frequency corresponding to search words included in all the historical search information sets;
determining variances of target lengths corresponding to all search words typed by the target user under the target category, which are included in all the historical search information sets, as first semantic differences of the target user under the target category;
determining a second semantic difference corresponding to each search word typed in by the target user under the target category according to a modified word set corresponding to each search word typed in by the target user under the target category, which is included in all the historical search information sets;
determining a third semantic difference of the target user under the target category according to the second semantic differences corresponding to the search words typed in by the target user under the target category, wherein the second semantic differences corresponding to the search words typed in by the target user under the target category are positively correlated with the third semantic differences;
Determining the semantic fitness of the target user under the target category according to the target behavior fitness, the first semantic difference and the third semantic difference of the target user under the target category, wherein the target behavior fitness is positively correlated with the target behavior fitness, and the first semantic difference and the third semantic difference are negatively correlated with the target behavior fitness.
Further, the determining, according to the target behavior frequency corresponding to the search terms included in all the historical search information sets, the target behavior fitness of the target user under the target category includes:
determining a first behavior difference of the target user under the target category according to target behavior frequency corresponding to each search word typed in by the target user under the target category and included in all the historical search information sets, wherein the target behavior frequency is positively correlated with the first behavior difference;
determining the variance of the target behavior frequency corresponding to all search words typed by the target user under the target category and included in all the historical search information sets as the second behavior difference of the target user under the target category;
Determining a mean value of target behavior frequencies corresponding to all search words in all target categories in the target category set included in all historical search information sets as a reference behavior frequency;
determining the accumulated sum of differences of the target behavior frequency corresponding to each search word typed by the target user under the target category and the reference behavior frequency included in all the historical search information sets as a third behavior difference of the target user under the target category;
and determining the target behavior compliance degree of the target user under the target category according to the first behavior difference, the second behavior difference and the third behavior difference of the target user under the target category, wherein the first behavior difference, the second behavior difference and the third behavior difference are in negative correlation with the target behavior compliance degree.
Further, the determining, according to the modified word set corresponding to each search word typed in by the target user under the target category and included in all the historical search information sets, the second semantic difference corresponding to each search word typed in by the target user under the target category includes:
determining the difference between the search word and each modification word in the modification word set corresponding to the search word, and obtaining a target difference set corresponding to the search word as a target difference between the search word and the modification word;
And determining a second semantic difference corresponding to the search word according to the target difference set corresponding to the search word, wherein each target difference in the target difference set is positively correlated with the second semantic difference.
Further, the screening the similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix includes:
for each reference user in the to-be-recommended user and the reference user set, determining the square of the difference value of the target evaluation indexes of the reference user and the to-be-recommended user in each target category, which is included in the target evaluation matrix, as a first evaluation difference between the to-be-recommended user and the reference user in the target category, and obtaining a first evaluation difference set between the to-be-recommended user and the reference user;
determining a second evaluation difference between the user to be recommended and each reference user according to a first evaluation difference set between the user to be recommended and each reference user, wherein the first evaluation difference in the first evaluation difference set is positively correlated with the second evaluation difference;
for each reference user in the user to be recommended and the reference user set, determining the square of the difference value of the semantic fitness of the reference user and the user to be recommended, which is included in the semantic fitness matrix, under each target category as a first fit difference between the user to be recommended and the reference user under the target category, and obtaining a first fit difference set between the user to be recommended and the reference user;
Determining a second fit difference between the user to be recommended and each reference user according to a first fit difference set between the user to be recommended and each reference user, wherein the first fit difference in the first fit difference set and the second fit difference are positively correlated;
determining a measurement distance between the user to be recommended and each reference user according to a second evaluation difference and a second fit difference between the user to be recommended and each reference user, wherein the second evaluation difference and the second fit difference are positively correlated with the measurement distance;
and screening a similar user set from the reference user set according to the measurement distance between the user to be recommended and each reference user in the reference user set.
Further, the obtaining, according to the target category corresponding to the target typing information, a target prediction score corresponding to each candidate search term in the candidate search term set includes:
and screening sub-search phrases corresponding to the candidate search words from target search phrases, wherein the target search phrases comprise: each similar user in the similar user set corresponding to the user to be recommended and all search words of the user to be recommended under the target category corresponding to the target typing information;
For each search word in the sub-search word group corresponding to the candidate search word, determining the semantic fitness of a target user typing the search word under the target category corresponding to the target typing information as the target fitness corresponding to the search word;
for each search word in the sub-search word groups corresponding to the candidate search word, determining a first score corresponding to the search word according to the target concordance degree and the target probability corresponding to the search word, wherein the target concordance degree and the target probability are positively correlated with the first score;
and determining target prediction scores corresponding to the candidate search words according to first scores corresponding to the search words in the sub-search word groups corresponding to the candidate search words, wherein the first scores corresponding to the search words in the sub-search word groups are positively correlated with the target prediction scores.
The invention has the following beneficial effects:
according to the method for recommending the user search word based on the typed search word, the technical problem that the accuracy of recommending the search word to the user is low is solved by carrying out data processing on the target typed information, and the accuracy of recommending the search word to the user is improved. Firstly, determining the target category corresponding to the target typing information can facilitate understanding of the content type which the user to be recommended wants to know, and can facilitate accurate recommendation subsequently. Next, since the candidate search term set includes: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. Therefore, compared with the screening of the historical search words of the user to be recommended, the search words in the candidate search word set are more consistent with the content in the type which the user to be recommended wants to know, and are not mixed together in multiple types, so that the content which the user to be recommended wants to search can be screened more easily. And secondly, screening the search words to be recommended from the candidate search word set, wherein compared with the screening of the search words from the history search words of the users to be recommended, the search words in the candidate search word set not only contain the search words which are input by the users to be recommended, but also contain the search words which are input by each similar user in the similar user set similar to the users to be recommended, so that the search words in the candidate search word set are more comprehensive, and even if the content which the users to be recommended want to search is the content which the users to be recommended do not search for in the aspect of the type, the users to be recommended can also be recommended based on the search words which are input by the similar users in the similar user set in the aspect of the type. For example, the target category corresponding to the target typing information is a type that the user to be recommended has not searched, and the search word recommendation may also be performed from the search words of each similar user in the set of similar users included in the candidate search word set under the target category corresponding to the target typing information. Then, a target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later. Finally, recommending the search term set to be recommended to the user to be recommended, so that the search term recommendation of the user to be recommended can be realized, and the accuracy of the search term recommendation of the user is improved. And secondly, based on the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user sets, comprehensively considering the target association degree and the target evaluation matrix between the search words, screening the similar user sets from the reference user sets, and improving the accuracy of determining the similar user sets, so that the accuracy of recommending the search words to the user to be recommended can be improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of recommending user search terms based on typed search terms in accordance with the present invention;
FIG. 2 is a flow chart of steps for determining a set of similar users in accordance with the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description is given below of the specific implementation, structure, features and effects of the technical solution according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a method for recommending user search words based on typing search words, which comprises the following steps:
acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information;
obtaining a target prediction score corresponding to each candidate search word in the candidate search word set according to a target category corresponding to the target typing information;
screening a search word set to be recommended from the candidate search word sets according to target prediction scores corresponding to the candidate search words;
and recommending the search word set to be recommended to the user to be recommended.
The following detailed development of each step is performed:
referring to FIG. 1, a flow diagram of some embodiments of a method of recommending user search terms based on typed search terms is shown, in accordance with the present invention. The method for recommending the user search word based on the typed search word comprises the following steps:
step S1, target typing information corresponding to a user to be recommended is obtained, and a target category corresponding to the target typing information is determined.
In some embodiments, target typing information corresponding to a user to be recommended may be obtained, and a target category corresponding to the target typing information may be determined.
The user to be recommended can be a user to be recommended by the search word. The search term may be text information that is searched. The target key-in information may also be text information. The text information may be any information of literal composition. For example, the text information may be, but is not limited to: words, sentences, idioms or a combination of words. The target typing information may be content to be recommended that the user has typed to participate in the search. The target category to which the target key information corresponds may be a category in which the target key information is located.
It should be noted that, determining the target category corresponding to the target typing information can facilitate understanding of the content type that the user to be recommended wants to know, and can facilitate accurate recommendation.
As an example, this step may include the steps of:
first, target typing information corresponding to a user to be recommended is obtained.
For example, the content that the user to be recommended has entered (input) in the search box may be acquired as target-entered information.
For example, if the content that the user to be recommended has already typed in the search box is "computer", the target typing information is "computer". If the content which is to be recommended and is already typed in the search box by the user is 'mobile phone battery', the target typing information is 'mobile phone battery'.
And secondly, determining the target category corresponding to the target typing information.
For example, the target category corresponding to the target typing information may be determined through a pre-trained target classification network.
The target classification network may be a network for determining a category of text information. The object classification network may be a TextCNN network (Text Convolutional Neural Networks, convolutional neural network for text analysis). The optimizer of the TextCNN network may be Adam.
Optionally, the training process of the object classification network may comprise the steps of:
the method comprises the steps of obtaining a reference text information set and a category of each reference text information in the reference text information set.
Wherein the reference text information may be a known category of text information.
And secondly, constructing a target classification network.
For example, a TextCNN network may be constructed as the target classification network before training.
Thirdly, taking the reference text information set as a training set of the target classification network, taking the category of each reference text information as a training label of the target classification network, and training the constructed target classification network to obtain the target classification network after training.
The loss function in the training process of the target classification network can be a cross entropy loss function. The output of the target classification network may be a probability that the reference text information belongs to each of a set of preset categories. The preset category may be a preset category. The preset category set may include: the category of each reference text information in the pre-marked reference text information set. The number of preset categories in the set of preset categories may be 100.
For example, the set of preset categories may include: computer-related category, cell phone-related category, and pencil-related category. Computer-related categories may include: information related to the computer. The cell phone related categories may include: information related to the handset. Pencil-related categories may include: information related to the pencil. If the reference text information is a computer keyboard, the category of the reference text information can be a computer related category, and when the target classification network training is carried out, the probability that the reference text information respectively belongs to the computer related category, the mobile phone related category and the pencil related category can be obtained. The reference text information is input into a target classification network after training is completed, and the maximum probability in the obtained probabilities can be the probability that the reference text information belongs to a computer-related category.
And S2, obtaining a target prediction score corresponding to each candidate search word in the candidate search word set according to the target category corresponding to the target typing information.
In some embodiments, a target prediction score corresponding to each candidate search term in the candidate search term set may be obtained according to the target category corresponding to the target typing information.
Wherein, the candidate search term set may include: and each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. For example, the candidate set of search terms may include: the method comprises the steps of searching words of each similar user in a similar user set corresponding to the user to be recommended under a target category corresponding to target typing information, and searching words of the user to be recommended under the target category corresponding to the target typing information. The candidate search word set may be a set obtained by performing duplication elimination on each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. The similar users in the set of similar users may be users having similar preferences to the users to be recommended. The search term of the user under a certain target category may be a search term input by the user and belonging to the target category. The set of candidate search terms may be a set obtained by a crawler.
It should be noted that, the larger the target prediction score corresponding to the candidate search term, the more the candidate search term should be recommended. Therefore, the target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later.
Optionally, referring to fig. 2, determining the set of similar users may include the steps of:
step 201, acquiring a historical search information set corresponding to a user to be recommended and a historical search information set corresponding to each reference user in a reference user set.
In some embodiments, a set of historical search information corresponding to the user to be recommended and a set of historical search information corresponding to each reference user in the set of reference users may be obtained.
The historical search information set corresponding to the user to be recommended may include: search term information entered by the user to be recommended at different times. Referencing the set of historical search information corresponding to the user may include: the reference user types in search term information at different times. The historical search information may include: the method comprises the steps of searching words, typing time of the searching words, target behavior frequency corresponding to the searching words, target length corresponding to the searching words and modification word set corresponding to the searching words. The historical search information may include search terms that are content that the user needs to search for. For example, the search term may be content within a search box when the user clicks a search button. The typing time of a search term may be the time the search term was entered into the search box. The frequency of target behaviors corresponding to the search term may be the number of times the search content is modified before the user clicks the search button. For example, the target behavioral frequency corresponding to a search term may be equal to the number of modifier words in the set of modifier words corresponding to the search term. The target length to which the search term corresponds may be the number of words in the search term. The set of modification words corresponding to the search word may include: the user modifies the search content to obtain content before correctly entering the search term in the search box. For example, some historical search information may include: "cell phone wallpaper picture", "2023, 04, 24, 09 minutes and 26 seconds", 4, 6, { "hand set", "hand", "cell phone wallpaper paint", "cell phone wallpaper" }. The mobile phone wallpaper picture is a search word included in the historical search information. "2023, 04, 24, 09, 31 minutes, 26 seconds" is the time of entry of the search term included in the historical search information. And 4, the target behavior frequency corresponding to the search word included in the historical search information. And 6, the target length corresponding to the search word included in the historical search information. The set of modification words corresponding to the search word may be { "hand set", "hand", "cell phone wallpaper coating", "cell phone wallpaper" }. The "handset", "hand", "cell wallpaper", and "cell wallpaper" may be modified words resulting from modification of search content. The "hand set" may be the first input error when the user inputs the search term "mobile wallpaper picture", so that the "set" in the "hand set" is deleted, and is modified once to obtain "hand", and the "hand" is added to obtain "mobile wallpaper coating", and the erroneous text "coating" is present, so that the "coating" in the "mobile wallpaper coating" is deleted, and is modified once to obtain "mobile wallpaper", and no error is input after the "mobile wallpaper", so that the obtained modification words may be "hand set", "hand", "mobile wallpaper coating" and "mobile wallpaper", respectively.
It should be noted that, the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user set are obtained, so that the similarity condition between the user to be recommended and the reference user can be conveniently and subsequently judged, and the similar user set can be conveniently screened out from the reference user set.
As an example, a crawler technology may be utilized to obtain a set of historical search information corresponding to a user to be recommended and a set of historical search information corresponding to each reference user in the set of reference users. In order to avoid the phenomenon of data abnormality in the process of the crawler, data can be cleaned on the data acquired by the crawler.
Step 202, classifying search words included in all obtained historical search information sets to obtain a target category set.
In some embodiments, the search terms included in all the obtained historical search information sets may be categorized to obtain a set of target categories.
Wherein the set of target categories may include: all the historical search information sets comprise the category of the search word.
It should be noted that, classifying the search terms included in all the obtained historical search information sets can facilitate subsequent analysis of the situation of each target user under each target category, and can facilitate subsequent screening of similar user sets from the reference user sets. The target user may be a user to be recommended or a reference user.
As an example, this step may include the steps of:
the first step, inputting each search word included in all obtained historical search information sets into a target classification network which is trained in advance to obtain the probability that the search word belongs to each preset category in a preset category set, and obtaining a category probability set corresponding to the search word by taking the probability as the category probability of the search word under the preset category.
The category probability set corresponding to the search word may include: the category probabilities of the search term under each preset category in the set of preset categories.
And secondly, screening out the maximum category probability from the category probability set corresponding to the search word for each search word included in all the obtained historical search information sets, taking the maximum category probability as the target probability corresponding to the search word, and determining the preset category corresponding to the target probability corresponding to the search word as the target category corresponding to the search word.
The target category corresponding to the search term may be a category in which the search term is located. The probability that the search word belongs to the target category corresponding to the search word may be the largest category probability in the set of category probabilities corresponding to the search word.
And thirdly, combining target categories corresponding to all search words included in the obtained historical search information set into a target category set.
Step 203, determining the target association degree between every two search words included in all the obtained historical search information sets.
In some embodiments, a target degree of association between every two search terms included in all of the resulting sets of historical search information may be determined.
Wherein, the target association degree between two search words can represent the association condition between the two search words.
As an example, this step may include the steps of:
and determining the two search words as a first search word and a second search word respectively according to the target probabilities corresponding to the two search words.
The search word with the larger target probability of the two search words can be determined to be the first search word, and the search word with the smaller target probability of the two search words can be determined to be the second search word. When the target probabilities corresponding to the two search words are equal, the two search words can be randomly determined to be the first search word and the second search word.
And a second step of determining a first difference between the first search word and the second search word according to the first number, the second number and the third number.
Wherein the first quantity may be the quantity of historical search information in all sets of historical search information that includes the first search term and does not include the second search term. The second number may be the number of historical search information in all sets of historical search information that includes the second search term and does not include the first search term. The third quantity may be the quantity of historical search information in all sets of historical search information that includes both the first search term and the second search term. Both the first number and the second number may be positively correlated with the first difference. The third number may be inversely related to the first difference.
For example, if the first search term is "mobile phone" and the second search term is "battery", the search term "mobile phone screen" may be a search term including the first search term and excluding the second search term, and the history search information where the search term "mobile phone screen" is located may be history search information including the first search term and excluding the second search term. The search term "computer battery" may be a search term including the second search term excluding the first search term, and the history search information in which the search term "computer battery" is located may be history search information including the second search term excluding the first search term. The search term "mobile phone battery" may be a search term including both the first search term and the second search term, and the history search information in which the search term "mobile phone battery" is located may be history search information including both the first search term and the second search term.
And thirdly, determining the absolute value of the difference value of the target probability corresponding to the second probability and the first search word as a second difference between the first search word and the second search word.
The second probability may be a category probability of the second search word under the target category corresponding to the first search word.
Fourth, determining a third difference between the first search word and the second search word based on the first difference and the second difference between the first search word and the second search word.
Wherein both the first difference and the second difference may be positively correlated with the third difference.
And fifthly, encoding the first search word and the second search word to obtain first encoded data corresponding to the first search word and second encoded data corresponding to the second search word.
For example, the first search term may be encoded using the encoding rules of UTF-8 (8-bit, universal Character Set/Unicode Transformation Format, variable length character encoding) to obtain first encoded data. And coding the second search word by adopting a coding rule of UTF-8 to obtain second coded data.
And a sixth step of determining an edit distance between the first encoded data and the second encoded data as a fourth difference between the first search word and the second search word.
Seventh, determining a target association degree between the first search word and the second search word according to the fourth difference and the third difference between the first search word and the second search word.
Wherein, the fourth difference and the third difference may both be inversely related to the target association.
For example, the formula for determining the target relevance correspondence between the first search term and the second search term may be:
Figure SMS_1
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_19
is the target association between the first search term and the second search term.
Figure SMS_23
Is a first difference between the first search term and the second search term.
Figure SMS_27
Is a first number.
Figure SMS_5
Is a second number.
Figure SMS_7
Is a third number.
Figure SMS_11
Is to take
Figure SMS_15
And
Figure SMS_4
is the maximum value of (a).
Figure SMS_8
Is to take
Figure SMS_12
And
Figure SMS_16
is the minimum value of (a). If it is
Figure SMS_20
Is that
Figure SMS_24
Then
Figure SMS_28
Is that
Figure SMS_32
. If it is
Figure SMS_18
Is that
Figure SMS_22
Then
Figure SMS_26
Is that
Figure SMS_30
Figure SMS_2
And
Figure SMS_6
are all in contact with
Figure SMS_10
And shows positive correlation.
Figure SMS_14
And (3) with
Figure SMS_31
And has negative correlation. M is the number of history search information in the resulting set of all history search information.
Figure SMS_35
Is the target probability corresponding to the first search term.
Figure SMS_38
Is the second probability.
Figure SMS_41
Is that
Figure SMS_36
Is the absolute value of (c).
Figure SMS_39
Is a second difference between the first search term and the second search term.
Figure SMS_42
Is of natural constant
Figure SMS_44
To the power.
Figure SMS_21
Is a third difference between the first search term and the second search term.
Figure SMS_25
And
Figure SMS_29
are all in contact with
Figure SMS_33
And shows positive correlation.
Figure SMS_34
And
Figure SMS_37
is a preset factor greater than 0 and is mainly used for preventing denominator from being 0. For example,
Figure SMS_40
And
Figure SMS_43
all can take 0.01.
Figure SMS_3
Is the edit distance between the first encoded data corresponding to the first search word and the second encoded data corresponding to the second search word, i.e., the fourth difference between the first search word and the second search word.
Figure SMS_9
And
Figure SMS_13
are all in contact with
Figure SMS_17
And has negative correlation.
When the following is performed
Figure SMS_46
The larger the term, the more often the first and second terms are presented at the same time, the more likely the first and second terms are words in the same category, and the higher the degree of association between the first and second terms.
Figure SMS_50
And
Figure SMS_52
the larger the size, the more often the first and second search terms are described as appearing separatelyThe greater the likelihood that the first and second search terms are often described as not being terms in the same category, the lower the degree of association between the first and second search terms is often described. Thus (2)
Figure SMS_47
The larger the association between the first and second search terms, the lower the association between the first and second search terms. When (when)
Figure SMS_49
The larger the term, the more likely the first and second terms are not terms in the same category, the lower the degree of association between the first and second terms. Thus, the first and second substrates are bonded together,
Figure SMS_51
The larger the association between the first and second search terms, the lower the association between the first and second search terms. Due to
Figure SMS_53
Is the edit distance between the first encoded data corresponding to the first search word and the second encoded data corresponding to the second search word, thus when
Figure SMS_45
The larger the difference between the first and second search terms tends to be explained, the lower the degree of association between the first and second search terms tends to be explained. Thus, the first and second substrates are bonded together,
Figure SMS_48
the larger the association between the first and second search terms, the lower the association between the first and second search terms.
And 204, determining target evaluation indexes of each target user under each target category according to all the obtained historical search information sets, target category sets and target association degrees among search words, and obtaining a target evaluation matrix.
In some embodiments, the target evaluation index of each target user under each target category may be determined according to all the obtained historical search information sets, the target category sets and the target association degrees between the search words, so as to obtain a target evaluation matrix.
The target user may be a user to be recommended or a reference user. The target evaluation matrix may include: target evaluation indexes of each target user under each target category.
It should be noted that, the target evaluation index of the target user under the target category may represent the preference score of the target user for the target category, that is, may represent the preference degree of the target user for the target category.
As an example, this step may include the steps of:
and a first step of determining the ratio of the fourth quantity to the fifth quantity as an initial evaluation index of the target user under the target category.
Wherein the fourth number may be the number of search terms entered in the target category by the target user included in all of the set of historical search information. The fifth number may be the number of search terms entered by the target user as described above that are included in all of the set of historical search information.
For example, the formula corresponding to the initial evaluation index of the target user under the target category may be:
Figure SMS_54
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_57
is the first
Figure SMS_61
The first target user in the target class set
Figure SMS_64
Initial evaluation index under each target class.
Figure SMS_58
Is the first to be included in all the historical search information sets
Figure SMS_59
The target user is at the first
Figure SMS_62
The number of search terms typed in the target category, i.e., the fourth number.
Figure SMS_65
Is the first to be included in all the historical search information sets
Figure SMS_56
The number of search terms entered by the individual target user is the fifth number.
Figure SMS_60
Is a preset factor greater than 0 and is mainly used for preventing denominator from being 0. For example,
Figure SMS_63
0.01 may be taken.
Figure SMS_66
Is the sequence number of the target user.
Figure SMS_55
Is the sequence number of the target class in the set of target classes.
When the following is performed
Figure SMS_75
The larger the tends to explain the first
Figure SMS_69
The target user is at the first
Figure SMS_71
The more search terms typed in the individual target categories, the more often the description of the first
Figure SMS_78
Target user pair(s)
Figure SMS_81
Inner of individual target categoriesThe more interesting the capacity may be, the more often the description is
Figure SMS_79
Target user pair(s)
Figure SMS_82
The higher the degree of preference of the individual target categories. Due to
Figure SMS_76
Is the first
Figure SMS_80
The number of search terms entered by the individual target users, and therefore when
Figure SMS_67
The larger the tends to explain the first
Figure SMS_72
The target user is at the first
Figure SMS_70
The more search terms typed in each target category relative to other target categories, the more often the description of the first
Figure SMS_74
Target user pair(s)
Figure SMS_73
The more interesting the content in the individual target categories may be relative to other target categories, often explaining the first
Figure SMS_77
Target user pair(s)
Figure SMS_68
The higher the degree of preference of the individual target categories may be relative to other target categories.
And a second step of determining a first association degree of the target user under the target category according to the target association degree between each search word typed by the target user and each search word in the target category, which is included in all the historical search information sets.
The target association degree between each search word typed by the target user and each search word in the target category can be positively correlated with the first association degree.
And thirdly, determining a second association degree corresponding to the target category according to the target association degree between each search word typed by all target users included in all the historical search information sets and each search word in the target category.
The target association degree between each search word typed by all target users and each search word in the target category can be positively correlated with the second association degree.
And fourth, determining the first association degree of the target user under the target category and the duty ratio of the second association degree corresponding to the target category as the third association degree of the target user under the target category.
Fifthly, determining a reference evaluation index corresponding to the target user according to the initial evaluation index of the target user under the target category in the target category set.
The initial evaluation index of the target user under the target category in the target category set may be positively correlated with the reference evaluation index.
For example, the average value of the initial evaluation indexes of the target user under all target categories in the target category set can be determined as the reference evaluation index corresponding to the target user.
For another example, the largest initial evaluation index can be selected from the initial evaluation indexes of the target users under each target category in the target category set, and the initial evaluation index is used as the reference evaluation index corresponding to the target user.
And sixthly, determining a target evaluation index of the target user in the target category according to the reference evaluation index corresponding to the target user and a third association degree of the target user in the target category.
Wherein, the reference evaluation index and the third association degree may both be positively correlated with the target evaluation index.
For example, the formula corresponding to the target evaluation index of the target user under the target category may be:
Figure SMS_83
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_103
is the first
Figure SMS_107
The first target user in the target class set
Figure SMS_110
Target evaluation index under each target class.
Figure SMS_87
Is the first to be included in all the historical search information sets
Figure SMS_91
Ith search term and ith search term entered by individual target user
Figure SMS_95
Target relevance between the j-th search term in the target category.
Figure SMS_99
Is the first
Figure SMS_88
The target user is at the first
Figure SMS_92
A first degree of association under the individual target categories.
Figure SMS_96
Is the first to be included in all the historical search information sets
Figure SMS_100
The number of search terms entered by the individual target user.
Figure SMS_104
Is the first to be included in all the historical search information sets
Figure SMS_108
The number of search terms in the individual target categories.
Figure SMS_112
And (3) with
Figure SMS_114
And shows positive correlation. n is the number of target users.
Figure SMS_102
Individual search terms and the th of all target user-entered search terms that may be included in all sets of historical search information
Figure SMS_106
An accumulated value of target relevance between individual search terms in the individual target categories.
Figure SMS_111
Is the first
Figure SMS_115
And a second degree of association corresponding to each target category.
Figure SMS_84
And (3) with
Figure SMS_90
And shows positive correlation.
Figure SMS_94
Is the first
Figure SMS_98
The target user is at the first
Figure SMS_101
Third degree of association under the individual target category.
Figure SMS_105
Is a preset factor greater than 0 and is mainly used for preventing denominatorIs 0. For example,
Figure SMS_109
0.01 may be taken.
Figure SMS_113
Is the first
Figure SMS_116
And the reference evaluation indexes corresponding to the target users.
Figure SMS_117
Is of natural constant
Figure SMS_118
To the power.
Figure SMS_119
Can realize the pair of
Figure SMS_85
Is included in the (c) for the normalization.
Figure SMS_89
Is the sequence number of the target user.
Figure SMS_93
Is the sequence number of the target class in the set of target classes. i is the first included in all historical search information sets
Figure SMS_97
The sequence number of the search term entered by the individual target user. j is the first included in all historical search information sets
Figure SMS_86
The sequence number of the search term in the individual target category.
It should be noted that due to
Figure SMS_136
Is the first
Figure SMS_139
Ith search term and ith search term entered by individual target user
Figure SMS_142
Target relevance between the jth search term in the target category, so
Figure SMS_120
Can characterize the first
Figure SMS_125
Target user and the first
Figure SMS_129
The degree of association of the individual target categories. And due to
Figure SMS_133
All target users and the first can be characterized
Figure SMS_121
The overall degree of association of the individual target categories. Thus, the first and second substrates are bonded together,
Figure SMS_126
the larger is, tend to illustrate the first
Figure SMS_130
Target user and the first
Figure SMS_134
The greater the relative degree of association of the target categories, the more often the description of the first
Figure SMS_123
Search terms entered by the individual target users are at the first
Figure SMS_124
The more of the target categories, the more often the description of the first
Figure SMS_128
Target user pair(s)
Figure SMS_132
The more interesting the content in the individual target categories may be, the more often the description of the first
Figure SMS_138
Target user pair(s)
Figure SMS_141
The higher the degree of preference of the individual target categories. Due to
Figure SMS_143
The larger is, tend to illustrate the first
Figure SMS_144
Target user pair(s)
Figure SMS_122
The higher the degree of preference of the individual target categories. Thus (2)
Figure SMS_127
The larger is, tend to illustrate the first
Figure SMS_131
Target user pair(s)
Figure SMS_135
The higher the degree of preference of the individual target categories. Secondly, the first step of the method comprises the steps of,
Figure SMS_137
can realize the pair of
Figure SMS_140
Can facilitate subsequent processing.
Step 205, screening out similar user sets from the reference user sets according to all the obtained historical search information sets and the target evaluation matrix.
In some embodiments, a set of similar users may be screened from the set of reference users based on all of the historical search information sets and the target evaluation matrix.
It should be noted that, by comprehensively considering all the obtained historical search information sets and the target evaluation matrix, the similar user sets are screened out from the reference user sets, and the accuracy of determining the similar user sets can be improved, so that the accuracy of recommending search words to the user to be recommended can be improved.
As an example, this step may include the steps of:
the first step, according to all obtained historical search information sets, determining the semantic fitness of each target user under each target category, and obtaining a semantic fitness matrix.
The semantic fitness matrix may include semantic fitness of each target user under each target category.
For example, determining the semantic fitness of each target user under each target category may include the sub-steps of:
and a first sub-step of determining the target behavior fitness of the target user under the target category according to the target behavior frequency corresponding to the search words included in all the historical search information sets.
For example, determining the target behavior compliance of each target user under the target category may include the steps of:
first, determining a first behavior difference of the target user under the target category according to target behavior frequency corresponding to each search word typed in by the target user under the target category and included in all historical search information sets.
Wherein, the target behavior frequency may be positively correlated with the first behavior difference.
For example, the average value of the target behavior frequency corresponding to all search words typed by the target user under the target category and included in all the historical search information sets can be determined as the first behavior difference of the target user under the target category.
For another example, the smallest target behavior frequency among the target behavior frequencies corresponding to the search words typed by the target user under the target category included in all the historical search information sets can be determined as the first behavior difference of the target user under the target category.
And then, determining the variance of the target behavior frequency corresponding to all search words typed by the target user in the target category and included in all the historical search information sets as the second behavior difference of the target user in the target category.
And then, determining the average value of the target behavior frequency corresponding to all search words in all target categories in the target category set included in all the historical search information sets as the reference behavior frequency.
And then, determining the accumulated sum of the differences of the target behavior frequency corresponding to each search word typed by the target user in the target category and the reference behavior frequency, which are included in all the historical search information sets, as a third behavior difference of the target user in the target category.
And finally, determining the target behavior compliance degree of the target user under the target category according to the first behavior difference, the second behavior difference and the third behavior difference of the target user under the target category.
Wherein the first behavior difference, the second behavior difference, and the third behavior difference may all be inversely related to the target behavior fitness.
For example, the formula corresponding to the target behavior compliance degree of the target user under the target category may be determined as follows:
Figure SMS_145
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_163
is the first
Figure SMS_167
The first target user in the target class set
Figure SMS_171
Target behavior compliance under the individual target categories.
Figure SMS_147
Is the first
Figure SMS_151
The target user is at the first
Figure SMS_155
First behavioral differences under the individual target categories.
Figure SMS_159
Is the first
Figure SMS_175
The target user is at the first
Figure SMS_179
Second behavior differences under the respective target categories, i.e. the first of all sets of historical search information
Figure SMS_181
The target user is at the first
Figure SMS_183
The variance of the target behavior frequency corresponding to all the search words typed under each target category. t is the frequency of reference behavior.
Figure SMS_178
Is the first to be included in all the historical search information sets
Figure SMS_180
The target user is at the first
Figure SMS_182
And the target behavior frequency corresponding to the f search word typed in under each target category.
Figure SMS_184
Is the first to be included in all the historical search information sets
Figure SMS_162
The target user is at the first
Figure SMS_166
The number of search terms typed under the individual target categories.
Figure SMS_170
Is the first
Figure SMS_174
The target user is at the first
Figure SMS_146
Third behavior differences under the individual target categories.
Figure SMS_150
Figure SMS_154
And
Figure SMS_158
is a preset factor greater than 0 and is mainly used for preventing denominator from being 0. For example,
Figure SMS_149
Figure SMS_152
and
Figure SMS_156
all can take 0.01.
Figure SMS_160
Is the first
Figure SMS_164
The target user is at the first
Figure SMS_168
Fourth behavior differences under the individual target classes.
Figure SMS_172
Can realize the pair of
Figure SMS_176
Is included in the (c) for the normalization.
Figure SMS_148
Figure SMS_153
And
Figure SMS_157
can all be connected with
Figure SMS_161
And has negative correlation.
Figure SMS_165
Is the sequence number of the target user.
Figure SMS_169
Is the sequence number of the target class in the set of target classes. f is the first included in all historical search information sets
Figure SMS_173
The target user is at the first
Figure SMS_177
The sequence number of the search term typed under the individual target category.
It should be noted that, when the frequency of the target behavior corresponding to the search word is greater, the more times that the target user inputs the search word to modify is often illustrated, the less familiarity of the target user with the search word is often illustrated, the less familiarity of the target user with the target category of the search word is often illustrated, and the lower the behavior compliance degree of the target user with the target category of the search word is often illustrated. Due to the frequency of target behaviors
Figure SMS_204
Is positively correlated, thus
Figure SMS_207
The larger is, tend to illustrate the first
Figure SMS_210
Target user pair(s)
Figure SMS_187
The lower the degree of fit the individual target categories may be. When (when)
Figure SMS_192
The larger the tends to explain the first
Figure SMS_196
The target user is at the first
Figure SMS_200
The more chaotic the target behavior frequency corresponding to the typed search term under each target category, the more often the description of the first
Figure SMS_202
Target user pair(s)
Figure SMS_205
The more unstable the familiarity of the individual target categories, the more often the description of the first
Figure SMS_208
Target user pair(s)
Figure SMS_211
The more unstable the behavior habit of the target class, the more often the description is
Figure SMS_203
Target user pair(s)
Figure SMS_206
The lower the degree of fit the individual target categories may be. When (when)
Figure SMS_209
The larger the number of times the f-th search term is modified, the more often the f-th search term is modified. When (when)
Figure SMS_212
The larger the tends to explain the first
Figure SMS_188
The target user is typing in the first
Figure SMS_191
The more times a search term in a target category is modified, the more often the description of the first
Figure SMS_195
Target user pair(s)
Figure SMS_199
The lower the familiarity of the individual target categories, the more often the description of the first
Figure SMS_185
Target user pair(s)
Figure SMS_189
The lower the degree of behavioral conformation of the individual target classes may be. Thus, when
Figure SMS_193
The larger the tends to explain the first
Figure SMS_197
Target user pair(s)
Figure SMS_186
The higher the familiarity of the individual target categories, the more often the description of the first
Figure SMS_190
Target user pair(s)
Figure SMS_194
The higher the degree of behavioral conformation of the individual target classes may be. Secondly, the first step of the method comprises the steps of,
Figure SMS_198
Can realize the pair of
Figure SMS_201
Can facilitate subsequent processing.
And a second sub-step of determining the variance of the target length corresponding to all the search words typed by the target user in the target category, which is included in all the historical search information set, as the first semantic difference of the target user in the target category.
And a third sub-step of determining a second semantic difference corresponding to each search word typed in by the target user in the target category according to the modified word set corresponding to each search word typed in by the target user in the target category, which is included in all the historical search information sets.
For example, determining the second semantic difference corresponding to each search term typed by each target user under the target category may include the steps of:
first, a difference between the search word and each of the modified words in the modified word set corresponding to the search word is determined, and a target difference set corresponding to the search word is obtained as a target difference between the search word and the modified word.
Wherein the target difference between the search term and the modification term may characterize the difference between the search term and the modification term. The set of target differences corresponding to the search term may include: the search term and each of the set of modification terms corresponding to the search term are subject to a difference between the target terms.
For example, the search term may be encoded using the encoding rules of UTF-8 to obtain the first data. The modification word may be encoded using the encoding rule of UTF-8 to obtain the second data. The edit distance between the first data and the second data may be used as a target difference between the search term and the modification term.
And then, determining a second semantic difference corresponding to the search word according to the target difference set corresponding to the search word.
Wherein each target difference in the set of target differences may be positively correlated with the second semantic difference.
And a fourth sub-step of determining a third semantic difference of the target user in the target category according to the second semantic differences corresponding to the search words typed in by the target user in the target category.
The second semantic difference corresponding to each search term typed by the target user in the target category may be positively correlated with the third semantic difference.
And a fifth sub-step of determining the semantic fitness of the target user in the target category according to the target behavior fitness of the target user in the target category, the first semantic difference and the third semantic difference.
The target behavior compliance may be positively correlated with the target behavior compliance. Both the first semantic difference and the third semantic difference may be inversely related to the target behavioral fitness.
For example, the formula corresponding to the semantic fitness of the target user under the target category may be:
Figure SMS_213
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_230
is the first
Figure SMS_234
The first target user in the target class set
Figure SMS_238
Semantic fitness under the individual target category.
Figure SMS_215
Is the first
Figure SMS_220
The target user is at the first
Figure SMS_224
Target behavior compliance under the individual target categories.
Figure SMS_226
Is the first
Figure SMS_227
The target user is at the first
Figure SMS_231
First semantic difference under each target category, i.e. the first included in all historical search information sets
Figure SMS_235
The target user is at the first
Figure SMS_239
All search term correspondences typed under individual target categoriesIs a function of the variance of the target length of (a).
Figure SMS_241
Is the first
Figure SMS_245
The target user is at the first
Figure SMS_248
Third semantic differences under the individual target categories.
Figure SMS_250
And (3) with
Figure SMS_233
And shows positive correlation.
Figure SMS_237
Is a preset factor greater than 0 and is mainly used for preventing denominator from being 0. For example,
Figure SMS_242
0.01 may be taken.
Figure SMS_246
Is the first to be included in all the historical search information sets
Figure SMS_217
The target user is at the first
Figure SMS_221
The target difference between the f-th search word and the b-th modification word in the modification word set corresponding to the f-th search word is typed under the target category.
Figure SMS_225
Is the first to be included in all the historical search information sets
Figure SMS_229
The target user is at the first
Figure SMS_232
Repair in a set of modifiers corresponding to the f search term typed under a target categoryNumber of word changes.
Figure SMS_236
Is the first to be included in all the historical search information sets
Figure SMS_240
The target user is at the first
Figure SMS_244
The number of search terms typed under the individual target categories.
Figure SMS_243
Is the first to be included in all the historical search information sets
Figure SMS_247
The target user is at the first
Figure SMS_249
And a second semantic difference corresponding to the f search term typed under the target category.
Figure SMS_251
And (3) with
Figure SMS_214
And shows positive correlation.
Figure SMS_218
Is the sequence number of the target user.
Figure SMS_223
Is the sequence number of the target class in the set of target classes. f is the first included in all historical search information sets
Figure SMS_228
The target user is at the first
Figure SMS_216
The sequence number of the search term typed under the individual target category. b is the first
Figure SMS_219
The target user is at the first
Figure SMS_222
And the sequence number of the modification word in the modification word set corresponding to the f search word typed in under the target category.
When the following is performed
Figure SMS_268
The larger the tends to explain the first
Figure SMS_271
The target user is at the first
Figure SMS_273
The greater the difference between a search term typed under a respective target category and a modification term in a corresponding set of modification terms, the more likely it is to be explained
Figure SMS_255
The target user is at the first
Figure SMS_256
The more likely the number of corresponding modifier words under the respective target categories, the more often the description of the first
Figure SMS_260
The target user is at the first
Figure SMS_264
The more times a typed search term under a respective target category is modified, the more often the description of the first
Figure SMS_253
Target user pair(s)
Figure SMS_257
The lower the degree of semantic agreement of the individual target categories may be. When (when)
Figure SMS_261
The larger the tends to explain the first
Figure SMS_265
The target user is at the first
Figure SMS_259
The more confusing the target length corresponding to the search term entered under each target category, the more often the description of the first
Figure SMS_263
The target user is at the first
Figure SMS_267
The more different the lengths of the search terms typed under the respective target categories, the more often the description of the first
Figure SMS_270
Target user pair(s)
Figure SMS_269
The lower the degree of semantic agreement of the individual target categories may be. Due to when
Figure SMS_272
The larger the tends to explain the first
Figure SMS_274
Target user pair(s)
Figure SMS_275
The higher the familiarity of the individual target categories, the more often the description of the first
Figure SMS_252
Target user pair(s)
Figure SMS_258
The higher the degree of behavioral conformation of the individual target classes may be. Thus, the first and second substrates are bonded together,
Figure SMS_262
the larger is, tend to illustrate the first
Figure SMS_266
Target user pair(s)
Figure SMS_254
The higher the semantic agreement of the individual target categories may be.
And step two, screening out a similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix.
For example, screening the set of similar users from the set of reference users described above may include the sub-steps of:
a first substep, for each reference user in the set of users to be recommended and the set of reference users, determining a square of a difference value of target evaluation indexes of the reference user and the user to be recommended in each target category included in the target evaluation matrix as a first evaluation difference between the user to be recommended and the reference user in the target category, and obtaining a first evaluation difference set between the user to be recommended and the reference user.
Wherein the first set of evaluation differences between the user to be recommended and the reference user may comprise: the user to be recommended and the reference user are subject to a first evaluation difference under each target category.
And a second sub-step of determining a second evaluation difference between the user to be recommended and the reference users according to the first evaluation difference set between the user to be recommended and each reference user.
Wherein a first evaluation discrepancy in the first set of evaluation discrepancies may be positively correlated with a second evaluation discrepancy.
And a third sub-step of determining, for each reference user in the set of users to be recommended and the set of reference users, a square of a difference in semantic fitness between the reference user and the user to be recommended in each target category, which is included in the semantic fitness matrix, as a first fit difference between the user to be recommended and the reference user in the target category, and obtaining a first fit difference set between the user to be recommended and the reference user.
Wherein the first set of fit differences between the user to be recommended and the reference user may comprise: the user to be recommended and the reference user are different in first fit under each target category.
And a fourth sub-step of determining a second fit difference between the user to be recommended and the reference users according to the first fit difference set between the user to be recommended and each reference user.
Wherein the first fitting difference in the first fitting difference set may be positively correlated with the second fitting difference.
And a fifth sub-step of determining a measured distance between the user to be recommended and each reference user according to the second evaluation difference and the second fit difference between the user to be recommended and each reference user.
Wherein the second evaluation difference and the second fit difference may each be positively correlated with the metric distance.
For example, the formula for determining the metric distance correspondence between the user to be recommended and the reference user may be:
Figure SMS_276
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_286
is the measured distance between the user to be recommended and the c-th reference user in the set of reference users. G is the number of target categories in the set of target categories.
Figure SMS_278
Is the first user to be recommended in the target category set
Figure SMS_282
Semantic fitness under the individual target category.
Figure SMS_290
Is the c reference user in the c
Figure SMS_294
Semantic fitness under the individual target category.
Figure SMS_296
Is the user to be recommended is at the first
Figure SMS_299
Target evaluation index under individual target category。
Figure SMS_287
Is the c reference user in the c
Figure SMS_291
Target evaluation index under each target class.
Figure SMS_277
Is the sequence number of the target class in the set of target classes. c is the sequence number of the reference user in the reference user set.
Figure SMS_283
Is the user to be recommended and the c reference user is in the (th)
Figure SMS_293
A first evaluation difference under the individual target categories.
Figure SMS_297
Is the second estimated difference between the user to be recommended and the c-th reference user.
Figure SMS_295
And (3) with
Figure SMS_298
And shows positive correlation.
Figure SMS_280
Is the user to be recommended and the c reference user is in the (th)
Figure SMS_284
First fitting differences under the individual target categories.
Figure SMS_288
Is the second fit difference between the user to be recommended and the c-th reference user.
Figure SMS_292
And (3) with
Figure SMS_279
And shows positive correlation.
Figure SMS_281
And
Figure SMS_285
are all in contact with
Figure SMS_289
And shows positive correlation.
When the following is performed
Figure SMS_300
And
Figure SMS_301
the smaller the time, the more often the user to be recommended and the c-th reference user are in the third place
Figure SMS_302
The more similar the preference under the individual target categories. Thus, the first and second substrates are bonded together,
Figure SMS_303
the smaller the time, the more similar the preference situation of the user to be recommended and the c-th reference user is, the more likely the c-th reference user is the similar user of the user to be recommended is.
And a sixth sub-step of screening a similar user set from the reference user set according to the measured distance between the user to be recommended and each reference user in the reference user set.
For example, a neighbor set of the user to be recommended may be obtained by using a KNN (K-nearest neighbor) algorithm according to a metric distance between the user to be recommended and each reference user in the reference user set, and the neighbor set of the user to be recommended is determined as a similar user set. Wherein K in the KNN algorithm may be 20.
It should be noted that, the more comprehensive the acquired data in the reference user set and the historical search information set, the more accurate the screening of the similar user set is.
Optionally, according to the target category corresponding to the target typing information, obtaining a target prediction score corresponding to each candidate search term in the candidate search term set may include the following steps:
and the first step, selecting sub-search phrases corresponding to the candidate search words from the target search phrases.
The target search phrase may include: and each similar user in the similar user set corresponding to the user to be recommended and all search words of the user to be recommended under the target category corresponding to the target typing information. The target search phrase may include the same search term. The sub-search phrase corresponding to the candidate search word may include: the candidate search words and the target search word group are the same as the candidate search words.
And a second step of determining, for each search word in the sub-search word groups corresponding to the candidate search word, a semantic fitness of a target user who types the search word under a target category corresponding to the target typing information as a target fitness corresponding to the search word.
And thirdly, for each search word in the sub-search phrase corresponding to the candidate search word, determining a first score corresponding to the search word according to the target fitness and the target probability corresponding to the search word.
Wherein, the target fitness and the target probability may both be positively correlated with the first score.
And fourthly, determining target prediction scores corresponding to the candidate search words according to the first scores corresponding to the search words in the sub-search word groups corresponding to the candidate search words.
The first scores corresponding to the search terms in the sub-search phrases may be positively correlated with the target prediction scores.
For example, the formula for determining the target prediction score corresponding to the candidate search term may be:
Figure SMS_304
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_306
is a candidate set of search termsTarget prediction scores corresponding to the h candidate search terms in the set.
Figure SMS_310
The semantic fitness of the target user who types in the y-th search word (the y-th search word in the sub-search word group corresponding to the h-th candidate search word) under the target category corresponding to the target typing information, namely the target fitness corresponding to the y-th search word.
Figure SMS_313
Is the target probability corresponding to the y-th search word in the sub-search word group corresponding to the h-th candidate search word.
Figure SMS_307
Is the first score corresponding to the y-th search word in the sub-search word group corresponding to the h-th candidate search word.
Figure SMS_309
And
Figure SMS_312
are all in contact with
Figure SMS_314
And shows positive correlation.
Figure SMS_305
Is the number of search words in the sub-search phrase corresponding to the h candidate search word.
Figure SMS_308
And (3) with
Figure SMS_311
And shows positive correlation.
When the following is performed
Figure SMS_315
The larger the search term, the more accurate the classification result of the h candidate search term into the target category corresponding to the target key-in information is often described. When (when)
Figure SMS_316
The larger the term, the greater the semantic agreement of the target user who types the y-th search term under the target category corresponding to the target typing information is often explained. Thus, the first and second substrates are bonded together,
Figure SMS_317
the larger the term, the more suitable the h candidate search term is for being recommended to the user to be recommended.
And step S3, screening out a search word set to be recommended from the candidate search word set according to the target prediction scores corresponding to the candidate search words.
In some embodiments, the set of search terms to be recommended may be selected from the set of candidate search terms according to a target prediction score corresponding to the candidate search terms.
The search term to be recommended in the search term set to be recommended may be a search term to be recommended.
As an example, a preset number of candidate search words with highest target prediction scores may be selected from the candidate search word set, and used as search words to be recommended, to obtain the search word set to be recommended. The preset number may be a preset number. For example, the preset number may be 10.
Optionally, the target evaluation matrix may be used as a user evaluation matrix in the collaborative filtering algorithm, a similar user set is used as a user neighbor set in the collaborative filtering algorithm, based on the target evaluation matrix and the similar user set, a prediction score of each user which does not type a search word in the corresponding historical search information set is calculated, the prediction scores of each search word are ranked in order from large to small based on a Top-N recommendation criterion, at this time, the size of N may be 20, namely, 20 search words in the ranking result of the prediction scores form a recommendation list, and the recommendation list is recommended to the user to be recommended.
For example, if the target key-in information is "computer", the preset number is 4, the set of search words to be recommended may be { "computer screen", "XXX brand computer", "computer battery", "computer keyboard" }.
It should be noted that, the more comprehensive the data in the acquired reference user set and the historical search information set is, the more accurate the screening of the similar user set and the search word set to be recommended is.
And S4, recommending the search word set to be recommended to the user to be recommended.
In some embodiments, the set of search terms to be recommended may be recommended to the user to be recommended.
As an example, a web page technology may be adopted, and the search words to be recommended in the set of search words to be recommended are displayed below a search box input by the user to be recommended in order from large to small, so that the user to be recommended can conveniently select the search words to be recommended, and search word recommendation is performed on the user to be recommended.
In summary, the target category corresponding to the target typing information is determined first, so that the content type which the user to be recommended wants to know can be conveniently known, and accurate recommendation can be conveniently performed subsequently. Next, since the candidate search term set includes: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. Therefore, compared with the screening of the historical search words of the user to be recommended, the search words in the candidate search word set are more consistent with the content in the type which the user to be recommended wants to know, and are not mixed together in multiple types, so that the content which the user to be recommended wants to search can be screened more easily. And secondly, screening the search words to be recommended from the candidate search word set, wherein compared with the screening of the search words from the history search words of the users to be recommended, the search words in the candidate search word set not only contain the search words which are input by the users to be recommended, but also contain the search words which are input by each similar user in the similar user set similar to the users to be recommended, so that the search words in the candidate search word set are more comprehensive, and even if the content which the users to be recommended want to search is the content which the users to be recommended do not search for in the aspect of the type, the users to be recommended can also be recommended based on the search words which are input by the similar users in the similar user set in the aspect of the type. For example, the target category corresponding to the target typing information is a type that the user to be recommended has not searched, and the search word recommendation may also be performed from the search words of each similar user in the set of similar users included in the candidate search word set under the target category corresponding to the target typing information. Then, a target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later. Finally, recommending the search term set to be recommended to the user to be recommended, so that the search term recommendation of the user to be recommended can be realized, and the accuracy of the search term recommendation of the user is improved. And secondly, based on the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user sets, comprehensively considering the target association degree and the target evaluation matrix between the search words, screening the similar user sets from the reference user sets, and improving the accuracy of determining the similar user sets, so that the accuracy of recommending the search words to the user to be recommended can be improved.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention and are intended to be included within the scope of the invention.

Claims (10)

1. A method for recommending user search terms based on typed search terms, comprising the steps of:
acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information;
obtaining a target prediction score corresponding to each candidate search word in a candidate search word set according to a target category corresponding to the target typing information, wherein the candidate search word set comprises: each similar user in the similar user set corresponding to the user to be recommended and search words of the user to be recommended under the target category corresponding to the target typing information;
Screening a search word set to be recommended from the candidate search word set according to target prediction scores corresponding to the candidate search words;
recommending the search word set to be recommended to the user to be recommended;
determining the set of similar users comprises the steps of:
acquiring a historical search information set corresponding to a user to be recommended and a historical search information set corresponding to each reference user in a reference user set;
classifying search words included in all the obtained historical search information sets to obtain a target category set;
determining target association degrees between every two search words included in all obtained historical search information sets;
determining target evaluation indexes of each target user under each target category according to all the obtained historical search information sets, the target category sets and the target association degrees among the search words to obtain a target evaluation matrix, wherein the target users are users to be recommended or reference users;
and screening a similar user set from the reference user set according to all the obtained historical search information sets and the target evaluation matrix.
2. The method of claim 1, wherein the classifying the search terms included in all the obtained historical search information sets to obtain the target category set comprises:
Inputting each search word included in all obtained historical search information sets into a target classification network which is trained in advance to obtain the probability that the search word belongs to each preset category in a preset category set, and obtaining a category probability set corresponding to the search word as the category probability of the search word under the preset category;
for each search word included in all obtained historical search information sets, screening out the maximum category probability from the category probability set corresponding to the search word, taking the maximum category probability as the target probability corresponding to the search word, and determining the preset category corresponding to the target probability corresponding to the search word as the target category corresponding to the search word;
and combining the target categories corresponding to all the search words included in all the obtained historical search information sets into a target category set.
3. The method of claim 2, wherein determining the target relevance between each two search terms included in all the set of historical search information comprises:
according to the target probabilities corresponding to the two search words, determining the two search words as a first search word and a second search word respectively;
Determining a first difference between the first search word and the second search word according to a first quantity, a second quantity and a third quantity, wherein the first quantity is the quantity of the historical search information which comprises the first search word and does not comprise the second search word in all the historical search information sets, the second quantity is the quantity of the historical search information which comprises the second search word and does not comprise the first search word in all the historical search information sets, the third quantity is the quantity of the historical search information which comprises both the first search word and the second search word in all the historical search information sets, the first quantity and the second quantity are positively correlated with the first difference, and the third quantity is negatively correlated with the first difference;
determining an absolute value of a difference value of a target probability corresponding to the second probability and the first search word as a second difference between the first search word and the second search word, wherein the second probability is a category probability of the second search word under a target category corresponding to the first search word;
determining a third difference between the first search word and the second search word according to the first difference and the second difference between the first search word and the second search word, wherein the first difference and the second difference are positively correlated with the third difference;
Encoding the first search word and the second search word to obtain first encoded data corresponding to the first search word and second encoded data corresponding to the second search word;
determining an edit distance between the first encoded data and the second encoded data as a fourth difference between the first search term and the second search term;
and determining the target association degree between the first search word and the second search word according to a fourth difference and a third difference between the first search word and the second search word, wherein the fourth difference and the third difference are in negative correlation with the target association degree.
4. The method of claim 1, wherein determining the target evaluation index of each target user under each target category based on the obtained set of all historical search information, the target category set, and the target association degree between the search terms comprises:
determining the duty ratio of a fourth quantity in a fifth quantity as an initial evaluation index of the target user under the target category, wherein the fourth quantity is the quantity of the target user input search words included in all the historical search information sets, and the fifth quantity is the quantity of the target user input search words included in all the historical search information sets;
Determining a first relevance of the target user under the target category according to target relevance between each search word typed in by the target user and each search word in the target category, which are included by all historical search information sets, wherein the target relevance between each search word typed in by the target user and each search word in the target category is positively correlated with the first relevance;
determining a second association degree corresponding to the target category according to target association degrees between all search words typed in by all target users and all search words in the target category, wherein the target association degrees between all search words typed in by all target users and all search words in the target category are positively correlated with the second association degree;
determining the first association degree of the target user under the target category and the duty ratio of the first association degree in the second association degree corresponding to the target category as the third association degree of the target user under the target category;
determining a reference evaluation index corresponding to the target user according to the initial evaluation index of the target user under the target category in the target category set, wherein the initial evaluation index of the target user under the target category in the target category set is positively correlated with the reference evaluation index;
And determining a target evaluation index of the target user under the target category according to the reference evaluation index corresponding to the target user and a third association degree of the target user under the target category, wherein the reference evaluation index and the third association degree are positively correlated with the target evaluation index.
5. The method of claim 2, wherein said screening a set of similar users from said set of reference users based on all of the resulting set of historical search information and said target evaluation matrix comprises:
determining the semantic fitness of each target user under each target category according to all the obtained historical search information sets to obtain a semantic fitness matrix;
and screening a similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix.
6. The method of claim 5, wherein determining semantic agreements for each target user under each target category based on the obtained set of all historical search information comprises:
Determining target behavior fitness of the target user under the target category according to target behavior frequency corresponding to search words included in all the historical search information sets;
determining variances of target lengths corresponding to all search words typed by the target user under the target category, which are included in all the historical search information sets, as first semantic differences of the target user under the target category;
determining a second semantic difference corresponding to each search word typed in by the target user under the target category according to a modified word set corresponding to each search word typed in by the target user under the target category, which is included in all the historical search information sets;
determining a third semantic difference of the target user under the target category according to the second semantic differences corresponding to the search words typed in by the target user under the target category, wherein the second semantic differences corresponding to the search words typed in by the target user under the target category are positively correlated with the third semantic differences;
determining the semantic fitness of the target user under the target category according to the target behavior fitness, the first semantic difference and the third semantic difference of the target user under the target category, wherein the target behavior fitness is positively correlated with the target behavior fitness, and the first semantic difference and the third semantic difference are negatively correlated with the target behavior fitness.
7. The method for recommending user search terms based on typed search terms according to claim 6, wherein the determining the target behavior compliance of the target user under the target category according to the target behavior frequency corresponding to the search terms included in all the historical search information sets comprises:
determining a first behavior difference of the target user under the target category according to target behavior frequency corresponding to each search word typed in by the target user under the target category and included in all the historical search information sets, wherein the target behavior frequency is positively correlated with the first behavior difference;
determining the variance of the target behavior frequency corresponding to all search words typed by the target user under the target category and included in all the historical search information sets as the second behavior difference of the target user under the target category;
determining a mean value of target behavior frequencies corresponding to all search words in all target categories in the target category set included in all historical search information sets as a reference behavior frequency;
determining the accumulated sum of differences of the target behavior frequency corresponding to each search word typed by the target user under the target category and the reference behavior frequency included in all the historical search information sets as a third behavior difference of the target user under the target category;
And determining the target behavior compliance degree of the target user under the target category according to the first behavior difference, the second behavior difference and the third behavior difference of the target user under the target category, wherein the first behavior difference, the second behavior difference and the third behavior difference are in negative correlation with the target behavior compliance degree.
8. The method of claim 6, wherein the determining the second semantic difference for each search term entered by the target user under the target category based on the set of modified terms for each search term entered by the target user under the target category included in all sets of historical search information comprises:
determining the difference between the search word and each modification word in the modification word set corresponding to the search word, and obtaining a target difference set corresponding to the search word as a target difference between the search word and the modification word;
and determining a second semantic difference corresponding to the search word according to the target difference set corresponding to the search word, wherein each target difference in the target difference set is positively correlated with the second semantic difference.
9. The method of claim 5, wherein the selecting a set of similar users from the set of reference users based on the target evaluation matrix and the semantic fitness matrix comprises:
for each reference user in the to-be-recommended user and the reference user set, determining the square of the difference value of the target evaluation indexes of the reference user and the to-be-recommended user in each target category, which is included in the target evaluation matrix, as a first evaluation difference between the to-be-recommended user and the reference user in the target category, and obtaining a first evaluation difference set between the to-be-recommended user and the reference user;
determining a second evaluation difference between the user to be recommended and each reference user according to a first evaluation difference set between the user to be recommended and each reference user, wherein the first evaluation difference in the first evaluation difference set is positively correlated with the second evaluation difference;
for each reference user in the user to be recommended and the reference user set, determining the square of the difference value of the semantic fitness of the reference user and the user to be recommended, which is included in the semantic fitness matrix, under each target category as a first fit difference between the user to be recommended and the reference user under the target category, and obtaining a first fit difference set between the user to be recommended and the reference user;
Determining a second fit difference between the user to be recommended and each reference user according to a first fit difference set between the user to be recommended and each reference user, wherein the first fit difference in the first fit difference set and the second fit difference are positively correlated;
determining a measurement distance between the user to be recommended and each reference user according to a second evaluation difference and a second fit difference between the user to be recommended and each reference user, wherein the second evaluation difference and the second fit difference are positively correlated with the measurement distance;
and screening a similar user set from the reference user set according to the measurement distance between the user to be recommended and each reference user in the reference user set.
10. The method for recommending user search terms based on typed search terms according to claim 5, wherein the obtaining a target prediction score corresponding to each candidate search term in the candidate search term set according to the target category corresponding to the target typed information comprises:
and screening sub-search phrases corresponding to the candidate search words from target search phrases, wherein the target search phrases comprise: each similar user in the similar user set corresponding to the user to be recommended and all search words of the user to be recommended under the target category corresponding to the target typing information;
For each search word in the sub-search word group corresponding to the candidate search word, determining the semantic fitness of a target user typing the search word under the target category corresponding to the target typing information as the target fitness corresponding to the search word;
for each search word in the sub-search word groups corresponding to the candidate search word, determining a first score corresponding to the search word according to the target concordance degree and the target probability corresponding to the search word, wherein the target concordance degree and the target probability are positively correlated with the first score;
and determining target prediction scores corresponding to the candidate search words according to first scores corresponding to the search words in the sub-search word groups corresponding to the candidate search words, wherein the first scores corresponding to the search words in the sub-search word groups are positively correlated with the target prediction scores.
CN202310483388.4A 2023-05-04 2023-05-04 Method for recommending user search terms based on typing search terms Active CN116204688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310483388.4A CN116204688B (en) 2023-05-04 2023-05-04 Method for recommending user search terms based on typing search terms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310483388.4A CN116204688B (en) 2023-05-04 2023-05-04 Method for recommending user search terms based on typing search terms

Publications (2)

Publication Number Publication Date
CN116204688A true CN116204688A (en) 2023-06-02
CN116204688B CN116204688B (en) 2023-06-30

Family

ID=86517671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310483388.4A Active CN116204688B (en) 2023-05-04 2023-05-04 Method for recommending user search terms based on typing search terms

Country Status (1)

Country Link
CN (1) CN116204688B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474636A (en) * 2023-12-27 2024-01-30 广州宇中网络科技有限公司 Platform user recommendation method and system based on big data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217142A1 (en) * 2013-09-29 2016-07-28 Peking University Founder Group Co., Ltd. Method and system of acquiring semantic information, keyword expansion and keyword search thereof
CN106022869A (en) * 2016-05-12 2016-10-12 北京邮电大学 Consumption object recommending method and consumption object recommending device
CN109635291A (en) * 2018-12-04 2019-04-16 重庆理工大学 A kind of recommended method of fusion score information and item contents based on coorinated training
CN110276009A (en) * 2019-06-20 2019-09-24 北京百度网讯科技有限公司 A kind of recommended method of associational word, device, electronic equipment and storage medium
CN113987159A (en) * 2021-11-11 2022-01-28 北京爱奇艺科技有限公司 Recommendation information determining method and device, electronic equipment and storage medium
CN114329055A (en) * 2021-12-27 2022-04-12 北京达佳互联信息技术有限公司 Search recommendation method and recommendation device, electronic device and storage medium
CN116089567A (en) * 2023-01-04 2023-05-09 浙江极氪智能科技有限公司 Recommendation method, device, equipment and storage medium for search keywords

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217142A1 (en) * 2013-09-29 2016-07-28 Peking University Founder Group Co., Ltd. Method and system of acquiring semantic information, keyword expansion and keyword search thereof
CN106022869A (en) * 2016-05-12 2016-10-12 北京邮电大学 Consumption object recommending method and consumption object recommending device
CN109635291A (en) * 2018-12-04 2019-04-16 重庆理工大学 A kind of recommended method of fusion score information and item contents based on coorinated training
CN110276009A (en) * 2019-06-20 2019-09-24 北京百度网讯科技有限公司 A kind of recommended method of associational word, device, electronic equipment and storage medium
CN113987159A (en) * 2021-11-11 2022-01-28 北京爱奇艺科技有限公司 Recommendation information determining method and device, electronic equipment and storage medium
CN114329055A (en) * 2021-12-27 2022-04-12 北京达佳互联信息技术有限公司 Search recommendation method and recommendation device, electronic device and storage medium
CN116089567A (en) * 2023-01-04 2023-05-09 浙江极氪智能科技有限公司 Recommendation method, device, equipment and storage medium for search keywords

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474636A (en) * 2023-12-27 2024-01-30 广州宇中网络科技有限公司 Platform user recommendation method and system based on big data
CN117474636B (en) * 2023-12-27 2024-04-12 广州宇中网络科技有限公司 Platform user recommendation method and system based on big data

Also Published As

Publication number Publication date
CN116204688B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
US20210150130A1 (en) Methods for generating natural language processing systems
CN111061856B (en) Knowledge perception-based news recommendation method
CN108647205B (en) Fine-grained emotion analysis model construction method and device and readable storage medium
West et al. Knowledge base completion via search-based question answering
CN107122469B (en) Query recommendation ranking method and device based on semantic similarity and timeliness frequency
CN110096577A (en) From the intention of abnormal profile data prediction user
CN112800170A (en) Question matching method and device and question reply method and device
CN110674279A (en) Question-answer processing method, device, equipment and storage medium based on artificial intelligence
CN102663129A (en) Medical field deep question and answer method and medical retrieval system
CN116204688B (en) Method for recommending user search terms based on typing search terms
CN113672708A (en) Language model training method, question and answer pair generation method, device and equipment
CN115599902B (en) Oil-gas encyclopedia question-answering method and system based on knowledge graph
CN111460158B (en) Microblog topic public emotion prediction method based on emotion analysis
CN114169869B (en) Attention mechanism-based post recommendation method and device
CN116992005B (en) Intelligent dialogue method, system and equipment based on large model and local knowledge base
CN110597968A (en) Reply selection method and device
CN111090771A (en) Song searching method and device and computer storage medium
CN115577185A (en) Muting course recommendation method and device based on mixed reasoning and mesopic group decision
Eskandari et al. Predicting best answer using sentiment analysis in community question answering systems
CN110390050B (en) Software development question-answer information automatic acquisition method based on deep semantic understanding
CN115827968A (en) Individualized knowledge tracking method based on knowledge graph recommendation
CN115510326A (en) Internet forum user interest recommendation algorithm based on text features and emotional tendency
CN114138954A (en) User consultation problem recommendation method, system, computer equipment and storage medium
CN111159360B (en) Method and device for obtaining query topic classification model and query topic classification
CN115391500A (en) Dialogue type information retrieval method based on pre-training language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant