CN116204688A

CN116204688A - Method for recommending user search terms based on typing search terms

Info

Publication number: CN116204688A
Application number: CN202310483388.4A
Authority: CN
Inventors: 李志洁; 王鹏; 陈拉拉
Original assignee: Quantum Digital Technology Co ltd
Current assignee: Quantum Digital Technology Co ltd
Priority date: 2023-05-04
Filing date: 2023-05-04
Publication date: 2023-06-02
Anticipated expiration: 2043-05-04
Also published as: CN116204688B

Abstract

The invention relates to the technical field of electric digital data processing, in particular to a method for recommending user search words based on typing search words, which comprises the following steps: acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information; according to the target category corresponding to the target typing information, obtaining a target prediction score corresponding to each candidate search word in the candidate search word set, wherein the candidate search word set comprises: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information; screening a search word set to be recommended from the candidate search word sets according to target prediction scores corresponding to the candidate search words; and recommending the search word set to be recommended to the user to be recommended. According to the method and the device for recommending the search terms, the accuracy of recommending the search terms for the user is improved by carrying out data processing on the target typing information, and the method and the device are applied to recommending the search terms for the user.

Description

Method for recommending user search terms based on typing search terms

Technical Field

The invention relates to the technical field of electric digital data processing, in particular to a method for recommending user search words based on typing search words.

Background

With the development of science and technology, various types of electronic devices walk into people's daily life, and in order to improve the intelligent experience of using electronic devices, most electronic devices at present often recommend related content according to search words of users, where the search words refer to terms input when users search for content in a search engine. In order to improve user experience, search word recommendation is often performed when a user inputs part of the content of the search word, and when the recommended search word contains the search word required by the user, the user does not need to perform subsequent input, so that the user experience is improved. Currently, when a user is recommended a search term, the following methods are generally adopted: recommended search terms are determined based on historical search terms of the user.

However, when the above manner is adopted, there are often the following technical problems:

when the content that the user wants to search is content in terms of the type that the user has not searched for in the history, it is often difficult to accurately recommend the search word to the user based on the user's history search word, resulting in low accuracy in recommending the search word to the user.

Disclosure of Invention

The summary of the invention is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. The summary of the invention is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In order to solve the technical problem of low accuracy of search word recommendation for users, the invention provides a method for recommending user search words based on typing the search words.

The invention provides a method for recommending user search words based on typing search words, which comprises the following steps:

acquiring target typing information corresponding to a user to be recommended, and determining a target category corresponding to the target typing information;

obtaining a target prediction score corresponding to each candidate search word in a candidate search word set according to a target category corresponding to the target typing information, wherein the candidate search word set comprises: each similar user in the similar user set corresponding to the user to be recommended and search words of the user to be recommended under the target category corresponding to the target typing information;

screening a search word set to be recommended from the candidate search word set according to target prediction scores corresponding to the candidate search words;

recommending the search word set to be recommended to the user to be recommended;

determining the set of similar users comprises the steps of:

acquiring a historical search information set corresponding to a user to be recommended and a historical search information set corresponding to each reference user in a reference user set;

Classifying search words included in all the obtained historical search information sets to obtain a target category set;

determining target association degrees between every two search words included in all obtained historical search information sets;

determining target evaluation indexes of each target user under each target category according to all the obtained historical search information sets, the target category sets and the target association degrees among the search words to obtain a target evaluation matrix, wherein the target users are users to be recommended or reference users;

and screening a similar user set from the reference user set according to all the obtained historical search information sets and the target evaluation matrix.

Further, the classifying the search terms included in all the obtained historical search information sets to obtain a target category set includes:

inputting each search word included in all obtained historical search information sets into a target classification network which is trained in advance to obtain the probability that the search word belongs to each preset category in a preset category set, and obtaining a category probability set corresponding to the search word as the category probability of the search word under the preset category;

For each search word included in all obtained historical search information sets, screening out the maximum category probability from the category probability set corresponding to the search word, taking the maximum category probability as the target probability corresponding to the search word, and determining the preset category corresponding to the target probability corresponding to the search word as the target category corresponding to the search word;

and combining the target categories corresponding to all the search words included in all the obtained historical search information sets into a target category set.

Further, the determining the target association degree between every two search words included in all the obtained historical search information sets includes:

according to the target probabilities corresponding to the two search words, determining the two search words as a first search word and a second search word respectively;

determining a first difference between the first search word and the second search word according to a first quantity, a second quantity and a third quantity, wherein the first quantity is the quantity of the historical search information which comprises the first search word and does not comprise the second search word in all the historical search information sets, the second quantity is the quantity of the historical search information which comprises the second search word and does not comprise the first search word in all the historical search information sets, the third quantity is the quantity of the historical search information which comprises both the first search word and the second search word in all the historical search information sets, the first quantity and the second quantity are positively correlated with the first difference, and the third quantity is negatively correlated with the first difference;

Determining an absolute value of a difference value of a target probability corresponding to the second probability and the first search word as a second difference between the first search word and the second search word, wherein the second probability is a category probability of the second search word under a target category corresponding to the first search word;

determining a third difference between the first search word and the second search word according to the first difference and the second difference between the first search word and the second search word, wherein the first difference and the second difference are positively correlated with the third difference;

encoding the first search word and the second search word to obtain first encoded data corresponding to the first search word and second encoded data corresponding to the second search word;

determining an edit distance between the first encoded data and the second encoded data as a fourth difference between the first search term and the second search term;

and determining the target association degree between the first search word and the second search word according to a fourth difference and a third difference between the first search word and the second search word, wherein the fourth difference and the third difference are in negative correlation with the target association degree.

Further, the determining the target evaluation index of each target user under each target category according to the obtained all historical search information sets, the target category sets and the target association degree between the search words comprises the following steps:

Determining the duty ratio of a fourth quantity in a fifth quantity as an initial evaluation index of the target user under the target category, wherein the fourth quantity is the quantity of the target user input search words included in all the historical search information sets, and the fifth quantity is the quantity of the target user input search words included in all the historical search information sets;

determining a first relevance of the target user under the target category according to target relevance between each search word typed in by the target user and each search word in the target category, which are included by all historical search information sets, wherein the target relevance between each search word typed in by the target user and each search word in the target category is positively correlated with the first relevance;

determining a second association degree corresponding to the target category according to target association degrees between all search words typed in by all target users and all search words in the target category, wherein the target association degrees between all search words typed in by all target users and all search words in the target category are positively correlated with the second association degree;

Determining the first association degree of the target user under the target category and the duty ratio of the first association degree in the second association degree corresponding to the target category as the third association degree of the target user under the target category;

determining a reference evaluation index corresponding to the target user according to the initial evaluation index of the target user under the target category in the target category set, wherein the initial evaluation index of the target user under the target category in the target category set is positively correlated with the reference evaluation index;

and determining a target evaluation index of the target user under the target category according to the reference evaluation index corresponding to the target user and a third association degree of the target user under the target category, wherein the reference evaluation index and the third association degree are positively correlated with the target evaluation index.

Further, the screening the similar user set from the reference user set according to the obtained all historical search information sets and the target evaluation matrix includes:

determining the semantic fitness of each target user under each target category according to all the obtained historical search information sets to obtain a semantic fitness matrix;

And screening a similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix.

Further, the determining the semantic fitness of each target user under each target category according to all the obtained historical search information sets includes:

determining target behavior fitness of the target user under the target category according to target behavior frequency corresponding to search words included in all the historical search information sets;

determining variances of target lengths corresponding to all search words typed by the target user under the target category, which are included in all the historical search information sets, as first semantic differences of the target user under the target category;

determining a second semantic difference corresponding to each search word typed in by the target user under the target category according to a modified word set corresponding to each search word typed in by the target user under the target category, which is included in all the historical search information sets;

determining a third semantic difference of the target user under the target category according to the second semantic differences corresponding to the search words typed in by the target user under the target category, wherein the second semantic differences corresponding to the search words typed in by the target user under the target category are positively correlated with the third semantic differences;

Determining the semantic fitness of the target user under the target category according to the target behavior fitness, the first semantic difference and the third semantic difference of the target user under the target category, wherein the target behavior fitness is positively correlated with the target behavior fitness, and the first semantic difference and the third semantic difference are negatively correlated with the target behavior fitness.

Further, the determining, according to the target behavior frequency corresponding to the search terms included in all the historical search information sets, the target behavior fitness of the target user under the target category includes:

determining a first behavior difference of the target user under the target category according to target behavior frequency corresponding to each search word typed in by the target user under the target category and included in all the historical search information sets, wherein the target behavior frequency is positively correlated with the first behavior difference;

determining the variance of the target behavior frequency corresponding to all search words typed by the target user under the target category and included in all the historical search information sets as the second behavior difference of the target user under the target category;

Determining a mean value of target behavior frequencies corresponding to all search words in all target categories in the target category set included in all historical search information sets as a reference behavior frequency;

determining the accumulated sum of differences of the target behavior frequency corresponding to each search word typed by the target user under the target category and the reference behavior frequency included in all the historical search information sets as a third behavior difference of the target user under the target category;

and determining the target behavior compliance degree of the target user under the target category according to the first behavior difference, the second behavior difference and the third behavior difference of the target user under the target category, wherein the first behavior difference, the second behavior difference and the third behavior difference are in negative correlation with the target behavior compliance degree.

Further, the determining, according to the modified word set corresponding to each search word typed in by the target user under the target category and included in all the historical search information sets, the second semantic difference corresponding to each search word typed in by the target user under the target category includes:

determining the difference between the search word and each modification word in the modification word set corresponding to the search word, and obtaining a target difference set corresponding to the search word as a target difference between the search word and the modification word;

And determining a second semantic difference corresponding to the search word according to the target difference set corresponding to the search word, wherein each target difference in the target difference set is positively correlated with the second semantic difference.

Further, the screening the similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix includes:

for each reference user in the to-be-recommended user and the reference user set, determining the square of the difference value of the target evaluation indexes of the reference user and the to-be-recommended user in each target category, which is included in the target evaluation matrix, as a first evaluation difference between the to-be-recommended user and the reference user in the target category, and obtaining a first evaluation difference set between the to-be-recommended user and the reference user;

determining a second evaluation difference between the user to be recommended and each reference user according to a first evaluation difference set between the user to be recommended and each reference user, wherein the first evaluation difference in the first evaluation difference set is positively correlated with the second evaluation difference;

for each reference user in the user to be recommended and the reference user set, determining the square of the difference value of the semantic fitness of the reference user and the user to be recommended, which is included in the semantic fitness matrix, under each target category as a first fit difference between the user to be recommended and the reference user under the target category, and obtaining a first fit difference set between the user to be recommended and the reference user;

Determining a second fit difference between the user to be recommended and each reference user according to a first fit difference set between the user to be recommended and each reference user, wherein the first fit difference in the first fit difference set and the second fit difference are positively correlated;

determining a measurement distance between the user to be recommended and each reference user according to a second evaluation difference and a second fit difference between the user to be recommended and each reference user, wherein the second evaluation difference and the second fit difference are positively correlated with the measurement distance;

and screening a similar user set from the reference user set according to the measurement distance between the user to be recommended and each reference user in the reference user set.

Further, the obtaining, according to the target category corresponding to the target typing information, a target prediction score corresponding to each candidate search term in the candidate search term set includes:

and screening sub-search phrases corresponding to the candidate search words from target search phrases, wherein the target search phrases comprise: each similar user in the similar user set corresponding to the user to be recommended and all search words of the user to be recommended under the target category corresponding to the target typing information;

For each search word in the sub-search word group corresponding to the candidate search word, determining the semantic fitness of a target user typing the search word under the target category corresponding to the target typing information as the target fitness corresponding to the search word;

for each search word in the sub-search word groups corresponding to the candidate search word, determining a first score corresponding to the search word according to the target concordance degree and the target probability corresponding to the search word, wherein the target concordance degree and the target probability are positively correlated with the first score;

and determining target prediction scores corresponding to the candidate search words according to first scores corresponding to the search words in the sub-search word groups corresponding to the candidate search words, wherein the first scores corresponding to the search words in the sub-search word groups are positively correlated with the target prediction scores.

The invention has the following beneficial effects:

according to the method for recommending the user search word based on the typed search word, the technical problem that the accuracy of recommending the search word to the user is low is solved by carrying out data processing on the target typed information, and the accuracy of recommending the search word to the user is improved. Firstly, determining the target category corresponding to the target typing information can facilitate understanding of the content type which the user to be recommended wants to know, and can facilitate accurate recommendation subsequently. Next, since the candidate search term set includes: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. Therefore, compared with the screening of the historical search words of the user to be recommended, the search words in the candidate search word set are more consistent with the content in the type which the user to be recommended wants to know, and are not mixed together in multiple types, so that the content which the user to be recommended wants to search can be screened more easily. And secondly, screening the search words to be recommended from the candidate search word set, wherein compared with the screening of the search words from the history search words of the users to be recommended, the search words in the candidate search word set not only contain the search words which are input by the users to be recommended, but also contain the search words which are input by each similar user in the similar user set similar to the users to be recommended, so that the search words in the candidate search word set are more comprehensive, and even if the content which the users to be recommended want to search is the content which the users to be recommended do not search for in the aspect of the type, the users to be recommended can also be recommended based on the search words which are input by the similar users in the similar user set in the aspect of the type. For example, the target category corresponding to the target typing information is a type that the user to be recommended has not searched, and the search word recommendation may also be performed from the search words of each similar user in the set of similar users included in the candidate search word set under the target category corresponding to the target typing information. Then, a target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later. Finally, recommending the search term set to be recommended to the user to be recommended, so that the search term recommendation of the user to be recommended can be realized, and the accuracy of the search term recommendation of the user is improved. And secondly, based on the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user sets, comprehensively considering the target association degree and the target evaluation matrix between the search words, screening the similar user sets from the reference user sets, and improving the accuracy of determining the similar user sets, so that the accuracy of recommending the search words to the user to be recommended can be improved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method of recommending user search terms based on typed search terms in accordance with the present invention;

FIG. 2 is a flow chart of steps for determining a set of similar users in accordance with the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description is given below of the specific implementation, structure, features and effects of the technical solution according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

obtaining a target prediction score corresponding to each candidate search word in the candidate search word set according to a target category corresponding to the target typing information;

screening a search word set to be recommended from the candidate search word sets according to target prediction scores corresponding to the candidate search words;

and recommending the search word set to be recommended to the user to be recommended.

The following detailed development of each step is performed:

referring to FIG. 1, a flow diagram of some embodiments of a method of recommending user search terms based on typed search terms is shown, in accordance with the present invention. The method for recommending the user search word based on the typed search word comprises the following steps:

step S1, target typing information corresponding to a user to be recommended is obtained, and a target category corresponding to the target typing information is determined.

In some embodiments, target typing information corresponding to a user to be recommended may be obtained, and a target category corresponding to the target typing information may be determined.

The user to be recommended can be a user to be recommended by the search word. The search term may be text information that is searched. The target key-in information may also be text information. The text information may be any information of literal composition. For example, the text information may be, but is not limited to: words, sentences, idioms or a combination of words. The target typing information may be content to be recommended that the user has typed to participate in the search. The target category to which the target key information corresponds may be a category in which the target key information is located.

It should be noted that, determining the target category corresponding to the target typing information can facilitate understanding of the content type that the user to be recommended wants to know, and can facilitate accurate recommendation.

As an example, this step may include the steps of:

first, target typing information corresponding to a user to be recommended is obtained.

For example, the content that the user to be recommended has entered (input) in the search box may be acquired as target-entered information.

For example, if the content that the user to be recommended has already typed in the search box is "computer", the target typing information is "computer". If the content which is to be recommended and is already typed in the search box by the user is 'mobile phone battery', the target typing information is 'mobile phone battery'.

And secondly, determining the target category corresponding to the target typing information.

For example, the target category corresponding to the target typing information may be determined through a pre-trained target classification network.

The target classification network may be a network for determining a category of text information. The object classification network may be a TextCNN network (Text Convolutional Neural Networks, convolutional neural network for text analysis). The optimizer of the TextCNN network may be Adam.

Optionally, the training process of the object classification network may comprise the steps of:

the method comprises the steps of obtaining a reference text information set and a category of each reference text information in the reference text information set.

Wherein the reference text information may be a known category of text information.

And secondly, constructing a target classification network.

For example, a TextCNN network may be constructed as the target classification network before training.

Thirdly, taking the reference text information set as a training set of the target classification network, taking the category of each reference text information as a training label of the target classification network, and training the constructed target classification network to obtain the target classification network after training.

The loss function in the training process of the target classification network can be a cross entropy loss function. The output of the target classification network may be a probability that the reference text information belongs to each of a set of preset categories. The preset category may be a preset category. The preset category set may include: the category of each reference text information in the pre-marked reference text information set. The number of preset categories in the set of preset categories may be 100.

For example, the set of preset categories may include: computer-related category, cell phone-related category, and pencil-related category. Computer-related categories may include: information related to the computer. The cell phone related categories may include: information related to the handset. Pencil-related categories may include: information related to the pencil. If the reference text information is a computer keyboard, the category of the reference text information can be a computer related category, and when the target classification network training is carried out, the probability that the reference text information respectively belongs to the computer related category, the mobile phone related category and the pencil related category can be obtained. The reference text information is input into a target classification network after training is completed, and the maximum probability in the obtained probabilities can be the probability that the reference text information belongs to a computer-related category.

And S2, obtaining a target prediction score corresponding to each candidate search word in the candidate search word set according to the target category corresponding to the target typing information.

In some embodiments, a target prediction score corresponding to each candidate search term in the candidate search term set may be obtained according to the target category corresponding to the target typing information.

Wherein, the candidate search term set may include: and each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. For example, the candidate set of search terms may include: the method comprises the steps of searching words of each similar user in a similar user set corresponding to the user to be recommended under a target category corresponding to target typing information, and searching words of the user to be recommended under the target category corresponding to the target typing information. The candidate search word set may be a set obtained by performing duplication elimination on each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. The similar users in the set of similar users may be users having similar preferences to the users to be recommended. The search term of the user under a certain target category may be a search term input by the user and belonging to the target category. The set of candidate search terms may be a set obtained by a crawler.

It should be noted that, the larger the target prediction score corresponding to the candidate search term, the more the candidate search term should be recommended. Therefore, the target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later.

Optionally, referring to fig. 2, determining the set of similar users may include the steps of:

step 201, acquiring a historical search information set corresponding to a user to be recommended and a historical search information set corresponding to each reference user in a reference user set.

In some embodiments, a set of historical search information corresponding to the user to be recommended and a set of historical search information corresponding to each reference user in the set of reference users may be obtained.

The historical search information set corresponding to the user to be recommended may include: search term information entered by the user to be recommended at different times. Referencing the set of historical search information corresponding to the user may include: the reference user types in search term information at different times. The historical search information may include: the method comprises the steps of searching words, typing time of the searching words, target behavior frequency corresponding to the searching words, target length corresponding to the searching words and modification word set corresponding to the searching words. The historical search information may include search terms that are content that the user needs to search for. For example, the search term may be content within a search box when the user clicks a search button. The typing time of a search term may be the time the search term was entered into the search box. The frequency of target behaviors corresponding to the search term may be the number of times the search content is modified before the user clicks the search button. For example, the target behavioral frequency corresponding to a search term may be equal to the number of modifier words in the set of modifier words corresponding to the search term. The target length to which the search term corresponds may be the number of words in the search term. The set of modification words corresponding to the search word may include: the user modifies the search content to obtain content before correctly entering the search term in the search box. For example, some historical search information may include: "cell phone wallpaper picture", "2023, 04, 24, 09 minutes and 26 seconds", 4, 6, { "hand set", "hand", "cell phone wallpaper paint", "cell phone wallpaper" }. The mobile phone wallpaper picture is a search word included in the historical search information. "2023, 04, 24, 09, 31 minutes, 26 seconds" is the time of entry of the search term included in the historical search information. And 4, the target behavior frequency corresponding to the search word included in the historical search information. And 6, the target length corresponding to the search word included in the historical search information. The set of modification words corresponding to the search word may be { "hand set", "hand", "cell phone wallpaper coating", "cell phone wallpaper" }. The "handset", "hand", "cell wallpaper", and "cell wallpaper" may be modified words resulting from modification of search content. The "hand set" may be the first input error when the user inputs the search term "mobile wallpaper picture", so that the "set" in the "hand set" is deleted, and is modified once to obtain "hand", and the "hand" is added to obtain "mobile wallpaper coating", and the erroneous text "coating" is present, so that the "coating" in the "mobile wallpaper coating" is deleted, and is modified once to obtain "mobile wallpaper", and no error is input after the "mobile wallpaper", so that the obtained modification words may be "hand set", "hand", "mobile wallpaper coating" and "mobile wallpaper", respectively.

It should be noted that, the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user set are obtained, so that the similarity condition between the user to be recommended and the reference user can be conveniently and subsequently judged, and the similar user set can be conveniently screened out from the reference user set.

As an example, a crawler technology may be utilized to obtain a set of historical search information corresponding to a user to be recommended and a set of historical search information corresponding to each reference user in the set of reference users. In order to avoid the phenomenon of data abnormality in the process of the crawler, data can be cleaned on the data acquired by the crawler.

Step 202, classifying search words included in all obtained historical search information sets to obtain a target category set.

In some embodiments, the search terms included in all the obtained historical search information sets may be categorized to obtain a set of target categories.

Wherein the set of target categories may include: all the historical search information sets comprise the category of the search word.

It should be noted that, classifying the search terms included in all the obtained historical search information sets can facilitate subsequent analysis of the situation of each target user under each target category, and can facilitate subsequent screening of similar user sets from the reference user sets. The target user may be a user to be recommended or a reference user.

As an example, this step may include the steps of:

the first step, inputting each search word included in all obtained historical search information sets into a target classification network which is trained in advance to obtain the probability that the search word belongs to each preset category in a preset category set, and obtaining a category probability set corresponding to the search word by taking the probability as the category probability of the search word under the preset category.

The category probability set corresponding to the search word may include: the category probabilities of the search term under each preset category in the set of preset categories.

And secondly, screening out the maximum category probability from the category probability set corresponding to the search word for each search word included in all the obtained historical search information sets, taking the maximum category probability as the target probability corresponding to the search word, and determining the preset category corresponding to the target probability corresponding to the search word as the target category corresponding to the search word.

The target category corresponding to the search term may be a category in which the search term is located. The probability that the search word belongs to the target category corresponding to the search word may be the largest category probability in the set of category probabilities corresponding to the search word.

And thirdly, combining target categories corresponding to all search words included in the obtained historical search information set into a target category set.

Step 203, determining the target association degree between every two search words included in all the obtained historical search information sets.

In some embodiments, a target degree of association between every two search terms included in all of the resulting sets of historical search information may be determined.

Wherein, the target association degree between two search words can represent the association condition between the two search words.

As an example, this step may include the steps of:

and determining the two search words as a first search word and a second search word respectively according to the target probabilities corresponding to the two search words.

The search word with the larger target probability of the two search words can be determined to be the first search word, and the search word with the smaller target probability of the two search words can be determined to be the second search word. When the target probabilities corresponding to the two search words are equal, the two search words can be randomly determined to be the first search word and the second search word.

And a second step of determining a first difference between the first search word and the second search word according to the first number, the second number and the third number.

Wherein the first quantity may be the quantity of historical search information in all sets of historical search information that includes the first search term and does not include the second search term. The second number may be the number of historical search information in all sets of historical search information that includes the second search term and does not include the first search term. The third quantity may be the quantity of historical search information in all sets of historical search information that includes both the first search term and the second search term. Both the first number and the second number may be positively correlated with the first difference. The third number may be inversely related to the first difference.

For example, if the first search term is "mobile phone" and the second search term is "battery", the search term "mobile phone screen" may be a search term including the first search term and excluding the second search term, and the history search information where the search term "mobile phone screen" is located may be history search information including the first search term and excluding the second search term. The search term "computer battery" may be a search term including the second search term excluding the first search term, and the history search information in which the search term "computer battery" is located may be history search information including the second search term excluding the first search term. The search term "mobile phone battery" may be a search term including both the first search term and the second search term, and the history search information in which the search term "mobile phone battery" is located may be history search information including both the first search term and the second search term.

And thirdly, determining the absolute value of the difference value of the target probability corresponding to the second probability and the first search word as a second difference between the first search word and the second search word.

The second probability may be a category probability of the second search word under the target category corresponding to the first search word.

Fourth, determining a third difference between the first search word and the second search word based on the first difference and the second difference between the first search word and the second search word.

Wherein both the first difference and the second difference may be positively correlated with the third difference.

And fifthly, encoding the first search word and the second search word to obtain first encoded data corresponding to the first search word and second encoded data corresponding to the second search word.

For example, the first search term may be encoded using the encoding rules of UTF-8 (8-bit, universal Character Set/Unicode Transformation Format, variable length character encoding) to obtain first encoded data. And coding the second search word by adopting a coding rule of UTF-8 to obtain second coded data.

And a sixth step of determining an edit distance between the first encoded data and the second encoded data as a fourth difference between the first search word and the second search word.

Seventh, determining a target association degree between the first search word and the second search word according to the fourth difference and the third difference between the first search word and the second search word.

Wherein, the fourth difference and the third difference may both be inversely related to the target association.

For example, the formula for determining the target relevance correspondence between the first search term and the second search term may be:

wherein, the liquid crystal display device comprises a liquid crystal display device,

is the target association between the first search term and the second search term.

Is a first difference between the first search term and the second search term.

Is a first number.

Is a second number.

Is a third number.

Is to take

And

is the maximum value of (a).

Is to take

And

is the minimum value of (a). If it is

Is that

Then

Is that

. If it is

Is that

Then

Is that

。

And

are all in contact with

And shows positive correlation.

And (3) with

And has negative correlation. M is the number of history search information in the resulting set of all history search information.

Is the target probability corresponding to the first search term.

Is the second probability.

Is that

Is the absolute value of (c).

Is a second difference between the first search term and the second search term.

Is of natural constant

To the power.

Is a third difference between the first search term and the second search term.

And

are all in contact with

And shows positive correlation.

And

is a preset factor greater than 0 and is mainly used for preventing denominator from being 0. For example,

And

all can take 0.01.

Is the edit distance between the first encoded data corresponding to the first search word and the second encoded data corresponding to the second search word, i.e., the fourth difference between the first search word and the second search word.

And

are all in contact with

And has negative correlation.

When the following is performed

The larger the term, the more often the first and second terms are presented at the same time, the more likely the first and second terms are words in the same category, and the higher the degree of association between the first and second terms.

And

the larger the size, the more often the first and second search terms are described as appearing separatelyThe greater the likelihood that the first and second search terms are often described as not being terms in the same category, the lower the degree of association between the first and second search terms is often described. Thus (2)

The larger the association between the first and second search terms, the lower the association between the first and second search terms. When (when)

The larger the term, the more likely the first and second terms are not terms in the same category, the lower the degree of association between the first and second terms. Thus, the first and second substrates are bonded together,

The larger the association between the first and second search terms, the lower the association between the first and second search terms. Due to

Is the edit distance between the first encoded data corresponding to the first search word and the second encoded data corresponding to the second search word, thus when

The larger the difference between the first and second search terms tends to be explained, the lower the degree of association between the first and second search terms tends to be explained. Thus, the first and second substrates are bonded together,

the larger the association between the first and second search terms, the lower the association between the first and second search terms.

And 204, determining target evaluation indexes of each target user under each target category according to all the obtained historical search information sets, target category sets and target association degrees among search words, and obtaining a target evaluation matrix.

In some embodiments, the target evaluation index of each target user under each target category may be determined according to all the obtained historical search information sets, the target category sets and the target association degrees between the search words, so as to obtain a target evaluation matrix.

The target user may be a user to be recommended or a reference user. The target evaluation matrix may include: target evaluation indexes of each target user under each target category.

It should be noted that, the target evaluation index of the target user under the target category may represent the preference score of the target user for the target category, that is, may represent the preference degree of the target user for the target category.

As an example, this step may include the steps of:

and a first step of determining the ratio of the fourth quantity to the fifth quantity as an initial evaluation index of the target user under the target category.

Wherein the fourth number may be the number of search terms entered in the target category by the target user included in all of the set of historical search information. The fifth number may be the number of search terms entered by the target user as described above that are included in all of the set of historical search information.

For example, the formula corresponding to the initial evaluation index of the target user under the target category may be:

is the first

The first target user in the target class set

Initial evaluation index under each target class.

Is the first to be included in all the historical search information sets

The target user is at the first

The number of search terms typed in the target category, i.e., the fourth number.

Is the first to be included in all the historical search information sets

The number of search terms entered by the individual target user is the fifth number.

0.01 may be taken.

Is the sequence number of the target user.

Is the sequence number of the target class in the set of target classes.

When the following is performed

The larger the tends to explain the first

The target user is at the first

The more search terms typed in the individual target categories, the more often the description of the first

Target user pair(s)

Inner of individual target categoriesThe more interesting the capacity may be, the more often the description is

Target user pair(s)

The higher the degree of preference of the individual target categories. Due to

Is the first

The number of search terms entered by the individual target users, and therefore when

The larger the tends to explain the first

The target user is at the first

The more search terms typed in each target category relative to other target categories, the more often the description of the first

Target user pair(s)

The more interesting the content in the individual target categories may be relative to other target categories, often explaining the first

Target user pair(s)

The higher the degree of preference of the individual target categories may be relative to other target categories.

And a second step of determining a first association degree of the target user under the target category according to the target association degree between each search word typed by the target user and each search word in the target category, which is included in all the historical search information sets.

The target association degree between each search word typed by the target user and each search word in the target category can be positively correlated with the first association degree.

And thirdly, determining a second association degree corresponding to the target category according to the target association degree between each search word typed by all target users included in all the historical search information sets and each search word in the target category.

The target association degree between each search word typed by all target users and each search word in the target category can be positively correlated with the second association degree.

And fourth, determining the first association degree of the target user under the target category and the duty ratio of the second association degree corresponding to the target category as the third association degree of the target user under the target category.

Fifthly, determining a reference evaluation index corresponding to the target user according to the initial evaluation index of the target user under the target category in the target category set.

The initial evaluation index of the target user under the target category in the target category set may be positively correlated with the reference evaluation index.

For example, the average value of the initial evaluation indexes of the target user under all target categories in the target category set can be determined as the reference evaluation index corresponding to the target user.

For another example, the largest initial evaluation index can be selected from the initial evaluation indexes of the target users under each target category in the target category set, and the initial evaluation index is used as the reference evaluation index corresponding to the target user.

And sixthly, determining a target evaluation index of the target user in the target category according to the reference evaluation index corresponding to the target user and a third association degree of the target user in the target category.

Wherein, the reference evaluation index and the third association degree may both be positively correlated with the target evaluation index.

For example, the formula corresponding to the target evaluation index of the target user under the target category may be:

is the first

The first target user in the target class set

Target evaluation index under each target class.

Is the first to be included in all the historical search information sets

Ith search term and ith search term entered by individual target user

Target relevance between the j-th search term in the target category.

Is the first

The target user is at the first

A first degree of association under the individual target categories.

Is the first to be included in all the historical search information sets

The number of search terms entered by the individual target user.

Is the first to be included in all the historical search information sets

The number of search terms in the individual target categories.

And (3) with

And shows positive correlation. n is the number of target users.

Individual search terms and the th of all target user-entered search terms that may be included in all sets of historical search information

An accumulated value of target relevance between individual search terms in the individual target categories.

Is the first

And a second degree of association corresponding to each target category.

And (3) with

And shows positive correlation.

Is the first

The target user is at the first

Third degree of association under the individual target category.

Is a preset factor greater than 0 and is mainly used for preventing denominatorIs 0. For example,

0.01 may be taken.

Is the first

And the reference evaluation indexes corresponding to the target users.

Is of natural constant

To the power.

Can realize the pair of

Is included in the (c) for the normalization.

Is the sequence number of the target user.

Is the sequence number of the target class in the set of target classes. i is the first included in all historical search information sets

The sequence number of the search term entered by the individual target user. j is the first included in all historical search information sets

The sequence number of the search term in the individual target category.

It should be noted that due to

Is the first

Ith search term and ith search term entered by individual target user

Target relevance between the jth search term in the target category, so

Can characterize the first

Target user and the first

The degree of association of the individual target categories. And due to

All target users and the first can be characterized

The overall degree of association of the individual target categories. Thus, the first and second substrates are bonded together,

the larger is, tend to illustrate the first

Target user and the first

The greater the relative degree of association of the target categories, the more often the description of the first

Search terms entered by the individual target users are at the first

The more of the target categories, the more often the description of the first

Target user pair(s)

The more interesting the content in the individual target categories may be, the more often the description of the first

Target user pair(s)

The higher the degree of preference of the individual target categories. Due to

The larger is, tend to illustrate the first

Target user pair(s)

The higher the degree of preference of the individual target categories. Thus (2)

The larger is, tend to illustrate the first

Target user pair(s)

The higher the degree of preference of the individual target categories. Secondly, the first step of the method comprises the steps of,

can realize the pair of

Can facilitate subsequent processing.

Step 205, screening out similar user sets from the reference user sets according to all the obtained historical search information sets and the target evaluation matrix.

In some embodiments, a set of similar users may be screened from the set of reference users based on all of the historical search information sets and the target evaluation matrix.

It should be noted that, by comprehensively considering all the obtained historical search information sets and the target evaluation matrix, the similar user sets are screened out from the reference user sets, and the accuracy of determining the similar user sets can be improved, so that the accuracy of recommending search words to the user to be recommended can be improved.

As an example, this step may include the steps of:

the first step, according to all obtained historical search information sets, determining the semantic fitness of each target user under each target category, and obtaining a semantic fitness matrix.

The semantic fitness matrix may include semantic fitness of each target user under each target category.

For example, determining the semantic fitness of each target user under each target category may include the sub-steps of:

and a first sub-step of determining the target behavior fitness of the target user under the target category according to the target behavior frequency corresponding to the search words included in all the historical search information sets.

For example, determining the target behavior compliance of each target user under the target category may include the steps of:

first, determining a first behavior difference of the target user under the target category according to target behavior frequency corresponding to each search word typed in by the target user under the target category and included in all historical search information sets.

Wherein, the target behavior frequency may be positively correlated with the first behavior difference.

For example, the average value of the target behavior frequency corresponding to all search words typed by the target user under the target category and included in all the historical search information sets can be determined as the first behavior difference of the target user under the target category.

For another example, the smallest target behavior frequency among the target behavior frequencies corresponding to the search words typed by the target user under the target category included in all the historical search information sets can be determined as the first behavior difference of the target user under the target category.

And then, determining the variance of the target behavior frequency corresponding to all search words typed by the target user in the target category and included in all the historical search information sets as the second behavior difference of the target user in the target category.

And then, determining the average value of the target behavior frequency corresponding to all search words in all target categories in the target category set included in all the historical search information sets as the reference behavior frequency.

And then, determining the accumulated sum of the differences of the target behavior frequency corresponding to each search word typed by the target user in the target category and the reference behavior frequency, which are included in all the historical search information sets, as a third behavior difference of the target user in the target category.

And finally, determining the target behavior compliance degree of the target user under the target category according to the first behavior difference, the second behavior difference and the third behavior difference of the target user under the target category.

Wherein the first behavior difference, the second behavior difference, and the third behavior difference may all be inversely related to the target behavior fitness.

For example, the formula corresponding to the target behavior compliance degree of the target user under the target category may be determined as follows:

is the first

The first target user in the target class set

Target behavior compliance under the individual target categories.

Is the first

The target user is at the first

First behavioral differences under the individual target categories.

Is the first

The target user is at the first

Second behavior differences under the respective target categories, i.e. the first of all sets of historical search information

The target user is at the first

The variance of the target behavior frequency corresponding to all the search words typed under each target category. t is the frequency of reference behavior.

Is the first to be included in all the historical search information sets

The target user is at the first

And the target behavior frequency corresponding to the f search word typed in under each target category.

Is the first to be included in all the historical search information sets

The target user is at the first

The number of search terms typed under the individual target categories.

Is the first

The target user is at the first

Third behavior differences under the individual target categories.

、

And

、

and

all can take 0.01.

Is the first

The target user is at the first

Fourth behavior differences under the individual target classes.

Can realize the pair of

Is included in the (c) for the normalization.

、

And

can all be connected with

And has negative correlation.

Is the sequence number of the target user.

Is the sequence number of the target class in the set of target classes. f is the first included in all historical search information sets

The target user is at the first

The sequence number of the search term typed under the individual target category.

It should be noted that, when the frequency of the target behavior corresponding to the search word is greater, the more times that the target user inputs the search word to modify is often illustrated, the less familiarity of the target user with the search word is often illustrated, the less familiarity of the target user with the target category of the search word is often illustrated, and the lower the behavior compliance degree of the target user with the target category of the search word is often illustrated. Due to the frequency of target behaviors

Is positively correlated, thus

The larger is, tend to illustrate the first

Target user pair(s)

The lower the degree of fit the individual target categories may be. When (when)

The larger the tends to explain the first

The target user is at the first

The more chaotic the target behavior frequency corresponding to the typed search term under each target category, the more often the description of the first

Target user pair(s)

The more unstable the familiarity of the individual target categories, the more often the description of the first

Target user pair(s)

The more unstable the behavior habit of the target class, the more often the description is

Target user pair(s)

The larger the number of times the f-th search term is modified, the more often the f-th search term is modified. When (when)

The larger the tends to explain the first

The target user is typing in the first

The more times a search term in a target category is modified, the more often the description of the first

Target user pair(s)

The lower the familiarity of the individual target categories, the more often the description of the first

Target user pair(s)

The lower the degree of behavioral conformation of the individual target classes may be. Thus, when

The larger the tends to explain the first

Target user pair(s)

The higher the familiarity of the individual target categories, the more often the description of the first

Target user pair(s)

The higher the degree of behavioral conformation of the individual target classes may be. Secondly, the first step of the method comprises the steps of,

Can realize the pair of

Can facilitate subsequent processing.

And a second sub-step of determining the variance of the target length corresponding to all the search words typed by the target user in the target category, which is included in all the historical search information set, as the first semantic difference of the target user in the target category.

And a third sub-step of determining a second semantic difference corresponding to each search word typed in by the target user in the target category according to the modified word set corresponding to each search word typed in by the target user in the target category, which is included in all the historical search information sets.

For example, determining the second semantic difference corresponding to each search term typed by each target user under the target category may include the steps of:

first, a difference between the search word and each of the modified words in the modified word set corresponding to the search word is determined, and a target difference set corresponding to the search word is obtained as a target difference between the search word and the modified word.

Wherein the target difference between the search term and the modification term may characterize the difference between the search term and the modification term. The set of target differences corresponding to the search term may include: the search term and each of the set of modification terms corresponding to the search term are subject to a difference between the target terms.

For example, the search term may be encoded using the encoding rules of UTF-8 to obtain the first data. The modification word may be encoded using the encoding rule of UTF-8 to obtain the second data. The edit distance between the first data and the second data may be used as a target difference between the search term and the modification term.

And then, determining a second semantic difference corresponding to the search word according to the target difference set corresponding to the search word.

Wherein each target difference in the set of target differences may be positively correlated with the second semantic difference.

And a fourth sub-step of determining a third semantic difference of the target user in the target category according to the second semantic differences corresponding to the search words typed in by the target user in the target category.

The second semantic difference corresponding to each search term typed by the target user in the target category may be positively correlated with the third semantic difference.

And a fifth sub-step of determining the semantic fitness of the target user in the target category according to the target behavior fitness of the target user in the target category, the first semantic difference and the third semantic difference.

The target behavior compliance may be positively correlated with the target behavior compliance. Both the first semantic difference and the third semantic difference may be inversely related to the target behavioral fitness.

For example, the formula corresponding to the semantic fitness of the target user under the target category may be:

is the first

The first target user in the target class set

Semantic fitness under the individual target category.

Is the first

The target user is at the first

Target behavior compliance under the individual target categories.

Is the first

The target user is at the first

First semantic difference under each target category, i.e. the first included in all historical search information sets

The target user is at the first

All search term correspondences typed under individual target categoriesIs a function of the variance of the target length of (a).

Is the first

The target user is at the first

Third semantic differences under the individual target categories.

And (3) with

And shows positive correlation.

0.01 may be taken.

Is the first to be included in all the historical search information sets

The target user is at the first

The target difference between the f-th search word and the b-th modification word in the modification word set corresponding to the f-th search word is typed under the target category.

Is the first to be included in all the historical search information sets

The target user is at the first

Repair in a set of modifiers corresponding to the f search term typed under a target categoryNumber of word changes.

Is the first to be included in all the historical search information sets

The target user is at the first

The number of search terms typed under the individual target categories.

Is the first to be included in all the historical search information sets

The target user is at the first

And a second semantic difference corresponding to the f search term typed under the target category.

And (3) with

And shows positive correlation.

Is the sequence number of the target user.

The target user is at the first

The sequence number of the search term typed under the individual target category. b is the first

The target user is at the first

And the sequence number of the modification word in the modification word set corresponding to the f search word typed in under the target category.

When the following is performed

The larger the tends to explain the first

The target user is at the first

The greater the difference between a search term typed under a respective target category and a modification term in a corresponding set of modification terms, the more likely it is to be explained

The target user is at the first

The more likely the number of corresponding modifier words under the respective target categories, the more often the description of the first

The target user is at the first

The more times a typed search term under a respective target category is modified, the more often the description of the first

Target user pair(s)

The lower the degree of semantic agreement of the individual target categories may be. When (when)

The larger the tends to explain the first

The target user is at the first

The more confusing the target length corresponding to the search term entered under each target category, the more often the description of the first

The target user is at the first

The more different the lengths of the search terms typed under the respective target categories, the more often the description of the first

Target user pair(s)

The lower the degree of semantic agreement of the individual target categories may be. Due to when

The larger the tends to explain the first

Target user pair(s)

The higher the degree of behavioral conformation of the individual target classes may be. Thus, the first and second substrates are bonded together,

the larger is, tend to illustrate the first

Target user pair(s)

The higher the semantic agreement of the individual target categories may be.

And step two, screening out a similar user set from the reference user set according to the target evaluation matrix and the semantic fitness matrix.

For example, screening the set of similar users from the set of reference users described above may include the sub-steps of:

a first substep, for each reference user in the set of users to be recommended and the set of reference users, determining a square of a difference value of target evaluation indexes of the reference user and the user to be recommended in each target category included in the target evaluation matrix as a first evaluation difference between the user to be recommended and the reference user in the target category, and obtaining a first evaluation difference set between the user to be recommended and the reference user.

Wherein the first set of evaluation differences between the user to be recommended and the reference user may comprise: the user to be recommended and the reference user are subject to a first evaluation difference under each target category.

And a second sub-step of determining a second evaluation difference between the user to be recommended and the reference users according to the first evaluation difference set between the user to be recommended and each reference user.

Wherein a first evaluation discrepancy in the first set of evaluation discrepancies may be positively correlated with a second evaluation discrepancy.

And a third sub-step of determining, for each reference user in the set of users to be recommended and the set of reference users, a square of a difference in semantic fitness between the reference user and the user to be recommended in each target category, which is included in the semantic fitness matrix, as a first fit difference between the user to be recommended and the reference user in the target category, and obtaining a first fit difference set between the user to be recommended and the reference user.

Wherein the first set of fit differences between the user to be recommended and the reference user may comprise: the user to be recommended and the reference user are different in first fit under each target category.

And a fourth sub-step of determining a second fit difference between the user to be recommended and the reference users according to the first fit difference set between the user to be recommended and each reference user.

Wherein the first fitting difference in the first fitting difference set may be positively correlated with the second fitting difference.

And a fifth sub-step of determining a measured distance between the user to be recommended and each reference user according to the second evaluation difference and the second fit difference between the user to be recommended and each reference user.

Wherein the second evaluation difference and the second fit difference may each be positively correlated with the metric distance.

For example, the formula for determining the metric distance correspondence between the user to be recommended and the reference user may be:

is the measured distance between the user to be recommended and the c-th reference user in the set of reference users. G is the number of target categories in the set of target categories.

Is the first user to be recommended in the target category set

Semantic fitness under the individual target category.

Is the c reference user in the c

Semantic fitness under the individual target category.

Is the user to be recommended is at the first

Target evaluation index under individual target category。

Is the c reference user in the c

Target evaluation index under each target class.

Is the sequence number of the target class in the set of target classes. c is the sequence number of the reference user in the reference user set.

Is the user to be recommended and the c reference user is in the (th)

A first evaluation difference under the individual target categories.

Is the second estimated difference between the user to be recommended and the c-th reference user.

And (3) with

And shows positive correlation.

Is the user to be recommended and the c reference user is in the (th)

First fitting differences under the individual target categories.

Is the second fit difference between the user to be recommended and the c-th reference user.

And (3) with

And shows positive correlation.

And

are all in contact with

And shows positive correlation.

When the following is performed

And

the smaller the time, the more often the user to be recommended and the c-th reference user are in the third place

The more similar the preference under the individual target categories. Thus, the first and second substrates are bonded together,

the smaller the time, the more similar the preference situation of the user to be recommended and the c-th reference user is, the more likely the c-th reference user is the similar user of the user to be recommended is.

And a sixth sub-step of screening a similar user set from the reference user set according to the measured distance between the user to be recommended and each reference user in the reference user set.

For example, a neighbor set of the user to be recommended may be obtained by using a KNN (K-nearest neighbor) algorithm according to a metric distance between the user to be recommended and each reference user in the reference user set, and the neighbor set of the user to be recommended is determined as a similar user set. Wherein K in the KNN algorithm may be 20.

It should be noted that, the more comprehensive the acquired data in the reference user set and the historical search information set, the more accurate the screening of the similar user set is.

Optionally, according to the target category corresponding to the target typing information, obtaining a target prediction score corresponding to each candidate search term in the candidate search term set may include the following steps:

and the first step, selecting sub-search phrases corresponding to the candidate search words from the target search phrases.

The target search phrase may include: and each similar user in the similar user set corresponding to the user to be recommended and all search words of the user to be recommended under the target category corresponding to the target typing information. The target search phrase may include the same search term. The sub-search phrase corresponding to the candidate search word may include: the candidate search words and the target search word group are the same as the candidate search words.

And a second step of determining, for each search word in the sub-search word groups corresponding to the candidate search word, a semantic fitness of a target user who types the search word under a target category corresponding to the target typing information as a target fitness corresponding to the search word.

And thirdly, for each search word in the sub-search phrase corresponding to the candidate search word, determining a first score corresponding to the search word according to the target fitness and the target probability corresponding to the search word.

Wherein, the target fitness and the target probability may both be positively correlated with the first score.

And fourthly, determining target prediction scores corresponding to the candidate search words according to the first scores corresponding to the search words in the sub-search word groups corresponding to the candidate search words.

The first scores corresponding to the search terms in the sub-search phrases may be positively correlated with the target prediction scores.

For example, the formula for determining the target prediction score corresponding to the candidate search term may be:

is a candidate set of search termsTarget prediction scores corresponding to the h candidate search terms in the set.

The semantic fitness of the target user who types in the y-th search word (the y-th search word in the sub-search word group corresponding to the h-th candidate search word) under the target category corresponding to the target typing information, namely the target fitness corresponding to the y-th search word.

Is the target probability corresponding to the y-th search word in the sub-search word group corresponding to the h-th candidate search word.

Is the first score corresponding to the y-th search word in the sub-search word group corresponding to the h-th candidate search word.

And

are all in contact with

And shows positive correlation.

Is the number of search words in the sub-search phrase corresponding to the h candidate search word.

And (3) with

And shows positive correlation.

When the following is performed

The larger the search term, the more accurate the classification result of the h candidate search term into the target category corresponding to the target key-in information is often described. When (when)

The larger the term, the greater the semantic agreement of the target user who types the y-th search term under the target category corresponding to the target typing information is often explained. Thus, the first and second substrates are bonded together,

the larger the term, the more suitable the h candidate search term is for being recommended to the user to be recommended.

And step S3, screening out a search word set to be recommended from the candidate search word set according to the target prediction scores corresponding to the candidate search words.

In some embodiments, the set of search terms to be recommended may be selected from the set of candidate search terms according to a target prediction score corresponding to the candidate search terms.

The search term to be recommended in the search term set to be recommended may be a search term to be recommended.

As an example, a preset number of candidate search words with highest target prediction scores may be selected from the candidate search word set, and used as search words to be recommended, to obtain the search word set to be recommended. The preset number may be a preset number. For example, the preset number may be 10.

Optionally, the target evaluation matrix may be used as a user evaluation matrix in the collaborative filtering algorithm, a similar user set is used as a user neighbor set in the collaborative filtering algorithm, based on the target evaluation matrix and the similar user set, a prediction score of each user which does not type a search word in the corresponding historical search information set is calculated, the prediction scores of each search word are ranked in order from large to small based on a Top-N recommendation criterion, at this time, the size of N may be 20, namely, 20 search words in the ranking result of the prediction scores form a recommendation list, and the recommendation list is recommended to the user to be recommended.

For example, if the target key-in information is "computer", the preset number is 4, the set of search words to be recommended may be { "computer screen", "XXX brand computer", "computer battery", "computer keyboard" }.

It should be noted that, the more comprehensive the data in the acquired reference user set and the historical search information set is, the more accurate the screening of the similar user set and the search word set to be recommended is.

And S4, recommending the search word set to be recommended to the user to be recommended.

In some embodiments, the set of search terms to be recommended may be recommended to the user to be recommended.

As an example, a web page technology may be adopted, and the search words to be recommended in the set of search words to be recommended are displayed below a search box input by the user to be recommended in order from large to small, so that the user to be recommended can conveniently select the search words to be recommended, and search word recommendation is performed on the user to be recommended.

In summary, the target category corresponding to the target typing information is determined first, so that the content type which the user to be recommended wants to know can be conveniently known, and accurate recommendation can be conveniently performed subsequently. Next, since the candidate search term set includes: each similar user in the similar user set corresponding to the user to be recommended and the search word of the user to be recommended under the target category corresponding to the target typing information. Therefore, compared with the screening of the historical search words of the user to be recommended, the search words in the candidate search word set are more consistent with the content in the type which the user to be recommended wants to know, and are not mixed together in multiple types, so that the content which the user to be recommended wants to search can be screened more easily. And secondly, screening the search words to be recommended from the candidate search word set, wherein compared with the screening of the search words from the history search words of the users to be recommended, the search words in the candidate search word set not only contain the search words which are input by the users to be recommended, but also contain the search words which are input by each similar user in the similar user set similar to the users to be recommended, so that the search words in the candidate search word set are more comprehensive, and even if the content which the users to be recommended want to search is the content which the users to be recommended do not search for in the aspect of the type, the users to be recommended can also be recommended based on the search words which are input by the similar users in the similar user set in the aspect of the type. For example, the target category corresponding to the target typing information is a type that the user to be recommended has not searched, and the search word recommendation may also be performed from the search words of each similar user in the set of similar users included in the candidate search word set under the target category corresponding to the target typing information. Then, a target prediction score corresponding to each candidate search word in the candidate search word set is obtained, so that the candidate search word set can be conveniently screened out to be recommended later. Finally, recommending the search term set to be recommended to the user to be recommended, so that the search term recommendation of the user to be recommended can be realized, and the accuracy of the search term recommendation of the user is improved. And secondly, based on the historical search information set corresponding to the user to be recommended and the historical search information set corresponding to each reference user in the reference user sets, comprehensively considering the target association degree and the target evaluation matrix between the search words, screening the similar user sets from the reference user sets, and improving the accuracy of determining the similar user sets, so that the accuracy of recommending the search words to the user to be recommended can be improved.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention and are intended to be included within the scope of the invention.

Claims

1. A method for recommending user search terms based on typed search terms, comprising the steps of:

determining the set of similar users comprises the steps of:

2. The method of claim 1, wherein the classifying the search terms included in all the obtained historical search information sets to obtain the target category set comprises:

3. The method of claim 2, wherein determining the target relevance between each two search terms included in all the set of historical search information comprises:

4. The method of claim 1, wherein determining the target evaluation index of each target user under each target category based on the obtained set of all historical search information, the target category set, and the target association degree between the search terms comprises:

5. The method of claim 2, wherein said screening a set of similar users from said set of reference users based on all of the resulting set of historical search information and said target evaluation matrix comprises:

6. The method of claim 5, wherein determining semantic agreements for each target user under each target category based on the obtained set of all historical search information comprises:

7. The method for recommending user search terms based on typed search terms according to claim 6, wherein the determining the target behavior compliance of the target user under the target category according to the target behavior frequency corresponding to the search terms included in all the historical search information sets comprises:

8. The method of claim 6, wherein the determining the second semantic difference for each search term entered by the target user under the target category based on the set of modified terms for each search term entered by the target user under the target category included in all sets of historical search information comprises:

9. The method of claim 5, wherein the selecting a set of similar users from the set of reference users based on the target evaluation matrix and the semantic fitness matrix comprises:

10. The method for recommending user search terms based on typed search terms according to claim 5, wherein the obtaining a target prediction score corresponding to each candidate search term in the candidate search term set according to the target category corresponding to the target typed information comprises: