CN116680466A

CN116680466A - Article recommendation method and device, storage medium and computer equipment

Info

Publication number: CN116680466A
Application number: CN202210157877.6A
Authority: CN
Inventors: 唐振宇
Original assignee: TCL Technology Group Co Ltd
Current assignee: TCL Technology Group Co Ltd
Priority date: 2022-02-21
Filing date: 2022-02-21
Publication date: 2023-09-01

Abstract

The embodiment of the application discloses an article recommending method, an article recommending device, a storage medium and computer equipment, wherein the embodiment of the application determines a reference article from candidate articles according to historical behavior data; and determining a recommended item meeting a preset condition from the candidate items according to the fusion weighting parameters and the reference items which are preset between every two candidate items, wherein the preset condition is that the similarity indicated by the fusion weighting parameters between the reference items and the recommended item is larger than a preset threshold. Therefore, recall probability of long-tail articles is improved, and recommendation effect of the long-tail articles is improved.

Description

Article recommendation method and device, storage medium and computer equipment

Technical Field

The application relates to the technical field of recommendation, in particular to an article recommendation method, an article recommendation device, a storage medium and computer equipment.

Background

With the development of network technology, in order to meet the requirement of recommending items to users, various recommendation algorithms are derived to recommend items to users. The recommendation algorithm based on the historical behavior data of the user is common, and the recommendation algorithm obtains the preference of the user to the articles according to the historical behavior data by analyzing the historical behavior data of the user so as to recommend the favorite articles to the user according to the preference.

However, such recommendation algorithms rely on the interactive behavior of the user on the articles, when the interactive behavior of the user is biased towards the hot articles, the recommendation result is biased towards the hot articles, and the recommendation of the long-tail articles is ignored, so that the recommendation effect of the long-tail articles is not improved.

Disclosure of Invention

The embodiment of the application provides an article recommending method, an article recommending device, a storage medium and computer equipment, which can improve the recommending effect on long-tail articles.

In a first aspect, an embodiment of the present application provides an item recommendation method, including:

determining a reference item from the candidate items according to the historical behavior data;

and determining a recommended article meeting a preset condition from the candidate articles according to the predetermined fusion weighting parameters between every two candidate articles and the reference article, wherein the preset condition is that the similarity indicated by the fusion weighting parameters between the reference article and the recommended article is larger than a preset threshold.

In a second aspect, an embodiment of the present application further provides an article recommendation apparatus, including:

the data acquisition module is used for determining a reference article from candidate articles according to the historical behavior data;

and the recall module is used for determining a recommended article meeting a preset condition from the candidate articles according to the fusion weighting parameter and the reference article which are preset between every two candidate articles, wherein the preset condition is that the similarity indicated by the fusion weighting parameter between the reference article and the recommended article is larger than a preset threshold.

In a third aspect, embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the item recommendation method as provided by any of the embodiments of the present application.

In a fourth aspect, an embodiment of the present application further provides a computer device, including a processor and a memory, where the memory has a computer program, and the processor is configured to execute the item recommendation method according to any of the embodiments of the present application by calling the computer program.

According to the technical scheme provided by the embodiment of the application, the reference article is determined from the candidate articles according to the historical behavior data, the recommended article meeting the preset condition is determined from the candidate articles according to the fusion weighting parameters and the reference articles which are preset between every two candidate articles, and the preset condition is that the similarity indicated by the fusion weighting parameters between the reference articles and the recommended articles is larger than the preset threshold. On the one hand, the recommended articles are sent to the user terminal, so that the user can push the favorite articles. In addition, when the recommended articles are determined from the candidate articles, the probability of recall of the long-tail articles from the candidate articles can be improved based on fusion weighting parameters between the reference articles and the recommended articles, and the recommending effect on the long-tail articles is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an application scenario schematic diagram of an item recommendation method according to an embodiment of the present application.

Fig. 2 is a flowchart of an item recommendation method according to an embodiment of the present application.

Fig. 3 is a schematic structural diagram of a fusion weighting chart in an article recommendation method according to an embodiment of the present application.

Fig. 4 is another schematic structural diagram of a fusion weighting chart in the method for recommending items according to the embodiment of the present application.

Fig. 5 is a schematic diagram of a neighboring network structure in an article recommendation method according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of an article recommendation device according to an embodiment of the present application.

Fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present application based on the embodiments of the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

When recommending items, a common recommendation system is used, the recommendation system is generally divided into two stages, namely a recall stage and a sort stage. The recall stage is mainly to extract a small part of articles interested by a user from an article library so as to form a candidate set; the sorting stage is mainly to precisely sort the articles in the candidate set so as to pre-place the order of the articles most interested by the user in the recommendation sequence, and then recommend the articles in the candidate set to the user according to the recommendation sequence.

The articles can be classified into long-tail articles and hot articles. The long-tail articles refer to articles with low demand, poor sales, low exposure degree, less click-through amount of users or unpopular articles, and the like, and the articles cannot be well displayed, so that only a few long-tail articles exist in the article warehouse. Correspondingly, the hot articles are opposite to the long-tail articles, and the long-tail articles are fewer because the hot articles in the article library are more, so that when the recommendation system is used for recommending the articles of the user, the recommendation result tends to be the hot articles, and the long-tail articles cannot obtain a good recommendation effect. Further, since the recommendation result is biased toward the hot item, the recommendation system is over-fitted, and when the recommendation system is used to recommend the item in the new item library, there is a problem in that the recommendation error is large.

In order to solve the problems, the embodiment of the application provides an article recommending method, which can improve recall probability of long-tail articles and promote ordering of the long-tail articles in a recommending sequence, so that the long-tail articles have good recommending effect.

The execution body of the article recommending method provided by the embodiment of the application can be the article recommending device provided by the embodiment of the application or the computer equipment integrated with the article recommending device, wherein the article recommending device can be realized in a hardware or software mode, and the computer equipment can be a server. The computer device can be connected with one or more user terminals, wherein the user terminals can be mobile phones, computers, tablets, televisions and the like.

It should be noted that, the articles mentioned in the embodiments of the present application may refer to physical articles, such as commodities, living goods, office supplies, and the like, and may also refer to virtual articles, such as financial information, news, electronic books, music images, television programs, APP virtual applications, virtual commodities, some products in online games, and the like. As the types of the physical articles and the virtual articles are various, the description is omitted herein, and the article recommending method provided by the embodiment of the application can be applied to recommending the physical articles and the virtual articles.

For better understanding of the present application, an application scenario is exemplified herein, for example, the application scenario is a movie recommendation, and a plurality of candidate movies in a movie library include hot movies and cold movies (i.e., long tail items). Referring to fig. 1, fig. 1 is a schematic application scenario diagram of an article recommendation method according to an embodiment of the present application. Before recommending the videos, the multi-mode similarity between every two candidate videos in the video library is calculated, and fusion weighting parameters corresponding to the candidate videos are determined according to the multi-mode similarity. When the video recommendation is carried out on the user terminal, the reference video is determined according to the user behavior data of the user terminal, then the recommended video related to the reference video is selected from the candidate videos according to the fusion weighting parameters corresponding to the candidate videos, so that the recommended video is sent to the user terminal, and the video which the user prefers is recommended to the user terminal. Furthermore, according to the fusion weighting parameters corresponding to the candidate videos, the recall rate of the cold videos can be improved, the proportion of the cold videos in the recommended videos is improved, the recommendation result is prevented from being biased to the hot videos, and accordingly exposure opportunities are provided for the cold videos.

Referring to fig. 2, fig. 2 is a flowchart illustrating an article recommendation method according to an embodiment of the application. The specific flow of the article recommending method provided by the embodiment of the application can be as follows:

101. and determining a reference item from the candidate items according to the historical behavior data.

The computer equipment stores historical behavior data of the user, and when the user recommends the article, the computer equipment analyzes the interactive behavior of the user terminal on the article according to the historical behavior data of the user terminal used by the user, and takes the article with the interactive behavior as a reference article. For example, taking the object clicked or browsed by the user terminal as a reference object, taking the movie recommendation in the application scene as an example, for example, the movie content browsed by the user terminal can be taken as a reference movie. It will be appreciated that the number of user terminals used by each user is not uniquely determined and that the user terminals may be identified based on the user's ID.

For example, the reference object may be obtained after the object in the historical behavior data is screened, and the screening modes are various as follows:

for example, according to the click rate of the articles in the historical behavior data, taking the articles with higher click rate as reference articles; for another example, only the articles browsed in a certain period of time in the historical behavior data are intercepted and used as reference articles; for another example, according to the browsing time length of the articles in the historical behavior data, taking the articles with higher browsing time length as reference articles; for another example, the item with the higher score may be used as the reference item according to the score of the item by the user terminal in the historical behavior data.

It will be appreciated that, based on the screening method in the above example, multiple methods may be combined to screen the articles, and then obtain the reference articles, where the preset article library includes a large number of candidate articles, and the reference articles belong to a part of the candidate articles. Since there are various ways to determine the reference item according to the historical behavior data, it is not listed here any more, as long as the reference item can be determined according to the item browsed by the user terminal, so as to represent the preference of the user.

102. According to the fusion weighting parameters and the reference articles which are preset between every two candidate articles, determining recommended articles which meet preset conditions from the candidate articles, wherein the preset conditions are that the similarity indicated by the fusion weighting parameters between the reference articles and the recommended articles is larger than a preset threshold.

The preset article library comprises a large number of candidate articles, and the reference articles are also contained in the preset article library. And (3) calculating the similarity between every two candidate articles in advance, and constructing a fusion weighting parameter according to the similarity so as to represent the association degree between every two candidate articles in a preset article library.

When the similarity between every two candidate articles is calculated, the characteristics of every two candidate articles in multiple dimensions are extracted firstly, then similarity calculation is carried out in a single dimension respectively, so that the similarity in the single dimension is obtained respectively, and finally the similarity in the multiple dimensions is subjected to fusion treatment. The candidate items may contain information of different forms to represent features in different dimensions, such as text, speech, images, video, color, etc.

By calculating the similarity of multiple dimensions, the degree of association between the hot-gate article and the long-tail article in the candidate articles can be enhanced, so that the probability of recall of the long-tail article is improved.

For example, the fusion weighting parameters may be represented by undirected graphs, which are called fusion weighting graphs, refer to fig. 3, and fig. 3 is a schematic structural diagram of the fusion weighting graphs in the method for recommending items according to the embodiment of the present application.

The nodes in the fused weighted graph in fig. 3 represent candidate items, for example a, b, c, e, i, j each represent a node where a candidate item is located, and if there is a border between two candidate items, it is explained that the two candidate items are associated with each other, and the border has a weight, and the weight is represented by the similarity between the two candidate items.

When the recommended articles of the reference articles are determined from the candidate articles according to the fusion weighted graph, the candidate articles connected with the node through the connecting edge can be regarded as recommended articles according to the position of the node where the reference articles are located in the fusion weighted graph, or the candidate articles directly and indirectly connected with the node through the connecting edge can be regarded as recommended articles, namely, a preset threshold value can be set to be 0, and further the recommended articles with the similarity with the reference articles being larger than the preset threshold value can be selected from the candidate articles.

Of course, the recommended articles can also be obtained by screening according to the similarity. As in the example above, the preliminary screening is first performed: and selecting candidate items which are directly and/or indirectly connected with the node where the fusion weighted graph is located through the connecting edge as initial recommended items. And a second step of screening: and selecting a plurality of initial recommended articles with highest similarity as final recommended articles according to a fusion weighted graph between each initial recommended article and the reference article, wherein the number of the plurality of initial recommended articles can be set according to actual requirements, and the method is not limited herein.

It will be appreciated that the second screening in this example may be replaced by other screening methods, for example, the similarity between users may be the final recommended item, which is the initial recommended item having high similarity with the user terminal; for another example, according to the properties of the item, such as popularity, grading, click rate and the like, the item with high popularity, grading or click rate in the initial recommended item is further used as the final recommended item. Since there are a variety of screening methods, which are not listed here, it is understood that the screening can recall a plurality of recommended items of most interest to the user from the item library.

After the recommended items are selected according to the fusion weighting parameters, the recommended items may also be sorted in descending order according to the fusion weighting parameters. Of course, after the recommended items are selected in other ways, the candidate items may also be sorted in a descending order in the same way, e.g., by user similarity, popularity of the items themselves, scoring, click-through, etc.

The fusion weighting parameters can also be expressed by a matrix, which is called a fusion weighting matrix, and similar to the expression mode of the fusion weighting graph, the fusion weighting matrix takes candidate articles as coordinate points in the matrix, the association degree between every two candidate articles is expressed according to the neighbor relation between the coordinate points, weights are arranged between every two adjacent coordinate points, and the weights are used for expressing the similarity between every two candidate articles.

As described above, the fusion weighting parameters provided in the embodiments of the present application are not limited to be represented by an undirected graph or a matrix, but may be represented by different manners, which are not listed here, as long as the degree of association between every two candidate items in the item library can be represented. In order to better understand the scheme provided by the embodiment of the present application, in the following embodiment, the fusion weighting parameters are represented by the fusion weighting graph provided by the embodiment.

In some embodiments, after recalling the candidate item from the library of preset items, the candidate item may be sent to the user terminal to enable recommending its favorite items to the user.

When the recommended articles are sent to the user terminal, the recommended articles after being sorted in a descending order can be sent to the user terminal according to the descending order sorting mode mentioned in the above embodiment, so that the recommended articles which are most interesting to the user are prepositioned in the user terminal. The method can also directly carry out disorder treatment on the recommended articles, then send the recommended articles with random ordering to the user terminal, and provide equal exposure opportunities for the recommended articles.

In particular, the application is not limited by the order of execution of the steps described, as some of the steps may be performed in other orders or concurrently without conflict.

As can be seen from the above, according to the item recommendation method provided by the embodiment of the present application, the similarity is calculated for each two candidate items in the preset item library, and the fusion weighting parameter is constructed according to the similarity. When the user terminal is recommended for the articles, the recommended articles related to the reference articles can be recalled from a preset article library according to the fusion weighting parameters and the reference articles, and then the recommended articles are sent to the user terminal, so that the article recommendation of the user terminal is realized. And when the recommended articles are recalled, the association between the hot articles and the long-tail articles can be enhanced based on the fusion weighting parameters, so that the probability of recalling the long-tail articles is improved.

The method described in the previous examples is described in further detail below by way of example. In the following embodiments, the fusion weighting parameters may be constructed by one or more of multi-modal similarity, first-order neighbor similarity, and second-order neighbor similarity, and may be specifically combined to achieve different effects when implemented, which are only partially described in the following embodiments, and are not considered as limiting the embodiments of the present application.

In some embodiments, before determining the recommended item meeting the preset condition from the candidate items according to the predetermined fusion weighting parameter and the reference item between every two candidate items, the method further includes:

acquiring description information of candidate articles, and extracting text features and image features in the description information;

according to the text characteristics, calculating the text similarity between every two candidate items;

calculating the image similarity between every two candidate articles according to the image characteristics;

according to the text similarity and the image similarity, determining multi-mode similarity between every two candidate articles;

and determining fusion weighting parameters according to the multi-mode similarity.

Taking the movie recommendation mentioned in the application scenario as an example, for a movie, the description information thereof may include a brief introduction of a movie work, a promotional poster, image content, transcript content, and the like. When extracting features of a plurality of dimensions based on the description information, text features, such as text description contents in a brief introduction about a movie work, may be extracted, and image features, such as image contents of a movie work, may be extracted.

After the text features and the image features are extracted, text similarity calculation and image similarity calculation are carried out on each two film and television works. When the similarity calculation is performed, the text features and the image features can be converted into text vectors and image vectors respectively, and then cosine similarity between every two text vectors is calculated to obtain the text similarity, or cosine similarity between every two image vectors is calculated to obtain the image similarity.

The calculation formula for calculating the text similarity according to the text vector is as follows:

in the formula (i),text vector representing any one candidate item (movie work) i +.>Text vector representing any one candidate item (movie work) j +.>Representing text vector +.>And text vector->Cosine similarity between, i.e. text similarity.

The calculation formula for calculating the image similarity according to the image vector is as follows:

in the formula (i),image vector representing candidate item i +.>Image vector representing candidate item j +.>Representing image vector +.>And image vector->Cosine similarity between, i.e. image similarity.

After obtaining the text similarity and the image similarity between the two candidate items i and j, performing fusion processing on the text similarity and the image similarity, wherein the fusion processing can be expressed by the following formula:

In the formula (i),representing multi-modal similarity by ++text similarity>And image similarity->And summing to obtain the multi-mode similarity between the two candidate articles i and j.

Illustratively, a coefficient may be further given to the text similarity and the image similarity according to actual requirements, for example, a coefficient a is given to the text similarity, and a coefficient b is given to the image similarity, and then the multi-modal similarity may be expressed by the following formula:

when the fusion weighted graph is constructed according to the multi-modal similarity, two candidate articles with the multi-modal similarity larger than 0 can be connected through the connecting edge by combining the above-mentioned figure 3, and the candidate articles are used as nodes of the fusion weighted graph, and the multi-modal similarity is used as the weight of the connecting edge, so that the fusion weighted graph is obtained.

Illustratively, the fusion weighting map may also be constructed by a K Nearest Neighbor algorithm, referred to as a K Nearest Neighbor classification algorithm (KNN). K sides are respectively constructed for a certain candidate item and K candidate items nearest to the certain candidate item, and K sides of a plurality of candidate items form a fusion weighted graph, wherein the multi-mode similarity between the two candidate items is the weight of the continuous sides.

The fusion weighted graph is constructed in the two modes, so that the weight on the connecting edge between two candidate items is only related to the semantic similarity between the candidate items and is not related to the popularity of the candidate items, and the recommendation result in the existing recommendation system is prevented from being biased to the hot item, thereby relieving the popularity deviation and providing the recommendation effect on the cold item.

In some embodiments, determining fusion weighting parameters from multi-modal similarities includes:

acquiring user behavior data of candidate articles, and extracting behavior characteristics in the user behavior data;

calculating the first-order neighbor similarity between every two candidate items according to a collaborative filtering algorithm;

and determining fusion weighting parameters according to the first-order neighbor similarity and the multi-mode similarity.

Taking the movie recommendation mentioned in the application scenario as an example, the user behavior data generated by the user terminal on the movie program may include browsing time, browsing duration, browsing content, collection behavior, comment content, scoring, screenshot, screen recording, forwarding, and the like. When extracting behavioral characteristics from the user behavioral data, one or more behavioral characteristics may be selected from the user behavioral data.

After the behavior features are obtained, the behavior features can be calculated through a collaborative filtering algorithm to obtain first-order neighbor similarity between every two behavior features, and a fusion weighted graph is constructed according to the first-order neighbor similarity. When the similarity between the behavior features is calculated, the behavior features can be converted into behavior vectors, and then cosine similarity between the two behavior vectors is calculated, so that first-order neighbor similarity between two candidate articles corresponding to the two behavior features can be obtained.

The formula for calculating the first-order neighbor similarity according to the behavior vector is as follows:

in the formula, t _ui Time t representing time when any user clicks on candidate item (movie work) i _uj Sign (t) indicating when any user clicks on candidate item (movie work) j _ui -t _uj ) Representing t _uj And t _ui Cosine similarity between q _u Representing the liveness of the user's behaviour,p _i representing popularity of candidate item i, p _j Representing the popularity of the candidate item j,representing first order neighbor similarity.

In the above formula, the time interval factor 1/sign (t) between the user clicking candidate item (film and television work) i and the clicking candidate item (film and television work) j is also considered on the basis of the cosine similarity between the two behavior vectors obtained by calculation _ui -t _uj ) And adds a user liveness penalty of 1/log (1+q) _u ) And information popularity penalty p _i p _j . Wherein, the shorter the time interval between clicking the candidate item (movie work) i and clicking the candidate item (movie work) j, the greater the similarity between the two candidate items i and j is explained. Punishment of 1/log by user liveness (1+q) _u ) Punishment of active users is achieved, and fairness between active users and inactive users is guaranteed, wherein the active users refer to higher times of clicking behaviors of the users, and the inactive users refer to lower times of clicking behaviors of the users. Punishment of p by popularity _i p _j Punishment of popular items, and alleviation of popularity bias between popular items and unpopular items, wherein popular items refer to historical click frequencies being higher and unpopular items refer to historical click frequencies being lower.

Illustratively, after obtaining the first-order neighbor similarity between two candidate items, the first-order neighbor similarity and the multi-modal similarity of the two candidate items are subjected to a fusion process, which can be represented by the following formula:

in the formula, the first-order neighbor similarity is calculatedAnd multimodal similarity->Summing to obtain the weight on the connecting edge between two candidate items i and j>Wherein, if the first-order neighbor similarity is +.>Or multimodal similarity->Either of these may be 0, and when both are 0, then there is no tie between the two candidate items.

When constructing the fusion weighting map, the fusion weighting map may be constructed with reference to the content mentioned in the above embodiment, which is not described herein, wherein,representing weights on the edges in the fused weighted graph.

There may also be two connected edges in the fused weighted graph, one being a common edge represented by a first-order neighbor similarity and the other being a multi-modal vector edge represented by a multi-modal similarity. Fig. 4 is another schematic structural diagram of a fused weighted graph in the article recommendation method according to the embodiment of the present application, where a solid line represents a multi-modal vector side and a dotted line represents a common side. When the fusion weighted graph is constructed, two candidate items with the multi-modal similarity greater than 0 can be connected through a solid line (multi-modal vector edge), two candidate items with the first-order neighbor similarity greater than 0 can be connected through a broken line (shared edge), the multi-modal vector edge takes the multi-modal similarity as a weight, and the shared edge takes the first-order neighbor similarity as a weight. It will be appreciated that when there is only semantic similarity between two candidate items, then there is only a multi-modal vector edge between the two candidate items; when the two candidate articles only have the behavior similarity, only a common edge exists between the two candidate articles; when two candidate articles have both semantic similarity and behavior similarity, two continuous edges exist between the two candidate articles, one is a multi-modal vector edge, and the other is a shared edge.

calculating the first-order neighbor similarity and the second-order neighbor similarity between every two candidate items according to a collaborative filtering algorithm;

and determining fusion weighting parameters according to the first-order neighbor similarity, the second-order neighbor similarity and the multi-modal similarity.

In this embodiment, the second-order neighbor similarity may be added to the fusion weighted graph as the weight of the continuous edge.

Specifically, the second-order neighbor similarity between every two candidate items can be calculated by a collaborative filtering algorithm. Where the second order neighbor similarity between every two candidate items refers to the similarity between their neighbor network structures, where the candidate items in the neighbor network structures tend to be similar to each other and the candidate item k is shared between the two neighbor network structures.

Referring to fig. 5, fig. 5 is a schematic diagram of a neighbor network structure in an article recommendation method according to an embodiment of the present application, two open circles in fig. 5 respectively represent candidate article i and candidate article j, other candidate articles are represented by black dots, and network structures (neighbor network structures) having a first-order neighbor similarity between candidate article i and other candidate articles are circled by solid lines to A representation; network structures (neighbor network structures) having a first-order neighbor similarity between candidate item j and other candidate items are circled with a dotted line to +.>A representation; the shared candidate item k is circled in an oval. Wherein between candidate item i and candidate item jThe second order similarity is formed by the first order neighbor similarity between the two>The common decision is that in the second-order neighbor similarity calculation, the shared candidate item k is respectively subjected to first-order neighbor similarity between the candidate item i and the candidate item j, and the first-order neighbor similarity between the candidate item i and the candidate item j is ∈ ->Bonding is performed.

The calculation formula of the second-order neighbor similarity between the candidate item i and the candidate item j is as follows:

where K represents the similar neighbor nodes shared by candidate item i and candidate item j (i.e., shared candidate item), K represents the number of shared candidate items,first-order neighbor similarity, indicative of the network structure of candidate item i, +.>First-order neighbor similarity, indicative of the network structure of candidate item j +.>Representing the second order neighbor similarity between candidate item i and candidate item j.

Illustratively, after obtaining the second-order neighbor similarity between two candidate items, the second-order neighbor similarity, the first-order neighbor similarity and the multi-modal similarity of the two candidate items are subjected to fusion processing, which can be expressed by the following formula:

In the formula, the first-order neighbor similarity is calculatedMultimodal similarity->And second order neighbor similarity->Summing the three to obtain the weight +.>Wherein, if the first-order neighbor similarity is +.>Or multimodal similarity->Or->Any one or any two of the two candidate items can be 0, and when all three items are 0, no connecting edge exists between the two candidate items.

Illustratively, there may also be two connected edges in the fused weighted graph, one being a common edge commonly represented by a first-order neighbor similarity and a second-order neighbor similarity, and the other being a multi-modal vector edge represented by a multi-modal similarity. The embodiment is based on the common edge mentioned in the above embodiment, and the sum of the first-order neighbor similarity and the second-order neighbor similarity is used as the weight of the common edge, and the detailed description thereof will be omitted herein.

According to the method and the device for calculating the second-order neighbor similarity between the two candidate items, the problem of sparse distribution among the nodes where the candidate items are located in the fusion weighted graph can be solved, hidden association between the two non-adjacent nodes is explored, association between the candidate items in a preset item library is facilitated to be improved, and therefore recommendation accuracy is improved.

In some embodiments, when the reference item is used as a reference for recalling the recommended item, the candidate item may be selected from a preset item library corresponding to the fusion weighted map according to the fusion weighted map mentioned in the above embodiment in combination with a multi-hop random walk algorithm.

Wherein, taking the reference item as the path starting point of the walk path in the multi-hop random walk algorithm, the accessed node (candidate item) can be taken as the recommended item by random walking for a plurality of times among the nodes (candidate items) in the fusion weighted graph.

Specifically, the distribution of each candidate item in the fused weighted graph is regarded as a node, and each node traversed in the course of the walk is given a score expressed as the number of times the node is accessed/the total number of steps of the random walk. When the node is moved to any one of the nodes, whether to continue the movement or stop the movement is determined according to the set probability, and when the node is stopped, the movement is restarted from the path start point of the movement path. If it is decided to continue the walk, one node is randomly selected from other nodes pointed by the current node according to uniform distribution as the node which is passed next time by the walk. After a number of random walks, the probability that each node is visited will converge to a number.

After the probability that each node is accessed is obtained, the access probability is used as the weight that the candidate articles represented by the node are selected, so that the candidate articles with higher access probability are selected as recommended articles from the accessed candidate articles, and the access probability can also be used as a basis for ordering the recommended articles, namely ordering the recommended articles in descending order according to the access probability.

In some embodiments, after recalling the recommended items associated with the reference item from the candidate items by the random walk algorithm, the recommended items are also reordered by the preset neural network model to increase the order of the long tail items in the recommended items. The method further comprises the steps of:

inputting the recommended articles into a preset neural network model to obtain a recommended sequence of the recommended articles, wherein the preset neural network model is obtained by performing attribute alignment training on sample articles of different categories on probability distribution;

and sending the recommended articles to the user terminal according to the recommended sequence.

After the recommended articles are obtained, the recommended articles are input into a preset neural network model, so that the recommended articles are ordered through the preset neural network model to obtain a recommended sequence, the computer equipment sends the recommended articles to the user terminal according to the recommended sequence, the recommended articles are displayed on the user terminal in an ordered mode according to the recommended sequence, and the click rate of the recommended articles ordered in front can be improved.

The preset neural network model may be DIEN (Deep Interest Evolution Network ) or deep fm (factor-Machine based Neural Network), and can extract both low-order combination features and high-order combination features.

When training the preset neural network model, attribute alignment is performed on sample articles of different categories on probability distribution to train the preset neural network model. The method comprises the steps of classifying sample articles, extracting article characteristics of the sample articles, calculating probability distribution of article characteristics of different types in a characteristic space, and adjusting model parameters to align the two probability distributions, so that attribute alignment among the sample articles of different types is realized, and the ratio of the sample articles of different types in sorting is balanced.

Taking the hot object and the long-tail object mentioned in the above embodiments as examples, the sample object may be divided into the hot object and the long-tail object, and the distribution difference between the hot object and the long-tail object can be reduced by calculating the probability distribution of the hot object in the feature space and the probability distribution of the long-tail object in the feature space, and then aligning the two probability distributions. When the distribution difference is reduced by adjusting the parameters of the preset neural network model, and then the recommended articles are ordered by the preset neural network model, the order of the long-tail articles in the recommended articles can be improved.

The article recommending method provided by the embodiment of the application further comprises a training method for a preset neural network model, wherein the training method comprises the following steps:

in some embodiments, the training method for the preset neural network model includes:

acquiring a sample set, and extracting statistical characteristics of sample articles in the sample set;

dividing the statistical features into a first type of features and a second type of features according to the popularity of the articles;

calculating a first probability distribution of the first type of features in a preset feature space, and calculating a second probability distribution of the second type of features in the preset feature space;

calculating a first loss value according to the first probability distribution and the second probability distribution;

and adjusting parameters of a preset neural network model according to the first loss value until the model converges.

The sample articles may be classified into hot articles and long-tail articles according to the article popularity and other manners, where the hot articles are referred to as first type samples, and the long-tail articles in the sample articles are referred to as second type samples. And then, extracting the statistical characteristics of the hot articles from the first type of samples, namely the first type of characteristics, and extracting the statistical characteristics of the long-tail articles from the second type of samples, namely the second type of characteristics.

Types of statistical features include, but are not limited to: shaping, floating point type, class type, multi-value type and other numerical value type characteristics.

In this embodiment, the difference between the first probability distribution and the second probability distribution is reduced by means of the migration learning. Specifically, the feature space where the first type of features are located is regarded as a source domain, the feature space where the second type of features are located is regarded as a target domain, the first type of features of the source domain are attached with a large amount of marking data, and the second type of features of the target domain lack marking data. In the process of training the preset neural network model, the difference between the first probability distribution and the second probability distribution is reduced, so that the labeled data of the source domain can be migrated into the target domain, and the migration learning from the source domain to the target domain is completed, thereby improving the generalization capability of the preset neural network model to the target domain and the sequencing effect of long-tail objects.

When the distribution difference between the first probability distribution and the second probability distribution is reduced, the first type of features of the source domain and the second type of features of the target domain can be aligned in a preset feature space. The alignment mode is to calculate a first loss value according to a first probability distribution of a first type of feature in a preset feature space and a second probability distribution of a second type of feature in the preset feature space, and then the statistical features of the source domain and the target domain are similar (i.e. aligned) in the preset feature space through adjusting model parameters according to the first loss value, so that an output result of a preset neural network model is adapted to the target domain, and therefore the purpose of migrating the marking data of the source domain to the target domain is achieved, and the accuracy of marking the second type of feature in the target domain by the marking data of the source domain is improved.

When the parameters of the preset neural network model are adjusted according to the first loss value, the parameters can be adjusted through a gradient descent algorithm so as to achieve the minimization of model loss.

In some embodiments, calculating a first probability distribution of the first class of features in the preset feature space and calculating a second probability distribution of the second class of features in the preset feature space includes:

calculating a first similarity between the first type of features and a second similarity between the second type of features according to the statistical features;

constructing a first covariance matrix according to the first similarity, and constructing a second covariance matrix according to the second similarity;

calculating a first probability distribution of the first covariance matrix in a preset feature space, and calculating a second probability distribution of the second covariance matrix in the preset feature space.

In this embodiment, the statistical features are divided into multiple dimensions, and the statistical features of each dimension are subjected to similarity calculation, so as to obtain a first similarity between the first type features of each dimension and a second similarity between the second type sample features of each dimension. The first similarity is used as the weight of the first type of features, the second similarity is used as the weight of the second type of features, then a first covariance matrix is built according to the first similarity of the first type of features of the multiple dimensions, and a second covariance matrix is built according to the second similarity of the second type of features of the multiple dimensions, so that the first covariance matrix and the second covariance matrix have consistent feature vectors, then the first probability distribution of the first covariance matrix in a preset feature space is calculated, and the second probability distribution of the second covariance matrix in the preset feature space is calculated, so that the distribution difference between the first probability distribution and the second probability distribution can be analyzed conveniently.

Illustratively, the plurality of dimensions of the statistical feature may include a floating point numerical feature, a category numerical feature, a text feature and an image feature, and the application scenario provided in the above embodiments is exemplified herein with respect to the floating point numerical feature and the category numerical feature, for example, the change of the number of times the statistical movie works are clicked in the time dimension may be represented by a continuous floating point numerical value; the movie works can also be represented by numerical features according to different categories, for example, the numerical feature of an animation is 1, the numerical feature of a war is 2, the numerical feature of a family play is 3, etc.

The similarity between floating point numerical value characteristics in the first class of characteristics is calculated to be used as a first weight, the similarity between class type numerical value characteristics is calculated to be used as a second weight, the similarity between text characteristics is calculated to be used as a third weight, the similarity between image characteristics is calculated to be used as a fourth weight, then a first covariance matrix is built by (the floating point numerical value characteristics, the first weight), (the class type numerical value characteristics, the second weight), (the text characteristics, the third weight) and (the image characteristics and the fourth weight), and similarly, the weight of each dimension in the second class of characteristics is calculated according to the statistical characteristics of a plurality of dimensions, and a second covariance matrix is built. It will be appreciated that the similarity calculation between the text feature and the image feature may refer to the content mentioned in the above embodiment, and will not be repeated here.

The preset feature space may be a regenerated kernel hilbert space or a linear transformation matrix. In the case of the regenerated kernel hilbert space, the distribution difference between the first probability distribution and the second probability distribution can be calculated to be used as a first loss value, wherein the distribution difference can be represented by a multi-core distribution distance as follows:

and selecting a plurality of kernel functions containing optimization parameters as a total kernel function of the source domain and the target domain, which is mapped in the regenerated kernel Hilbert space, constructing a distribution distance function between the first probability distribution and the second probability distribution based on the total kernel function, and determining the optimization parameters based on unbiased estimation so as to obtain the multi-core distribution distance between the first probability distribution and the second probability distribution.

When the preset feature space is a linear transformation matrix, the first covariance matrix and the second covariance matrix are aligned by solving parameters of the linear transformation matrix, wherein the parameters of the linear transformation matrix can be used as a first loss value.

In some embodiments, adjusting parameters of a predetermined neural network model according to the first loss value until the model converges, includes:

extracting article characteristics of the sample article;

classifying the object features into a third type of features and a fourth type of features according to the user interest level;

Calculating a second loss value according to the cluster distribution of the third type of features, and calculating a third loss value according to the cluster distribution of the fourth type of features;

and adjusting parameters of a preset neural network model according to the first loss value, the second loss value and the third loss value until the model converges.

In this embodiment, by extracting the item features of the sample item, wherein implicit features may also be extracted from the item features, the item features or implicit features are then classified into a third class of features and a fourth class of features according to the user's interest level in the sample item. The user interest level can be divided by the browsing behavior of the user on the object, for example, the browsing amount is high as the third type of characteristics, and the browsing amount is low as the fourth type of characteristics.

For example, item features or implicit features may also be classified into a third or fourth class of features by cluster analysis such that a third class of features with the same feedback (user is interested in sample items) is aggregated, while a fourth class of features with different feedback (user is not interested in sample items) is aggregated. Wherein the number of the third type of features and the fourth type of features are identical.

After the item features are partitioned, clustering among the item features may be achieved by adjusting the distance between the item features. In the embodiment, the distance between the third type of features with the same feedback is reduced, so that the third type of features with the same feedback are clustered, and the similarity between the third type of features is improved; by increasing the distance between the fourth type of features with different feedback, the fourth type of features with different feedback are not clustered, and the similarity between the fourth type of features is reduced. The method is beneficial to improving the similarity between users, is convenient for sequencing the recommended articles according to the similarity between users, and improves the sequencing accuracy.

Thus, in adjusting the model parameters, the distance between the third class of features may be reduced by adjusting the parameters, i.e. the second loss value is reduced by adjusting the model parameters; and increasing the distance between the fourth class of features by adjusting the parameters, i.e. increasing the third loss value by adjusting the model parameters.

Naturally, when the model parameters are adjusted by the second loss value and the third loss value, the model parameters may also be adjusted together with the first loss value obtained in the above embodiment until the model converges.

extracting user features and article features in the sample article;

inputting the object features and the user features into a preset neural network model to obtain a predicted value of interest of the user on the sample object;

calculating a fourth loss value according to the actual value and the predicted value of the interest of the user on the sample article;

and adjusting parameters of a preset neural network model according to the first loss value and the fourth loss value until the model converges.

In this embodiment, the implicit feature of the sample object may be extracted according to the object feature, and the implicit feature of the sample object may be extracted according to the user feature, so that the object feature and the user feature are input into a preset neural network model, or the implicit feature of the object feature and the implicit feature of the user feature are input into the preset neural network model, to obtain a predicted value of interest of the user in the sample object.

And after the predicted value is obtained, obtaining a fourth loss value according to the difference value between the predicted value and the actual value, and further adjusting model parameters according to the fourth loss value until the model converges.

Naturally, when the model parameters are adjusted by the fourth loss value, the model parameters may also be adjusted together with the first loss value obtained in the above embodiment until the model converges.

In some embodiments, the model parameters may also be adjusted together according to the first, second, third, and fourth loss values obtained in the above embodiments until the model converges.

As can be seen from the above, in the recall stage, the item recommendation method provided by the embodiment of the invention can improve the semantic similarity of the popular item and the long-tail item by constructing the fusion weighting parameter, thereby improving the probability of recalling the long-tail item from the candidate items and improving the recommendation effect on the long-tail item. In the sorting stage, attribute alignment is performed on sample articles of different categories on probability distribution to train a preset neural network model, so that the duty ratio of the sample articles of different categories on sorting can be balanced, and the recommended articles are sorted by the preset neural network model obtained through training, so that the sorting of hot articles and long-tail articles in the recommended articles is balanced, the sorting order of the long-tail articles in the sorting is improved, and the long-tail articles are recommended to users.

An item recommendation device 200 is also provided in one embodiment. Referring to fig. 6, fig. 6 is a schematic structural diagram of an article recommendation device 200 according to an embodiment of the application. Wherein the item recommendation device 200 is applied to a computer apparatus, the item recommendation device 200 includes:

a data acquisition module 201, configured to determine a reference item from candidate items according to the historical behavior data;

the recall module 202 is configured to determine, from the candidate items, a recommended item that meets a preset condition according to a predetermined fusion weighting parameter and a reference item between every two candidate items, where the preset condition is that a similarity indicated by the fusion weighting parameter between the reference item and the recommended item is greater than a preset threshold.

In some embodiments, recall module 202 is further to:

In some embodiments, the item recommendation device 200 further comprises: a recommendation module 203, wherein the recommendation module 203 is configured to:

In some embodiments, the item recommendation device 200 further includes a ranking module 204, the ranking module 204 configured to:

In some embodiments, the ranking module 204 is further configured to:

extracting article characteristics of the sample article;

In some embodiments, the ranking module 204 is further configured to:

Extracting user features and article features in the sample article;

It should be noted that, the article recommending apparatus 200 provided in the embodiment of the present application belongs to the same concept as the article recommending method in the above embodiment, and any method provided in the article recommending method embodiment may be implemented by the article recommending apparatus 200, and detailed implementation processes of the method are shown in the article recommending method embodiment, which is not described herein.

As can be seen from the above, in the recall stage, the article recommendation device provided by the embodiment of the application can improve the semantic similarity of the popular articles and the long-tail articles by constructing the fusion weighting parameters, thereby improving the probability of recalling the long-tail articles from the candidate articles and improving the recommendation effect on the long-tail articles. In the sorting stage, attribute alignment is performed on sample articles of different categories on probability distribution to train a preset neural network model, so that the duty ratio of the sample articles of different categories on sorting can be balanced, and the recommended articles are sorted by the preset neural network model obtained through training, so that the sorting of hot articles and long-tail articles in the recommended articles is balanced, the sorting order of the long-tail articles in the sorting is improved, and the long-tail articles are recommended to users.

The embodiment of the present application further provides a computer device 300, where the computer device 300 may be a server, as shown in fig. 7, and fig. 7 is a schematic structural diagram of the computer device 300 according to the embodiment of the present application. The computer device 300 includes a processor 301 having one or more processing cores, a memory 302 having one or more computer readable storage media, and a computer program stored on the memory 302 and executable on the processor. The processor 301 is electrically connected to the memory 302. Those skilled in the art will appreciate that the computer device 300 structures shown in the figures do not constitute a limitation of the computer device 300, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

Processor 301 is a control center of computer device 300 and utilizes various interfaces and lines to connect various portions of the overall computer device 300, and to perform various functions of computer device 300 and process data by running or loading software programs and/or modules stored in memory 302 and invoking data stored in memory 302, thereby performing overall monitoring of computer device 300.

In the embodiment of the present application, the processor 301 in the computer device 300 loads the instructions corresponding to the processes of one or more application programs into the memory 302 according to the following steps, and the processor 301 executes the application programs stored in the memory 302, so as to implement various functions:

according to the fusion weighting parameters and the reference articles which are preset between every two candidate articles, determining recommended articles which meet preset conditions from the candidate articles, wherein the preset conditions are that the similarity indicated by the fusion weighting parameters between the reference articles and the recommended articles is larger than a preset threshold.

The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

As can be seen from the above, in the recall stage, the computer device provided in this embodiment can improve the semantic similarity between the popular item and the long-tail item by constructing the fusion weighting parameter, thereby improving the probability of recalling the long-tail item from the candidate items and improving the recommendation effect on the long-tail item. In the sorting stage, attribute alignment is performed on sample articles of different categories on probability distribution to train a preset neural network model, so that the duty ratio of the sample articles of different categories on sorting can be balanced, and the recommended articles are sorted by the preset neural network model obtained through training, so that the sorting of hot articles and long-tail articles in the recommended articles is balanced, the sorting order of the long-tail articles in the sorting is improved, and the long-tail articles are recommended to users.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.

To this end, an embodiment of the present application provides a computer readable storage medium having stored therein a plurality of computer programs that can be loaded by a processor to perform the steps of any of the item recommendation methods provided by the embodiments of the present application. For example, the computer program may perform the steps of:

Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like. The steps of any item recommending method provided by the embodiment of the present application can be executed by the computer program stored in the storage medium, so that the beneficial effects of any item recommending method provided by the embodiment of the present application can be achieved, and detailed descriptions of the previous embodiments are omitted.

The above description of the method, the device, the medium and the computer equipment for recommending the article provided by the embodiment of the application applies specific examples to describe the principle and the implementation of the application, and the description of the above examples is only used for helping to understand the method and the core idea of the application; meanwhile, as those skilled in the art will vary in the specific embodiments and application scope according to the ideas of the present application, the present description should not be construed as limiting the present application in summary.

Claims

1. An item recommendation method, comprising:

2. The method according to claim 1, wherein before determining a recommended item meeting a preset condition from the candidate items according to the predetermined fusion weighting parameter between each two candidate items and the reference item, the method further comprises:

Acquiring description information of the candidate articles, and extracting text features and image features in the description information;

calculating the image similarity between every two candidate items according to the image characteristics;

determining multi-modal similarity between every two candidate items according to the text similarity and the image similarity;

and determining the fusion weighting parameter according to the multi-mode similarity.

3. The method of claim 2, wherein said determining said fusion weighting parameter based on said multi-modal similarity comprises:

acquiring user behavior data of the candidate object, and extracting behavior characteristics in the user behavior data;

and determining the fusion weighting parameter according to the first-order neighbor similarity, the second-order neighbor similarity and the multi-modal similarity.

4. A method according to any one of claims 1 to 3, wherein after determining a recommended item meeting a preset condition from the candidate items according to a predetermined fusion weighting parameter between each two candidate items and the reference item, the method further comprises:

5. The method according to claim 4, wherein the method further comprises:

and adjusting parameters of the preset neural network model according to the first loss value until the model converges.

6. The method of claim 5, wherein the computing a first probability distribution of the first type of feature in a preset feature space and the computing a second probability distribution of the second type of feature in the preset feature space comprises:

calculating a first probability distribution of the first covariance matrix in the preset feature space, and calculating a second probability distribution of the second covariance matrix in the preset feature space.

7. The method of claim 5, wherein adjusting parameters of the predetermined neural network model according to the first loss value until the model converges comprises:

extracting an item feature of the sample item;

and adjusting parameters of the preset neural network model according to the first loss value, the second loss value and the third loss value until the model converges.

8. The method of claim 5, wherein adjusting parameters of the predetermined neural network model according to the first loss value until the model converges comprises:

Extracting user features and article features in the sample article;

inputting the object features and the user features into the preset neural network model to obtain a predicted value of interest of the user in the sample object;

calculating a fourth loss value according to the actual value of the interest of the user on the sample article and the predicted value;

and adjusting parameters of the preset neural network model according to the first loss value and the fourth loss value until the model converges.

9. An article recommendation device, comprising:

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when run on a computer, causes the computer to perform the item recommendation method of any one of claims 1 to 8.

11. A computer device comprising a processor and a memory, the memory storing a computer program, wherein the processor is configured to perform the item recommendation method of any one of claims 1 to 8 by invoking the computer program.