CN114817724A

CN114817724A - Evaluation method and device for recommendation algorithm and storage medium

Info

Publication number: CN114817724A
Application number: CN202210458220.3A
Authority: CN
Inventors: 姜文君; 郭治焱; 李肯立; 李克勤
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2022-04-27
Filing date: 2022-04-27
Publication date: 2022-07-29
Anticipated expiration: 2042-04-27

Abstract

The application provides an evaluation method and related equipment for a recommendation algorithm, which can evaluate recommendation results of a surprise-oriented recommendation algorithm and more accurately measure the quality of a recommendation list. The method comprises the following steps: determining N recommendation lists and historical behavior records corresponding to a target user, wherein N is an integer greater than or equal to 1; determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record; calculating the relevance of the target user and each item in a target recommendation list in each aspect in the aspect set, wherein the target recommendation list is any one recommendation list in the N recommendation lists; determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation; and determining the overall profit score of the target recommendation list according to the profit of each item.

Description

Evaluation method and device for recommendation algorithm and storage medium

[ technical field ] A method for producing a semiconductor device

The present application relates to recommendation, and in particular, to a method and an apparatus for evaluating a recommendation algorithm, and a storage medium.

[ background of the invention ]

The recommendation system plays an important role in the information explosion era, can help users to find interesting information from massive information, and has been widely applied in many fields, such as news recommendation, e-commerce recommendation, short video recommendation and the like. The output of the recommendation algorithm is usually a list, and the elements in the list are sorted according to a certain standard, which indicates the degree of interest of the user in the item considered by the recommendation algorithm.

At present, the evaluation indexes of the recommendation algorithm recommendation list mainly include: accuracy (accuracy), Recall (Recall), Precision (Precision), and Normalized broken Cumulative Gain (nDCG), among others. Such evaluation indicators consider recommendation results of higher accuracy to have higher quality.

However, the high accuracy rate cannot fully reflect the quality of the recommendation result, but on the contrary, the recommendation content is gradually simplified and predictable, so that the user experience is reduced, and finally the effect of the recommendation algorithm cannot be reasonably evaluated.

[ summary of the invention ]

The application provides an evaluation method, an evaluation device and a storage medium for a recommendation algorithm, which can evaluate recommendation results of a recommendation algorithm facing surprise and more accurately measure profits of a recommendation list.

The application provides an evaluation method aiming at a recommendation algorithm in a first aspect, which comprises the following steps:

determining N recommendation lists and historical behavior records corresponding to a target user, wherein N is an integer greater than or equal to 1, and the N recommendation lists correspond to the recommendation algorithm;

determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;

calculating the relevance of the target user and each item in a target recommendation list in each aspect in the aspect set, wherein the target recommendation list is any one recommendation list in the N recommendation lists;

determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;

and determining the overall profit score of the target recommendation list according to the profit of each item.

A second aspect of the present application provides an evaluation apparatus for a recommendation algorithm, including:

the device comprises a determining unit, a recommending unit and a recommending unit, wherein the determining unit is used for determining N recommending lists and historical behavior records corresponding to a target user, N is an integer greater than or equal to 1, and the N recommending lists correspond to the recommending algorithm;

the curiosity degree determining unit is used for determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;

an aspect correlation calculation unit, configured to calculate a correlation between the target user and each item in a target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists;

a profit determining unit, configured to determine a profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;

and the score determining unit is used for determining the overall profit score of the target recommendation list according to the profit of each item.

A third aspect of embodiments of the present application provides a computer device, which includes at least one connected processor, a memory and a transceiver, wherein the memory is configured to store program codes, and the processor is configured to call the program codes in the memory to execute the steps of the evaluation method for recommendation algorithm according to the first aspect.

A fourth aspect of the embodiments of the present application provides a computer storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the steps of the evaluation method for recommendation algorithm described in any of the above aspects.

Compared with the related technology, in the embodiment provided by the application, based on the relation between the user and the articles in the aspect, the surprise degree and diversity of top K recommendation are evaluated in a finer-grained manner, the benefit brought to the user by the recommendation list is quantified from the angle of the surprise degree and diversity, the exploration intensity of the user on different aspects is considered in the evaluation process, the influence of the articles interacted in different time sequences on the curiosity of the user is considered from the historical interaction angle of the user, the recommendation result of the recommendation algorithm facing the surprise degree is evaluated, and the overall benefit score of the recommendation list is more accurately balanced.

[ description of the drawings ]

Fig. 1 is a schematic flowchart of an evaluation method for a recommendation algorithm according to an embodiment of the present application;

fig. 2 is a schematic virtual structure diagram of an evaluation apparatus for a recommendation algorithm according to an embodiment of the present application;

fig. 3 is a schematic diagram of a hardware structure of a server according to an embodiment of the present application.

[ detailed description ] embodiments

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments.

The recommendation system plays an important role in the era of information explosion, can help users to find information which is interesting to the users from massive information, and has been widely applied to many fields, such as news recommendation, e-commerce recommendation, short video recommendation and the like. The existing evaluation indexes of numerous recommendation algorithms can be mainly divided into the following categories:

the indexes such as Accuracy, Precision, Recall and hit, and the like, focus on evaluating the Accuracy of the recommendation result and mainly consider whether the recommendation list can contain items related to the user. But is not sensitive to the ordering of items in the list, and due to over-attention accuracy, the recommended content tends to be simplified, and the user is trapped in an 'information cocoon';

in the aspect of ranking, the index is considered that the recommendation list not only includes items clicked by the user, but also considers the sequence of items related to the user in the recommendation list.

Although the indexes give consideration to accuracy and item ordering, the accuracy is still used for representing the satisfaction degree of the user, and the algorithm with high score also has the problems of simplification of recommended content and 'information cocoon';

the indexes such as Diversity, Novelty, resume, Unexpectedness, and continuity are indexes, and from the viewpoint of the relationship between items in the recommendation list and the preference of the items and the user, the recommendation list is expected to meet certain requirements to improve the satisfaction degree of the user, for example, the Diversity requires that the items in the recommendation list are diversified as much as possible, and the resume requires that the items in the recommendation list are related to the interest of the user as much as possible. Although the indexes can more directly reflect the satisfaction degree of the user to the recommendation list, the definition of each index is not clear, and the satisfaction degree of a certain user to the recommendation result cannot be measured in a personalized mode.

Most of the above evaluation methods only consider the accuracy of the recommendation result or favor the unilateral utility of the recommendation result, so that the satisfaction degree of the user on the recommendation result cannot be comprehensively reflected, and the requirements on evaluation indexes under different recommendation scenes are difficult to adapt.

In view of the above, the present application evaluates the utility of items in a recommendation list to a user by surprise. After a recommendation algorithm generates a recommendation list for a certain user, calculating the curiosity of the user on each aspect; calculating the relevance of different aspects of the items in the recommendation list to the user one by one; and finally, combining the curiosity and the correlation to calculate the income of each article, and carrying out attenuation summation according to the position to finally obtain the grade of the user on the recommendation list. Therefore, the recommendation method and the recommendation device can be used for evaluating recommendation results of the surprise-oriented recommendation algorithm, measuring the quality of a recommendation list more accurately and evaluating, comparing and selecting a proper recommendation algorithm more comprehensively.

The following describes an evaluation method for a recommendation algorithm from the perspective of an evaluation device for a recommendation algorithm, which may be a server or a service unit in a server, and is not particularly limited.

Referring to fig. 1, fig. 1 is a schematic flow chart of an evaluation method for a recommendation algorithm according to an embodiment of the present application, including:

101. and determining N recommendation lists and historical behavior records corresponding to the target user.

In this embodiment, the evaluation device for the recommendation algorithm may determine N recommendation lists and historical behavior records corresponding to the target user, where N is an integer greater than or equal to 1. Wherein, the historical behavior record refers to the object sequence interacted by the target user and arranged according to the interaction time, and the sequence passes through S _u To indicate that the user is not in a normal position,

wherein

The N recommendation lists correspond to recommendation algorithms to be evaluated for the objects which have been interacted by the target user recently.

The generation method of determining the N recommendation lists corresponding to the target user and the method of determining the historical behavior record corresponding to the target user are not particularly limited here, as long as the determination can be made, and in addition, the N recommendation lists may be generated by different recommendation algorithms respectively, and the N different recommendation algorithms are algorithms to be evaluated.

102. And determining the curiosity degree of each aspect in the aspect set by the target user according to the historical behavior record.

In this embodiment, after the evaluation device for the recommendation algorithm determines the historical behavior record corresponding to the target user, the curiosity degree of the target user on each aspect in the aspect set may be determined according to the historical behavior record. Among other things, the recommended surprise is related to the following factors:

1. whether the item matches the user's long-term interest preferences or short-term needs, for example;

2. recommending relationships between items in the list, such as ordering of items in the list and differences between items;

3. the relationship between the article and the user historical behavior, such as the time interval of the last appearance of the article, the time interval of the last appearance of the same kind of article, and the like.

In this application, surprise is defined as two parts: curosivity and relevance. Thus, items that surprise the user should be unexpected to the user, but relevant to the user, e.g. inferring from the behavior of user u that he would be interested in history and politics, and S _u Also related to these aspects.

In general, the user's desire to explore various aspects is constantly changing, and during the exploration process, the user explores a certain aspect a of the aspect set A _φ The curiosity/strength of exploration of (c) is mainly influenced by two aspects:

1. aspect a consumed over a period of time _φ The number of items involved. However, it is not sufficient to consider only the quantity, the article and the aspect a _φ The accumulation of the correlation between is also important, the items consumed by user u in aspect a _φ When the cumulative correlation of (1) is high, the user is facing the aspect (a) _φ The curiosity degree is lower, otherwise, the curiosity degree is higher;

2. user u is in aspect a _φ The cumulative score of (c). User u to aspect a _φ When the cumulative score of (1) is higher, the user u is indicated to the aspect a _φ With a high degree of satisfaction, i.e. to aspect a _φ Has approached saturation, thus for aspect a _φ The curiosity of the user may not be high.

In one embodiment, the determining, by the evaluation device for the recommendation algorithm, how curious each aspect in the set of aspects is by the target user based on the historical behavior record comprises:

determining a weight of each item over time in the historical behavior log and an aspect rating of the each item by the target user;

and determining the curiosity degree of each aspect in the aspect set by the target user according to the weight of each article in the historical behavior record in time and the score of the target user on each article in the aspect.

Specifically, the evaluation device for the recommendation algorithm may determine how curious the target user is about each aspect in the set of aspects by the following formula:

where u is the target user, τ _u,φ For the target user u to the aspect a in the aspect set _φ Degree of curiosity of S _u Is the historical behavior record of target user u, m is the number of facets included in the facet set,

t _u,i ∈[1...|S _u |]，

is t _u,i The result after reordering according to the preset rule,

aspect a for item i for target user u _φ The score of (2) is determined by the following formula

And

wherein epsilon is a parameter designated by a target user, and alpha is epsilon [0,1]，

The time weight of the item in the interaction sequence of the user is represented, and the size of the time weight represents the size of the influence capacity of the item on the curiosity of the user. Alpha controls the distribution of time weight, and the smaller alpha is, the greater influence of the recently interacted articles on the user is shown; the larger alpha, the greater the impact of earlier interacting items. .

Note that, by τ _u,φ Represents u to a _φ Is known as curiosity. Because of the subjectivity of curiosity, curiosity is calculated in a relatively objective manner: from the user's historical behavior, the more items that have consumed an aspect, the less likely the aspect will be surprised by the user. Thus, the target user's curiosity for each facet in the set of facets is calculated by the following formula:

wherein the content of the first and second substances,

is t _u,i Result reordered according to preset rules, t _u,i ∈[1...|S _u |]Denotes that i is at S _u The serial number in (1) is (d),

the calculation can be made by the following formula:

wherein alpha is [0,1 ]]For preset parameters, controlling S _u The degree of influence of the items with different interaction sequences on the curiosity of the user can be determined

Viewed as item i at S _u At a time of weight of

The item at (c) has the smallest temporal weight of 1, with the smallest contribution to user u's curiosity over time. Physical meaning of parameter α: balancing relevance and surprise, balancing short-term needs and long-term preferences of the user. In recommendations, items that the user has interacted with recently can reflect the needs of the user at hand, while interactions over a longer period of time reflect the long-term, steady preferences of the user, and those with fewer user interactions. Through the adjustment of alpha, the items with earlier interaction time and closer items can obtain higher weight, and the curiosity of the user on different aspects can be reflected by combining the scoring of the user on the items. Because of the sparsity of the data, the user may not score every aspect, so smoothing the aspect scores for the item is required:

wherein epsilon is a parameter designated by the target user, and when the target user does not score a certain aspect, the value is used, so that the curiosity degree calculation formula after smoothing is as follows:

103. and calculating the relevance of the target user and each item in the target recommendation list in each aspect in the aspect set.

In this embodiment, the evaluation device for the recommendation algorithm may calculate the relevance of the target user to each item in the target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists. Specifically, the evaluation device for recommendation algorithm may first determine that each item satisfies the probability that the target user is interested in each aspect, and the probability that the target user is interested in each aspect after browsing the target recommendation list, where the priority is greater than that of each item; and then determining the relevance of the target user and each item in the target recommendation list in each aspect according to the probability that each item meets the interest of the target user in each aspect and the probability that the target user has the interest of each aspect after browsing the target recommendation list, wherein the priority of each item is greater than that of each item. The following is a detailed description:

the evaluation means for the recommendation algorithm may determine the relevance of the target user to each item in the target recommendation list in each aspect by the following formula:

wherein relevanve (u, i, phi) is the target user u and item i in aspect a _φ Correlation of (A) P (a) _φ | u, i) satisfies the aspect a of the target user u for the item i _φ The probability of the interest is determined by the probability of interest,

after browsing the target recommendation list, the priority of the target user u is higher than that of the item i _φ Of interest, I ^r For a target recommendation list including K recommended items, P (a) is determined by the following formula _φ I u, i) and

wherein the content of the first and second substances,

r _max ＝{r _u,i,φ |i∈S _u }，P(a _φ | u) is the target user u to the aspect a _φ Is determined by the following formula:

a is set of aspects, S _u For the historical behavior record of the target user, r _u,i,φ Aspect a for item i for said target user u _φ The score of (a) is given to (b),

is to r _u,i,φ And smoothing to obtain the product.

It should be noted that curiosity measures how curiosity and the strength of exploration of various aspects of a user are measured. When evaluating a recommendation list, the surprise and income gained by the user for the items in the list not only depends on curiosity, but also is influenced by the relevance of the items themselves to the aspect, and the position of the items in the list is also critical. Consider an example: after entering keywords on a search engine, users typically expect relevant documents to be ranked in front, and the content of different documents should be as little overlapping as possible. From this it can be determined:

1. the matching degree of the user and the content of the article is not only dependent on the content of the article, but also influenced by the subjectivity of the user, namely the matching degree of different people and the same article is different;

2. the items in the recommendation list have information redundancy in content, and if the items in front of one item meet the requirement of a user on a certain aspect, the value of the item to the user is reduced.

In the present application, it is assumed that the aspects of the item are independent, that is, the relevance of the user to the item is determined by m independent aspects together, that is:

wherein relevanve (u, i, phi) is the target user u and item i in aspect a _φ The relevance of the user to the item depends on the matching of the user and the item and the ordering of the item, and the relevance of the user to the item in terms is defined as follows in the application:

wherein, P (a) _φ I u, i) satisfies the target user u to face a for item i _φ Probability of interest, i.e. target user u and item i in aspect a _φ The correlation of (a) to (b) is,

after the target user u is browsed to the surface a after the priority in the target recommendation list is higher than the item i _φ I.e. the item that is arranged in front of the item i in the viewed target recommendation list

Thereafter, target user u returns to face a _φ Is determined. Aspect a _φ How much to satisfy the target user is not only related to the item or not _φ Is also dependent on

How many items in (a) have satisfied target user u in aspect a _φ (ii) interest in;

the measure is that the item i is in the aspect a after removing the redundancy _φ Satisfying the contribution of the target user u, i.e. preferentially satisfying the aspects that the user has not yet satisfied, requires a target recommendation list I ^r The article of (1) should have as high an aspect coverage as possible. Wherein, the first and the second end of the pipe are connected with each other,

the definition is as follows:

wherein, P (a) _φ | u) is the target user u to the aspect a _φ Is compared with the above-mentioned curiosity level τ in step 102 _u,φ The difference is that: p (a) _φ | u) emphasizes target user u and aspect a _φ And the preference of the target user u, τ _u,φ Emphasis on given S _u In the state (b), the curiosity of the target user u is distributed. The target user u's preference in different aspects can be accurately estimated through the historical behavior record of the target user u:

wherein A is an aspect set, S _u Historical behavioral records for the target user, r _u,i,φ Aspect a for item i for target user u _φ The score of (a) is given to (b),

is to r _u,i,φ And smoothing to obtain the product.

P(a _φ U, i) is personalized, with different users experiencing different experiences with the same item, defined as follows:

if aspect a _φ Independently of article i, then P (a) _φ | u, i) is 0;

if item i exhibits aspect a _φ But not in the historical behavior record corresponding to the target user, the function f (u, i, a) is passed _φ ) Given that different strategies may be applied to estimate P (a) _φ | u, i), e.g. according to S _u A small constant is generated or specified which, in practice,the function f (u, i, a) may be replaced by existing methods, such as click-through rate estimation _φ )；

For item i interacted with by target user u, but not in aspect a _φ Up-scoring, g (u, i, a) can be specified using different strategies _φ ) For example, using the average or minimum value of the scores of the target user u in other aspects of the item i, and the target user u in the aspect a of other items _φ The mean or lowest value of the scores above or a smaller constant value is specified;

the scoring aspect is given to the target user u, and the scoring is enough to show that the target user u performs scoring on the item i in the aspect a _φ Preference of (c), the function h (u, i, a) _φ ) Is defined as:

wherein r is _max ＝{r _u,i,φ |i∈S _u Denotes the maximum value in the target user u's score in terms;

it should be noted that the curiosity degree of the target user on each aspect in the aspect set can be determined through step 102, and the relevance between the target user and each item in the target recommendation list in each aspect is calculated through step 103, however, there is no sequential execution order limitation between these two steps, and step 102 may be executed first, step 103 may be executed first, or step 103 may be executed simultaneously, which is not limited specifically.

104. And determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation.

In this embodiment, after determining the curiosity degree of each aspect in the aspect set by the target user and the correlation between the target user and each item in the target recommendation list in each aspect in the aspect set, the evaluation device for the recommendation algorithm may determine the benefit of the target user to each item in the target recommendation list according to the curiosity degree and the correlation, and specifically, may determine the benefit of the target user to each item in the target recommendation list by the following formula:

wherein gain (u, i) is the income of the target user to each item i in the target recommendation list, tau _u,φ For the target user u, each aspect a in the aspect set _φ With respect to the target user u and item i, in aspect a _φ The correlation of (c). Wherein, 1- (τ) _u,φ Relevance (u, i, φ)) represents the target user u in aspect a with item i _φ Without surprise from the above

Indicating that the target user u is not surprised, i.e., not surprised in all respects, by item i.

105. An overall revenue score for the target recommendation list is determined based on the revenue for each item.

In this embodiment, after determining the profit of each item, the evaluation device for the recommendation algorithm may calculate the unnormalized score of the target recommendation list and the ideal profit of the target recommendation list according to the profit of each item, and then determine the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit. Specifically, the evaluation device for the recommendation algorithm may determine the unnormalized score of the target recommendation list by the following formula:

wherein ser-DCG is the unnormalized score of the target recommendation list, K is the number of articles in the target recommendation list, and gain (u, i) is the income of the target user u to each article i in the target recommendation list;

determining the ideal benefit of the target recommendation list by the following formula:

wherein ser-IDCG is the ideal profit of the target recommendation list, rank (i) ranks the profit values of each item i in the target recommendation list;

calculating an overall profit score for the target recommendation list based on the unnormalized scores and the ideal profits by:

wherein ser-nDCG is the overall profit score of the target recommendation list, namely the normalized score.

In summary, in the embodiment provided by the application, based on the relationship between the user and the object in the aspect, the surprise degree and diversity of top K recommendation are evaluated in a finer-grained manner, the benefit brought to the user by the recommendation list is quantified from the angle of the surprise degree and diversity, the exploration intensity of the user on different aspects is considered in the evaluation process, the influence of the object interacted in different time sequences on the curiosity of the user is considered from the historical interaction angle of the user, the recommendation result of the recommendation algorithm facing the surprise degree is evaluated, and the satisfaction of the user on the recommendation list is more accurately measured.

The present application is explained above from the point of view of an evaluation method for a recommendation algorithm and below from the point of view of an evaluation device for a recommendation algorithm.

Referring to fig. 2, fig. 2 is a schematic view of a virtual structure of an evaluation apparatus for a recommendation algorithm according to an embodiment of the present application, where the evaluation apparatus 200 for a recommendation algorithm includes:

a determining unit 201, configured to determine N recommendation lists and historical behavior records corresponding to a target user, where N is an integer greater than or equal to 1, and the recommendation algorithm corresponds to the N recommendation lists;

the curiosity degree determining unit 202 is used for determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;

an aspect correlation calculation unit 203, configured to calculate a correlation between the target user and each item in a target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists;

a profit determining unit 204, configured to determine a profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;

a score determining unit 205, configured to determine an overall profit score of the target recommendation list according to the profit of each item.

In one possible design, the curiosity level determining unit 202 is specifically configured to:

In one possible design, the aspect correlation calculation unit 203 is specifically configured to:

determining the probability that each item meets the interest of the target user in each aspect and the interest of the target user in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than the interest of each item;

and determining the relevance of the target user and each item in the target recommendation list in each aspect according to the probability that each item meets the interest of the target user in each aspect and the probability that the target user has the interest in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than that of each item.

In one possible design, the benefit determining unit 204 is specifically configured to:

determining the profit of the target user for each item in the target recommendation list by the following formula:

wherein gain (u, i) is the income of the target user to each item i in the target recommendation list, tau _u,φ For the target user u, each aspect a in the aspect set _φ With respect to the target user u and item i, in aspect a _φ The correlation of (c).

In one possible design, the score determining unit 205 is specifically configured to:

calculating the unnormalized score of the target recommendation list according to the income of each article;

calculating the ideal profit of the target recommendation list according to the profit of each item;

and determining the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit.

In one possible design, the calculating, by the score determining unit 205, the unnormalized score of the target recommendation list according to the revenue of each item includes:

determining an unnormalized score for the target recommendation list by:

wherein ser-DCG is the unnormalized score of the target recommendation list, K is the number of items included in the target recommendation list, and gain (u, i) is the income of the target user u on each item i in the target recommendation list;

the scoring determination unit 205 calculating the ideal profit for the target recommendation list according to the profit for each item includes:

determining an ideal benefit of the target recommendation list by:

wherein ser-IDCG is the ideal profit of said target recommendation list, rank (i) ranks the profit values of said each item i in said target recommendation list;

the step of determining, by the score determining unit 205, the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit includes:

determining an overall profit score for the target recommendation list by:

wherein ser-nDCG is the score of the target recommendation list.

Fig. 3 is a schematic structural diagram of a server according to the present application, and as shown in fig. 3, a server 300 according to this embodiment includes at least one processor 301, at least one network interface 304 or other user interface 303, a memory 305, and at least one communication bus 302. The server 300 optionally contains a user interface 303 including a display, keyboard or pointing device. Memory 305 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 305 stores execution instructions, when the server 300 runs, the processor 301 communicates with the memory 305, and the processor 301 calls the instructions stored in the memory 305 to execute the above evaluation method for the recommended algorithm. An operating system 306, which contains various programs for implementing various basic services and for handling hardware-dependent tasks.

The server provided in the embodiment of the present application may execute the technical solution of the above-described evaluation method for a recommendation algorithm, and the implementation principle and the technical effect are similar, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a computer, implements the method flow related to the evaluation device for recommendation algorithm in any of the above method embodiments. Correspondingly, the computer may be the above-mentioned evaluation device for the recommendation algorithm.

The present invention also provides a computer program or a computer program product comprising a computer program, which, when executed on a computer, causes the computer to implement the method flows of any of the above method embodiments related to the evaluation device for recommendation algorithm. Correspondingly, the computer may be the above-mentioned evaluation device for the recommendation algorithm.

In the above-described embodiment corresponding to fig. 1, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and these modifications or substitutions do not depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. An evaluation method for a recommendation algorithm, comprising:

2. The method of claim 1, wherein determining how curious each facet in the set of facets is the target user based on the historical behavior record comprises:

3. The method of claim 1, wherein the calculating the relevance of the target user to each item in the target recommendation list for each aspect in the set of aspects comprises:

4. The method of claim 1, wherein the determining the benefit of the target user for each item in the target recommendation list based on the curiosity level and the relevance comprises:

5. The method of any one of claims 1 to 4, wherein said determining an overall profit score for the target recommendation list from the profit for the each item comprises:

6. The method of claim 5, wherein said calculating an unnormalized score for the target recommendation list based on the revenue for each item comprises:

determining an unnormalized score for the target recommendation list by:

the calculating the ideal profit of the target recommendation list according to the profit of each item comprises:

determining an ideal benefit of the target recommendation list by:

the determining an overall revenue score for the target recommendation list based on the unnormalized score and the ideal revenue comprises:

determining an overall profit score for the target recommendation list by:

wherein ser-nDCG is the overall profit score of the target recommendation list.

7. An evaluation apparatus for a recommendation algorithm, comprising:

8. The apparatus according to claim 7, wherein the score determining unit is specifically configured to:

9. A computer device, comprising:

at least one connected processor, memory and transceiver, wherein the memory is for storing program code and the processor is for calling the program code in the memory to perform the steps of the evaluation method for recommendation algorithm of any of claims 1 to 6.

10. A computer storage medium, comprising:

instructions which, when run on a computer, cause the computer to perform the steps of the evaluation method for recommendation algorithms of any of claims 1 to 6.