CN114817724A - Evaluation method and device for recommendation algorithm and storage medium - Google Patents

Evaluation method and device for recommendation algorithm and storage medium Download PDF

Info

Publication number
CN114817724A
CN114817724A CN202210458220.3A CN202210458220A CN114817724A CN 114817724 A CN114817724 A CN 114817724A CN 202210458220 A CN202210458220 A CN 202210458220A CN 114817724 A CN114817724 A CN 114817724A
Authority
CN
China
Prior art keywords
target
item
recommendation list
determining
profit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210458220.3A
Other languages
Chinese (zh)
Other versions
CN114817724B (en
Inventor
姜文君
郭治焱
李肯立
李克勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202210458220.3A priority Critical patent/CN114817724B/en
Priority claimed from CN202210458220.3A external-priority patent/CN114817724B/en
Publication of CN114817724A publication Critical patent/CN114817724A/en
Application granted granted Critical
Publication of CN114817724B publication Critical patent/CN114817724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an evaluation method and related equipment for a recommendation algorithm, which can evaluate recommendation results of a surprise-oriented recommendation algorithm and more accurately measure the quality of a recommendation list. The method comprises the following steps: determining N recommendation lists and historical behavior records corresponding to a target user, wherein N is an integer greater than or equal to 1; determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record; calculating the relevance of the target user and each item in a target recommendation list in each aspect in the aspect set, wherein the target recommendation list is any one recommendation list in the N recommendation lists; determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation; and determining the overall profit score of the target recommendation list according to the profit of each item.

Description

Evaluation method and device for recommendation algorithm and storage medium
[ technical field ] A method for producing a semiconductor device
The present application relates to recommendation, and in particular, to a method and an apparatus for evaluating a recommendation algorithm, and a storage medium.
[ background of the invention ]
The recommendation system plays an important role in the information explosion era, can help users to find interesting information from massive information, and has been widely applied in many fields, such as news recommendation, e-commerce recommendation, short video recommendation and the like. The output of the recommendation algorithm is usually a list, and the elements in the list are sorted according to a certain standard, which indicates the degree of interest of the user in the item considered by the recommendation algorithm.
At present, the evaluation indexes of the recommendation algorithm recommendation list mainly include: accuracy (accuracy), Recall (Recall), Precision (Precision), and Normalized broken Cumulative Gain (nDCG), among others. Such evaluation indicators consider recommendation results of higher accuracy to have higher quality.
However, the high accuracy rate cannot fully reflect the quality of the recommendation result, but on the contrary, the recommendation content is gradually simplified and predictable, so that the user experience is reduced, and finally the effect of the recommendation algorithm cannot be reasonably evaluated.
[ summary of the invention ]
The application provides an evaluation method, an evaluation device and a storage medium for a recommendation algorithm, which can evaluate recommendation results of a recommendation algorithm facing surprise and more accurately measure profits of a recommendation list.
The application provides an evaluation method aiming at a recommendation algorithm in a first aspect, which comprises the following steps:
determining N recommendation lists and historical behavior records corresponding to a target user, wherein N is an integer greater than or equal to 1, and the N recommendation lists correspond to the recommendation algorithm;
determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;
calculating the relevance of the target user and each item in a target recommendation list in each aspect in the aspect set, wherein the target recommendation list is any one recommendation list in the N recommendation lists;
determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;
and determining the overall profit score of the target recommendation list according to the profit of each item.
A second aspect of the present application provides an evaluation apparatus for a recommendation algorithm, including:
the device comprises a determining unit, a recommending unit and a recommending unit, wherein the determining unit is used for determining N recommending lists and historical behavior records corresponding to a target user, N is an integer greater than or equal to 1, and the N recommending lists correspond to the recommending algorithm;
the curiosity degree determining unit is used for determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;
an aspect correlation calculation unit, configured to calculate a correlation between the target user and each item in a target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists;
a profit determining unit, configured to determine a profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;
and the score determining unit is used for determining the overall profit score of the target recommendation list according to the profit of each item.
A third aspect of embodiments of the present application provides a computer device, which includes at least one connected processor, a memory and a transceiver, wherein the memory is configured to store program codes, and the processor is configured to call the program codes in the memory to execute the steps of the evaluation method for recommendation algorithm according to the first aspect.
A fourth aspect of the embodiments of the present application provides a computer storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the steps of the evaluation method for recommendation algorithm described in any of the above aspects.
Compared with the related technology, in the embodiment provided by the application, based on the relation between the user and the articles in the aspect, the surprise degree and diversity of top K recommendation are evaluated in a finer-grained manner, the benefit brought to the user by the recommendation list is quantified from the angle of the surprise degree and diversity, the exploration intensity of the user on different aspects is considered in the evaluation process, the influence of the articles interacted in different time sequences on the curiosity of the user is considered from the historical interaction angle of the user, the recommendation result of the recommendation algorithm facing the surprise degree is evaluated, and the overall benefit score of the recommendation list is more accurately balanced.
[ description of the drawings ]
Fig. 1 is a schematic flowchart of an evaluation method for a recommendation algorithm according to an embodiment of the present application;
fig. 2 is a schematic virtual structure diagram of an evaluation apparatus for a recommendation algorithm according to an embodiment of the present application;
fig. 3 is a schematic diagram of a hardware structure of a server according to an embodiment of the present application.
[ detailed description ] embodiments
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
The recommendation system plays an important role in the era of information explosion, can help users to find information which is interesting to the users from massive information, and has been widely applied to many fields, such as news recommendation, e-commerce recommendation, short video recommendation and the like. The existing evaluation indexes of numerous recommendation algorithms can be mainly divided into the following categories:
the indexes such as Accuracy, Precision, Recall and hit, and the like, focus on evaluating the Accuracy of the recommendation result and mainly consider whether the recommendation list can contain items related to the user. But is not sensitive to the ordering of items in the list, and due to over-attention accuracy, the recommended content tends to be simplified, and the user is trapped in an 'information cocoon';
in the aspect of ranking, the index is considered that the recommendation list not only includes items clicked by the user, but also considers the sequence of items related to the user in the recommendation list.
Although the indexes give consideration to accuracy and item ordering, the accuracy is still used for representing the satisfaction degree of the user, and the algorithm with high score also has the problems of simplification of recommended content and 'information cocoon';
the indexes such as Diversity, Novelty, resume, Unexpectedness, and continuity are indexes, and from the viewpoint of the relationship between items in the recommendation list and the preference of the items and the user, the recommendation list is expected to meet certain requirements to improve the satisfaction degree of the user, for example, the Diversity requires that the items in the recommendation list are diversified as much as possible, and the resume requires that the items in the recommendation list are related to the interest of the user as much as possible. Although the indexes can more directly reflect the satisfaction degree of the user to the recommendation list, the definition of each index is not clear, and the satisfaction degree of a certain user to the recommendation result cannot be measured in a personalized mode.
Most of the above evaluation methods only consider the accuracy of the recommendation result or favor the unilateral utility of the recommendation result, so that the satisfaction degree of the user on the recommendation result cannot be comprehensively reflected, and the requirements on evaluation indexes under different recommendation scenes are difficult to adapt.
In view of the above, the present application evaluates the utility of items in a recommendation list to a user by surprise. After a recommendation algorithm generates a recommendation list for a certain user, calculating the curiosity of the user on each aspect; calculating the relevance of different aspects of the items in the recommendation list to the user one by one; and finally, combining the curiosity and the correlation to calculate the income of each article, and carrying out attenuation summation according to the position to finally obtain the grade of the user on the recommendation list. Therefore, the recommendation method and the recommendation device can be used for evaluating recommendation results of the surprise-oriented recommendation algorithm, measuring the quality of a recommendation list more accurately and evaluating, comparing and selecting a proper recommendation algorithm more comprehensively.
The following describes an evaluation method for a recommendation algorithm from the perspective of an evaluation device for a recommendation algorithm, which may be a server or a service unit in a server, and is not particularly limited.
Referring to fig. 1, fig. 1 is a schematic flow chart of an evaluation method for a recommendation algorithm according to an embodiment of the present application, including:
101. and determining N recommendation lists and historical behavior records corresponding to the target user.
In this embodiment, the evaluation device for the recommendation algorithm may determine N recommendation lists and historical behavior records corresponding to the target user, where N is an integer greater than or equal to 1. Wherein, the historical behavior record refers to the object sequence interacted by the target user and arranged according to the interaction time, and the sequence passes through S u To indicate that the user is not in a normal position,
Figure BDA0003619504240000041
wherein
Figure BDA0003619504240000042
The N recommendation lists correspond to recommendation algorithms to be evaluated for the objects which have been interacted by the target user recently.
The generation method of determining the N recommendation lists corresponding to the target user and the method of determining the historical behavior record corresponding to the target user are not particularly limited here, as long as the determination can be made, and in addition, the N recommendation lists may be generated by different recommendation algorithms respectively, and the N different recommendation algorithms are algorithms to be evaluated.
102. And determining the curiosity degree of each aspect in the aspect set by the target user according to the historical behavior record.
In this embodiment, after the evaluation device for the recommendation algorithm determines the historical behavior record corresponding to the target user, the curiosity degree of the target user on each aspect in the aspect set may be determined according to the historical behavior record. Among other things, the recommended surprise is related to the following factors:
1. whether the item matches the user's long-term interest preferences or short-term needs, for example;
2. recommending relationships between items in the list, such as ordering of items in the list and differences between items;
3. the relationship between the article and the user historical behavior, such as the time interval of the last appearance of the article, the time interval of the last appearance of the same kind of article, and the like.
In this application, surprise is defined as two parts: curosivity and relevance. Thus, items that surprise the user should be unexpected to the user, but relevant to the user, e.g. inferring from the behavior of user u that he would be interested in history and politics, and S u Also related to these aspects.
In general, the user's desire to explore various aspects is constantly changing, and during the exploration process, the user explores a certain aspect a of the aspect set A φ The curiosity/strength of exploration of (c) is mainly influenced by two aspects:
1. aspect a consumed over a period of time φ The number of items involved. However, it is not sufficient to consider only the quantity, the article and the aspect a φ The accumulation of the correlation between is also important, the items consumed by user u in aspect a φ When the cumulative correlation of (1) is high, the user is facing the aspect (a) φ The curiosity degree is lower, otherwise, the curiosity degree is higher;
2. user u is in aspect a φ The cumulative score of (c). User u to aspect a φ When the cumulative score of (1) is higher, the user u is indicated to the aspect a φ With a high degree of satisfaction, i.e. to aspect a φ Has approached saturation, thus for aspect a φ The curiosity of the user may not be high.
In one embodiment, the determining, by the evaluation device for the recommendation algorithm, how curious each aspect in the set of aspects is by the target user based on the historical behavior record comprises:
determining a weight of each item over time in the historical behavior log and an aspect rating of the each item by the target user;
and determining the curiosity degree of each aspect in the aspect set by the target user according to the weight of each article in the historical behavior record in time and the score of the target user on each article in the aspect.
Specifically, the evaluation device for the recommendation algorithm may determine how curious the target user is about each aspect in the set of aspects by the following formula:
Figure BDA0003619504240000051
where u is the target user, τ u,φ For the target user u to the aspect a in the aspect set φ Degree of curiosity of S u Is the historical behavior record of target user u, m is the number of facets included in the facet set,
Figure BDA00036195042400000610
t u,i ∈[1...|S u |],
Figure BDA0003619504240000061
is t u,i The result after reordering according to the preset rule,
Figure BDA0003619504240000062
aspect a for item i for target user u φ The score of (2) is determined by the following formula
Figure BDA0003619504240000063
And
Figure BDA0003619504240000064
Figure BDA0003619504240000065
Figure BDA0003619504240000066
wherein epsilon is a parameter designated by a target user, and alpha is epsilon [0,1],
Figure BDA0003619504240000067
The time weight of the item in the interaction sequence of the user is represented, and the size of the time weight represents the size of the influence capacity of the item on the curiosity of the user. Alpha controls the distribution of time weight, and the smaller alpha is, the greater influence of the recently interacted articles on the user is shown; the larger alpha, the greater the impact of earlier interacting items. .
Note that, by τ u,φ Represents u to a φ Is known as curiosity. Because of the subjectivity of curiosity, curiosity is calculated in a relatively objective manner: from the user's historical behavior, the more items that have consumed an aspect, the less likely the aspect will be surprised by the user. Thus, the target user's curiosity for each facet in the set of facets is calculated by the following formula:
Figure BDA0003619504240000068
wherein the content of the first and second substances,
Figure BDA0003619504240000069
is t u,i Result reordered according to preset rules, t u,i ∈[1...|S u |]Denotes that i is at S u The serial number in (1) is (d),
Figure BDA0003619504240000071
the calculation can be made by the following formula:
Figure BDA0003619504240000072
wherein alpha is [0,1 ]]For preset parameters, controlling S u The degree of influence of the items with different interaction sequences on the curiosity of the user can be determined
Figure BDA0003619504240000073
Viewed as item i at S u At a time of weight of
Figure BDA0003619504240000074
The item at (c) has the smallest temporal weight of 1, with the smallest contribution to user u's curiosity over time. Physical meaning of parameter α: balancing relevance and surprise, balancing short-term needs and long-term preferences of the user. In recommendations, items that the user has interacted with recently can reflect the needs of the user at hand, while interactions over a longer period of time reflect the long-term, steady preferences of the user, and those with fewer user interactions. Through the adjustment of alpha, the items with earlier interaction time and closer items can obtain higher weight, and the curiosity of the user on different aspects can be reflected by combining the scoring of the user on the items. Because of the sparsity of the data, the user may not score every aspect, so smoothing the aspect scores for the item is required:
Figure BDA0003619504240000075
wherein epsilon is a parameter designated by the target user, and when the target user does not score a certain aspect, the value is used, so that the curiosity degree calculation formula after smoothing is as follows:
Figure BDA0003619504240000076
103. and calculating the relevance of the target user and each item in the target recommendation list in each aspect in the aspect set.
In this embodiment, the evaluation device for the recommendation algorithm may calculate the relevance of the target user to each item in the target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists. Specifically, the evaluation device for recommendation algorithm may first determine that each item satisfies the probability that the target user is interested in each aspect, and the probability that the target user is interested in each aspect after browsing the target recommendation list, where the priority is greater than that of each item; and then determining the relevance of the target user and each item in the target recommendation list in each aspect according to the probability that each item meets the interest of the target user in each aspect and the probability that the target user has the interest of each aspect after browsing the target recommendation list, wherein the priority of each item is greater than that of each item. The following is a detailed description:
the evaluation means for the recommendation algorithm may determine the relevance of the target user to each item in the target recommendation list in each aspect by the following formula:
Figure BDA0003619504240000081
wherein relevanve (u, i, phi) is the target user u and item i in aspect a φ Correlation of (A) P (a) φ | u, i) satisfies the aspect a of the target user u for the item i φ The probability of the interest is determined by the probability of interest,
Figure BDA0003619504240000082
after browsing the target recommendation list, the priority of the target user u is higher than that of the item i φ Of interest, I r For a target recommendation list including K recommended items, P (a) is determined by the following formula φ I u, i) and
Figure BDA0003619504240000083
Figure BDA0003619504240000084
Figure BDA0003619504240000085
wherein the content of the first and second substances,
Figure BDA0003619504240000086
r max ={r u,i,φ |i∈S u },P(a φ | u) is the target user u to the aspect a φ Is determined by the following formula:
Figure BDA0003619504240000091
a is set of aspects, S u For the historical behavior record of the target user, r u,i,φ Aspect a for item i for said target user u φ The score of (a) is given to (b),
Figure BDA0003619504240000092
is to r u,i,φ And smoothing to obtain the product.
It should be noted that curiosity measures how curiosity and the strength of exploration of various aspects of a user are measured. When evaluating a recommendation list, the surprise and income gained by the user for the items in the list not only depends on curiosity, but also is influenced by the relevance of the items themselves to the aspect, and the position of the items in the list is also critical. Consider an example: after entering keywords on a search engine, users typically expect relevant documents to be ranked in front, and the content of different documents should be as little overlapping as possible. From this it can be determined:
1. the matching degree of the user and the content of the article is not only dependent on the content of the article, but also influenced by the subjectivity of the user, namely the matching degree of different people and the same article is different;
2. the items in the recommendation list have information redundancy in content, and if the items in front of one item meet the requirement of a user on a certain aspect, the value of the item to the user is reduced.
In the present application, it is assumed that the aspects of the item are independent, that is, the relevance of the user to the item is determined by m independent aspects together, that is:
Figure BDA0003619504240000093
wherein relevanve (u, i, phi) is the target user u and item i in aspect a φ The relevance of the user to the item depends on the matching of the user and the item and the ordering of the item, and the relevance of the user to the item in terms is defined as follows in the application:
Figure BDA0003619504240000094
wherein, P (a) φ I u, i) satisfies the target user u to face a for item i φ Probability of interest, i.e. target user u and item i in aspect a φ The correlation of (a) to (b) is,
Figure BDA0003619504240000101
after the target user u is browsed to the surface a after the priority in the target recommendation list is higher than the item i φ I.e. the item that is arranged in front of the item i in the viewed target recommendation list
Figure BDA0003619504240000102
Thereafter, target user u returns to face a φ Is determined. Aspect a φ How much to satisfy the target user is not only related to the item or not φ Is also dependent on
Figure BDA0003619504240000103
How many items in (a) have satisfied target user u in aspect a φ (ii) interest in;
Figure BDA0003619504240000104
the measure is that the item i is in the aspect a after removing the redundancy φ Satisfying the contribution of the target user u, i.e. preferentially satisfying the aspects that the user has not yet satisfied, requires a target recommendation list I r The article of (1) should have as high an aspect coverage as possible. Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003619504240000105
the definition is as follows:
Figure BDA0003619504240000106
wherein, P (a) φ | u) is the target user u to the aspect a φ Is compared with the above-mentioned curiosity level τ in step 102 u,φ The difference is that: p (a) φ | u) emphasizes target user u and aspect a φ And the preference of the target user u, τ u,φ Emphasis on given S u In the state (b), the curiosity of the target user u is distributed. The target user u's preference in different aspects can be accurately estimated through the historical behavior record of the target user u:
Figure BDA0003619504240000107
wherein A is an aspect set, S u Historical behavioral records for the target user, r u,i,φ Aspect a for item i for target user u φ The score of (a) is given to (b),
Figure BDA0003619504240000111
is to r u,i,φ And smoothing to obtain the product.
P(a φ U, i) is personalized, with different users experiencing different experiences with the same item, defined as follows:
Figure BDA0003619504240000112
if aspect a φ Independently of article i, then P (a) φ | u, i) is 0;
if item i exhibits aspect a φ But not in the historical behavior record corresponding to the target user, the function f (u, i, a) is passed φ ) Given that different strategies may be applied to estimate P (a) φ | u, i), e.g. according to S u A small constant is generated or specified which, in practice,the function f (u, i, a) may be replaced by existing methods, such as click-through rate estimation φ );
For item i interacted with by target user u, but not in aspect a φ Up-scoring, g (u, i, a) can be specified using different strategies φ ) For example, using the average or minimum value of the scores of the target user u in other aspects of the item i, and the target user u in the aspect a of other items φ The mean or lowest value of the scores above or a smaller constant value is specified;
the scoring aspect is given to the target user u, and the scoring is enough to show that the target user u performs scoring on the item i in the aspect a φ Preference of (c), the function h (u, i, a) φ ) Is defined as:
Figure BDA0003619504240000121
wherein r is max ={r u,i,φ |i∈S u Denotes the maximum value in the target user u's score in terms;
it should be noted that the curiosity degree of the target user on each aspect in the aspect set can be determined through step 102, and the relevance between the target user and each item in the target recommendation list in each aspect is calculated through step 103, however, there is no sequential execution order limitation between these two steps, and step 102 may be executed first, step 103 may be executed first, or step 103 may be executed simultaneously, which is not limited specifically.
104. And determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation.
In this embodiment, after determining the curiosity degree of each aspect in the aspect set by the target user and the correlation between the target user and each item in the target recommendation list in each aspect in the aspect set, the evaluation device for the recommendation algorithm may determine the benefit of the target user to each item in the target recommendation list according to the curiosity degree and the correlation, and specifically, may determine the benefit of the target user to each item in the target recommendation list by the following formula:
Figure BDA0003619504240000122
wherein gain (u, i) is the income of the target user to each item i in the target recommendation list, tau u,φ For the target user u, each aspect a in the aspect set φ With respect to the target user u and item i, in aspect a φ The correlation of (c). Wherein, 1- (τ) u,φ Relevance (u, i, φ)) represents the target user u in aspect a with item i φ Without surprise from the above
Figure BDA0003619504240000131
Indicating that the target user u is not surprised, i.e., not surprised in all respects, by item i.
105. An overall revenue score for the target recommendation list is determined based on the revenue for each item.
In this embodiment, after determining the profit of each item, the evaluation device for the recommendation algorithm may calculate the unnormalized score of the target recommendation list and the ideal profit of the target recommendation list according to the profit of each item, and then determine the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit. Specifically, the evaluation device for the recommendation algorithm may determine the unnormalized score of the target recommendation list by the following formula:
Figure BDA0003619504240000132
wherein ser-DCG is the unnormalized score of the target recommendation list, K is the number of articles in the target recommendation list, and gain (u, i) is the income of the target user u to each article i in the target recommendation list;
determining the ideal benefit of the target recommendation list by the following formula:
Figure BDA0003619504240000133
wherein ser-IDCG is the ideal profit of the target recommendation list, rank (i) ranks the profit values of each item i in the target recommendation list;
calculating an overall profit score for the target recommendation list based on the unnormalized scores and the ideal profits by:
Figure BDA0003619504240000134
wherein ser-nDCG is the overall profit score of the target recommendation list, namely the normalized score.
In summary, in the embodiment provided by the application, based on the relationship between the user and the object in the aspect, the surprise degree and diversity of top K recommendation are evaluated in a finer-grained manner, the benefit brought to the user by the recommendation list is quantified from the angle of the surprise degree and diversity, the exploration intensity of the user on different aspects is considered in the evaluation process, the influence of the object interacted in different time sequences on the curiosity of the user is considered from the historical interaction angle of the user, the recommendation result of the recommendation algorithm facing the surprise degree is evaluated, and the satisfaction of the user on the recommendation list is more accurately measured.
The present application is explained above from the point of view of an evaluation method for a recommendation algorithm and below from the point of view of an evaluation device for a recommendation algorithm.
Referring to fig. 2, fig. 2 is a schematic view of a virtual structure of an evaluation apparatus for a recommendation algorithm according to an embodiment of the present application, where the evaluation apparatus 200 for a recommendation algorithm includes:
a determining unit 201, configured to determine N recommendation lists and historical behavior records corresponding to a target user, where N is an integer greater than or equal to 1, and the recommendation algorithm corresponds to the N recommendation lists;
the curiosity degree determining unit 202 is used for determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;
an aspect correlation calculation unit 203, configured to calculate a correlation between the target user and each item in a target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists;
a profit determining unit 204, configured to determine a profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;
a score determining unit 205, configured to determine an overall profit score of the target recommendation list according to the profit of each item.
In one possible design, the curiosity level determining unit 202 is specifically configured to:
determining a weight of each item over time in the historical behavior log and an aspect rating of the each item by the target user;
and determining the curiosity degree of each aspect in the aspect set by the target user according to the weight of each article in the historical behavior record in time and the score of the target user on each article in the aspect.
In one possible design, the aspect correlation calculation unit 203 is specifically configured to:
determining the probability that each item meets the interest of the target user in each aspect and the interest of the target user in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than the interest of each item;
and determining the relevance of the target user and each item in the target recommendation list in each aspect according to the probability that each item meets the interest of the target user in each aspect and the probability that the target user has the interest in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than that of each item.
In one possible design, the benefit determining unit 204 is specifically configured to:
determining the profit of the target user for each item in the target recommendation list by the following formula:
Figure BDA0003619504240000151
wherein gain (u, i) is the income of the target user to each item i in the target recommendation list, tau u,φ For the target user u, each aspect a in the aspect set φ With respect to the target user u and item i, in aspect a φ The correlation of (c).
In one possible design, the score determining unit 205 is specifically configured to:
calculating the unnormalized score of the target recommendation list according to the income of each article;
calculating the ideal profit of the target recommendation list according to the profit of each item;
and determining the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit.
In one possible design, the calculating, by the score determining unit 205, the unnormalized score of the target recommendation list according to the revenue of each item includes:
determining an unnormalized score for the target recommendation list by:
Figure BDA0003619504240000152
wherein ser-DCG is the unnormalized score of the target recommendation list, K is the number of items included in the target recommendation list, and gain (u, i) is the income of the target user u on each item i in the target recommendation list;
the scoring determination unit 205 calculating the ideal profit for the target recommendation list according to the profit for each item includes:
determining an ideal benefit of the target recommendation list by:
Figure BDA0003619504240000161
wherein ser-IDCG is the ideal profit of said target recommendation list, rank (i) ranks the profit values of said each item i in said target recommendation list;
the step of determining, by the score determining unit 205, the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit includes:
determining an overall profit score for the target recommendation list by:
Figure BDA0003619504240000162
wherein ser-nDCG is the score of the target recommendation list.
Fig. 3 is a schematic structural diagram of a server according to the present application, and as shown in fig. 3, a server 300 according to this embodiment includes at least one processor 301, at least one network interface 304 or other user interface 303, a memory 305, and at least one communication bus 302. The server 300 optionally contains a user interface 303 including a display, keyboard or pointing device. Memory 305 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 305 stores execution instructions, when the server 300 runs, the processor 301 communicates with the memory 305, and the processor 301 calls the instructions stored in the memory 305 to execute the above evaluation method for the recommended algorithm. An operating system 306, which contains various programs for implementing various basic services and for handling hardware-dependent tasks.
The server provided in the embodiment of the present application may execute the technical solution of the above-described evaluation method for a recommendation algorithm, and the implementation principle and the technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a computer, implements the method flow related to the evaluation device for recommendation algorithm in any of the above method embodiments. Correspondingly, the computer may be the above-mentioned evaluation device for the recommendation algorithm.
The present invention also provides a computer program or a computer program product comprising a computer program, which, when executed on a computer, causes the computer to implement the method flows of any of the above method embodiments related to the evaluation device for recommendation algorithm. Correspondingly, the computer may be the above-mentioned evaluation device for the recommendation algorithm.
In the above-described embodiment corresponding to fig. 1, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and these modifications or substitutions do not depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. An evaluation method for a recommendation algorithm, comprising:
determining N recommendation lists and historical behavior records corresponding to a target user, wherein N is an integer greater than or equal to 1, and the N recommendation lists correspond to the recommendation algorithm;
determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;
calculating the relevance of the target user and each item in a target recommendation list in each aspect in the aspect set, wherein the target recommendation list is any one recommendation list in the N recommendation lists;
determining the profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;
and determining the overall profit score of the target recommendation list according to the profit of each item.
2. The method of claim 1, wherein determining how curious each facet in the set of facets is the target user based on the historical behavior record comprises:
determining a weight of each item over time in the historical behavior log and an aspect rating of the each item by the target user;
and determining the curiosity degree of each aspect in the aspect set by the target user according to the weight of each article in the historical behavior record in time and the score of the target user on each article in the aspect.
3. The method of claim 1, wherein the calculating the relevance of the target user to each item in the target recommendation list for each aspect in the set of aspects comprises:
determining the probability that each item meets the interest of the target user in each aspect and the interest of the target user in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than the interest of each item;
and determining the relevance of the target user and each item in the target recommendation list in each aspect according to the probability that each item meets the interest of the target user in each aspect and the probability that the target user has the interest in each aspect after browsing the target recommendation list, wherein the priority of each item is greater than that of each item.
4. The method of claim 1, wherein the determining the benefit of the target user for each item in the target recommendation list based on the curiosity level and the relevance comprises:
determining the profit of the target user for each item in the target recommendation list by the following formula:
Figure FDA0003619504230000021
wherein gain (u, i) is the income of the target user to each item i in the target recommendation list, tau u,φ For the target user u, each aspect a in the aspect set φ With respect to the target user u and item i, in aspect a φ The correlation of (c).
5. The method of any one of claims 1 to 4, wherein said determining an overall profit score for the target recommendation list from the profit for the each item comprises:
calculating the unnormalized score of the target recommendation list according to the income of each article;
calculating the ideal profit of the target recommendation list according to the profit of each item;
and determining the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit.
6. The method of claim 5, wherein said calculating an unnormalized score for the target recommendation list based on the revenue for each item comprises:
determining an unnormalized score for the target recommendation list by:
Figure FDA0003619504230000022
wherein ser-DCG is the unnormalized score of the target recommendation list, K is the number of items included in the target recommendation list, and gain (u, i) is the income of the target user u on each item i in the target recommendation list;
the calculating the ideal profit of the target recommendation list according to the profit of each item comprises:
determining an ideal benefit of the target recommendation list by:
Figure FDA0003619504230000031
wherein ser-IDCG is the ideal profit of said target recommendation list, rank (i) ranks the profit values of said each item i in said target recommendation list;
the determining an overall revenue score for the target recommendation list based on the unnormalized score and the ideal revenue comprises:
determining an overall profit score for the target recommendation list by:
Figure FDA0003619504230000032
wherein ser-nDCG is the overall profit score of the target recommendation list.
7. An evaluation apparatus for a recommendation algorithm, comprising:
the device comprises a determining unit, a recommending unit and a recommending unit, wherein the determining unit is used for determining N recommending lists and historical behavior records corresponding to a target user, N is an integer greater than or equal to 1, and the N recommending lists correspond to the recommending algorithm;
the curiosity degree determining unit is used for determining the curiosity degree of each aspect in the aspect set of the target user according to the historical behavior record;
an aspect correlation calculation unit, configured to calculate a correlation between the target user and each item in a target recommendation list in each aspect of the aspect set, where the target recommendation list is any one recommendation list in the N recommendation lists;
a profit determining unit, configured to determine a profit of the target user for each item in the target recommendation list according to the curiosity degree and the correlation;
and the score determining unit is used for determining the overall profit score of the target recommendation list according to the profit of each item.
8. The apparatus according to claim 7, wherein the score determining unit is specifically configured to:
calculating the unnormalized score of the target recommendation list according to the income of each article;
calculating the ideal profit of the target recommendation list according to the profit of each item;
and determining the overall profit score of the target recommendation list according to the unnormalized score and the ideal profit.
9. A computer device, comprising:
at least one connected processor, memory and transceiver, wherein the memory is for storing program code and the processor is for calling the program code in the memory to perform the steps of the evaluation method for recommendation algorithm of any of claims 1 to 6.
10. A computer storage medium, comprising:
instructions which, when run on a computer, cause the computer to perform the steps of the evaluation method for recommendation algorithms of any of claims 1 to 6.
CN202210458220.3A 2022-04-27 Evaluation method and device for recommendation algorithm and storage medium Active CN114817724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210458220.3A CN114817724B (en) 2022-04-27 Evaluation method and device for recommendation algorithm and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210458220.3A CN114817724B (en) 2022-04-27 Evaluation method and device for recommendation algorithm and storage medium

Publications (2)

Publication Number Publication Date
CN114817724A true CN114817724A (en) 2022-07-29
CN114817724B CN114817724B (en) 2024-06-25

Family

ID=

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140104626A (en) * 2013-02-20 2014-08-29 에스케이플래닛 주식회사 System and method for contents recommendation, and apparatus applied to the same
US20200320646A1 (en) * 2018-04-26 2020-10-08 Tencent Technology (Shenzhen) Company Limited Interest recommendation method, computer device, and storage medium
CN111782943A (en) * 2020-06-24 2020-10-16 中国平安财产保险股份有限公司 Information recommendation method, device, equipment and medium based on historical data record
CN111915409A (en) * 2020-08-11 2020-11-10 深圳墨世科技有限公司 Article recommendation method, device and equipment based on article and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20140104626A (en) * 2013-02-20 2014-08-29 에스케이플래닛 주식회사 System and method for contents recommendation, and apparatus applied to the same
US20200320646A1 (en) * 2018-04-26 2020-10-08 Tencent Technology (Shenzhen) Company Limited Interest recommendation method, computer device, and storage medium
CN111782943A (en) * 2020-06-24 2020-10-16 中国平安财产保险股份有限公司 Information recommendation method, device, equipment and medium based on historical data record
CN111915409A (en) * 2020-08-11 2020-11-10 深圳墨世科技有限公司 Article recommendation method, device and equipment based on article and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张东;蔡国永;夏彬彬;: "一种提高推荐多样性的概率选择模型", 计算机科学, no. 02, 15 February 2016 (2016-02-15) *

Similar Documents

Publication Publication Date Title
Jannach et al. Adaptation and evaluation of recommendations for short-term shopping goals
US9075882B1 (en) Recommending content items
US8972370B2 (en) Repetitive fusion search method for search system
US8301623B2 (en) Probabilistic recommendation system
US7457768B2 (en) Methods and apparatus for predicting and selectively collecting preferences based on personality diagnosis
KR101097632B1 (en) Dynamic bid pricing for sponsored search
US20010013009A1 (en) System and method for computer-based marketing
Brodén et al. Ensemble recommendations via thompson sampling: an experimental study within e-commerce
US20120185481A1 (en) Method and Apparatus for Executing a Recommendation
US20080010258A1 (en) Collaborative filtering-based recommendations
US20090006368A1 (en) Automatic Video Recommendation
US20110145226A1 (en) Product similarity measure
US20120016772A1 (en) Value Maximizing Recommendation Systems
KR20100086676A (en) Method and apparatus of predicting preference rating for contents, and method and apparatus for selecting sample contents
US11599548B2 (en) Utilize high performing trained machine learning models for information retrieval in a web store
US9594809B2 (en) System and method for compiling search results using information regarding length of time users spend interacting with individual search results
Zolaktaf et al. A generic top-n recommendation framework for trading-off accuracy, novelty, and coverage
CN106447419B (en) Visitor identification based on feature selection
CA2646748A1 (en) A method and system for measuring an impact of various categories of media owners on a corporate brand
CN115423571A (en) Commodity recommendation method and system based on e-commerce platform
Shani et al. Tutorial on application-oriented evaluation of recommendation systems
KR20130022322A (en) Item based recommendation engiine recommending highly associated item
CN114817724B (en) Evaluation method and device for recommendation algorithm and storage medium
CN114817724A (en) Evaluation method and device for recommendation algorithm and storage medium
WO2009057103A2 (en) Content-based recommendations across domains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant