CN111428127A

CN111428127A - Personalized event recommendation method and system integrating topic matching and two-way preference

Info

Publication number: CN111428127A
Application number: CN202010069262.9A
Authority: CN
Inventors: 钱忠胜; 杨家秀; 朱懿敏
Original assignee: Jiangxi University of Finance and Economics
Current assignee: Jiangxi University of Finance and Economics
Priority date: 2020-01-21
Filing date: 2020-01-21
Publication date: 2020-07-17
Anticipated expiration: 2040-01-21
Also published as: CN111428127B

Abstract

The invention discloses a method and a system for recommending personalized events integrating topic matching and two-way preference, which comprises the steps of firstly, extracting topic information of events and historical events participated by users by utilizing a document topic generation model L DA, calculating the topic matching degree of the users and the events, secondly, constructing preference models of the users and the events according to the two-way consideration of the users and the events for social network recommendation based on the events, respectively obtaining user preference scores and event preference scores, more completely mining preference relations from the two angles of the users and the events, finally, combining the user-event pair matching degree and the two-way preference of the users and the events in a linear weighting mode to obtain final user-event pair comprehensive scores, and taking sequenced TOP-K user-event pairs as recommendation results.

Description

Personalized event recommendation method and system integrating topic matching and two-way preference

Technical Field

The invention relates to the technical field of information recommendation, in particular to a personalized event recommendation method and system integrating topic matching and two-way preference.

Background

With the rapid development of internet and computer technologies, in recent years, the traditional Social Network is also developed towards different innovations, and a number of special types of novel Social networks are formed, such as a location-Based Social Network (L opportunity-Based Social Network, L BSN), a Social Network forming Social relationships mainly according to geographic sign-in information of users, and another complex heterogeneous Social Network combining online and offline — an event-Based Social Network (sn), which is different from friends relationships established among acquaintances in the traditional Social Network, users in the event-Based Social Network establish interpersonal relationships through Social activities, and users join online interest groups and offline collective Social activities according to their interests or common points.

In the process of rapid development of event-based social networks, more and more users choose to participate in social activities in the event social networks, and on an event-based social network platform, users can join various online groups, and an organizer or users in the group can initiate and participate in any offline social activities, such as a party, a hiking, a sports activity, a concert, and the like, and share information with other users.

The event-based social network may provide a user with a social service that combines online to offline, helping the user initiate and formulate a personalized event participation program. The users form online group relations through common interests online, online meeting events are initiated online, the social network based on the events has wider social attributes than the social network based on the positions, and the existing work shows that the social network based on the events in the recommendation system has better recommendation characteristics than the traditional social network.

Most current event-based social network recommendations are mainly recommendations based on user one-way angle extraction feature preferences, and although the social influence of event sponsors can be considered, the potential attractiveness of the events is not enough. On the other hand, the influence of the theme factors only takes the event theme as one of the recommendation factors, and the user theme factors and the matching degree of the user theme factors and the event theme are less considered.

Disclosure of Invention

In view of the above, it is necessary to provide a personalized event recommendation method and system that combines several types of main context information to calculate user preferences and event potential preferences and finally fuses topic matching and two-way preferences of topic matching degree and user-event two-way preferences.

A personalized event recommendation method fusing topic matching and two-way preference comprises the following steps:

step one, extracting theme information of an event by using a document theme generation model L DA, obtaining user theme information according to a historical event record participated by a user, calculating themes of a new event and a user historical event, and calculating a theme matching degree score of a user-event pair by adopting cosine similarity;

step two, respectively constructing a user preference model and an event preference model, and respectively calculating a user preference score and an event preference score;

and thirdly, learning the weight parameters of the user preference scores and the event preference scores by using a Bayesian personalized ranking algorithm BPR to obtain user event two-way preference scores, linearly weighting and combining the theme matching degree scores and the two-way preference scores to obtain final recommendation scores of user-event pairs, and recommending the top K ordered events to the user.

Further, the document topic generation model L DA in step one has a three-layer Bayesian network structure, including documents, topics and words, wherein document-topic and topic-words are subject to polynomial distribution, each document selects a topic with a certain probability, and selects a word from the topic with a certain probability, and topics in any document are subject to Dirichlet distribution, and relationships between texts are discovered through the distribution.

Further, in the step one, the subject of the new event and the user history event is calculated, and the subject matching degree score of the user-event pair is calculated by using cosine similarity, and the specific steps include:

step 1-1, forming a document set D from all event description contents, removing stop words, inputting the document set D into a document theme generation model L DA, and respectively obtaining the theme distribution of each event;

removing stop words and punctuation marks from all event contents, regarding the document contents after removing noise interference words as a set D of all documents, inputting the set D into an L DA topic model, and generating a document D_iIs given by (1), the joint distribution p (ω, z | α) of topics and words of (1):

two unknown parameters in the model were then estimated using the Gibbs sampling method: event topic distribution

And topic word distribution v;

step 1-2, calculating topic distribution similarity between the historical event and the new event of the target user according to a JS divergence algorithm;

a topic distribution for all events has been generated according to equation (1)

Given event e_dp and e_dqRespectively have a theme distribution

Firstly calculating JS divergence between the two through a JS divergence method

As shown in formula (2):

wherein ,D_js∈[0,1]，D_KLExpressing the divergence of K L, which describes the difference between two probability distributions p and q, the calculation formula is shown in equation (3):

combining formula (2) and formula (3) to obtain event e_dp and e_dqHas a topic similarity of S_topicAs shown in formula (4):

wherein the topic similarity S of the event_topicIs at a value of [0,1]]In (3), the closer the value is to 1, the higher the event similarity is;

step 1-3, averaging the similarity of all historical events of a target user to obtain a theme matching degree score of the user and a new event;

with E_uRepresenting the number of historical events of the target user, taking the average of all the similarities of the target user

As the topic matching degree score of the user and the new event, the topic matching degree score is represented by formula (5):

matching the model according to the constructed theme, and finally

To measure the subject matching relationship between the target user and the new event.

Further, the constructing of the user preference model in the second step constructs the single-factor preference of the user from three aspects of geographic location, social relationship and time factor, and specifically includes:

step 2-1-1, constructing a geographic position preference model:

the geographic position preference model calculates the probability that a target user will participate in holding an event at the position, a kernel density estimation KDE method is adopted to model the two-dimensional geographic position distribution of the event participated by the user, and the normalized event participation probability represents the preference degree of the user to the geographic positionThe longitude and latitude coordinates of the event geographic location are represented by (L x, L y), the set of locations where the user has historically participated in the event is represented by L (u), and the KDE function for user u

As shown in formula (6):

wherein ,l_i＝(Lx_i,Ly_i)^TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event location_l(u,l_i) Indicating user u attended geographic location l_iThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, K (·) represents a gaussian kernel defined as shown in equation (7):

combining equations (6) and (7) may define the probability that user u will attend an event that will be held at location l, as shown in equation (8):

normalizing the probability to obtain a preference score S of the user about the geographic position_G(u, l) is represented by the formula (9).

The denominator represents the maximum event participation probability of the target user;

step 2-1-2, constructing a social relationship preference model:

in a user social relationship network, a user can add at least one or more interest groups on line, choose to participate in event activities published by different groups, and judge social relationship preference of the user through the online same-group relationship of the user, wherein the same-group relationship mainly comprises two interactive relationships;

first, the relevance of users to groups is defined as the interaction between users and all groups they belong to and between users and events created in the groups, and G (u) represents the set of groups to which events the users u participate belong, so the relevance of users to groups

Can be expressed as shown in formula (10):

wherein ,m_p(u, g) represents a set of event activities in which user u has participated in the group in which the user is present;

second, the relevance of users in the group is defined by the similarity of friends in the group where the target user is located, and the similarity s (u, g) between the target user and the users in the group is calculated, as shown in formula (11):

wherein sim (u)_i,u_j) Representing users u in the same group_iAnd user u_jThe similarity between the two is shown as a formula (12);

normalizing s (u, g) to

As shown in formula (13):

combining the above two interactions, users belonging to the same group tend to participateEvents created by other users within these groups, and the combined user-to-group and user-within-group correlations yield a social preference score S for user u with respect to online group g_I(u, g) represented by the formula (14):

α∈ [0,1] is used as a weight control parameter, in the social relationship network, the preference association of the target user and the group is set to be as important as the association between the users in the group, and the value of α is set to be 0.5 through experimental verification;

step 2-1-3, constructing a time factor preference model:

the time factor of the event is an important preference factor which needs to be considered when calculating the preference of the user; representing a new event e which the user can choose to attend as a 7 x 24 dimensional event time vector

When a new event occurs in a certain specific time period of a week, setting the vector component value of the time period to be 1, otherwise, setting the vector component value to be 0; representing users as user time vectors based on historical event records of user participation in a time preference model

As shown in equation (15):

wherein ,E_uRepresenting the historical event set participated by the target user, and then calculating cosine similarity s (u, e) between the user time vector and the new event time vector, as shown in formula (16):

for new event e, user u_i∈ U can be based on equation (16)) Find the similarity s (u)_iE) normalizing the similarity to obtain a time preference score S of the user for the event_T(u_iAnd e) is represented by formula (17):

further, the calculating the user preference score in the second step specifically includes:

for the geographic location preference model, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship preference model, calculating a social preference score of the target user from two aspects of the relationship between the target user and the group and the relevance between the target user and the users in the group; for the time factor preference model, unified vector representation of two granularities of date and hour is constructed, and similarity of a user-event pair is calculated on the basis of the unified vector representation to serve as a time preference score of a target user; combining the three single-factor preferences to form a user preference perception model, and linearly combining the three single-factor preferences to obtain the total preference score S of the user u to the event e_userAs shown in formula (18):

wherein ,S_G、S_I、S_TRespectively representing the preference scores of the users on three single factors, namely geographical position, social relation and time factor.

Further, the constructing of the event preference model in the second step constructs the single-factor preference of the event from two aspects of the event location popularity and the event host influence, and specifically includes:

step 2-2-1, constructing an event position popularity preference model:

calculating the popularity of the geographic position according to the visiting frequency of the user u and the users in the online group g which the user u joins;

first, an event geographic location l is defined_eWith respect to user uPrevalence p (l)_eU) is represented by formula (19):

wherein the molecule m_l(u,l_e) Joining a geographic location l for user u_eThe frequency of the held events is controlled, and the denominator is the maximum frequency of the positions historically visited by the user u; likewise, a geographic location l is defined_ePopularity p (l) of group g with respect to user u_eG) is represented by formula (20):

wherein, the numerator represents the frequency of each user in the group g participating in practical activities at the position l, and the denominator is the maximum frequency of the positions historically visited by the group members, thereby calculating the geographic position l_ePopularity with respect to users in group g; binding to p (l)_eU) and p (l)_eG) defining the total popularity of the hosting location of the event to be recommended to the target user u as P (l)_eU, g) as shown in formula (21):

P(l_e,u,g)＝αp(l_e,u)+(1-α)p(l_e,g) (21)

step 2-2-2, constructing an event host influence preference model:

firstly, the influence degree of an event host on a target user selects to express the implicit preference of the event through the credibility or influence degree of the host; defining the influence I (e, u) of the event on the user u, as shown in formula (22):

wherein ,m_h(u,u_h) Representing host u participated in by user u_hSet of events held, E_hIs the host u_hAll event sets held;

secondly, the influence degree of the event host in the group is expressed by the frequency proportion of the users participating in the group for the online group where the target user is located, and the influence degree of the users in the group is expressed by I (e, g), as shown in formula (23):

wherein ,U_gRepresents a small group u_hSet of users in (m)_h(u_i,u_h) Representing user u_iParticipating by host u_hSet of events held, E_h(g) Represents u_hIn subgroup u_hA set of events held in; and (3) calculating a comprehensive influence score I (e, u, g) of the event host according to the influence of the event host on the target users and the users in the group, wherein the formula (24) is as follows:

I(e,u,g)＝αI(e,u)+(1-α)I(e,g) (24)

further, the calculating the event preference score in the second step specifically includes:

for new events which do not occur, representing the preference of the event by calculating the event position popularity and the event host influence of the new event; popularity P (l) to constructed event location_eU, g) and event host influence I (e, u, g) are linearly combined, and preference score S of event e to user u is calculated_eventsAs shown in formula (25):

further, the step three is to obtain a two-way preference score of the user event, and linearly weight and combine the topic matching degree score and the two-way preference score to obtain a final recommendation score of the user-event pair, and the specific steps include:

step 3-1, solving two-way preference for user-event pairs:

suppose the preference scoring weights of the user and the event are theta₁ and θ₂And the two are weighted and fused to obtain the user eventTwo-way preference scoring S_u,e＝θ₁S_user+θ₂S_events(ii) a Converting the problem of two-way preference scoring into a weight vector for solving two preference scores, and selecting implicit feedback as training data to learn the weight vector;

selecting a learning algorithm BPR based on Bayesian maximum likelihood estimation to perform sequencing learning on the weights, and learning the correct sequencing sequence of the user-event pairs according to implicit feedback data of the user on the events so that the events participated by the user are arranged in front of new events or other events; first, a maximum posterior probability p (θ | R) is defined, as shown in equation (26):

p(θ|R)∝p(R|θ)p(θ) (26)

where θ represents a weight vector, R represents a set of all user-event pairs, and p (R | θ) is defined as shown in equation (27);

wherein R in the formula_uRepresents a user-event pair of user u, and p (e)_i>e_j) Representing events e for user u_iIs arranged at e_jThe probability of the foregoing, as shown in equation (28):

p(e_i>e_j|θ)＝σ(s(u,e_i)-s(u,e_j)) (28)

wherein S (u, e) is the two-way preference score S_u,e，

For more convenient optimization, assuming θ follows a normal distribution with a mean value of 0, the final optimization objective function lnp (θ | R) is derived by expansion, as shown in equation (29):

the lambda represents a regular term coefficient, and an optimal weight parameter vector is obtained by maximizing an optimization objective function through implicit interactive feedback data of a user event; solving the optimization problem by adopting a random gradient descent algorithm SGD, randomly extracting a user-event pair of a target user from a training set in an iterative process to update a weight vector theta, wherein the updating process is shown as a formula (30):

where α is the learning rate, s_ij＝s(u,e_i)-s(u,e_j) Through the learning process, the weight vector theta can be automatically obtained according to the user event preference score training set and the hyperparameters α and lambda, so that the two-way preference score S is obtained_u,e；

Step 3-2, solving a final recommendation score of the user-event pair by combining topic matching and two-way preference:

firstly, extracting event topics through an L DA topic model and obtaining topic matching degree scores of users and events, secondly, respectively constructing preference models of the users and the events according to user event context information in the EBSN, obtaining user event two-way preference scores through a BPR learning algorithm, and finally, scoring the topic matching degree scores

Preference rating in both directions with user events S_u,eLinear weighted summation to obtain final user-event pair recommendation degree score S_RecAs shown in formula (31):

where γ is a weighting parameter, typically set manually from experience, the optimal setting will be determined experimentally.

And an implementation system for personalized event recommendation fusing topic matching and two-way preference, which is used for implementing the personalized event recommendation method fusing topic matching and two-way preference as described in any one of the above, the implementation system comprising:

the document theme generation module is used for extracting themes of the user historical events and the new events, calculating theme distribution and word distribution of the events, expressing theme matching degree according to theme similarity between the user historical events and the new events, and fusing the theme matching degree into a recommendation model as one of recommended key factors to recommend the events;

the user preference building module is used for building the single-factor preference of the user from the three aspects of geographic position, social relation and time factor, and weighting and fusing the three single-factor preferences to obtain the overall preference of the user;

constructing an event preference module, and expressing the preference of the event by using the social influence of the event host in the group and the popularity of the event host geographical position in the group;

the user event bidirectional preference scoring module is used for solving the weight parameters of the user preference score and the event preference score by using a sequencing learning algorithm to obtain a user event bidirectional preference score;

and the final recommendation scoring module of the user-event pair is used for linearly weighting and combining the theme matching degree score and the two-way preference score to obtain the final recommendation degree score of the user-event pair.

Further, the user preference module comprises a geographic location preference module, a social relationship preference module, and a time factor preference module, and the event preference module comprises an event location popularity preference module and an event host influence preference module, wherein:

the geographic location preference module is used for expressing a geographic location preference score by predicting the probability of a user participating in an event held at a certain geographic location;

the social relationship preference module is used for calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group;

the time factor preference module is used for constructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user;

the event location popularity preference module is used for selecting important places for interested users when recommending new events, is called the popularity of the geographic location in a user group, and can calculate the attractiveness of the event to the users more accurately by considering the popularity of the event geographic location;

the event host influence preference module is used for improving the recommendation accuracy according to the influence of the event host on the group where the target user is located, and calculating the influence of the event host on the target user and the influence of the event host in the group.

In the personalized event recommendation method and system based on the fusion topic matching and the two-way preference, firstly, the topic information of an event is extracted by using a document topic generation model L DA, the user topic information is obtained according to a historical event record participated by a user, the topic matching degree of the user and the event is calculated to be used as an important recommendation factor in a recommendation model, the topic factor can better represent characteristic preference, secondly, a preference model of the user and the event is built according to the two-way view of the user and the event for the event-based social network recommendation, the preference score of the user and the preference score of the event are respectively obtained, the preference relationship between the user and the event is more completely mined from the two angles, finally, the user-event pair matching degree and the two-way preference of the user and the event are combined in a linear weighting mode to obtain the final user-event pair comprehensive score, the TOP K (namely TOP-K) user-event pairs after sequencing are used as recommendation results, and the personalized event recommendation algorithm is subjected to a large number of experiments on a Meetup real data set and compared with other event recommendation algorithms, so that the performance of the software algorithm is better than that the traditional recommendation scheme can be well predicted, and the personalized.

Drawings

Fig. 1 is a block diagram of an overall recommendation fusion framework of a personalized event recommendation method and system fusing topic matching and two-way preference according to an embodiment of the present invention.

Fig. 2 is a block diagram of a document theme generation model L DA of the personalized event recommendation method and system fusing theme matching and two-way preference according to the embodiment of the present invention.

Detailed Description

In this embodiment, a personalized event recommendation method combining topic matching and two-way preference is taken as an example, and the following describes the present invention in detail with reference to specific embodiments and accompanying drawings.

Referring to fig. 1 and fig. 2, a personalized event recommendation method and system combining topic matching and two-way preference according to an embodiment of the present invention are shown.

The software specifically explains the technical details related to the Personalized event recommendation system for fusing topic matching and two-way preference of the software, and the software mainly adopts an L DA topic model to calculate the topics of new events and user historical events, adopts cosine similarity to calculate the topic matching degree of user-event pairs, and respectively constructs a user preference model and an event preference model, wherein the user preference model calculates the comprehensive preference scores of users from three aspects of time, geography and social relations.

1. Recommendation framework fusing L DA topic matching and user event two-way preferences

Based on the current existing work, an event recommendation scheme combining user-event pair topic matching and user-event pair two-way preference is provided based on geographic position information, time information, social relations and other related user event context information in the EBSN. In the scheme, the influence of the theme matching degree, the user preference and the event preference on event recommendation is respectively considered, and the factors are fused to effectively recommend the interest events to the user. The general framework of the recommendation model is shown in fig. 1, and the specific recommendation process is as follows:

1) and calculating the new event and the historical event topic of the target user by utilizing an L DA topic model according to the description document of the event in the EBSN, expressing the topic of the user by using the topic of the historical event of the user, and then calculating the semantic similarity of the event and the topic distribution of the user to obtain the matching degree score of the user-event topic.

2) And calculating a user preference score and an event preference score, calculating the preference scores from the geographic position, the social relationship and the time of the user preference respectively, and performing linear fusion on the preference scores, wherein the event preference is represented by the popularity of the event host geographic position and the social influence of the event host, and the event preference score is obtained by performing linear fusion on the preference scores. It should be noted that, when calculating the popularity of the geographic location and the influence of the host on the event, only the group and the users in the group where the target user is located are targeted, and the association of other users and groups is totally ignored, so as to improve the recommendation performance and reduce the calculation complexity.

3) And obtaining the matching degree score of the user-event theme, the preference score of the user to the event and the preference score of the event to the user through the calculation. Learning the weights of the user preference scores and the event preference scores by using a Bayesian personalized sorting algorithm, fusing the preference scores of the users and the events according to the weights to obtain two-way preference scores, linearly combining the theme matching degree scores and the two-way preference score information to obtain final user-event pair recommendation degree scores, and recommending the TOP-K event with the highest score to the users.

2. Subject matching model based on L DA

In an event social network, there is an obvious semantic similarity between users and events, and users often choose to participate in a certain type of interesting events, which generally have similar attributes and topics. The method comprises the steps of better capturing preferences of users and events by applying topics of events in recommendation, representing user topics by topics of historical events participated by the users, calculating topic distribution and word distribution of new events, representing topic matching degree by topic similarity between the historical events and the new events of the users, and fusing the topic matching degree into a recommendation model as one of key factors of recommendation to recommend the events.

The core idea is that each document selects a certain topic with a certain probability and selects a certain word from the topic with a certain probability, the topic in any document is considered to accord with Dirichlet distribution, and the relationship between texts can be discovered through the distribution. L DA consists of three layers of generating Bayesian network structures, including documents, topics and words, and the document-topic and topic-words all obey polynomial distribution. L DA topic model generation process is shown in FIG. 2.

Given document set D ═ D₁,d₂,…,d_mV and in FIG. 2Respectively represent documents d_iα are hyper-parameters of the empirically given prior distribution of topics and prior distribution of words, respectively, k is the number of topics of the previously specified document set, N is the number of topics of the previously specified document set_mRepresenting a document d_iM is the number of documents in the document set. For document d_iFor each word in the document, L DA determines the topic distribution v of the document based on the prior knowledge α, then extracts a topic z from the topic distribution v, and determines the word distribution of the current topic based on the prior knowledge β

Then, the word distribution corresponding to the subject z

Extract one fromA word w, repeating the above process N_mGenerating the document d_i. In the process, the document d can be solved by using a Gibbs sampling method_iThe distribution of themes.

The method comprises the steps of calculating topic similarity between users and events according to L DA, converting text content into semantic features, calculating topic distribution for each event by utilizing a L DA topic model, wherein the event content mainly comprises titles and description documents, and also comprises information such as time, holding places and the like, and event topics can be extracted through the event content_iIs determined, as shown in equation (1), then two unknown parameters in the model, i.e., the event topic distribution, are estimated using the Gibbs sampling method

And subject word distribution v.

After the subject distribution and word distribution of the event document are obtained through the L DA process, the similarity among the events is calculated according to the subject distribution of the events by using a JS divergence (Jensen Shannon divergence) method, the JS divergence is based on a variant of K L divergence (Kullback-L eibler divergence), is symmetrical, solves the problem of asymmetric K L divergence, and can better measure the similarity of two probability distributions, and the subject distribution of all the events is generated according to the formula (1)

Given event e_dp and e_dqRespectively have a theme distribution

As shown in formula (2).

wherein ,D_js∈[0,1]，D_KLExpressing the divergence of K L, is used to describe the difference between two probability distributions p and q, and the calculation formula is shown in equation (3).

Combining formula (2) and formula (3) to obtain event e_dp and e_dqHas a topic similarity of S_topicAs shown in formula (4).

Topic similarity S of events_topicIs at a value of [0,1]]In (3), a value closer to 1 indicates a higher degree of event similarity. It has been mentioned above that topic similarity between a new event and a user's historical events is taken as topic similarity between a user and an event, whereas a user often participates in events many times, and there are multiple topic similarities between a new event, with E_uRepresenting the number of historical events of the target user, taking the average of all the similarities of the target user

And (5) serving as a theme matching degree score of the user and the new event.

Algorithm 1 describes the process of calculating topic matching for user-event pairs through the L DA topic model, where

A distribution of words representing a subject-matter,

representing document theme distribution, Dir () representing Dirichlet distribution, Mult () representing polynomial distribution, and Poiss () representing poisson distribution.

The method comprises the steps of firstly, forming all event description contents into a document set, removing stop words, using the document set as input of a L DA model, respectively obtaining topic distribution (from a line 2 to a line 11) of each event, then calculating topic distribution similarity (from a line 12 to a line 14) between a historical event of a target user and a new event according to the JS divergence algorithm, and finally averaging the similarity of all historical events of the target user to obtain topic matching degree scores (from a line 15 to a line 16) of the user and the new event.

3. User-based preference model

Feature learning is generally performed from the relevant context information of the user with respect to the user preferences, and the learned feature information is expressed as the user preferences. The single-factor preference of the user is constructed from the three aspects of geographic factors, social relations and time factors, and the three single-factor preferences are weighted and fused to obtain the overall preference of the user.

3.1 geographic location preferences

The geographic location preference model calculates the probability that a target user will participate in holding an event at the location, and KDE (Kernel Density Estimation) is adoptedEstimate) method models a two-dimensional geo-location distribution of events in which a user participates, representing a user's preference for geo-location by event participation probability after normalization longitude and latitude coordinates of event geo-location are represented by (L x, L y), a set of places in which a user has historically participated in an event is represented by L (u), and a KDE function with respect to user u

As shown in equation (6).

wherein ,l_i＝(Lx_i,Ly_i)^TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event location_l(u,l_i) Indicating user u attended geographic location l_iThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, and K (·) represents the Gaussian kernel function, which is defined as shown in equation (7).

The combination of equations (6) and (7) may define the probability that user u will attend the event that will be held at location/as shown in equation (8).

3.2 social relationship preferences

In a user social relationship network, a user typically joins at least one or more interest groups online and may choose to participate in event activities posted by different groups. In these group relationships, users usually select a preference group with the most interest to themselves to participate in, and members in the same group generally have the same interest, so that social relationship preferences of users can be considered through online same-group relationships of users, and two kinds of interaction relationships are mainly included.

1) Relevance of users to groups. I.e. the interaction between the user and all groups to which they belong and between the user and events created within the groups. G (u) represents the set of groups to which the event participated by the user u belongs, and the relevance of the user and the groups

Can be expressed as shown in formula (10).

wherein ,m_p(u, g) represents the set of event activities in the group of users that user u has attended.

2) Intra-group user relevance. The relevance of the users in the group is defined by the similarity of friends in the group where the target user is located, and the similarity s (u, g) between the target user and the users in the group is calculated, as shown in formula (11).

Wherein sim (u)_i,u_j) Representing users u in the same group_iAnd user u_jThe similarity between them is shown in formula (12).

Finally, s (u, g) is normalized to

As shown in equation (13).

In conjunction with these two interactions, users belonging to the same or similar groups tend to attend events created within those groups, and the association of users with and within the groups is combined to derive a social preference score S for user u with respect to online group g_I(u, g) is represented by the formula (14).

α∈ [0,1] is used as a weight control parameter, in the social relationship network, the preference association between the target user and the group is considered as important as the association between the users in the group, and the value α is set to 0.5 through experimental verification.

3.3 time preference

The time factor of an event is another important preference factor that needs to be considered when calculating user preferences. Different users may have different preferences in selecting to attend an event, some users may prefer to attend an event in the evening, while others may prefer to attend an event in the morning, or may prefer different time points on a weekday or weekend. In reality time is periodic, mainly in periods of 7 days per week and 24 hours per day, creating user time preferences at two different levels of granularity for a user to choose to engage in an activity on a certain day of the week and on certain hours of the day. We express the user's temporal preferences by combining the user selections at two levels of granularity.

If the user selects a time period on a day of the week to engage in the event, which may indicate an implicit time preference of the user, the user may choose to engage in the event again at the next same time period. To uniformly and intuitively represent the implicit preference, a new event e which a user can choose to participate is represented as a 7-by-24-dimensional event time vector

When a new event occurs in a certain time period of the week, the vector component value of the time period is set to 1, otherwise, the vector component value is set to 0. Thus, a user may be represented in a temporal preference model as a user time vector based on historical event records in which the user attended

As shown in equation (15).

wherein ,E_uRepresenting the historical event set participated by the target user, and then calculating cosine similarity s (u, e) between the user time vector and the new event time vector, as shown in formula (16).

For new event e, user u_i∈ U the similarity s (U) can be obtained from the equation (16)_iE) normalizing the similarity to obtain a time preference score S of the user for the event_T(u_iAnd e) is represented by formula (17).

3.4 user fusion preference Scoring

According to the modeling of the single-factor preference model of the user from the three aspects, preference scores of the user about the geographic position, the social relationship and the time are calculated respectively. For a geographic location, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship, calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group; for time preference, a unified vector representation of date and hour granularity is constructed, and similarity of a user-event pair is calculated based on the unified vector representation as a time preference score of a target user. Combine the threeThe single-factor preferences form a user preference perception model, and the three single-factor preferences are linearly combined to obtain the total preference score S of the user u to the event e_userAs shown in equation (18).

wherein ,S_G、S_I、S_TThe distribution represents the preference scores of the users on three factors of geographic position, social relationship and time. Algorithm 2 describes the calculation of the user preference score.

Algorithm 2 presents a process for solving a user composite preference score in combination with the user's preferences in three factors, namely geographical location, social relationship, and time. Predicting the probability that the user is likely to participate in an event held at a certain specific position through a kernel density estimation algorithm, and expressing the geographical preference of the user after normalizing the probability (3 rd row); calculating social preference (lines 5 to 11) for the user's social relevance to the online group and members of the group according to equations (10) and (13); representing the new event and the user historical event as time vectors, and calculating cosine similarity of the new event and the user historical event to represent the time preference of the user (line 4); finally, the three preference values are linearly combined to obtain the total preference score of the user (line 13 to line 14).

4. Event-based preference model

For the preferences of events, learning from event host and event ontology information is considered. Because an event lacks active personalized context information compared to the user, it does not have history, personalized tags, etc. for new events, and thus the preference of the event is expressed in terms of the social impact of the event host in the group, and the popularity of the event host's geographic location in the group.

4.1 event location popularity

The geographic location of the event host is a consideration of whether the user chooses to engage in the event. For a certain online group which users join is generally a user group with the same interest, a plurality of users may choose to participate in the same event activity, therefore, for new event recommendation, the holding place of the new event recommendation can be used as an important selection basis for interested users, and the relationship is called the popularity of the geographic position in the user group. The popularity of considering the geographic location of an event in a model for calculating event preferences enables a more accurate calculation of the attractiveness of an event to a user. The popularity of a geographic location is calculated based on the frequency of visits to the place by user u and the users in the online team g to which it has joined.

First, an event geographic location l is defined_ePopularity p (l) with respect to user u_eAnd u) is represented by formula (19).

Wherein the molecule m_l(u,l_e) Joining a geographic location l for user u_eThe frequency of the held events is treated, and the denominator is the maximum frequency of the locations historically visited by the user u. Likewise, a geographic location l may be defined_ePopularity p (l) of group g with respect to user u_eAnd g) is shown as formula (20).

Wherein, the numerator represents the frequency of each user in the group g participating in practical activities at the position l, and the denominator is the maximum frequency of the positions historically visited by the group members, thereby calculating the geographic position l_eRegarding the popularity of users in group g. Binding to p (l)_eU) and p (l)_eG) the total popularity of the hosting location of the event to be recommended to the target user u may be defined as P (l)_eU, g) is represented by the formula (21).

P(l_e,u,g)＝αp(l_e,u)+(1-α)p(l_e,g) (21)

4.2 event host influence

In the event social network, the initiator of each event activity is also a common user on the network, and a host initiates an activity to get a better response, so that the previously-participated user is likely to choose to participate in the held activity again when other new activities are initiated next time. Although the event to be recommended is an entirely new event that has not occurred for each user, the host of the event may be the active host of that type of event, possibly having previously been hosting multiple events, which provides more ancillary recommendation information for addressing the cold start problem present in event recommendations. Therefore, the influence of the event host on the user group in the group is an important characteristic of the event preference, and the software improves the recommendation accuracy according to the influence of the event host on the group of the target user. Its influence can be considered from the following two aspects.

1) The degree of influence of the event host on the target user. The event social network does not have scoring information of the event by the user, the influence of the host and the event cannot be visually represented, and the event cannot be scored again when the life cycle of the event is ended. First, the influence I (e, u) of the event on the user u is defined, as shown in equation (22).

wherein ,m_h(u,u_h) Representing host u participated in by user u_hSet of events held, E_hIs the host u_hThe set of all events held.

2) The influence of the event host in the team. For the online group where the target user is located, the influence of the event in the group can be similarly expressed by the frequency proportion of the users participating in the group, and the influence of the users in the group is expressed by I (e, g), as shown in equation (23).

wherein ,U_gRepresents a small group u_hSet of users in (m)_h(u_i,u_h) Representing user u_iParticipating by host u_hSet of events held, E_h(g) Represents u_hIn subgroup u_hThe set of events held in (1). And (3) obtaining a comprehensive influence score I (e, u, g) of the event host by combining the influence degree of the event host on the target users and the users in the group, wherein the comprehensive influence score I (e, u, g) is shown as a formula (24).

I(e,u,g)＝αI(e,u)+(1-α)I(e,g) (24)

4.3 event potential preference Scoring

For new events that do not occur, two key factors that the software sets up to attract users to attend are geographic location and host influence. The preferences of an event are expressed by calculating the geo-location popularity of a new event and the social influence of its host. In order to reduce the computational complexity and avoid the interference and influence of weakly related data, the popularity of the event geographic position and the social influence of the host are only limited to a group where the target user is located. It is assumed here that the remaining user or group correlations are zero, with no impact on event preferences. For the above constructed event geo-location popularity P (l)_eU, g) and host influence I (e, u, g) are linearly combined to find a preference score S for event e to user u_eventsAs shown in formula (25).

Algorithm 3 details the process of calculating an event potential preference score by event location popularity and host influence.

Algorithm 3 presents a process that seeks to solve for an event potential preference score based on event geo-location popularity and host influence. For the group where the target user is located, calculating the popularity of the event geographic position to the user and the group according to the formula (19) and the formula (20), and combining the popularity and the popularity to represent the total popularity of the event geographic position (the 3 rd row to the 8 th row); similarly, the influence of the event host on the user and the team is obtained by the equations (22) and (23) (lines 9 to 13), and the influence of the event host is expressed by combining the two; finally, the location popularity and host influence are linearly combined to obtain a potential preference score for the event (line 17).

5. Recommendation algorithm fusing topic matching and user event two-way preference

The method comprises the steps of solving topic distribution of users and events respectively by using an L DA topic model, calculating topic matching degree of user-event pairs according to the topic distribution, constructing a feature preference scoring model for the users and the events, and solving a user preference score and an event preference score respectively.

1) The user-event pairs are scored for two-way preference. Suppose the preference scoring weights of the user and the event are theta₁ and θ₂Weighting and fusing the two to obtain a user event bidirectional preference score S_u,e＝θ₁S_user+θ₂S_events. The key problem of two-way preference scoring is then to find the weight vector of both preference scores, choosing to learn the weight vector using implicit feedback as training data. Unlike explicit feedback where a user scores a project, implicit feedback in an event social network can only be expressed in terms of interaction information between the user and the event, i.e., if the user attended the event, the feedback is 1, otherwise the feedback is 0. Obviously, the user's feedback is 0 for all new events.

A learning algorithm BPR based on Bayesian maximum likelihood estimation is selected to carry out sequencing learning on the weights, and the correct sequencing sequence of the user-event pairs is learned according to implicit feedback data of the user on the events, so that the events participated by the user are arranged in front of new events or other events. First, a maximum posterior probability p (θ | R) is defined as shown in equation (26).

p(θ|R)∝p(R|θ)p(θ) (26)

Where θ represents a weight vector, R represents a set of all user-event pairs, and p (R | θ) is defined as shown in equation (27).

wherein ,R_uRepresents a user-event pair of user u, and p (e)_i>e_j) Representing events e for user u_iIs arranged at e_jThe foregoing probability is shown as equation (28).

p(e_i>e_j|θ)＝σ(s(u,e_i)-s(u,e_j)) (28)

Wherein S (u, e) is the two-way preference score S_u,e，

For more convenient optimization, assuming θ follows a normal distribution with a mean value of 0, the final optimization objective function lnp (θ | R) is derived by expansion, as shown in equation (29).

Where λ represents the regular term coefficient. And (4) maximizing an optimization objective function through implicit interaction feedback data of the user event to obtain an optimal weight parameter vector. The optimization problem is solved by adopting a Stochastic Gradient Descent (SGD) algorithm, and a user-event pair of a target user is randomly extracted from a training set in an iterative process to update a weight vector theta, wherein the updating process is shown as a formula (30).

Where α is the learning rate, s_ij＝s(u,e_i)-s(u,e_j) Through the learning process, a weight vector theta can be automatically obtained according to the user event preference score training set and the super parameters α and lambda, so that a two-way preference score S is obtained_u,e。

2) Combining the discussion about the topic matching and the preference calculation of the users and the events, firstly, extracting the event topics through an L DA topic model and obtaining the topic matching degree score of the users and the events, secondly, respectively constructing preference models of the users and the events according to the context information of the user events in the EBSN, obtaining the two-way preference score of the user events through a BPR learning algorithm, and finally, obtaining the topic matching degree score

Preference rating in both directions with user events S_u,eLinear weighted summation to obtain final user-event pair recommendation degree score S_RecAs shown in formula (31).

Where γ is a weighting parameter, typically set manually from experience, the optimal setting will be determined experimentally. Algorithm 4 describes a process that fuses topic matching and two-way preference solving for user-event to final recommendation score.

Algorithm 4 presents a process that ultimately fuses the topic matching score and the user event two-way preference score. Firstly, performing sequencing learning on a training set generated by a user preference score set and an event preference score set through a Bayes personalized sequencing algorithm to obtain an optimal weight vector theta, and calculating a user-event pair two-way preference score (from line 2 to line 10) of a target user according to the theta; and secondly, linearly combining the topic matching degree score and the two-way preference score of the user-event pair to obtain a final recommendation degree score (from the 11 th line to the 13 th line), and recommending the TOP-K events for the user according to the final recommendation degree score.

So far, we propose a personalized event recommendation scheme by combining topic matching and user event two-way preferences, and detailed contents thereof in the above section.

According to the personalized event recommendation method and system based on the fusion topic matching and the two-way preference, firstly, the topic information of an event is extracted by using a document topic generation model L DA, the user topic information is obtained according to a historical event record participated by a user, the topic matching degree of the user and the event is calculated to serve as an important recommendation factor in a recommendation model, the topic factor can better represent characteristic preference, secondly, a preference model of the user and the event is built according to the two-way view of the user and the event for the event-based social network recommendation, the preference score of the user and the preference score of the event are respectively obtained, the preference relationship between the user and the event is more completely mined from the two angles of the user and the event, finally, the user-event pair matching degree and the two-way preference of the user and the event are combined in a linear weighting mode to obtain the final comprehensive score of the user-event pair, the sequenced TOP-K user-event pairs serve as recommendation results, a large number of experiments are carried out on a Meetup real data set, and compared with other event recommendation algorithms, and the performance of the traditional software recommendation algorithm is better than that the personalized preference of the user can be predicted.

It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention, and those skilled in the art can make various modifications and changes. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A personalized event recommendation method fusing topic matching and two-way preference is characterized by comprising the following steps:

2. The method for recommending personalized events with fused topic matching and two-way preferences according to claim 1, wherein said document topic generation model L DA in step one has a three-layer Bayesian network structure, comprising documents, topics and words, wherein document-topic and topic-words are subject to polynomial distribution, each document selects a topic with a certain probability and selects a word from this topic with a certain probability, and topics in any document are subject to Dirichlet distribution, and relationships between texts are discovered through the distribution.

3. The personalized event recommendation method combining topic matching and two-way preference according to claim 2, wherein the step one of calculating the topics of the new event and the historical events of the user and calculating the topic matching degree score of the user-event pair by using cosine similarity comprises the following specific steps:

And subject term distribution v;

Given event e_dp and e_dqRespectively have a theme distribution

As shown in formula (2):

wherein ,D_js∈[0，1]，D_KLExpressing the divergence of K L, which describes the difference between two probability distributions p and q, the calculation formula is shown in equation (3):

according to the structureA topic matching model is built, finally

4. The method for recommending personalized events fusing topic matching and two-way preferences according to claim 3, wherein the constructing of the user preference model in the second step constructs the single-factor preferences of the user from three aspects of geographic location, social relationship and time factor, which specifically comprises:

step 2-1-1, constructing a geographic position preference model:

the geographic location preference model calculates the probability that a target user will participate in holding an event at the location, a Kernel Density Estimation (KDE) method is adopted to model the two-dimensional geographic location distribution of the event participated by the user, the event participation probability after normalization is used for representing the preference degree of the user to the geographic location, the longitude and latitude coordinates of the geographic location of the event are represented by (L x, L y), the place set of the historical participation event of the user is represented by L (u), and a KDE function about the user u

As shown in formula (6):

wherein ,l_i＝(Lx_i，Ly_i)^TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event location_l(u，l_i) Indicating user u attended geographic location l_iThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, K (·) represents a gaussian kernel defined as shown in equation (7):

step 2-1-2, constructing a social relationship preference model:

Can be expressed as shown in formula (10):

wherein sim (u)_i，u_j) Representing users u in the same group_iAnd user u_jThe similarity between the two is shown as a formula (12);

normalizing s (u, g) to

As shown in formula (13):

in conjunction with the two interactions described above, users belonging to the same group tend to attend events created by other users within those groups, and the association of users with and within the group is combined to derive a social preference score S for user u with respect to online group g_I(u, g) represented by the formula (14):

step 2-1-3, constructing a time factor preference model:

As shown in equation (15):

for new event e, user u_i∈ U the similarity s (U) can be obtained from the equation (16)_iE) normalizing the similarity to obtain a time preference score S of the user for the event_T(u_iAnd e) is represented by formula (17):

5. the method for recommending personalized events fusing topic matching and two-way preference according to claim 4, wherein the calculating the user preference score in the second step specifically comprises:

for the geographic location preference model, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship preference model, calculating a social preference score of the target user from two aspects of the relationship between the target user and the group and the relevance between the target user and the users in the group; preference model for the time factorConstructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user based on the unified vector representation; combining the three single-factor preferences to form a user preference perception model, and linearly combining the three single-factor preferences to obtain the total preference score S of the user u to the event e_userAs shown in formula (18):

6. The personalized event recommendation method integrating topic matching and two-way preference as claimed in claim 5, wherein the constructing of the event preference model in the second step constructs the single-factor preference of the event from two aspects of event location popularity and event host influence respectively, and specifically comprises:

step 2-2-1, constructing an event position popularity preference model:

first, an event geographic location l is defined_ePopularity p (l) with respect to user u_eU) is represented by formula (19):

wherein the molecule m_l(u，l_e) Joining a geographic location l for user u_eThe frequency of the held events is controlled, and the denominator is the maximum frequency of the positions historically visited by the user u; likewise, a geographic location l is defined_ePopularity p (l) of group g with respect to user u_eG) is represented by formula (20):

P(l_e，u，g)＝αp(l_e，u)+(1-α)p(l_e，g) (21)

step 2-2-2, constructing an event host influence preference model:

wherein ,m_h(u，u_h) Representing host u participated in by user u_hSet of events held, E_hIs the host u_hAll event sets held;

wherein ,U_gRepresents a small group u_hSet of users in (m)_h(u_i，u_h) Representing user u_iParticipating by host u_hSet of events held, E_h(g) Represents u_hIn subgroup u_hA set of events held in; and (3) calculating a comprehensive influence score I (e, u, g) of the event host according to the influence of the event host on the target users and the users in the group, wherein the formula (24) is as follows:

I(e，u，g)＝αI(e，u)+(1-α)I(e，g)(24)

7. the personalized event recommendation method combining topic matching and two-way preference according to claim 6, wherein the calculating of the event preference score in the second step specifically comprises:

8. the personalized event recommendation method integrating topic matching and two-way preference according to claim 7, wherein the step three of obtaining the user event two-way preference score and linearly weighting and combining the topic matching degree score and the two-way preference score to obtain the final recommendation score of the user-event pair comprises the following specific steps:

step 3-1, solving two-way preference for user-event pairs:

suppose the preference scoring weights of the user and the event are theta₁ and θ₂Weighting and fusing the two to obtain a user event bidirectional preference score S_u，e＝θ₁S_user+θ₂S_events(ii) a Converting the problem of two-way preference scoring into a weight vector for solving two preference scores, and selecting implicit feedback as training data to learn the weight vector;

p(θ|R)∝p(R|θ)p(θ) (26)

wherein R in the formula_uRepresents a user-event pair of user u, and p (e)_i＞e_j) Representing events e for user u_iIs arranged at e_jThe probability of the foregoing, as shown in equation (28):

p(e_i＞e_j|θ)＝σ(s(u，e_i)-s(u，e_j)) (28)

wherein S (u, e) is the two-way preference score S_u，e，

where α is the learning rate, s_ij＝s(u，e_i)-s(u，e_j) Through the learning process, the weight vector theta can be automatically obtained according to the user event preference score training set and the hyperparameters α and lambda, so that the two-way preference score S is obtained_u，e；

Preference rating in both directions with user events S_u，eLinear weighted summation to obtain final user-event pair recommendation degree score S_RecAs shown in formula (31):

9. An implementation system for personalized event recommendation fusing topic matching and two-way preference, which is used for implementing the personalized event recommendation method fusing topic matching and two-way preference according to any one of claims 1-8, and is characterized in that the implementation system comprises:

10. The system of claim 9, wherein the user preference module comprises a geo-location preference module, a social relationship preference module, and a time factor preference module, and the event preference module comprises an event location popularity preference module and an event host influence preference module, wherein: