CN111428127A - Personalized event recommendation method and system integrating topic matching and two-way preference - Google Patents

Personalized event recommendation method and system integrating topic matching and two-way preference Download PDF

Info

Publication number
CN111428127A
CN111428127A CN202010069262.9A CN202010069262A CN111428127A CN 111428127 A CN111428127 A CN 111428127A CN 202010069262 A CN202010069262 A CN 202010069262A CN 111428127 A CN111428127 A CN 111428127A
Authority
CN
China
Prior art keywords
event
user
preference
events
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010069262.9A
Other languages
Chinese (zh)
Other versions
CN111428127B (en
Inventor
钱忠胜
杨家秀
朱懿敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi University of Finance and Economics
Original Assignee
Jiangxi University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi University of Finance and Economics filed Critical Jiangxi University of Finance and Economics
Priority to CN202010069262.9A priority Critical patent/CN111428127B/en
Publication of CN111428127A publication Critical patent/CN111428127A/en
Application granted granted Critical
Publication of CN111428127B publication Critical patent/CN111428127B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method and a system for recommending personalized events integrating topic matching and two-way preference, which comprises the steps of firstly, extracting topic information of events and historical events participated by users by utilizing a document topic generation model L DA, calculating the topic matching degree of the users and the events, secondly, constructing preference models of the users and the events according to the two-way consideration of the users and the events for social network recommendation based on the events, respectively obtaining user preference scores and event preference scores, more completely mining preference relations from the two angles of the users and the events, finally, combining the user-event pair matching degree and the two-way preference of the users and the events in a linear weighting mode to obtain final user-event pair comprehensive scores, and taking sequenced TOP-K user-event pairs as recommendation results.

Description

Personalized event recommendation method and system integrating topic matching and two-way preference
Technical Field
The invention relates to the technical field of information recommendation, in particular to a personalized event recommendation method and system integrating topic matching and two-way preference.
Background
With the rapid development of internet and computer technologies, in recent years, the traditional Social Network is also developed towards different innovations, and a number of special types of novel Social networks are formed, such as a location-Based Social Network (L opportunity-Based Social Network, L BSN), a Social Network forming Social relationships mainly according to geographic sign-in information of users, and another complex heterogeneous Social Network combining online and offline — an event-Based Social Network (sn), which is different from friends relationships established among acquaintances in the traditional Social Network, users in the event-Based Social Network establish interpersonal relationships through Social activities, and users join online interest groups and offline collective Social activities according to their interests or common points.
In the process of rapid development of event-based social networks, more and more users choose to participate in social activities in the event social networks, and on an event-based social network platform, users can join various online groups, and an organizer or users in the group can initiate and participate in any offline social activities, such as a party, a hiking, a sports activity, a concert, and the like, and share information with other users.
The event-based social network may provide a user with a social service that combines online to offline, helping the user initiate and formulate a personalized event participation program. The users form online group relations through common interests online, online meeting events are initiated online, the social network based on the events has wider social attributes than the social network based on the positions, and the existing work shows that the social network based on the events in the recommendation system has better recommendation characteristics than the traditional social network.
Most current event-based social network recommendations are mainly recommendations based on user one-way angle extraction feature preferences, and although the social influence of event sponsors can be considered, the potential attractiveness of the events is not enough. On the other hand, the influence of the theme factors only takes the event theme as one of the recommendation factors, and the user theme factors and the matching degree of the user theme factors and the event theme are less considered.
Disclosure of Invention
In view of the above, it is necessary to provide a personalized event recommendation method and system that combines several types of main context information to calculate user preferences and event potential preferences and finally fuses topic matching and two-way preferences of topic matching degree and user-event two-way preferences.
A personalized event recommendation method fusing topic matching and two-way preference comprises the following steps:
step one, extracting theme information of an event by using a document theme generation model L DA, obtaining user theme information according to a historical event record participated by a user, calculating themes of a new event and a user historical event, and calculating a theme matching degree score of a user-event pair by adopting cosine similarity;
step two, respectively constructing a user preference model and an event preference model, and respectively calculating a user preference score and an event preference score;
and thirdly, learning the weight parameters of the user preference scores and the event preference scores by using a Bayesian personalized ranking algorithm BPR to obtain user event two-way preference scores, linearly weighting and combining the theme matching degree scores and the two-way preference scores to obtain final recommendation scores of user-event pairs, and recommending the top K ordered events to the user.
Further, the document topic generation model L DA in step one has a three-layer Bayesian network structure, including documents, topics and words, wherein document-topic and topic-words are subject to polynomial distribution, each document selects a topic with a certain probability, and selects a word from the topic with a certain probability, and topics in any document are subject to Dirichlet distribution, and relationships between texts are discovered through the distribution.
Further, in the step one, the subject of the new event and the user history event is calculated, and the subject matching degree score of the user-event pair is calculated by using cosine similarity, and the specific steps include:
step 1-1, forming a document set D from all event description contents, removing stop words, inputting the document set D into a document theme generation model L DA, and respectively obtaining the theme distribution of each event;
removing stop words and punctuation marks from all event contents, regarding the document contents after removing noise interference words as a set D of all documents, inputting the set D into an L DA topic model, and generating a document DiIs given by (1), the joint distribution p (ω, z | α) of topics and words of (1):
Figure RE-GDA0002533677390000021
two unknown parameters in the model were then estimated using the Gibbs sampling method: event topic distribution
Figure RE-GDA0002533677390000022
And topic word distribution v;
step 1-2, calculating topic distribution similarity between the historical event and the new event of the target user according to a JS divergence algorithm;
a topic distribution for all events has been generated according to equation (1)
Figure RE-GDA0002533677390000023
Given event edp and edqRespectively have a theme distribution
Figure RE-GDA0002533677390000024
Firstly calculating JS divergence between the two through a JS divergence method
Figure RE-GDA0002533677390000025
As shown in formula (2):
Figure RE-GDA0002533677390000026
wherein ,Djs∈[0,1],DKLExpressing the divergence of K L, which describes the difference between two probability distributions p and q, the calculation formula is shown in equation (3):
Figure RE-GDA0002533677390000027
combining formula (2) and formula (3) to obtain event edp and edqHas a topic similarity of StopicAs shown in formula (4):
Figure RE-GDA0002533677390000028
wherein the topic similarity S of the eventtopicIs at a value of [0,1]]In (3), the closer the value is to 1, the higher the event similarity is;
step 1-3, averaging the similarity of all historical events of a target user to obtain a theme matching degree score of the user and a new event;
with EuRepresenting the number of historical events of the target user, taking the average of all the similarities of the target user
Figure RE-GDA0002533677390000031
As the topic matching degree score of the user and the new event, the topic matching degree score is represented by formula (5):
Figure RE-GDA0002533677390000032
matching the model according to the constructed theme, and finally
Figure RE-GDA0002533677390000033
To measure the subject matching relationship between the target user and the new event.
Further, the constructing of the user preference model in the second step constructs the single-factor preference of the user from three aspects of geographic location, social relationship and time factor, and specifically includes:
step 2-1-1, constructing a geographic position preference model:
the geographic position preference model calculates the probability that a target user will participate in holding an event at the position, a kernel density estimation KDE method is adopted to model the two-dimensional geographic position distribution of the event participated by the user, and the normalized event participation probability represents the preference degree of the user to the geographic positionThe longitude and latitude coordinates of the event geographic location are represented by (L x, L y), the set of locations where the user has historically participated in the event is represented by L (u), and the KDE function for user u
Figure RE-GDA0002533677390000034
As shown in formula (6):
Figure RE-GDA0002533677390000035
wherein ,li=(Lxi,Lyi)TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event locationl(u,li) Indicating user u attended geographic location liThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, K (·) represents a gaussian kernel defined as shown in equation (7):
Figure RE-GDA0002533677390000036
combining equations (6) and (7) may define the probability that user u will attend an event that will be held at location l, as shown in equation (8):
Figure RE-GDA0002533677390000037
normalizing the probability to obtain a preference score S of the user about the geographic positionG(u, l) is represented by the formula (9).
Figure RE-GDA0002533677390000038
The denominator represents the maximum event participation probability of the target user;
step 2-1-2, constructing a social relationship preference model:
in a user social relationship network, a user can add at least one or more interest groups on line, choose to participate in event activities published by different groups, and judge social relationship preference of the user through the online same-group relationship of the user, wherein the same-group relationship mainly comprises two interactive relationships;
first, the relevance of users to groups is defined as the interaction between users and all groups they belong to and between users and events created in the groups, and G (u) represents the set of groups to which events the users u participate belong, so the relevance of users to groups
Figure RE-GDA0002533677390000041
Can be expressed as shown in formula (10):
Figure RE-GDA0002533677390000042
wherein ,mp(u, g) represents a set of event activities in which user u has participated in the group in which the user is present;
second, the relevance of users in the group is defined by the similarity of friends in the group where the target user is located, and the similarity s (u, g) between the target user and the users in the group is calculated, as shown in formula (11):
Figure RE-GDA0002533677390000043
wherein sim (u)i,uj) Representing users u in the same groupiAnd user ujThe similarity between the two is shown as a formula (12);
Figure RE-GDA0002533677390000044
normalizing s (u, g) to
Figure RE-GDA0002533677390000045
As shown in formula (13):
Figure RE-GDA0002533677390000046
combining the above two interactions, users belonging to the same group tend to participateEvents created by other users within these groups, and the combined user-to-group and user-within-group correlations yield a social preference score S for user u with respect to online group gI(u, g) represented by the formula (14):
Figure RE-GDA0002533677390000047
α∈ [0,1] is used as a weight control parameter, in the social relationship network, the preference association of the target user and the group is set to be as important as the association between the users in the group, and the value of α is set to be 0.5 through experimental verification;
step 2-1-3, constructing a time factor preference model:
the time factor of the event is an important preference factor which needs to be considered when calculating the preference of the user; representing a new event e which the user can choose to attend as a 7 x 24 dimensional event time vector
Figure RE-GDA0002533677390000048
When a new event occurs in a certain specific time period of a week, setting the vector component value of the time period to be 1, otherwise, setting the vector component value to be 0; representing users as user time vectors based on historical event records of user participation in a time preference model
Figure RE-GDA0002533677390000049
As shown in equation (15):
Figure RE-GDA00025336773900000410
wherein ,EuRepresenting the historical event set participated by the target user, and then calculating cosine similarity s (u, e) between the user time vector and the new event time vector, as shown in formula (16):
Figure RE-GDA00025336773900000411
for new event e, user ui∈ U can be based on equation (16)) Find the similarity s (u)iE) normalizing the similarity to obtain a time preference score S of the user for the eventT(uiAnd e) is represented by formula (17):
Figure RE-GDA0002533677390000051
further, the calculating the user preference score in the second step specifically includes:
for the geographic location preference model, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship preference model, calculating a social preference score of the target user from two aspects of the relationship between the target user and the group and the relevance between the target user and the users in the group; for the time factor preference model, unified vector representation of two granularities of date and hour is constructed, and similarity of a user-event pair is calculated on the basis of the unified vector representation to serve as a time preference score of a target user; combining the three single-factor preferences to form a user preference perception model, and linearly combining the three single-factor preferences to obtain the total preference score S of the user u to the event euserAs shown in formula (18):
Figure RE-GDA0002533677390000052
wherein ,SG、SI、STRespectively representing the preference scores of the users on three single factors, namely geographical position, social relation and time factor.
Further, the constructing of the event preference model in the second step constructs the single-factor preference of the event from two aspects of the event location popularity and the event host influence, and specifically includes:
step 2-2-1, constructing an event position popularity preference model:
calculating the popularity of the geographic position according to the visiting frequency of the user u and the users in the online group g which the user u joins;
first, an event geographic location l is definedeWith respect to user uPrevalence p (l)eU) is represented by formula (19):
Figure RE-GDA0002533677390000053
wherein the molecule ml(u,le) Joining a geographic location l for user ueThe frequency of the held events is controlled, and the denominator is the maximum frequency of the positions historically visited by the user u; likewise, a geographic location l is definedePopularity p (l) of group g with respect to user ueG) is represented by formula (20):
Figure RE-GDA0002533677390000054
wherein, the numerator represents the frequency of each user in the group g participating in practical activities at the position l, and the denominator is the maximum frequency of the positions historically visited by the group members, thereby calculating the geographic position lePopularity with respect to users in group g; binding to p (l)eU) and p (l)eG) defining the total popularity of the hosting location of the event to be recommended to the target user u as P (l)eU, g) as shown in formula (21):
P(le,u,g)=αp(le,u)+(1-α)p(le,g) (21)
step 2-2-2, constructing an event host influence preference model:
firstly, the influence degree of an event host on a target user selects to express the implicit preference of the event through the credibility or influence degree of the host; defining the influence I (e, u) of the event on the user u, as shown in formula (22):
Figure RE-GDA0002533677390000061
wherein ,mh(u,uh) Representing host u participated in by user uhSet of events held, EhIs the host uhAll event sets held;
secondly, the influence degree of the event host in the group is expressed by the frequency proportion of the users participating in the group for the online group where the target user is located, and the influence degree of the users in the group is expressed by I (e, g), as shown in formula (23):
Figure RE-GDA0002533677390000062
wherein ,UgRepresents a small group uhSet of users in (m)h(ui,uh) Representing user uiParticipating by host uhSet of events held, Eh(g) Represents uhIn subgroup uhA set of events held in; and (3) calculating a comprehensive influence score I (e, u, g) of the event host according to the influence of the event host on the target users and the users in the group, wherein the formula (24) is as follows:
I(e,u,g)=αI(e,u)+(1-α)I(e,g) (24)
further, the calculating the event preference score in the second step specifically includes:
for new events which do not occur, representing the preference of the event by calculating the event position popularity and the event host influence of the new event; popularity P (l) to constructed event locationeU, g) and event host influence I (e, u, g) are linearly combined, and preference score S of event e to user u is calculatedeventsAs shown in formula (25):
Figure RE-GDA0002533677390000063
further, the step three is to obtain a two-way preference score of the user event, and linearly weight and combine the topic matching degree score and the two-way preference score to obtain a final recommendation score of the user-event pair, and the specific steps include:
step 3-1, solving two-way preference for user-event pairs:
suppose the preference scoring weights of the user and the event are theta1 and θ2And the two are weighted and fused to obtain the user eventTwo-way preference scoring Su,e=θ1Suser2Sevents(ii) a Converting the problem of two-way preference scoring into a weight vector for solving two preference scores, and selecting implicit feedback as training data to learn the weight vector;
selecting a learning algorithm BPR based on Bayesian maximum likelihood estimation to perform sequencing learning on the weights, and learning the correct sequencing sequence of the user-event pairs according to implicit feedback data of the user on the events so that the events participated by the user are arranged in front of new events or other events; first, a maximum posterior probability p (θ | R) is defined, as shown in equation (26):
p(θ|R)∝p(R|θ)p(θ) (26)
where θ represents a weight vector, R represents a set of all user-event pairs, and p (R | θ) is defined as shown in equation (27);
Figure RE-GDA0002533677390000064
wherein R in the formulauRepresents a user-event pair of user u, and p (e)i>ej) Representing events e for user uiIs arranged at ejThe probability of the foregoing, as shown in equation (28):
p(ei>ej|θ)=σ(s(u,ei)-s(u,ej)) (28)
wherein S (u, e) is the two-way preference score Su,e
Figure RE-GDA0002533677390000071
For more convenient optimization, assuming θ follows a normal distribution with a mean value of 0, the final optimization objective function lnp (θ | R) is derived by expansion, as shown in equation (29):
Figure RE-GDA0002533677390000072
the lambda represents a regular term coefficient, and an optimal weight parameter vector is obtained by maximizing an optimization objective function through implicit interactive feedback data of a user event; solving the optimization problem by adopting a random gradient descent algorithm SGD, randomly extracting a user-event pair of a target user from a training set in an iterative process to update a weight vector theta, wherein the updating process is shown as a formula (30):
Figure RE-GDA0002533677390000073
where α is the learning rate, sij=s(u,ei)-s(u,ej) Through the learning process, the weight vector theta can be automatically obtained according to the user event preference score training set and the hyperparameters α and lambda, so that the two-way preference score S is obtainedu,e
Step 3-2, solving a final recommendation score of the user-event pair by combining topic matching and two-way preference:
firstly, extracting event topics through an L DA topic model and obtaining topic matching degree scores of users and events, secondly, respectively constructing preference models of the users and the events according to user event context information in the EBSN, obtaining user event two-way preference scores through a BPR learning algorithm, and finally, scoring the topic matching degree scores
Figure RE-GDA0002533677390000074
Preference rating in both directions with user events Su,eLinear weighted summation to obtain final user-event pair recommendation degree score SRecAs shown in formula (31):
Figure RE-GDA0002533677390000075
where γ is a weighting parameter, typically set manually from experience, the optimal setting will be determined experimentally.
And an implementation system for personalized event recommendation fusing topic matching and two-way preference, which is used for implementing the personalized event recommendation method fusing topic matching and two-way preference as described in any one of the above, the implementation system comprising:
the document theme generation module is used for extracting themes of the user historical events and the new events, calculating theme distribution and word distribution of the events, expressing theme matching degree according to theme similarity between the user historical events and the new events, and fusing the theme matching degree into a recommendation model as one of recommended key factors to recommend the events;
the user preference building module is used for building the single-factor preference of the user from the three aspects of geographic position, social relation and time factor, and weighting and fusing the three single-factor preferences to obtain the overall preference of the user;
constructing an event preference module, and expressing the preference of the event by using the social influence of the event host in the group and the popularity of the event host geographical position in the group;
the user event bidirectional preference scoring module is used for solving the weight parameters of the user preference score and the event preference score by using a sequencing learning algorithm to obtain a user event bidirectional preference score;
and the final recommendation scoring module of the user-event pair is used for linearly weighting and combining the theme matching degree score and the two-way preference score to obtain the final recommendation degree score of the user-event pair.
Further, the user preference module comprises a geographic location preference module, a social relationship preference module, and a time factor preference module, and the event preference module comprises an event location popularity preference module and an event host influence preference module, wherein:
the geographic location preference module is used for expressing a geographic location preference score by predicting the probability of a user participating in an event held at a certain geographic location;
the social relationship preference module is used for calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group;
the time factor preference module is used for constructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user;
the event location popularity preference module is used for selecting important places for interested users when recommending new events, is called the popularity of the geographic location in a user group, and can calculate the attractiveness of the event to the users more accurately by considering the popularity of the event geographic location;
the event host influence preference module is used for improving the recommendation accuracy according to the influence of the event host on the group where the target user is located, and calculating the influence of the event host on the target user and the influence of the event host in the group.
In the personalized event recommendation method and system based on the fusion topic matching and the two-way preference, firstly, the topic information of an event is extracted by using a document topic generation model L DA, the user topic information is obtained according to a historical event record participated by a user, the topic matching degree of the user and the event is calculated to be used as an important recommendation factor in a recommendation model, the topic factor can better represent characteristic preference, secondly, a preference model of the user and the event is built according to the two-way view of the user and the event for the event-based social network recommendation, the preference score of the user and the preference score of the event are respectively obtained, the preference relationship between the user and the event is more completely mined from the two angles, finally, the user-event pair matching degree and the two-way preference of the user and the event are combined in a linear weighting mode to obtain the final user-event pair comprehensive score, the TOP K (namely TOP-K) user-event pairs after sequencing are used as recommendation results, and the personalized event recommendation algorithm is subjected to a large number of experiments on a Meetup real data set and compared with other event recommendation algorithms, so that the performance of the software algorithm is better than that the traditional recommendation scheme can be well predicted, and the personalized.
Drawings
Fig. 1 is a block diagram of an overall recommendation fusion framework of a personalized event recommendation method and system fusing topic matching and two-way preference according to an embodiment of the present invention.
Fig. 2 is a block diagram of a document theme generation model L DA of the personalized event recommendation method and system fusing theme matching and two-way preference according to the embodiment of the present invention.
Detailed Description
In this embodiment, a personalized event recommendation method combining topic matching and two-way preference is taken as an example, and the following describes the present invention in detail with reference to specific embodiments and accompanying drawings.
Referring to fig. 1 and fig. 2, a personalized event recommendation method and system combining topic matching and two-way preference according to an embodiment of the present invention are shown.
The software specifically explains the technical details related to the Personalized event recommendation system for fusing topic matching and two-way preference of the software, and the software mainly adopts an L DA topic model to calculate the topics of new events and user historical events, adopts cosine similarity to calculate the topic matching degree of user-event pairs, and respectively constructs a user preference model and an event preference model, wherein the user preference model calculates the comprehensive preference scores of users from three aspects of time, geography and social relations.
1. Recommendation framework fusing L DA topic matching and user event two-way preferences
Based on the current existing work, an event recommendation scheme combining user-event pair topic matching and user-event pair two-way preference is provided based on geographic position information, time information, social relations and other related user event context information in the EBSN. In the scheme, the influence of the theme matching degree, the user preference and the event preference on event recommendation is respectively considered, and the factors are fused to effectively recommend the interest events to the user. The general framework of the recommendation model is shown in fig. 1, and the specific recommendation process is as follows:
1) and calculating the new event and the historical event topic of the target user by utilizing an L DA topic model according to the description document of the event in the EBSN, expressing the topic of the user by using the topic of the historical event of the user, and then calculating the semantic similarity of the event and the topic distribution of the user to obtain the matching degree score of the user-event topic.
2) And calculating a user preference score and an event preference score, calculating the preference scores from the geographic position, the social relationship and the time of the user preference respectively, and performing linear fusion on the preference scores, wherein the event preference is represented by the popularity of the event host geographic position and the social influence of the event host, and the event preference score is obtained by performing linear fusion on the preference scores. It should be noted that, when calculating the popularity of the geographic location and the influence of the host on the event, only the group and the users in the group where the target user is located are targeted, and the association of other users and groups is totally ignored, so as to improve the recommendation performance and reduce the calculation complexity.
3) And obtaining the matching degree score of the user-event theme, the preference score of the user to the event and the preference score of the event to the user through the calculation. Learning the weights of the user preference scores and the event preference scores by using a Bayesian personalized sorting algorithm, fusing the preference scores of the users and the events according to the weights to obtain two-way preference scores, linearly combining the theme matching degree scores and the two-way preference score information to obtain final user-event pair recommendation degree scores, and recommending the TOP-K event with the highest score to the users.
2. Subject matching model based on L DA
In an event social network, there is an obvious semantic similarity between users and events, and users often choose to participate in a certain type of interesting events, which generally have similar attributes and topics. The method comprises the steps of better capturing preferences of users and events by applying topics of events in recommendation, representing user topics by topics of historical events participated by the users, calculating topic distribution and word distribution of new events, representing topic matching degree by topic similarity between the historical events and the new events of the users, and fusing the topic matching degree into a recommendation model as one of key factors of recommendation to recommend the events.
The core idea is that each document selects a certain topic with a certain probability and selects a certain word from the topic with a certain probability, the topic in any document is considered to accord with Dirichlet distribution, and the relationship between texts can be discovered through the distribution. L DA consists of three layers of generating Bayesian network structures, including documents, topics and words, and the document-topic and topic-words all obey polynomial distribution. L DA topic model generation process is shown in FIG. 2.
Given document set D ═ D1,d2,…,dmV and in FIG. 2Respectively represent documents diα are hyper-parameters of the empirically given prior distribution of topics and prior distribution of words, respectively, k is the number of topics of the previously specified document set, N is the number of topics of the previously specified document setmRepresenting a document diM is the number of documents in the document set. For document diFor each word in the document, L DA determines the topic distribution v of the document based on the prior knowledge α, then extracts a topic z from the topic distribution v, and determines the word distribution of the current topic based on the prior knowledge β
Figure RE-GDA0002533677390000102
Then, the word distribution corresponding to the subject z
Figure RE-GDA0002533677390000103
Extract one fromA word w, repeating the above process NmGenerating the document di. In the process, the document d can be solved by using a Gibbs sampling methodiThe distribution of themes.
The method comprises the steps of calculating topic similarity between users and events according to L DA, converting text content into semantic features, calculating topic distribution for each event by utilizing a L DA topic model, wherein the event content mainly comprises titles and description documents, and also comprises information such as time, holding places and the like, and event topics can be extracted through the event contentiIs determined, as shown in equation (1), then two unknown parameters in the model, i.e., the event topic distribution, are estimated using the Gibbs sampling method
Figure RE-GDA0002533677390000111
And subject word distribution v.
Figure RE-GDA0002533677390000112
After the subject distribution and word distribution of the event document are obtained through the L DA process, the similarity among the events is calculated according to the subject distribution of the events by using a JS divergence (Jensen Shannon divergence) method, the JS divergence is based on a variant of K L divergence (Kullback-L eibler divergence), is symmetrical, solves the problem of asymmetric K L divergence, and can better measure the similarity of two probability distributions, and the subject distribution of all the events is generated according to the formula (1)
Figure RE-GDA0002533677390000113
Given event edp and edqRespectively have a theme distribution
Figure RE-GDA0002533677390000114
Firstly calculating JS divergence between the two through a JS divergence method
Figure RE-GDA0002533677390000115
As shown in formula (2).
Figure RE-GDA0002533677390000116
wherein ,Djs∈[0,1],DKLExpressing the divergence of K L, is used to describe the difference between two probability distributions p and q, and the calculation formula is shown in equation (3).
Figure RE-GDA0002533677390000117
Combining formula (2) and formula (3) to obtain event edp and edqHas a topic similarity of StopicAs shown in formula (4).
Figure RE-GDA0002533677390000118
Topic similarity S of eventstopicIs at a value of [0,1]]In (3), a value closer to 1 indicates a higher degree of event similarity. It has been mentioned above that topic similarity between a new event and a user's historical events is taken as topic similarity between a user and an event, whereas a user often participates in events many times, and there are multiple topic similarities between a new event, with EuRepresenting the number of historical events of the target user, taking the average of all the similarities of the target user
Figure RE-GDA0002533677390000119
And (5) serving as a theme matching degree score of the user and the new event.
Figure RE-GDA00025336773900001110
Algorithm 1 describes the process of calculating topic matching for user-event pairs through the L DA topic model, where
Figure RE-GDA00025336773900001111
A distribution of words representing a subject-matter,
Figure RE-GDA00025336773900001112
representing document theme distribution, Dir () representing Dirichlet distribution, Mult () representing polynomial distribution, and Poiss () representing poisson distribution.
Figure RE-GDA0002533677390000121
The method comprises the steps of firstly, forming all event description contents into a document set, removing stop words, using the document set as input of a L DA model, respectively obtaining topic distribution (from a line 2 to a line 11) of each event, then calculating topic distribution similarity (from a line 12 to a line 14) between a historical event of a target user and a new event according to the JS divergence algorithm, and finally averaging the similarity of all historical events of the target user to obtain topic matching degree scores (from a line 15 to a line 16) of the user and the new event.
3. User-based preference model
Feature learning is generally performed from the relevant context information of the user with respect to the user preferences, and the learned feature information is expressed as the user preferences. The single-factor preference of the user is constructed from the three aspects of geographic factors, social relations and time factors, and the three single-factor preferences are weighted and fused to obtain the overall preference of the user.
3.1 geographic location preferences
The geographic location preference model calculates the probability that a target user will participate in holding an event at the location, and KDE (Kernel Density Estimation) is adoptedEstimate) method models a two-dimensional geo-location distribution of events in which a user participates, representing a user's preference for geo-location by event participation probability after normalization longitude and latitude coordinates of event geo-location are represented by (L x, L y), a set of places in which a user has historically participated in an event is represented by L (u), and a KDE function with respect to user u
Figure RE-GDA0002533677390000131
As shown in equation (6).
Figure RE-GDA0002533677390000132
wherein ,li=(Lxi,Lyi)TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event locationl(u,li) Indicating user u attended geographic location liThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, and K (·) represents the Gaussian kernel function, which is defined as shown in equation (7).
Figure RE-GDA0002533677390000133
The combination of equations (6) and (7) may define the probability that user u will attend the event that will be held at location/as shown in equation (8).
Figure RE-GDA0002533677390000134
Normalizing the probability to obtain a preference score S of the user about the geographic positionG(u, l) is represented by the formula (9).
Figure RE-GDA0002533677390000135
3.2 social relationship preferences
In a user social relationship network, a user typically joins at least one or more interest groups online and may choose to participate in event activities posted by different groups. In these group relationships, users usually select a preference group with the most interest to themselves to participate in, and members in the same group generally have the same interest, so that social relationship preferences of users can be considered through online same-group relationships of users, and two kinds of interaction relationships are mainly included.
1) Relevance of users to groups. I.e. the interaction between the user and all groups to which they belong and between the user and events created within the groups. G (u) represents the set of groups to which the event participated by the user u belongs, and the relevance of the user and the groups
Figure RE-GDA0002533677390000136
Can be expressed as shown in formula (10).
Figure RE-GDA0002533677390000137
wherein ,mp(u, g) represents the set of event activities in the group of users that user u has attended.
2) Intra-group user relevance. The relevance of the users in the group is defined by the similarity of friends in the group where the target user is located, and the similarity s (u, g) between the target user and the users in the group is calculated, as shown in formula (11).
Figure RE-GDA0002533677390000138
Wherein sim (u)i,uj) Representing users u in the same groupiAnd user ujThe similarity between them is shown in formula (12).
Figure RE-GDA0002533677390000141
Finally, s (u, g) is normalized to
Figure RE-GDA0002533677390000142
As shown in equation (13).
Figure RE-GDA0002533677390000143
In conjunction with these two interactions, users belonging to the same or similar groups tend to attend events created within those groups, and the association of users with and within the groups is combined to derive a social preference score S for user u with respect to online group gI(u, g) is represented by the formula (14).
Figure RE-GDA0002533677390000144
α∈ [0,1] is used as a weight control parameter, in the social relationship network, the preference association between the target user and the group is considered as important as the association between the users in the group, and the value α is set to 0.5 through experimental verification.
3.3 time preference
The time factor of an event is another important preference factor that needs to be considered when calculating user preferences. Different users may have different preferences in selecting to attend an event, some users may prefer to attend an event in the evening, while others may prefer to attend an event in the morning, or may prefer different time points on a weekday or weekend. In reality time is periodic, mainly in periods of 7 days per week and 24 hours per day, creating user time preferences at two different levels of granularity for a user to choose to engage in an activity on a certain day of the week and on certain hours of the day. We express the user's temporal preferences by combining the user selections at two levels of granularity.
If the user selects a time period on a day of the week to engage in the event, which may indicate an implicit time preference of the user, the user may choose to engage in the event again at the next same time period. To uniformly and intuitively represent the implicit preference, a new event e which a user can choose to participate is represented as a 7-by-24-dimensional event time vector
Figure RE-GDA0002533677390000145
When a new event occurs in a certain time period of the week, the vector component value of the time period is set to 1, otherwise, the vector component value is set to 0. Thus, a user may be represented in a temporal preference model as a user time vector based on historical event records in which the user attended
Figure RE-GDA0002533677390000146
As shown in equation (15).
Figure RE-GDA0002533677390000147
wherein ,EuRepresenting the historical event set participated by the target user, and then calculating cosine similarity s (u, e) between the user time vector and the new event time vector, as shown in formula (16).
Figure RE-GDA0002533677390000148
For new event e, user ui∈ U the similarity s (U) can be obtained from the equation (16)iE) normalizing the similarity to obtain a time preference score S of the user for the eventT(uiAnd e) is represented by formula (17).
Figure RE-GDA0002533677390000151
3.4 user fusion preference Scoring
According to the modeling of the single-factor preference model of the user from the three aspects, preference scores of the user about the geographic position, the social relationship and the time are calculated respectively. For a geographic location, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship, calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group; for time preference, a unified vector representation of date and hour granularity is constructed, and similarity of a user-event pair is calculated based on the unified vector representation as a time preference score of a target user. Combine the threeThe single-factor preferences form a user preference perception model, and the three single-factor preferences are linearly combined to obtain the total preference score S of the user u to the event euserAs shown in equation (18).
Figure RE-GDA0002533677390000152
wherein ,SG、SI、STThe distribution represents the preference scores of the users on three factors of geographic position, social relationship and time. Algorithm 2 describes the calculation of the user preference score.
Figure RE-GDA0002533677390000153
Algorithm 2 presents a process for solving a user composite preference score in combination with the user's preferences in three factors, namely geographical location, social relationship, and time. Predicting the probability that the user is likely to participate in an event held at a certain specific position through a kernel density estimation algorithm, and expressing the geographical preference of the user after normalizing the probability (3 rd row); calculating social preference (lines 5 to 11) for the user's social relevance to the online group and members of the group according to equations (10) and (13); representing the new event and the user historical event as time vectors, and calculating cosine similarity of the new event and the user historical event to represent the time preference of the user (line 4); finally, the three preference values are linearly combined to obtain the total preference score of the user (line 13 to line 14).
4. Event-based preference model
For the preferences of events, learning from event host and event ontology information is considered. Because an event lacks active personalized context information compared to the user, it does not have history, personalized tags, etc. for new events, and thus the preference of the event is expressed in terms of the social impact of the event host in the group, and the popularity of the event host's geographic location in the group.
4.1 event location popularity
The geographic location of the event host is a consideration of whether the user chooses to engage in the event. For a certain online group which users join is generally a user group with the same interest, a plurality of users may choose to participate in the same event activity, therefore, for new event recommendation, the holding place of the new event recommendation can be used as an important selection basis for interested users, and the relationship is called the popularity of the geographic position in the user group. The popularity of considering the geographic location of an event in a model for calculating event preferences enables a more accurate calculation of the attractiveness of an event to a user. The popularity of a geographic location is calculated based on the frequency of visits to the place by user u and the users in the online team g to which it has joined.
First, an event geographic location l is definedePopularity p (l) with respect to user ueAnd u) is represented by formula (19).
Figure RE-GDA0002533677390000161
Wherein the molecule ml(u,le) Joining a geographic location l for user ueThe frequency of the held events is treated, and the denominator is the maximum frequency of the locations historically visited by the user u. Likewise, a geographic location l may be definedePopularity p (l) of group g with respect to user ueAnd g) is shown as formula (20).
Figure RE-GDA0002533677390000162
Wherein, the numerator represents the frequency of each user in the group g participating in practical activities at the position l, and the denominator is the maximum frequency of the positions historically visited by the group members, thereby calculating the geographic position leRegarding the popularity of users in group g. Binding to p (l)eU) and p (l)eG) the total popularity of the hosting location of the event to be recommended to the target user u may be defined as P (l)eU, g) is represented by the formula (21).
P(le,u,g)=αp(le,u)+(1-α)p(le,g) (21)
4.2 event host influence
In the event social network, the initiator of each event activity is also a common user on the network, and a host initiates an activity to get a better response, so that the previously-participated user is likely to choose to participate in the held activity again when other new activities are initiated next time. Although the event to be recommended is an entirely new event that has not occurred for each user, the host of the event may be the active host of that type of event, possibly having previously been hosting multiple events, which provides more ancillary recommendation information for addressing the cold start problem present in event recommendations. Therefore, the influence of the event host on the user group in the group is an important characteristic of the event preference, and the software improves the recommendation accuracy according to the influence of the event host on the group of the target user. Its influence can be considered from the following two aspects.
1) The degree of influence of the event host on the target user. The event social network does not have scoring information of the event by the user, the influence of the host and the event cannot be visually represented, and the event cannot be scored again when the life cycle of the event is ended. First, the influence I (e, u) of the event on the user u is defined, as shown in equation (22).
Figure RE-GDA0002533677390000171
wherein ,mh(u,uh) Representing host u participated in by user uhSet of events held, EhIs the host uhThe set of all events held.
2) The influence of the event host in the team. For the online group where the target user is located, the influence of the event in the group can be similarly expressed by the frequency proportion of the users participating in the group, and the influence of the users in the group is expressed by I (e, g), as shown in equation (23).
Figure RE-GDA0002533677390000172
wherein ,UgRepresents a small group uhSet of users in (m)h(ui,uh) Representing user uiParticipating by host uhSet of events held, Eh(g) Represents uhIn subgroup uhThe set of events held in (1). And (3) obtaining a comprehensive influence score I (e, u, g) of the event host by combining the influence degree of the event host on the target users and the users in the group, wherein the comprehensive influence score I (e, u, g) is shown as a formula (24).
I(e,u,g)=αI(e,u)+(1-α)I(e,g) (24)
4.3 event potential preference Scoring
For new events that do not occur, two key factors that the software sets up to attract users to attend are geographic location and host influence. The preferences of an event are expressed by calculating the geo-location popularity of a new event and the social influence of its host. In order to reduce the computational complexity and avoid the interference and influence of weakly related data, the popularity of the event geographic position and the social influence of the host are only limited to a group where the target user is located. It is assumed here that the remaining user or group correlations are zero, with no impact on event preferences. For the above constructed event geo-location popularity P (l)eU, g) and host influence I (e, u, g) are linearly combined to find a preference score S for event e to user ueventsAs shown in formula (25).
Figure RE-GDA0002533677390000173
Algorithm 3 details the process of calculating an event potential preference score by event location popularity and host influence.
Figure RE-GDA0002533677390000181
Algorithm 3 presents a process that seeks to solve for an event potential preference score based on event geo-location popularity and host influence. For the group where the target user is located, calculating the popularity of the event geographic position to the user and the group according to the formula (19) and the formula (20), and combining the popularity and the popularity to represent the total popularity of the event geographic position (the 3 rd row to the 8 th row); similarly, the influence of the event host on the user and the team is obtained by the equations (22) and (23) (lines 9 to 13), and the influence of the event host is expressed by combining the two; finally, the location popularity and host influence are linearly combined to obtain a potential preference score for the event (line 17).
5. Recommendation algorithm fusing topic matching and user event two-way preference
The method comprises the steps of solving topic distribution of users and events respectively by using an L DA topic model, calculating topic matching degree of user-event pairs according to the topic distribution, constructing a feature preference scoring model for the users and the events, and solving a user preference score and an event preference score respectively.
1) The user-event pairs are scored for two-way preference. Suppose the preference scoring weights of the user and the event are theta1 and θ2Weighting and fusing the two to obtain a user event bidirectional preference score Su,e=θ1Suser2Sevents. The key problem of two-way preference scoring is then to find the weight vector of both preference scores, choosing to learn the weight vector using implicit feedback as training data. Unlike explicit feedback where a user scores a project, implicit feedback in an event social network can only be expressed in terms of interaction information between the user and the event, i.e., if the user attended the event, the feedback is 1, otherwise the feedback is 0. Obviously, the user's feedback is 0 for all new events.
A learning algorithm BPR based on Bayesian maximum likelihood estimation is selected to carry out sequencing learning on the weights, and the correct sequencing sequence of the user-event pairs is learned according to implicit feedback data of the user on the events, so that the events participated by the user are arranged in front of new events or other events. First, a maximum posterior probability p (θ | R) is defined as shown in equation (26).
p(θ|R)∝p(R|θ)p(θ) (26)
Where θ represents a weight vector, R represents a set of all user-event pairs, and p (R | θ) is defined as shown in equation (27).
Figure RE-GDA0002533677390000191
wherein ,RuRepresents a user-event pair of user u, and p (e)i>ej) Representing events e for user uiIs arranged at ejThe foregoing probability is shown as equation (28).
p(ei>ej|θ)=σ(s(u,ei)-s(u,ej)) (28)
Wherein S (u, e) is the two-way preference score Su,e
Figure RE-GDA0002533677390000192
For more convenient optimization, assuming θ follows a normal distribution with a mean value of 0, the final optimization objective function lnp (θ | R) is derived by expansion, as shown in equation (29).
Figure RE-GDA0002533677390000193
Where λ represents the regular term coefficient. And (4) maximizing an optimization objective function through implicit interaction feedback data of the user event to obtain an optimal weight parameter vector. The optimization problem is solved by adopting a Stochastic Gradient Descent (SGD) algorithm, and a user-event pair of a target user is randomly extracted from a training set in an iterative process to update a weight vector theta, wherein the updating process is shown as a formula (30).
Figure RE-GDA0002533677390000194
Where α is the learning rate, sij=s(u,ei)-s(u,ej) Through the learning process, a weight vector theta can be automatically obtained according to the user event preference score training set and the super parameters α and lambda, so that a two-way preference score S is obtainedu,e
2) Combining the discussion about the topic matching and the preference calculation of the users and the events, firstly, extracting the event topics through an L DA topic model and obtaining the topic matching degree score of the users and the events, secondly, respectively constructing preference models of the users and the events according to the context information of the user events in the EBSN, obtaining the two-way preference score of the user events through a BPR learning algorithm, and finally, obtaining the topic matching degree score
Figure RE-GDA0002533677390000201
Preference rating in both directions with user events Su,eLinear weighted summation to obtain final user-event pair recommendation degree score SRecAs shown in formula (31).
Figure RE-GDA0002533677390000202
Where γ is a weighting parameter, typically set manually from experience, the optimal setting will be determined experimentally. Algorithm 4 describes a process that fuses topic matching and two-way preference solving for user-event to final recommendation score.
Figure RE-GDA0002533677390000203
Algorithm 4 presents a process that ultimately fuses the topic matching score and the user event two-way preference score. Firstly, performing sequencing learning on a training set generated by a user preference score set and an event preference score set through a Bayes personalized sequencing algorithm to obtain an optimal weight vector theta, and calculating a user-event pair two-way preference score (from line 2 to line 10) of a target user according to the theta; and secondly, linearly combining the topic matching degree score and the two-way preference score of the user-event pair to obtain a final recommendation degree score (from the 11 th line to the 13 th line), and recommending the TOP-K events for the user according to the final recommendation degree score.
So far, we propose a personalized event recommendation scheme by combining topic matching and user event two-way preferences, and detailed contents thereof in the above section.
And an implementation system for personalized event recommendation fusing topic matching and two-way preference, which is used for implementing the personalized event recommendation method fusing topic matching and two-way preference as described in any one of the above, the implementation system comprising:
the document theme generation module is used for extracting themes of the user historical events and the new events, calculating theme distribution and word distribution of the events, expressing theme matching degree according to theme similarity between the user historical events and the new events, and fusing the theme matching degree into a recommendation model as one of recommended key factors to recommend the events;
the user preference building module is used for building the single-factor preference of the user from the three aspects of geographic position, social relation and time factor, and weighting and fusing the three single-factor preferences to obtain the overall preference of the user;
constructing an event preference module, and expressing the preference of the event by using the social influence of the event host in the group and the popularity of the event host geographical position in the group;
the user event bidirectional preference scoring module is used for solving the weight parameters of the user preference score and the event preference score by using a sequencing learning algorithm to obtain a user event bidirectional preference score;
and the final recommendation scoring module of the user-event pair is used for linearly weighting and combining the theme matching degree score and the two-way preference score to obtain the final recommendation degree score of the user-event pair.
Further, the user preference module comprises a geographic location preference module, a social relationship preference module, and a time factor preference module, and the event preference module comprises an event location popularity preference module and an event host influence preference module, wherein:
the geographic location preference module is used for expressing a geographic location preference score by predicting the probability of a user participating in an event held at a certain geographic location;
the social relationship preference module is used for calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group;
the time factor preference module is used for constructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user;
the event location popularity preference module is used for selecting important places for interested users when recommending new events, is called the popularity of the geographic location in a user group, and can calculate the attractiveness of the event to the users more accurately by considering the popularity of the event geographic location;
the event host influence preference module is used for improving the recommendation accuracy according to the influence of the event host on the group where the target user is located, and calculating the influence of the event host on the target user and the influence of the event host in the group.
According to the personalized event recommendation method and system based on the fusion topic matching and the two-way preference, firstly, the topic information of an event is extracted by using a document topic generation model L DA, the user topic information is obtained according to a historical event record participated by a user, the topic matching degree of the user and the event is calculated to serve as an important recommendation factor in a recommendation model, the topic factor can better represent characteristic preference, secondly, a preference model of the user and the event is built according to the two-way view of the user and the event for the event-based social network recommendation, the preference score of the user and the preference score of the event are respectively obtained, the preference relationship between the user and the event is more completely mined from the two angles of the user and the event, finally, the user-event pair matching degree and the two-way preference of the user and the event are combined in a linear weighting mode to obtain the final comprehensive score of the user-event pair, the sequenced TOP-K user-event pairs serve as recommendation results, a large number of experiments are carried out on a Meetup real data set, and compared with other event recommendation algorithms, and the performance of the traditional software recommendation algorithm is better than that the personalized preference of the user can be predicted.
It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention, and those skilled in the art can make various modifications and changes. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A personalized event recommendation method fusing topic matching and two-way preference is characterized by comprising the following steps:
step one, extracting theme information of an event by using a document theme generation model L DA, obtaining user theme information according to a historical event record participated by a user, calculating themes of a new event and a user historical event, and calculating a theme matching degree score of a user-event pair by adopting cosine similarity;
step two, respectively constructing a user preference model and an event preference model, and respectively calculating a user preference score and an event preference score;
and thirdly, learning the weight parameters of the user preference scores and the event preference scores by using a Bayesian personalized ranking algorithm BPR to obtain user event two-way preference scores, linearly weighting and combining the theme matching degree scores and the two-way preference scores to obtain final recommendation scores of user-event pairs, and recommending the top K ordered events to the user.
2. The method for recommending personalized events with fused topic matching and two-way preferences according to claim 1, wherein said document topic generation model L DA in step one has a three-layer Bayesian network structure, comprising documents, topics and words, wherein document-topic and topic-words are subject to polynomial distribution, each document selects a topic with a certain probability and selects a word from this topic with a certain probability, and topics in any document are subject to Dirichlet distribution, and relationships between texts are discovered through the distribution.
3. The personalized event recommendation method combining topic matching and two-way preference according to claim 2, wherein the step one of calculating the topics of the new event and the historical events of the user and calculating the topic matching degree score of the user-event pair by using cosine similarity comprises the following specific steps:
step 1-1, forming a document set D from all event description contents, removing stop words, inputting the document set D into a document theme generation model L DA, and respectively obtaining the theme distribution of each event;
removing stop words and punctuation marks from all event contents, regarding the document contents after removing noise interference words as a set D of all documents, inputting the set D into an L DA topic model, and generating a document DiIs given by (1), the joint distribution p (ω, z | α) of topics and words of (1):
Figure FDA0002376881330000011
two unknown parameters in the model were then estimated using the Gibbs sampling method: event topic distribution
Figure FDA0002376881330000012
And subject term distribution v;
step 1-2, calculating topic distribution similarity between the historical event and the new event of the target user according to a JS divergence algorithm;
a topic distribution for all events has been generated according to equation (1)
Figure FDA0002376881330000013
Given event edp and edqRespectively have a theme distribution
Figure FDA0002376881330000014
Firstly calculating JS divergence between the two through a JS divergence method
Figure FDA0002376881330000015
As shown in formula (2):
Figure FDA0002376881330000016
wherein ,Djs∈[0,1],DKLExpressing the divergence of K L, which describes the difference between two probability distributions p and q, the calculation formula is shown in equation (3):
Figure FDA0002376881330000021
combining formula (2) and formula (3) to obtain event edp and edqHas a topic similarity of StopicAs shown in formula (4):
Figure FDA0002376881330000022
wherein the topic similarity S of the eventtopicIs at a value of [0,1]]In (3), the closer the value is to 1, the higher the event similarity is;
step 1-3, averaging the similarity of all historical events of a target user to obtain a theme matching degree score of the user and a new event;
with EuRepresenting the number of historical events of the target user, taking the average of all the similarities of the target user
Figure FDA0002376881330000023
As the topic matching degree score of the user and the new event, the topic matching degree score is represented by formula (5):
Figure FDA0002376881330000024
according to the structureA topic matching model is built, finally
Figure FDA0002376881330000025
To measure the subject matching relationship between the target user and the new event.
4. The method for recommending personalized events fusing topic matching and two-way preferences according to claim 3, wherein the constructing of the user preference model in the second step constructs the single-factor preferences of the user from three aspects of geographic location, social relationship and time factor, which specifically comprises:
step 2-1-1, constructing a geographic position preference model:
the geographic location preference model calculates the probability that a target user will participate in holding an event at the location, a Kernel Density Estimation (KDE) method is adopted to model the two-dimensional geographic location distribution of the event participated by the user, the event participation probability after normalization is used for representing the preference degree of the user to the geographic location, the longitude and latitude coordinates of the geographic location of the event are represented by (L x, L y), the place set of the historical participation event of the user is represented by L (u), and a KDE function about the user u
Figure FDA0002376881330000026
As shown in formula (6):
Figure FDA0002376881330000027
wherein ,li=(Lxi,Lyi)TTwo-dimensional vector, m, representing longitude and latitude coordinates of an event locationl(u,li) Indicating user u attended geographic location liThe frequency of the hosted event, σ represents the size of the neighborhood window (bandwidth), N represents the number in the location sample, K (·) represents a gaussian kernel defined as shown in equation (7):
Figure FDA0002376881330000028
combining equations (6) and (7) may define the probability that user u will attend an event that will be held at location l, as shown in equation (8):
Figure FDA0002376881330000029
normalizing the probability to obtain a preference score S of the user about the geographic positionG(u, l) is represented by the formula (9).
Figure FDA0002376881330000031
The denominator represents the maximum event participation probability of the target user;
step 2-1-2, constructing a social relationship preference model:
in a user social relationship network, a user can add at least one or more interest groups on line, choose to participate in event activities published by different groups, and judge social relationship preference of the user through the online same-group relationship of the user, wherein the same-group relationship mainly comprises two interactive relationships;
first, the relevance of users to groups is defined as the interaction between users and all groups they belong to and between users and events created in the groups, and G (u) represents the set of groups to which events the users u participate belong, so the relevance of users to groups
Figure FDA0002376881330000032
Can be expressed as shown in formula (10):
Figure FDA0002376881330000033
wherein ,mp(u, g) represents a set of event activities in which user u has participated in the group in which the user is present;
second, the relevance of users in the group is defined by the similarity of friends in the group where the target user is located, and the similarity s (u, g) between the target user and the users in the group is calculated, as shown in formula (11):
Figure FDA0002376881330000034
wherein sim (u)i,uj) Representing users u in the same groupiAnd user ujThe similarity between the two is shown as a formula (12);
Figure FDA0002376881330000035
normalizing s (u, g) to
Figure FDA0002376881330000036
As shown in formula (13):
Figure FDA0002376881330000037
in conjunction with the two interactions described above, users belonging to the same group tend to attend events created by other users within those groups, and the association of users with and within the group is combined to derive a social preference score S for user u with respect to online group gI(u, g) represented by the formula (14):
Figure FDA0002376881330000038
α∈ [0,1] is used as a weight control parameter, in the social relationship network, the preference association of the target user and the group is set to be as important as the association between the users in the group, and the value of α is set to be 0.5 through experimental verification;
step 2-1-3, constructing a time factor preference model:
the time factor of the event is an important preference factor which needs to be considered when calculating the preference of the user; representing a new event e which the user can choose to attend as a 7 x 24 dimensional event time vector
Figure FDA0002376881330000039
When a new event occurs in a certain specific time period of a week, setting the vector component value of the time period to be 1, otherwise, setting the vector component value to be 0; representing users as user time vectors based on historical event records of user participation in a time preference model
Figure FDA0002376881330000046
As shown in equation (15):
Figure FDA0002376881330000041
wherein ,EuRepresenting the historical event set participated by the target user, and then calculating cosine similarity s (u, e) between the user time vector and the new event time vector, as shown in formula (16):
Figure FDA0002376881330000042
for new event e, user ui∈ U the similarity s (U) can be obtained from the equation (16)iE) normalizing the similarity to obtain a time preference score S of the user for the eventT(uiAnd e) is represented by formula (17):
Figure FDA0002376881330000043
5. the method for recommending personalized events fusing topic matching and two-way preference according to claim 4, wherein the calculating the user preference score in the second step specifically comprises:
for the geographic location preference model, representing a geographic location preference score by predicting a probability of a user engaging in an event hosted by the location; for the social relationship preference model, calculating a social preference score of the target user from two aspects of the relationship between the target user and the group and the relevance between the target user and the users in the group; preference model for the time factorConstructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user based on the unified vector representation; combining the three single-factor preferences to form a user preference perception model, and linearly combining the three single-factor preferences to obtain the total preference score S of the user u to the event euserAs shown in formula (18):
Figure FDA0002376881330000044
wherein ,SG、SI、STRespectively representing the preference scores of the users on three single factors, namely geographical position, social relation and time factor.
6. The personalized event recommendation method integrating topic matching and two-way preference as claimed in claim 5, wherein the constructing of the event preference model in the second step constructs the single-factor preference of the event from two aspects of event location popularity and event host influence respectively, and specifically comprises:
step 2-2-1, constructing an event position popularity preference model:
calculating the popularity of the geographic position according to the visiting frequency of the user u and the users in the online group g which the user u joins;
first, an event geographic location l is definedePopularity p (l) with respect to user ueU) is represented by formula (19):
Figure FDA0002376881330000045
wherein the molecule ml(u,le) Joining a geographic location l for user ueThe frequency of the held events is controlled, and the denominator is the maximum frequency of the positions historically visited by the user u; likewise, a geographic location l is definedePopularity p (l) of group g with respect to user ueG) is represented by formula (20):
Figure FDA0002376881330000051
wherein, the numerator represents the frequency of each user in the group g participating in practical activities at the position l, and the denominator is the maximum frequency of the positions historically visited by the group members, thereby calculating the geographic position lePopularity with respect to users in group g; binding to p (l)eU) and p (l)eG) defining the total popularity of the hosting location of the event to be recommended to the target user u as P (l)eU, g) as shown in formula (21):
P(le,u,g)=αp(le,u)+(1-α)p(le,g) (21)
step 2-2-2, constructing an event host influence preference model:
firstly, the influence degree of an event host on a target user selects to express the implicit preference of the event through the credibility or influence degree of the host; defining the influence I (e, u) of the event on the user u, as shown in formula (22):
Figure FDA0002376881330000052
wherein ,mh(u,uh) Representing host u participated in by user uhSet of events held, EhIs the host uhAll event sets held;
secondly, the influence degree of the event host in the group is expressed by the frequency proportion of the users participating in the group for the online group where the target user is located, and the influence degree of the users in the group is expressed by I (e, g), as shown in formula (23):
Figure FDA0002376881330000053
wherein ,UgRepresents a small group uhSet of users in (m)h(ui,uh) Representing user uiParticipating by host uhSet of events held, Eh(g) Represents uhIn subgroup uhA set of events held in; and (3) calculating a comprehensive influence score I (e, u, g) of the event host according to the influence of the event host on the target users and the users in the group, wherein the formula (24) is as follows:
I(e,u,g)=αI(e,u)+(1-α)I(e,g)(24)
7. the personalized event recommendation method combining topic matching and two-way preference according to claim 6, wherein the calculating of the event preference score in the second step specifically comprises:
for new events which do not occur, representing the preference of the event by calculating the event position popularity and the event host influence of the new event; popularity P (l) to constructed event locationeU, g) and event host influence I (e, u, g) are linearly combined, and preference score S of event e to user u is calculatedeventsAs shown in formula (25):
Figure FDA0002376881330000054
8. the personalized event recommendation method integrating topic matching and two-way preference according to claim 7, wherein the step three of obtaining the user event two-way preference score and linearly weighting and combining the topic matching degree score and the two-way preference score to obtain the final recommendation score of the user-event pair comprises the following specific steps:
step 3-1, solving two-way preference for user-event pairs:
suppose the preference scoring weights of the user and the event are theta1 and θ2Weighting and fusing the two to obtain a user event bidirectional preference score Su,e=θ1Suser2Sevents(ii) a Converting the problem of two-way preference scoring into a weight vector for solving two preference scores, and selecting implicit feedback as training data to learn the weight vector;
selecting a learning algorithm BPR based on Bayesian maximum likelihood estimation to perform sequencing learning on the weights, and learning the correct sequencing sequence of the user-event pairs according to implicit feedback data of the user on the events so that the events participated by the user are arranged in front of new events or other events; first, a maximum posterior probability p (θ | R) is defined, as shown in equation (26):
p(θ|R)∝p(R|θ)p(θ) (26)
where θ represents a weight vector, R represents a set of all user-event pairs, and p (R | θ) is defined as shown in equation (27);
Figure FDA0002376881330000061
wherein R in the formulauRepresents a user-event pair of user u, and p (e)i>ej) Representing events e for user uiIs arranged at ejThe probability of the foregoing, as shown in equation (28):
p(ei>ej|θ)=σ(s(u,ei)-s(u,ej)) (28)
wherein S (u, e) is the two-way preference score Su,e
Figure FDA0002376881330000062
For more convenient optimization, assuming θ follows a normal distribution with a mean value of 0, the final optimization objective function lnp (θ | R) is derived by expansion, as shown in equation (29):
Figure FDA0002376881330000063
the lambda represents a regular term coefficient, and an optimal weight parameter vector is obtained by maximizing an optimization objective function through implicit interactive feedback data of a user event; solving the optimization problem by adopting a random gradient descent algorithm SGD, randomly extracting a user-event pair of a target user from a training set in an iterative process to update a weight vector theta, wherein the updating process is shown as a formula (30):
Figure FDA0002376881330000064
where α is the learning rate, sij=s(u,ei)-s(u,ej) Through the learning process, the weight vector theta can be automatically obtained according to the user event preference score training set and the hyperparameters α and lambda, so that the two-way preference score S is obtainedu,e
Step 3-2, solving a final recommendation score of the user-event pair by combining topic matching and two-way preference:
firstly, extracting event topics through an L DA topic model and obtaining topic matching degree scores of users and events, secondly, respectively constructing preference models of the users and the events according to user event context information in the EBSN, obtaining user event two-way preference scores through a BPR learning algorithm, and finally, scoring the topic matching degree scores
Figure FDA0002376881330000066
Preference rating in both directions with user events Su,eLinear weighted summation to obtain final user-event pair recommendation degree score SRecAs shown in formula (31):
Figure FDA0002376881330000065
where γ is a weighting parameter, typically set manually from experience, the optimal setting will be determined experimentally.
9. An implementation system for personalized event recommendation fusing topic matching and two-way preference, which is used for implementing the personalized event recommendation method fusing topic matching and two-way preference according to any one of claims 1-8, and is characterized in that the implementation system comprises:
the document theme generation module is used for extracting themes of the user historical events and the new events, calculating theme distribution and word distribution of the events, expressing theme matching degree according to theme similarity between the user historical events and the new events, and fusing the theme matching degree into a recommendation model as one of recommended key factors to recommend the events;
the user preference building module is used for building the single-factor preference of the user from the three aspects of geographic position, social relation and time factor, and weighting and fusing the three single-factor preferences to obtain the overall preference of the user;
constructing an event preference module, and expressing the preference of the event by using the social influence of the event host in the group and the popularity of the event host geographical position in the group;
the user event bidirectional preference scoring module is used for solving the weight parameters of the user preference score and the event preference score by using a sequencing learning algorithm to obtain a user event bidirectional preference score;
and the final recommendation scoring module of the user-event pair is used for linearly weighting and combining the theme matching degree score and the two-way preference score to obtain the final recommendation degree score of the user-event pair.
10. The system of claim 9, wherein the user preference module comprises a geo-location preference module, a social relationship preference module, and a time factor preference module, and the event preference module comprises an event location popularity preference module and an event host influence preference module, wherein:
the geographic location preference module is used for expressing a geographic location preference score by predicting the probability of a user participating in an event held at a certain geographic location;
the social relationship preference module is used for calculating the social preference score of the target user from the relationship between the target user and the group and the relevance between the target user and the users in the group;
the time factor preference module is used for constructing unified vector representation of two granularities of date and hour, and calculating the similarity of a user-event pair as a time preference score of a target user;
the event location popularity preference module is used for selecting important places for interested users when recommending new events, is called the popularity of the geographic location in a user group, and can calculate the attractiveness of the event to the users more accurately by considering the popularity of the event geographic location;
the event host influence preference module is used for improving the recommendation accuracy according to the influence of the event host on the group where the target user is located, and calculating the influence of the event host on the target user and the influence of the event host in the group.
CN202010069262.9A 2020-01-21 2020-01-21 Personalized event recommendation method and system integrating theme matching and bidirectional preference Active CN111428127B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010069262.9A CN111428127B (en) 2020-01-21 2020-01-21 Personalized event recommendation method and system integrating theme matching and bidirectional preference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010069262.9A CN111428127B (en) 2020-01-21 2020-01-21 Personalized event recommendation method and system integrating theme matching and bidirectional preference

Publications (2)

Publication Number Publication Date
CN111428127A true CN111428127A (en) 2020-07-17
CN111428127B CN111428127B (en) 2023-08-11

Family

ID=71551509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010069262.9A Active CN111428127B (en) 2020-01-21 2020-01-21 Personalized event recommendation method and system integrating theme matching and bidirectional preference

Country Status (1)

Country Link
CN (1) CN111428127B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100483A (en) * 2020-08-07 2020-12-18 西安工程大学 Association rule recommendation method fusing user interest weight
CN113032662A (en) * 2021-03-31 2021-06-25 龙关玲 Block chain big data recommendation method and system based on artificial intelligence and cloud platform
CN113706325A (en) * 2021-07-30 2021-11-26 西安交通大学 Planning method and system for event-oriented social network
CN114564652A (en) * 2022-04-29 2022-05-31 江西财经大学 Personalized gift recommendation method and system based on user intention and two-way preference
CN116089712A (en) * 2022-12-29 2023-05-09 无锡东方健康科技有限公司 Hot conference recommending method and system based on data mining and analysis

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090012841A1 (en) * 2007-01-05 2009-01-08 Yahoo! Inc. Event communication platform for mobile device users
US20110213655A1 (en) * 2009-01-24 2011-09-01 Kontera Technologies, Inc. Hybrid contextual advertising and related content analysis and display techniques
CN102282556A (en) * 2008-11-25 2011-12-14 谷歌公司 Providing digital content based on expected user behavior
US20150262069A1 (en) * 2014-03-11 2015-09-17 Delvv, Inc. Automatic topic and interest based content recommendation system for mobile devices
CN106250513A (en) * 2016-08-02 2016-12-21 西南石油大学 A kind of event personalization sorting technique based on event modeling and system
CN107657034A (en) * 2017-09-28 2018-02-02 武汉大学 A kind of event social networks proposed algorithm of social information enhancing
CN107967257A (en) * 2017-11-20 2018-04-27 哈尔滨工业大学 A kind of tandem type composition generation method
CN108763362A (en) * 2018-05-17 2018-11-06 浙江工业大学 Method is recommended to the partial model Weighted Fusion Top-N films of selection based on random anchor point
CN110083531A (en) * 2019-04-12 2019-08-02 江西财经大学 It improves the shared multi-goal path coverage test method of individual information and realizes system
US20190340245A1 (en) * 2016-12-01 2019-11-07 Spotify Ab System and method for semantic analysis of song lyrics in a media content environment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090012841A1 (en) * 2007-01-05 2009-01-08 Yahoo! Inc. Event communication platform for mobile device users
CN102282556A (en) * 2008-11-25 2011-12-14 谷歌公司 Providing digital content based on expected user behavior
US20110213655A1 (en) * 2009-01-24 2011-09-01 Kontera Technologies, Inc. Hybrid contextual advertising and related content analysis and display techniques
US20150262069A1 (en) * 2014-03-11 2015-09-17 Delvv, Inc. Automatic topic and interest based content recommendation system for mobile devices
CN106250513A (en) * 2016-08-02 2016-12-21 西南石油大学 A kind of event personalization sorting technique based on event modeling and system
US20190340245A1 (en) * 2016-12-01 2019-11-07 Spotify Ab System and method for semantic analysis of song lyrics in a media content environment
CN107657034A (en) * 2017-09-28 2018-02-02 武汉大学 A kind of event social networks proposed algorithm of social information enhancing
CN107967257A (en) * 2017-11-20 2018-04-27 哈尔滨工业大学 A kind of tandem type composition generation method
CN108763362A (en) * 2018-05-17 2018-11-06 浙江工业大学 Method is recommended to the partial model Weighted Fusion Top-N films of selection based on random anchor point
CN110083531A (en) * 2019-04-12 2019-08-02 江西财经大学 It improves the shared multi-goal path coverage test method of individual information and realizes system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
钱忠胜;缪淮扣;: "面向用户会话的Web应用测试用例生成及其优化", no. 06 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100483A (en) * 2020-08-07 2020-12-18 西安工程大学 Association rule recommendation method fusing user interest weight
CN112100483B (en) * 2020-08-07 2023-09-19 西安工程大学 Association rule recommendation method fusing user interest weights
CN113032662A (en) * 2021-03-31 2021-06-25 龙关玲 Block chain big data recommendation method and system based on artificial intelligence and cloud platform
CN113032662B (en) * 2021-03-31 2021-11-26 艾普深瞳(北京)智能科技有限公司 Block chain big data recommendation method and system based on artificial intelligence and cloud platform
CN113706325A (en) * 2021-07-30 2021-11-26 西安交通大学 Planning method and system for event-oriented social network
CN114564652A (en) * 2022-04-29 2022-05-31 江西财经大学 Personalized gift recommendation method and system based on user intention and two-way preference
CN114564652B (en) * 2022-04-29 2022-09-27 江西财经大学 Personalized gift recommendation method and system based on user intention and two-way preference
CN116089712A (en) * 2022-12-29 2023-05-09 无锡东方健康科技有限公司 Hot conference recommending method and system based on data mining and analysis
CN116089712B (en) * 2022-12-29 2024-03-29 无锡东方健康科技有限公司 Hot conference recommending method and system based on data mining and analysis

Also Published As

Publication number Publication date
CN111428127B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
US11659050B2 (en) Discovering signature of electronic social networks
CN111428127A (en) Personalized event recommendation method and system integrating topic matching and two-way preference
Guo et al. Combining geographical and social influences with deep learning for personalized point-of-interest recommendation
Christensen et al. Social group recommendation in the tourism domain
US9483580B2 (en) Estimation of closeness of topics based on graph analytics
US10592535B2 (en) Data flow based feature vector clustering
US10785181B2 (en) Sharing content to multiple public and private targets in a social network
CN110889434B (en) Social network activity feature extraction method based on activity
US11205128B2 (en) Inferred profiles on online social networking systems using network graphs
CN107657034A (en) A kind of event social networks proposed algorithm of social information enhancing
US11245649B2 (en) Personalized low latency communication
Jia et al. Collaborative restricted Boltzmann machine for social event recommendation
Gu et al. Context aware matrix factorization for event recommendation in event-based social networks
Smiljanić et al. A theoretical model for the associative nature of conference participation
CN110222273B (en) Business point promotion method and system in social network based on geographic community
CN110008411A (en) It is a kind of to be registered the deep learning point of interest recommended method of sparse matrix based on user
CN111125507B (en) Group activity recommendation method and device, server and computer storage medium
CN111143700B (en) Activity recommendation method, activity recommendation device, server and computer storage medium
CN116049549A (en) Activity recommendation method based on multi-granularity feature fusion
CN114417166A (en) Continuous interest point recommendation method based on behavior sequence and dynamic social influence
CN112784177A (en) Spatial distance adaptive next interest point recommendation method
Ngamsa-Ard et al. Point-of-interest (POI) recommender systems for social groups in location based social networks (LBSNs): Proposition of an improved model
CN113032685B (en) Object pushing method, device, equipment and storage medium based on social relationship
Krishna et al. Simplifying Sparse Expert Recommendation by Revisiting Graph Diffusion
Liu et al. PCRM: Increasing POI Recommendation Accuracy in Location-Based Social Networks.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant