CN107784124B

CN107784124B - LBSN (location based service) hyper-network link prediction method based on space-time relationship

Info

Publication number: CN107784124B
Application number: CN201711182961.9A
Authority: CN
Inventors: 胡敏; 陈元会; 黄宏程
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2017-11-23
Filing date: 2017-11-23
Publication date: 2021-08-24
Anticipated expiration: 2037-11-23
Also published as: CN107784124A

Abstract

The invention relates to a space-time relationship based LBSN (location based service) hyper-network link prediction method, belonging to the field of data mining. The method comprises the following steps: s1: acquiring a data source; s2: constructing a hyper-network model; s3: defining and quantizing the hyper-network edge weight; s4: based on the model, firstly, various types of weighted super-edge structures are constructed, different semantic relationships among users are mined through different structures, and finally, model parameters are trained through a gradient descent method, so that the link relationship in the network is predicted. According to the method, various incidence relations among nodes can be effectively mined through the weighted super-edge structure, the problem of sparsity in a network can be solved, the noise immunity and stability of a model can be improved, and the prediction accuracy is greatly improved.

Description

LBSN (location based service) hyper-network link prediction method based on space-time relationship

Technical Field

The invention belongs to the field of data mining, and relates to a space-time relationship-based LBSN (location based service) hyper-network link prediction method.

Background

With the continuous development of computer information technology and the rapid popularization of the internet, an online Social platform becomes an indispensable part in the life of people, people can establish a friend relationship Network of themselves through the online Social platform to perform instant communication and interaction with friends, which greatly facilitates the life of people, and especially in recent years, the appearance of a Location-Based Social Network (LBSN) enables some Location services to be advocated by a large number of users in a short time, so that great success is achieved. In the LBSN, a user can check in at the position where the user has gone, and can share the check-in place of the user with friends, and the check-in behavior can truly reflect the position activity of the user, so that close contact is established between an online virtual world and an offline real world, and new opportunities and challenges are brought to social network link prediction.

At present, link prediction can be mainly divided into two methods: a similarity-based approach and a learning-based approach. The main idea of the similarity-based method is to calculate a similarity score between any two node pairs without links, and the higher the score, the more likely a link is generated between them. Representative methods include common neighbor index (CN), preferential link index (PA), Adamic/Adar index (AA), Jaccard coefficient, Katz, Rooted PageRank, etc.; the learning-based method is used for converting the link prediction problem into a two-classification problem, and the method is characterized in that key features influencing link generation are found, feature vectors are established through the features, and efficient model learning is carried out, so that accurate prediction is realized.

A heterogeneous social network is a network that contains multiple types of nodes and edges, while a location-based social network belongs to a heterogeneous social network, mainly embodied in that it has user nodes, location nodes, category nodes, user-location edges, user-user edges, and so on. At present, most link prediction researches are mainly focused on homogeneous networks, namely, only one type of nodes (user nodes) and edges (user-user edges) are arranged in the networks, and researches on heterogeneous networks are relatively few, so that most link prediction methods based on the homogeneous networks are not applicable any more. For the link prediction of the heterogeneous network, some people research from the perspective of time information, and the time of link establishment or the correlation of two users on the time-space relationship are considered, so that the research shows that the accuracy of prediction can be effectively improved by considering the time factor; there is also a research on the basis of meta-path, which refers to a path connecting different nodes, and the path has certain semantic information, for example, user-comedy-user indicates that both users like comedy, and the correlation between nodes is calculated by random walk and the like on the basis of different semantic paths. The method solves the problem of data sparsity, however, most researches only aim at an unweighted network, and for a weighted network, the network weight characteristic still needs to be considered in a targeted manner; the existing method predicts the links in the network by constructing a weighted hyper-triangle structure based on a weighted hyper-network model, however, the existing weighted hyper-triangle structure can only capture the influence of the field nodes on the formation of the links, and other richer weighted hyper-edge structures are integrated, so that the problem of data sparsity can be relieved, and meanwhile, the prediction accuracy can be further improved. In addition, the existing method based on the hyper-network cannot utilize the time information, so that the accuracy of the method still has great improvement space.

Disclosure of Invention

In view of this, the present invention provides a space-time relationship-based lbs n hyper-network link prediction method, which provides a "space-time-user-location-category" four-layer weighted hyper-network model for the heterogeneity of heterogeneous social networks and the space-time correlation characteristics between users, and effectively incorporates time information into the hyper-network model. The edge weight of the weighted hyper-network is corrected by considering the implicit behavior among users, the potential position incidence relation and the user preference, and the model interpretability is improved. And finally, defining a super edge and a super edge structure based on the modified weighted super network model, and mining the incidence relation between users based on the super edge and the super edge structure.

In order to achieve the purpose, the invention provides the following technical scheme:

a LBSN hyper-network link prediction method based on a space-time relationship comprises the following steps:

s1: acquiring a data source; acquiring data information with high accuracy and reliability from the existing large social network platform; the acquired data content comprises friend relationships among users, comments and scores of the users on the positions, comment time, longitude and latitude of the positions and the types of the positions;

s2: constructing a hyper-network model; the method comprises the steps of constructing a space-time sub-network, a social sub-network, a position sub-network and a category sub-network, wherein the space-time sub-network is constructed by using the sign-in time of a user to a position and is used for mining the space-time similarity between the users;

s3: defining and quantizing the hyper-network edge weight; defining edge weight values in a hyper-network model through four different modes of user influence, implicit association relation, user preference and node degree information;

s4: through the process of S1-S3, a weighted super-network model is constructed, based on the model, various types of weighted super-edge structures are firstly constructed, different semantic relationships among users are mined through different structures, and finally model parameters are trained through a gradient descent method, so that the link relationship in the network is predicted and divided into a time space layer, a user layer, a position layer and a category layer.

Further, the step S2 specifically includes:

extracting a friend relationship list of a user, a check-in relationship list of the user and the category information of the position through the original data information;

s21: extracting time-space nodes through the sign-in time of the user; the spatio-temporal node means that if two or more users commonly visit a certain position in a certain time period, the position is defined as a spatio-temporal node; the spatio-temporal nodes reflect the interest preference of a user at a specific position at a specific time;

s22: constructing a space-time-user-position-category four-layer hyper-network model; the method comprises the following steps of dividing the method into a space-time subnet, a social subnet, a position subnet and a category subnet; the incidence relation among the four layers of sub-networks is that users can visit some interest points under some types according to own interest preference, check in, comment and score the interest points, and if the users have special interest preference in a specific time, the users can be associated by the same time-space node; so far, the construction of the four-layer sub-network under the social network based on the position is completed.

Further, the step S3 specifically includes:

s31: the user-user weight is strengthened through user influence; in a location-based social network, the influence of each user is different; dividing the influence of the users into individual influence of the users and influence among the users, and measuring the influence through a following network and a following behavior respectively;

defining the following behavior: if the user v signs in the place where the friend u signs in, the user v is considered to generate a following behavior for the user u, and a directed edge from v to u is correspondingly generated;

definition following network G_f＝(V_f,E_f): wherein G is_fRepresenting a directed network formed by following behaviors, V_fRepresenting users in a follows network, E_fRepresenting directed edges resulting from the follow-up behavior;

s311: individual influence of the user I_u: the method is used for measuring the influence of the user on other users in the network due to the self behavior; taking into account different time periods by dividing time slices

The following behavior of the user in each time slice forms a corresponding following network by the influence of the user, and S time slices are divided, t_sFor the s-th time slice, the user's final individual influence is contributed by the individual influence in each time slice, and the time slices that are further away from the current time have their individual influences attenuated more;

considering existence of isolated nodes in the network, the user individual influence is solved by adopting a LeaderRank algorithm, and an iterative formula is as follows:

wherein N is_uA neighbor node representing user u is shown,

representing the out-degree of user v; in a stable state, the leader rank uniformly distributes the scores of the group nodes to all other nodes, and the final scores of the nodes are expressed as:

I_u＝I_u(t_d)+I_g(t_d)/N

wherein I_g(t_d) The number of the nodes is the fraction of the group Node in a stable state, and N is the total number of users;

since the influence of the user decreases with time, the decay function is defined as:

W_u(t_i)＝exp(-ln2×(t_c-t_i)/t_m)

wherein t is_cIndicates the current time, t_iDenotes the ith time slice, t_mHalf-life representing a decrease in potency;

user u total value I of individual influence at current moment_uComprises the following steps:

wherein I_u(t_i) Denotes the t-th_iIndividual influence of individual time slice users u;

s312: influence between users: influence between users I_i(u, v) measuring the influence of the user u on the user v, regarding the following behavior as the interaction among the users and measuring the influence among the users;

proposing a following location ratio I_pAnd follow sign-in ratio I_cThese two metrics:

where M (v, u) represents the number of check-in places, positions, where user v follows user u_uRepresents the total number of check-in locations for user u, K (v, u) represents the total number of check-ins for user v to follow user u, Checkin_uRepresenting the total number of check-ins of user u;

the user influence I (u, v) is:

based on the user influence, quantizing the user-user edge weight, and for the node pair u and v, if the user influence of u on v is high, the corresponding edge weight should be high, and the edge weight between the user and the user is quantized as:

wherein w (u, w) is the neighbor node of the user u in the social subnet in the S, and I (u, v) represents the influence between the user and the neighbor node of the social subnet;

s32: defining and quantizing a position-position edge weight and a category-category edge weight through a hidden incidence relation;

defining an edge weight value between positions and an edge weight value between categories:

wherein geodesist (p, p ') denotes the distance between positions p and p', Max | W_pI is the maximum of the number of times two locations are associated, w (p, p ') is the number of times locations p and p' are associated by the user,

is a correlation time threshold;

where | P (c, c ') | represents the number of locations that belong to both c and category c', Max | P_cL represents the maximum value of the number of places belonging to the type c and other types at the same time;

s33: defining and quantifying a user-location edge weight by user preference; in the social network based on the position, the scoring attribute of the user to the position can intuitively reflect the preference degree of the user to the position; and (3) correcting the user-position edge weight value through an exponential function for higher weight values of positions with high preference of the user:

wherein r (u, p) is the score of user u at location p;

s34: the remaining edge weights are defined and quantized by node out-degree.

Further, the step S4 specifically includes:

s41: defining a super edge and super edge weight;

three types of super edges are defined:

class I super edge SE_I: the method comprises the following steps that a super edge only containing one type of nodes belongs to a special super edge in a super network;

class II supercede SE_II: the node pair between two adjacent layers of subnets forms an edge, and is characterized by only comprising two heterogeneous nodes;

class III supercede SE_III: the node is an edge formed by three adjacent layers of subnets and is characterized by only comprising three kinds of heterogeneous nodes;

the excess edge weight refers to the weight of each excess edge, and is obtained by calculating the edge weight contained in the excess edge;

s42: hyperlink prediction: based on the defined three types of super edges, a weighted super edge structure is provided, and the hyperlink prediction problem between users is solved through the weighted super edge structure; mining implicit semantic relations among nodes by constructing various types of super-edge structures;

s421: the weighting super-triangular structure comprises a single weighting super-triangular structure, a double weighting super-triangular structure and a large weighting super-triangular structure;

single weighted super triangle structure: calculating the similarity between user nodes through a single-weighted hyper-triangular structure formed by the space-time nodes and the user nodes, and expressing that two users like activities at the same time and at the same position; the defined super-edge structures are all closed-loop structures and have directivity;

double-weighted super-triangular structure: the finger comprises two continuous weighted super-triangular structures;

the heavy-weighted super-triangular structure: the finger is a triangular structure consisting of two three types of overedges;

s422: weighted hyper-rectangular structure: the user node likes to be active at two related spatio-temporal nodes, and its weight is the product of the corresponding excess weights:

s423: weighted super-hybrid architecture: the system comprises a weighted super-hybrid I structure and a weighted super-hybrid II structure; the definitions are respectively:

weighted super-hybrid I structure: the mixed I structure is formed by adding a class of super edges on the basis of a single triangular structure;

weighted super-hybrid II architecture: the mixed II structure is formed by adding a class of super edges on the basis of a rectangular structure;

the deeper the hierarchy is, the longer the associated link is, and the richer the super-edge structure is;

the weighted super-edge structure comprises: a weighted hyper-triangular structure, a weighted hyper-rectangular structure, and a weighted hyper-hybrid structure; different structures have different degrees of influence on the link prediction, so their similarity is expressed as:

S(u,v)＝θ₁W_S1(u,v)+θ₂W_S2(u,v)+......+θ₁₉W_S19(u,v)

wherein theta is_iThe weight of the ith weighted super-edge structure is obtained by training through a gradient descent method; the parameter updating process comprises the following steps:

wherein, theta_i-oldRepresenting the weight, θ, before iterative training_i-newRepresenting the weight after iterative training, lambda represents the learning step length, and y represents whether a link exists between users; when the variation value of each parameter is less than a certain threshold value, the updating of the parameters is converged to obtain an optimal parameter set theta⁺And finally using the optimal parameter set theta⁺Predicting the link relation among users, when the y value is 1, considering that the link among the users exists, otherwise, considering that the link among the users does not exist, and the definition formula is as follows:

the invention has the beneficial effects that: according to the method, various incidence relations among nodes can be effectively mined through the weighted super-edge structure, the problem of sparsity in a network can be solved, the noise immunity and stability of a model can be improved, and the prediction accuracy is greatly improved.

Drawings

In order to make the object, technical scheme and beneficial effect of the invention more clear, the invention provides the following drawings for explanation:

FIG. 1 is a general flow diagram of the present invention;

FIG. 2 is a hyper-network model based on spatio-temporal relationships in LBSN;

FIG. 3 is a two-layer hyper-network model.

FIG. 4 is a three-tier hyper-network model.

Detailed Description

Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1 and 2, the method includes: the LBSN super-network link prediction method based on the space-time relationship comprises four modules of a data acquisition module, a super-network model construction, network side weight definition and quantification.

The detailed implementation of the present invention is described in detail below.

S1: a data source is acquired. The acquired data is yelp data, and the data is an open data set of yelp websites. The obtained data content mainly comprises friend relationships among users, comments and scores of the users to the shops, longitude and latitude of the shops, belonged categories and the like.

S2: and constructing a hyper network model. In the social network based on the position, the establishment of the link is influenced by a plurality of factors, such as a time factor, a position factor, a social factor and the like. The invention integrates and runs through a plurality of factors by a method for constructing a hyper-network model and is applied to link prediction. The system is divided into four layers, namely a space-time layer, a user layer, a position layer and a category layer. The specific scheme is as follows:

s21: and constructing space-time nodes. A spatio-temporal node is defined as a spatio-temporal node if two or more users have commonly visited a location for a particular period of time. As can be seen, the spatio-temporal nodes reflect the interest preferences of a user at a particular location at a particular time, which are more likely to reflect the similarity between users than if two users had visited a location in common.

S22: and constructing a spatio-temporal-user-position-category four-layer hyper-network model. The method is mainly divided into a space-time subnet, a social subnet, a location subnet and a category subnet. The association relationship between the four layers of sub-networks can be summarized in that users can visit some interest points under some types according to own interest preference, check in, comment and score the interest points, and if the users have special interest preference in a specific time, the users can be associated with the same spatio-temporal node. So far, the construction of the four-layer sub-network under the social network based on the position is completed, and then the edge weight of the hyper-network is defined and quantified.

S3: hyper-network edge weights are defined and quantized. Because four types of nodes and ten types of edges exist in the four-layer hyper-network model, the edges in the sub-network are four, and the edges between the sub-networks are six, the invention defines the edge weight in the network by different methods, and the four methods are specifically divided into four types: based on user influence, based on implicit associations between locations, based on user preferences and based on node recency. The method comprises the following specific steps:

s31: user-to-user weights are leveraged by user influence. In a location-based social network, the influence of each user is different. If a friend has extremely low influence on us, it is difficult for us to take certain actions and contact with other people through the friend. Therefore, defining and quantifying the user-user edge weights through the influence of the user is one of the feasible methods for improving model interpretability. The invention divides the user influence into the individual influence of the user and the influence among the users, and measures the influence through the following network and the following behavior respectively.

A follow-up behavior is defined. If the user v performs a check-in at a place where the friend u checks in, the user v is considered to generate a following behavior for the user v, and a v-to-u directed edge is generated correspondingly.

Definition following network G_f＝(V_f,E_f). Wherein G is_fRepresenting a directed network formed by following behaviors, V_fRepresenting users in a follows network, E_fRepresenting the directed edges that follow the behavior.

S311: user-individual influence. Individual influence of the user I_uThe method is used for measuring the influence of the user on other users in the network due to the self behavior, and is a measurement method of a global angle. Since the individual influence changes dynamically with time, some users may be initially active, and their sign-in behavior generates many trailing edges, forming a large influence, and then the influence gradually decreases to a stable value when the activity decreases. Therefore, in order to accurately measure the influence of the user, we should consider the time factor.

The invention considers the influence of users in different time periods by dividing the time slices, and forms the following behavior of the users in each time slice into a corresponding following network

Where S time slices are divided, the user' S final individual influence is contributed by the individual influence in each time slice, and the time slices that are further away from the current time instant have their individual influences attenuated more.

In consideration of existence of isolated nodes in the network, the method adopts a LeaderRank algorithm to solve individual influence of the user. The leader rank algorithm solves the problem that the sorting result is not unique due to the isolated nodes in the Pagerank by introducing the group node, has high convergence speed and strong noise resistance, and can be well applied to the method. The iterative formula of the algorithm is described as follows:

wherein

Representing the out-degree of user v. In a steady state, the leader rank uniformly distributes the scores of the group nodes to all other nodes, so the final score of a Node can be expressed as:

I_u＝I_u(t_d)+I_g(t_d)/N (2)

wherein I_g(t_d) The scores of the group node in the stable state, and N is the total number of users.

W_u(t_i)＝exp(-ln2×(t_c-t_i)/t_m) (3)

wherein t is_cIndicates the current time, t_iDenotes the ith time slice, t_mIndicating a half-life of reduced influence.

wherein I_u(t_i) Denotes the t-th_iIndividual influence of individual time slice users u.

S312: inter-user influence. Influence between users I_i(u, v) is a method for measuring the influence of the user u on the user v, and is a method for measuring the local view angle. Generally, the greater the number of interactions between two users, the greater the impact they will have. The following behavior is regarded as the interaction among users in the invention, and the influence among the users is measured according to the interaction.

where M (v, u) represents the number of check-in places, positions, where user v follows user u_uRepresents the total number of check-in locations for user u, K (v, u) represents the total number of check-ins for user v to follow user u, Checkin_uRepresenting the total number of check-ins for user u.

From the above analysis, the user influence I (u, v) is:

based on the user influence, the user-user edge weight can be quantized, and for the node pair u and v, if the user influence of u on v is high, the corresponding edge weight should be high, so the edge weight between the user and the user is quantized as follows:

where w (u, w) e S represents the neighbor nodes of user u in the social subnet. I (u, v) represents the magnitude of the influence between the user and its social subnet neighbor nodes.

S32: and defining and quantizing the position-position edge weight and the category-category edge weight through the hidden incidence relation. If a user visits two locations continuously within a certain time threshold, then there is a certain implicit relationship between the two locations, and similarly, if two categories appear in multiple locations at the same time, there is a certain implicit relationship between the two categories, for example, it can be found from data statistics that categories Festivals and Arts & entitation often appear in category attributes of multiple locations, which implicitly indicates that there is a certain correlation between the two categories. Based on the above considerations, the edge weight value between positions and the edge weight value between categories are defined by the following formula.

Where Max | W_pI is the maximum of the number of times two locations are associated, w (p, p ') is the number of times locations p and p' are associated by the user,

the threshold value of the association times can be adjusted according to network characteristics and experimental performance.

Where | P (c, c ') | represents the number of locations that belong to both c and category c', Max | P_cL represents the maximum number of places belonging to both type c and some other type.

S33: user-location edge weights are defined and quantified by user preferences. In the social network based on the position, the scoring attribute of the position by the user can intuitively reflect the preference degree of the user to the position. For example, user u₁At p₁,p₂,p₃Three locations were scored and given a score value of 5, 3, 1, respectively, and if the user's scoring attribute for this location was not taken into account, then each user-location edge was assigned 1/3, but in practice this would be inaccurate, because if user u were to be present₁To p₃Is given a score of 1, indicating that the user is dissatisfied with the place, at which time u should be increased₁-p₁By decreasing u₁-p₃The edge weight of (2). It can be seen from this example that the user should be given a higher weight to the preferred high position, and in the present invention, the user-position edge weight is modified by an exponential function:

where r (u, p) is the score of user u at location p.

S34: the remaining edge weights are defined and quantized by node out-degree.

S4: LBSN hyper-network link prediction method based on space-time relationship. Through the process of S1-S3, a weighted hyper-network model is constructed, and a weighted hyper-edge structure is constructed for link prediction based on the model.

S41: the super-edge and super-edge weights are defined. In the lbs n hyper-network model, there are multiple types of hyper-edges, for example, an edge formed between a user node and a location node is a hyper-edge, and an edge formed between a user node and a spatio-temporal node is also a hyper-edge, and since different hyper-edges contain different numbers of heterogeneous nodes, three types of hyper-edges are defined.

Class I super edge SE_I. The class-one super edge refers to a super edge only containing one type of nodes, and belongs to a special class of super edges in a super network. For example, a super edge formed by two user nodes is called a super edge class, and the super edge class indicates the association relationship between nodes in the same-layer subnet, for example, for a social subnet, it refers to the friend relationship between users.

Class II supercede SE_II. The class II super-edge refers to an edge formed by node pairs between two adjacent layers of subnets and is characterized by only comprising two kinds of heterogeneous nodes. For example, the super-edge formed between the user and the position node or between the user and the spatio-temporal node is called as a class two super-edge.

Class III supercede SE_III. The three types of super edges refer to edges formed by three adjacent layers of subnets and are characterized by only comprising three types of heterogeneous nodes. For example, the super edges formed by the user, the location and the category nodes are called three types of super edges.

FIG. 3 is a diagram of two adjacent sub-networks, as shown in FIG. 3 and FIG. 4, wherein (T)₁-T₂) Form a class of super-edge, denoted as SE_I(T₁-T₂)。(U₁-T₁) Form a two-class super edge, denoted as SE_II(U₁-T₁)，(U₃-T₁) Also forms a two-class over edge, denoted as SE_II(U₃-T₁). FIG. 4 is an adjacent three-tier network, where (U)₁-P₁-C₁) Form a three-class super edge, and is marked as SE_III(U₁-P₁-C₁)，(U₃-P₃-C₁) Form a three-class super edge, and is marked as SE_III(U₃-P₃-C₁)。

The super-edge weight. The super-edge weight refers to the weight of each super-edge, and can be calculated by the edge weight included in the super-edge. For example, class two super edge SE in FIG. 3_II(U₁-T₁) Over-edge weight of

Three classes of hyperedge weights in FIG. 4

S41: and (4) hyperlink prediction. Based on the defined three types of hyper-edges, a weighted hyper-edge structure is provided, and the hyperlink prediction problem between users is solved through the weighted hyper-edge structure. In the conventional method, the association degree between nodes is mainly calculated by weighting a hyper-triangle structure, and the main idea is to associate two hyper-edges through co-occurrence nodes between different hyper-edges, so that the hyper-triangle structure is obtained and used for measuring the similarity between the nodes. The method is suitable for the heterogeneous network, can simply and efficiently capture the additional association between two nodes, and improves the prediction accuracy while relieving the data sparsity problem. However, the super network can describe not only the association between the homogeneous nodes but also the association between the heterogeneous nodes, so that the deeper the considered network hierarchy is, the longer the association chain is, the more the fine-grained implicit association between the nodes can be reflected. The invention excavates the implicit semantic relation between nodes by constructing various types of super-edge structures.

S411: a weighted hyper-triangular structure. The method comprises a single-weighted super-triangular structure, a double-weighted super-triangular structure and a large-weighted super-triangular structure. It is defined as follows:

a single weighted hyper-triangular structure. In FIG. 3, the space-time node T1 and the user nodes U1 and U3 can be used to formThe similarity between U1 and U3 is calculated by a single weighted hyper-triangle structure, and the semantic information expressed by the structure is that two users like to move at the same time and in the same position. If the number of the single-weighted super-triangle structures containing U1 and U3 is larger, the weight is larger, the similarity between the single-weighted super-triangle structures is considered to be larger, and the link is more likely to be generated. The super-triangular structure comprises two class-II super-edges SE_II(U₁-T₁) And SE_II(T₁-U₃) The weight of the hyper-triangle structure is the product of the corresponding hyper-edge weights, so the weights are:

it is emphasized that the super-edge structures defined by the present invention are all closed-loop structures, and have directionality. Therefore W_S3(U₁ΔU₃)≠W_S3(U₃ΔU₁) The same shall apply hereinafter.

A double weighted hyper-triangular structure. Double triangles are meant to include two consecutive weighted super-triangle structures, e.g. SE in FIG. 3_II(U₁-T₁) And SE_II(T₁-U₂) Form a weighted super-triangular structure, SE_II(U₂-T₂) And SE_II(T₂-U₃) And a weighted hyper-triangle structure is formed, the two weighted hyper-triangle structures can be combined into a double weighted hyper-triangle structure for measuring the similarity between the U1 and the U3, and the semantic information of the structure is that the users U1 and U3 both like the same position activity at the same time as the user U2. The dual weighted hyper-triangle structure weight is the product of two corresponding single weighted hyper-triangle structure weights, so the weights are:

W_S6(U₁ΔΔU₃)＝W_S3(U₁ΔU₂)·W_S3(U₂ΔU₃) (14)

a heavily weighted hyper-triangular structure. The large-weighted super-triangular structure is a triangular structure consisting of two three types of super-edges. For example, in FIG. 4, over edge SE_III(U_i-P_j-C_k) And over edge SE_III(U_i-P_j-C_k) A heavily weighted hyper-triangular structure is formed, the semantic information of which is that two users have the same category of preference. The weight is the product of two three types of excess edge weights, so the weight is:

s412: a weighted hyper-rectangular structure. In FIG. 3, over edge SE_II(U₁-T₁)，SE_I(T₁-T₂)，SE_II(T₂-U₃) A weighted hyper-rectangle structure can be formed, the weighted hyper-rectangle structure comprises two nodes of U1 and U3, and can be used for measuring the similarity between U1 and U3, the semantic information of the structure is that users U1 and U3 like to be active at two related spatio-temporal nodes, and the weight is the product of the corresponding hyper-edge weights:

s413: a weighted super-hybrid structure. The system comprises a weighted super-hybrid I structure and a weighted super-hybrid II structure. It is defined as follows:

weighted super-hybrid I structure: the mixed I structure is formed by adding a class of super edges on the basis of a single triangular structure. For example, FIG. 1 by super edge SE_II(U₁-T₁),SE_II(T₁-U₂),SE_I(U₂-U₃) The formed structure belongs to a mixed I structure, and the structure expresses that a friend U2 with semantic information of U3 likes to move at the same position and the same time as U1. The weight is the product of the corresponding single weighted super-triangle structure weight and a class of super-edge weight:

weighted super-hybrid II architecture: the mixed II structure is a structure formed by adding a class of super edges on the basis of a rectangular structure. For example, FIG. 1 by super edge SE_II(U₁-T₁)，SE_I(T₁-T₂)，SE_II(U₂-T₂)，SE_I(U₂-U₃) The structure of the composition belongs to a mixed II structure. The weight is the product of the corresponding weighted hyper-rectangular structure weight and a class hyper-edge weight:

it can be seen that the deeper the hierarchy, the longer the associated link, and the richer the super-edge structure. The present invention lists 19 of these effective weighted super-edge structures, as shown in table 1.

From the above analysis, it can be seen that different weighted super-edge structures have different semantic information, for example, the S2 structure embodies the meaning of location entropy, which is that if two users have checked-in together in a place where many people have gone, it is difficult to predict that there is a friend relationship between the two people, because it may be a coincidence, but if two users often check-in a place where few people have gone, it indicates that there may be a certain relationship between them. The popularity of a location also has an impact on link prediction, which can be effectively captured by the S2 structure. And S3 can mine the user' S short-term interest, which is interpreted herein as the interest the user may have only during a certain period of time, such as 7 pm going to a movie theater every friday. The interest only occurs in a specific time period, but the personality of the user can be better reflected.

Since different structures have different degrees of influence on the link prediction, their similarity can be expressed as:

S(u,v)＝θ₁W_S1(u,v)+θ₂W_S2(u,v)+......+θ₁₉W_S19(u,v) (19)

wherein λ_iThe weight of the ith weighted super-edge structure can be obtained by training through a gradient descent method. The parameter updating process is as follows:

where λ represents the learning step size and y represents whether a link exists between users. When the variation value of each parameter is less than a certain threshold value, the updating of the parameters is converged to obtain an optimal parameter set theta⁺. Finally, the optimal parameter set theta is utilized⁺Predicting the link relation among users, when the y value is 1, considering that the link among the users exists, otherwise, considering that the link among the users does not exist, and the definition formula is as follows:

according to the invention, time factors are integrated into the super-network model by introducing space-time nodes, then a four-layer weighted super-network model is constructed based on user influence, hidden association relation, user preference and node degree information, so that the interpretability of the model is improved, and finally, the semantic relation between users is mined through various weighted super-edge structures, so that the problem of data sparsity is solved, and meanwhile, the prediction accuracy is improved. It is emphasized that the present invention is an effective method for weighted network link prediction, which can solve the link prediction problem in weighted network well.

Finally, it is noted that the above-mentioned preferred embodiments illustrate rather than limit the invention, and that, although the invention has been described in detail with reference to the above-mentioned preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention as defined by the appended claims.

Claims

1. A LBSN hyper-network link prediction method based on a space-time relationship is characterized in that: the method comprises the following steps:

s4: through the process of S1-S3, a weighted super-network model is constructed, based on the model, various types of weighted super-edge structures are constructed firstly, different semantic relationships among users are mined through different structures, finally, model parameters are trained through a gradient descent method, the link relationship in the network is predicted, and the constructed weighted super-network model is divided into a time space layer, a user layer, a position layer and a category layer.

2. The LBSN hyper-network link prediction method based on the spatiotemporal relationship as claimed in claim 1, wherein: the step S2 specifically includes:

3. The LBSN hyper-network link prediction method based on the spatiotemporal relationship as claimed in claim 1, wherein: the step S3 specifically includes:

s311: individual influence of the user I_u: the method is used for measuring the influence of the user on other users in the network due to the self behavior; by passingThe manner in which the time slices are divided takes into account different time periods