CN111738447A - Mobile social network user relationship inference method based on spatio-temporal relationship learning - Google Patents
Mobile social network user relationship inference method based on spatio-temporal relationship learning Download PDFInfo
- Publication number
- CN111738447A CN111738447A CN202010572405.8A CN202010572405A CN111738447A CN 111738447 A CN111738447 A CN 111738447A CN 202010572405 A CN202010572405 A CN 202010572405A CN 111738447 A CN111738447 A CN 111738447A
- Authority
- CN
- China
- Prior art keywords
- user
- users
- graph
- social
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 21
- 239000013598 vector Substances 0.000 claims description 33
- 239000011159 matrix material Substances 0.000 claims description 29
- 230000006399 behavior Effects 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 12
- 230000002452 interceptive effect Effects 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 10
- 150000001875 compounds Chemical class 0.000 claims description 9
- 241000287196 Asthenes Species 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000006386 neutralization reaction Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000009825 accumulation Methods 0.000 claims description 2
- 239000002131 composite material Substances 0.000 claims description 2
- 238000002474 experimental method Methods 0.000 abstract description 9
- 239000000284 extract Substances 0.000 abstract description 4
- 230000000694 effects Effects 0.000 description 13
- 238000000605 extraction Methods 0.000 description 3
- 230000006855 networking Effects 0.000 description 2
- 238000005295 random walk Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/041—Abduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Economics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a method for inferring the user relationship of a mobile social network based on spatio-temporal relationship learning, which considers the mobility and sociability of individuals at the same time. Considering the effectiveness of the social network structure on social connection prediction, the model firstly constructs a preliminary social graph based on the mobility of users, then extracts the social network structure characteristics of the user pairs from the preliminarily constructed social graph, and finally carries out friend relation inference by integrating the characteristics of the mobility and the sociability. Once the training of the model is completed, different scenes can be well migrated to predict the friendship between users. Experiments on two real-world data sets have shown that our method is consistently superior to existing methods. Furthermore, our model is also valid for relationships with little check-in data and no face-to-face events.
Description
Technical Field
The invention relates to a method for inferring the user relationship of a mobile social network based on spatio-temporal relationship learning, which utilizes the mobile information of a user.
Background
In recent years, with the popularization of mobile social networking applications such as Facebook, Twitter, and microblog, users can publish in a timely manner a place of interest (a web red restaurant, a tourist attraction, etc.) that they are visiting to share with friends. Although such social ways bring great convenience to people's friends making, there is a risk of revealing the user's social relationships. Users of mobile social networks are also becoming aware of this, for example, a large-scale study on Facebook users shows: the proportion of Facebook users hiding buddy lists rose from 17.2% in 2010 to 56.2% in 2011. Few users know that their friends can be inferred using check-in records with spatiotemporal relationships, accurately revealing the hidden social relationships between users.
Existing social relationship inference methods based on spatiotemporal relationships are mainly divided into two categories: the first type is a heuristic method based on feature selection, and effective features such as number of times of contact, number of shared positions, popularity of the shared positions and the like are observed and selected to serve as a measuring standard for measuring whether the users have a friendship or not. These methods, however, put many assumptions and limitations on the required movement data, which greatly reduce their applicability. For example, almost all existing effective methods can only be used if two individuals share a location. The other method is a method based on feature learning, the movement features of the users are vectorized through a machine learning method, and the similarity degree between vectors is used as a standard for deducing whether the user pairs have a friendship or not. However, this type of method is directed to individual modeling and loses time information in the mobile data, so that a more accurate relationship inference cannot be directly obtained.
Disclosure of Invention
The invention aims to provide a model capable of accurately deducing the social relationship of a user by utilizing the movement information of the user.
In order to achieve the above object, the technical solution of the present invention is to provide a method for inferring a relationship of a mobile social network user based on spatiotemporal relationship learning, which is characterized by comprising the following steps:
step 101, dividing all interest points POIs signed by mobile data by all users into I multiplied by J grids according to longitude and latitude, simultaneously dividing time into M time segments, and constructing an I multiplied by J multiplied by M space-time matrix STD, wherein the time is divided into equal-length time segments with the length tau in the time dimension, the space is uniformly divided into grids with equal size in the space dimension, and each grid is recursively divided into four equal grids until the number of the interest points POIs is smaller than a threshold value sigma in each grid;
102, grouping each user pair (u)a,ub) The trajectories of (a) are all projected into a space-time matrix STD, each check-in of a user can be projected into a particular square, for each square, the following is calculated: user u in this time periodaAccessed interest points na(ii) a User ubAccessed interest points nb(ii) a User uaAnd user ubCommonly visited interest points na,bThereby obtaining a user pair (u)a,ub) Space-time matrix of Triple in formulaRepresents the user u in the m-th time periodaAnd ubMoving statistics of information in the location grid of the ith row and the jth column;
103, each pair of users (u)a,ub) Is/are as followsSpace-time matrix O(a,b)Encoding into a low-dimensional vector, and calculating user u using the low-dimensional vectoraAnd user ubObtaining an initial social relationship graph G which is (U, E), wherein U is a set of vertexes in the graph and represents all users with movement information; e is a set of edges representing a friendship between two users, wherein the spatio-temporal matrix O(a,b)Is adjusted by the parameters σ and τ;
step 202, find (u) in the initial social relationship graph Ga,ub) All paths of length 2 between them and all found paths are expressed asThen delete social relationship graph G neutralizationPoint and edge deletion all occurring in except for user uaAnd user ubItself;
step 203, increasing the path length step by step, and repeating the step 202 until the path length exceeds k;
and 4, 0/1 classification is carried out by using a classifier according to the comprehensive characteristic vector of the user, wherein 1 is a friend, and 0 is not a friend, so that the latest prediction social graph is obtained.
And 5, updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed any more, namely obtaining the final predicted social network graph.
Preferably, in step 103, the spatio-temporal matrix O is transformed(a,b)Inputting an automatic encoder with R hidden layers, which encodes the hidden layers into d-dimensional vectors to obtain a reconstructed space-time matrixMake it and the input O of the encoder(a,b)The optimization goals of the training process are as follows:
in the formula (I), the compound is shown in the specification,representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possibleAnd the space-time matrix O before coding(a,b)The same is true. (ii) a
The automatic encoder adopts supervised training to realize reconstruction and distinction of the encoding process, namely, a classification network is added to the automatic encoder to monitor the encoding process of the automatic encoder, and a loss function of the processComprises the following steps:
in the formula (I), the compound is shown in the specification,representing the prediction result, namely the output result of the classification network; y represents a sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set.
To obtain a more discriminative vector representation, the following constraints are placed on the integrated hybrid network:
in the formula (I), the compound is shown in the specification,representing the overall loss function of the entire hybrid network;
once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
The invention provides a novel social relationship reasoning model which simultaneously considers mobility and sociability among individuals. Considering the effectiveness of the social network structure on social connection prediction, the model firstly constructs a preliminary social graph based on the mobility of users, then extracts the social network structure characteristics of the user pairs from the preliminarily constructed social graph, and finally carries out friend relation inference by integrating the characteristics of the mobility and the sociability. Once the training of the model is completed, different scenes can be well migrated to predict the friendship between users. Experiments on two real-world data sets have shown that our method is consistently superior to existing methods. Furthermore, our model is also valid for relationships with little check-in data and no face-to-face events.
Drawings
FIG. 1 is a diagram of mapping friendships to a user's mobile relationship;
FIG. 2 is a system diagram of a social relationship inference model;
FIG. 3 is a schematic diagram of an interactive behavior feature extraction process;
FIG. 4 is a schematic diagram of a k-reachable subgraph encoding process;
FIG. 5(a) is a graph of the effect of spatial granularity in a BrightKite dataset on the accuracy of friendship inference;
FIG. 5(b) is a graph of the effect of spatial granularity in a Gawalla dataset on the accuracy of friendship inference;
FIG. 6(a) is a graph of the effect of time granularity on friendship inference accuracy in a BrightKite dataset;
FIG. 6(b) is a graph of the effect of time granularity in a Gawalla dataset on the accuracy of friendship inference;
FIG. 7(a) is a graph of the effect of feature vector dimensions on the accuracy of friendship inference in the BrightKite dataset;
FIG. 7(b) is a graph of the effect of feature vector dimensions on the accuracy of friendship inference in the Gawalla dataset;
FIG. 8(a) is a graph of the effect of number of iterations in the BrightKite dataset on the accuracy of friendship inference;
FIG. 8(b) is a graph of the effect of iteration number on friendship inference accuracy in the Gawalla dataset;
FIG. 9(a) is a graph of the effect of user check-in number on friendship inference accuracy in the BrightKite dataset;
FIG. 9(b) is a graph of the effect of user check-in number on friendship inference accuracy in the Gawalla dataset;
FIG. 10(a) is a graph of the user's accurate impact on friendship inference on number of shared locations in the BrightKite dataset;
FIG. 10(b) is a graph of the user's accurate impact on friendship inferences for shared location number in the Gawalla dataset.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The invention deduces whether the users have a friendship or not by utilizing the similarity of the movement tracks among the users and the propagation of the social network, and FIG. 1 is an abstraction of the real problem. The inference model is divided into two phases: 1. constructing an initial social network graph: the initial social network graph is constructed based on the observation that the movement trajectories among friends generally have higher similarities. 2. Obtaining a hidden social relationship: based on the phenomenon that network topological structures among friends are more similar, hidden friendships with similar preferences but without similar movement tracks are mined.
FIG. 2 shows a specific implementation content of the spatio-temporal relationship learning-based mobile social network user relationship inference method provided by the present invention, which mainly includes the following two stages:
the first stage is as follows: building preliminary social graph
In the stage, the interactive behavior characteristics between the user pairs are extracted, and whether the two users have a friendship or not is deduced by using the characteristics. Generally, the interactive behavior features may be characterized by statistical attributes of the interactions between pairs of users, such as number of impressions, number of co-visited sites. However, there may be no co-visit events or meeting events for friends, and not all co-visits are of equal importance in inferring the user's friendship. Therefore, we propose a more comprehensive approach to learning complex interactive behavior features. We embed both the collective behavior and the individual behavior of two users into the interactive behavior feature. In addition, we consider that different places and time periods have different prediction importance, and design a hybrid model to generate the compressed interactive behavior feature vector and the initial classification result at the same time, wherein each position influence range and each time interval are parameters of the model. The method can ensure that the compressed interactive behavior characteristics keep the original action characteristics of the user as much as possible, and the characteristics have an important indication function in the task of deducing the friendship. The first stage comprises the following steps:
Firstly, dividing all points of interest (POIs) signed by all users of mobile data into grids of I × J according to longitude and latitude, simultaneously dividing time into M time segments, and constructing a space-time matrix (STD) of I × J × M, wherein the time dimension is divided into time slices with equal length tau, the space dimension is uniformly divided into grids with equal size, and each grid is recursively divided into four equal grids in consideration of larger difference of density of the POIs in a geographic region until the number of the POIs in each grid is less than a threshold value sigmaa,ub) The trajectories of all the users can be projected into the STD, each check-in of the user can be projected into a specific square, and for each square, three indicating factors are calculated, namely the user u in the time periodaNumber n of POIs visitedaUser ubNumber n of POIs visitedbAnd user uaAnd user ubNumber n of commonly visited POIsa,b. Thereby obtaining a user pair (u)a,ub) Space-time matrix of Triple in formulaRepresents the user u in the m-th time periodaAnd ubThe statistics of the movement information in the location grid of the ith row and jth column.
This step pairs each pair of users (u)a,ub) Space-time matrix O of(a,b)Encoding into a low-dimensional vector and using the vector to calculate user uaAnd user ubProbability of friendship, where the size of the spatio-temporal matrix can be adjusted by parameters σ and τ.
We input the spatio-temporal matrix into an automatic encoder, and the encoder with R hidden layers encodes it into d-dimensional vector, i.e. the R-th layer output result of the encoder. Decoder with R hidden layers outputs reconstructed space-time matrix according to d-dimensional vectorMake it and the input O of the encoder(a,b)And (4) approaching. The optimization goals of the training process are as follows:
in the formula (I), the compound is shown in the specification,representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possibleAnd the space-time matrix O before coding(a,b)Similarly, U represents all users in the training sample.
The training of the auto-encoder is unsupervised, which means that it only learns the input structure and does not have the function of feature selection or tendency, so the finally obtained d-dimensional feature vector only retains the user's original movement information and cannot guarantee utility in the friendship inference task. To avoid this problem, we use supervised training to reconstruct and distinguish the encoding process, i.e. we add a classification network (output) to the auto-encoderRepresents the pair of users (u)a,ub) Probability of having a friendship) to monitor its encoding process. Loss function of the processComprises the following steps:
in the formula (I), the compound is shown in the specification,representing the prediction result, namely the output result of the classification network; y represents a sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set.
To obtain a more discriminative vector representation, we put the following constraints on the integrated hybrid network:
in the formula (I), the compound is shown in the specification,representing the composite loss function of the hybrid network.
Once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
And a second stage: obtaining hidden social relationships
In the second phase, we consider the characteristics of indirect links between user pairs, i.e. the characteristics of the network structure. Since mobile data does not have a social topological graph, we extract structural features from the first stage prediction (a semi-true social graph). Considering that the simple heuristic structural features are not universally applicable, a new K reachable subgraph is provided for describing the network structural features among target users, and the subgraph is coded to obtain a structural feature vector. And finally, combining the coded structure vector with corresponding interactive characteristics to improve the accuracy of inference. In addition, an iterative process is designed, and the structural feature vector and the obtained final social relationship graph can be simultaneously refined.
The second stage comprises the following steps:
Katz is a common method for measuring network proximity between users, which considers a global view of user pairs in a social graph, i.e., all possible paths between user pairs. However, we consider this to be unreasonable, as longer paths not only are computationally complex but also negatively impact relationship inference. Therefore, we describe the network structure between users with local view points in network proximity, i.e. extracting a k-reachable subgraph for each user. For a given initial social relationship graph G ═ U, E, we define the extraction user pair (U, E)a,ub) K-reachable diagram ofThe steps are as follows:
step 302, find out (u) in the initial social relationship graph G constructed in the first stagea,ub) All paths of length 2 between them and all found paths are expressed asThen delete social relationship graph G neutralizationPoint and edge deletion all occur in (except user u)aAnd user ubItself).
Step 303, increasing the path length step by step, and repeating step 302 until the path length exceeds k.
From the initial social relationship graph G constructed in the first stage, we are for each pair of users (u)a,ub) K-reachable diagram ofAnd (6) coding is carried out. Coding the paths with the same length based on the accumulation principle, splicing the coding results of the paths with different lengths, and realizing the vectorization of the user on the k-reachable subgraph. Fig. 4 illustrates the K-reachable subgraph encoding process in detail.
0/1 classification (1 is friend, 0 is not friend) is carried out according to the comprehensive feature vector (the interactive behavior feature obtained in the first stage and the local structure feature vector obtained in the second stage) of the user pair by using a classifier, and the latest prediction social graph is obtained.
And updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed, namely obtaining the final predicted social network graph.
The verification process of the invention is as follows:
the experiment adopts real data disclosed by BrightKite and Gowalla, the two data sets are both social networks based on positions, and a user can publish own time-position information and share the time-position information to own friends. Such data sets are divided into a check-in data set and a social relationship graph of the user, and table 1 shows statistical information of the experimental data set.
TABLE 1 data set statistics
The experiment was compared with four popular inference methods based on distance traveled, number of shared locations, random walks and graph embedding.
Inference based on the relationship of the number of shared locations: the method extracts and uses the number of common places visited by a pair of users as features to infer whether a friendship exists between the pair of users. The experiment was supervised learning using svm classifiers.
A relational inference based on distance of movement. The method calculates the center position of the activity of each user according to all the movement information of each user, and deduces whether the users have a friendship or not by taking the Euclidean distance of the center position of each pair of users as a characteristic. The experiment was supervised learning using svm classifiers.
Relationship inference based on random walks. The method maps the mobile data of the user into a user-position bipartite graph, walks out the neighbors of each user in the bipartite graph according to a certain rule, compresses the neighbor information into vectors, and finally carries out friend relation inference by comparing the similarity degree of characteristic vectors between user pairs.
Relationship inference based on graph embedding. The method constructs a head-meeting graph according to the mobile information of users, namely two users have edges when the two users appear together (appear at the same position in a certain time interval), swims out a certain number of friends for each user on the head-meeting graph, codes the friend nodes into vectors, and finally carries out friend relation inference by comparing the similarity degree of characteristic vectors between user pairs.
In the experiment, f1-score is used as an evaluation index, the index is insensitive to class distribution, and the accuracy of an inference result can be accurately depicted even if the class distribution is unbalanced. It is defined as:
f1-score ═ (2 × precision × recall) ÷ (precision + recall)
Where precision accuracy rate is the number of true positives divided by the number of predicted results as positives and recall (also called true positive rate) is the number of true positives divided by the number of tags as positives. The value of f1-score is between 0 and 1, and the larger the value is, the higher the accuracy of the inference is.
In our experiment, we adopt a fully connected self-encoder network, the number of nodes in each layer in the network structure is half of the number of nodes in the previous layer, and the last layer is set by the user. On the other hand, in the structure of the decoder, we use the same inverse network structure as the encoder. The experiment adjusts the number of layers according to the size of the space-time matrix input into the encoder. We use a simple KNN and SVM as the classifier in stage 1 and the binary classifier in stage 2, respectively, and RBF as the kernel function of the support vector machine. The learning rate for all networks was set to 0.005 and the condition for iteration termination was that the change in the social networking graph edge for both runs was less than 1%.
1) Influence of spatial granularity σ
We tested five cases of σ of 500, 750, 1000, 1250 and 1500 to infer the accuracy of the results. As shown in fig. 5(a), (b), in the brightkit dataset (fig. 5(a)), as σ increases from 500 to 1000, its f1-score increases by 4.56%, and then the accuracy gradually decreases as σ increases. A similar trend is for the Gawalla dataset (fig. 5(b)), where f1-score takes a maximum value when σ is 750, since POIs in the Gawalla dataset are more dispersed than POIs in the brightkit dataset, which means that a grid in Gawalla covers a larger geographical area than brightkit, so the optimal σ value for the Gawalla dataset is smaller than the brightkit data.
2) Influence of time granularity tau
We tested the accuracy of the results in five cases of tau being 1 day, 7 days, 14 days, 30 days and 60 days. As shown in fig. 6(a) and (b), the Brightkite dataset (fig. 6(a)) has a maximum F1-Score when τ is 7 days. Similar experimental results were also documented in the Gawalla dataset (fig. 6 (b)). This is because human activities tend to exhibit periodicity, typically on a weekly basis.
3) Influence of the coding dimension d
We tested the accuracy of the inferred results in five cases, d is 16, 32, 64, 128 and 256. As shown in fig. 7(a), (b), the accuracy variation of the brightkit dataset (fig. 7(a)) and the Gawalla dataset (fig. 7(b)) have the same trend, since the higher the dimensionality of the spatio-temporal relationship vector, the more information is contained, resulting in better attack performance; however, the high-dimensional space-time relationship vector also generates excessive noise, and the accuracy of the attack is reduced.
4) Influence of the number of iterations
From the above exploration of the parameters, we experimented with two data sets using the best value for each parameter.
Fig. 8(a), (b) depict the brightkit dataset (fig. 8(a)) and the Gawalla dataset (fig. 8(b)), the effect of the number of iterations on our inferred accuracy. It can be seen that the inference accuracy remains improved as the number of iterations increases, 4 and 5 for the number of iterations required to satisfy the termination condition in Gowalla and Brightkite, respectively.
5) Influence of the number of user sign-ins
Fig. 9(a) and (b) are graphs showing the influence of the number of shared positions by the user on the accuracy of the inference result. It can be seen that as the number of shared locations increased, the accuracy of the inference results also increased. The experimental results also show that the brightkit data set (fig. 9(a)) and the Gawalla data set (fig. 9(b)) of the inference model are higher than the method based on feature extraction (inference based on the relation of moving distance) by about 10% on the index of f 1-score.
6) User impact on number of shared locations
Fig. 10(a) and (b) are graphs showing the influence of the number of user check-ins on the accuracy of the estimation result. Experimental results of the brightkit dataset (fig. 10(a)) and the Gawalla dataset (fig. 10(b)) show that our method is somewhat robust to users with different numbers of check-ins. Obviously, the more user check-ins, the more accurate the modeling of user behavior patterns and the more accurate the user relationship inference.
Claims (2)
1. A mobile social network user relationship inference method based on spatiotemporal relationship learning is characterized by comprising the following steps:
step 1, extracting the interactive behavior characteristics between user pairs, and deducing whether two users have a friendship by using the characteristics, wherein the method comprises the following steps:
step 101, dividing all interest points POIs signed by mobile data by all users into I multiplied by J grids according to longitude and latitude, simultaneously dividing time into M time segments, and constructing an I multiplied by J multiplied by M space-time matrix STD, wherein the time is divided into equal-length time segments with the length tau in the time dimension, the space is uniformly divided into grids with equal size in the space dimension, and each grid is recursively divided into four equal grids until the number of the interest points POIs is smaller than a threshold value sigma in each grid;
102, grouping each user pair (u)a,ub) The trajectories of (a) are all projected into a space-time matrix STD, each check-in of a user can be projected into a particular square, for each square, the following is calculated: user u in this time periodaAccessed interest points na(ii) a User ubAccessed interest points nb(ii) a User uaAnd user ubCommonly visited interest points na,bThereby obtaining a user pair (u)a,ub) Space-time matrix of Triple in formulaRepresents the user u in the m-th time periodaAnd ubMoving statistics of information in the location grid of the ith row and the jth column;
103, each pair of users (u)a,ub) Space-time matrix O of(a,b)Encoding into a low-dimensional vector, and calculating user u using the low-dimensional vectoraAnd user ubObtaining an initial social relationship graph G which is (U, E), wherein U is a set of vertexes in the graph and represents all users with movement information; e is a set of edges representing a friendship between two users, wherein the spatio-temporal matrix O(a,b)Is adjusted by the parameters σ and τ;
step 2, extracting a k-reachable subgraph for each user to describe the network structure between the users, and defining an extracted user pair (U, E) for a given initial social relationship graph G (U, E)a,ub) K-reachable diagram ofThe steps are as follows:
step 202, find (u) in the initial social relationship graph Ga,ub) All paths of length 2 between them and all found paths are expressed asThen delete social relationship graph G neutralizationPoint and edge deletion all occurring in except for user uaAnd user ubItself;
step 203, increasing the path length step by step, and repeating the step 202 until the path length exceeds k;
step 3, according to the initial social relationship graph G, aiming at each pair of users (u)a,ub) K-reachable diagram ofCoding, namely coding paths with the same length based on an accumulation principle, splicing coding results of paths with different lengths, realizing vectorization of a user on a k-reachable subgraph, and obtaining a comprehensive characteristic vector of the user pair;
and 4, 0/1 classification is carried out by using a classifier according to the comprehensive characteristic vector of the user, wherein 1 is a friend, and 0 is not a friend, so that the latest prediction social graph is obtained.
And 5, updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed, namely obtaining the final predicted social network graph.
2. The method as claimed in claim 1, wherein the spatio-temporal relationship learning-based mobile social network user relationship inference method is characterized in that in step 103, the spatio-temporal matrix O is set(a,b)Inputting an automatic encoder with R hidden layers, which encodes the hidden layers into d-dimensional vectors to obtain a reconstructed space-time matrixMake it and the input O of the encoder(a,b)The optimization goals of the training process are as follows:
in the formula (I), the compound is shown in the specification,representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possibleAnd the space-time matrix O before coding(a,b)Similarly, U represents all users in the training sample;
the training of the automatic encoder adopts supervised training to realize the reconstruction and the distinction of the encoding process, namely, a classification network is added to the automatic encoder to monitor the encoding process of the automatic encoder, and the loss function of the processComprises the following steps:
in the formula (I), the compound is shown in the specification,representing the prediction result, namely the output result of the classification network; y representsA sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set. .
To obtain a more discriminative vector representation, the following constraints are placed on the integrated hybrid network:
in the formula (I), the compound is shown in the specification,representing the composite loss function of the hybrid network.
Once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010572405.8A CN111738447B (en) | 2020-06-22 | 2020-06-22 | Mobile social network user relationship inference method based on spatio-temporal relationship learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010572405.8A CN111738447B (en) | 2020-06-22 | 2020-06-22 | Mobile social network user relationship inference method based on spatio-temporal relationship learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111738447A true CN111738447A (en) | 2020-10-02 |
CN111738447B CN111738447B (en) | 2022-07-29 |
Family
ID=72650234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010572405.8A Active CN111738447B (en) | 2020-06-22 | 2020-06-22 | Mobile social network user relationship inference method based on spatio-temporal relationship learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111738447B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112784162A (en) * | 2021-01-29 | 2021-05-11 | 东北财经大学 | TS 24-based dynamic POIs recommendation method |
CN113569154A (en) * | 2021-07-16 | 2021-10-29 | 上海理工大学 | Interactive object prediction method based on social behavior characteristics |
CN114880586A (en) * | 2022-06-07 | 2022-08-09 | 电子科技大学 | Confrontation-based social circle inference method through mobility context awareness |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140279722A1 (en) * | 2013-03-15 | 2014-09-18 | Mitu Singh | Methods and systems for inferring user attributes in a social networking system |
US20160275401A1 (en) * | 2015-03-20 | 2016-09-22 | Fuji Xerox Co., Ltd. | Methods and systems of venue inference for social messages |
CN106447505A (en) * | 2016-09-26 | 2017-02-22 | 浙江工业大学 | Implementation method for effective friend relationship discovery in social network |
CN106649659A (en) * | 2016-12-13 | 2017-05-10 | 重庆邮电大学 | Link prediction system and method for social network |
US20170199920A1 (en) * | 2010-03-17 | 2017-07-13 | At&T Intellectual Property I, L.P. | System for calculating a social graph using a sugi |
CN107220902A (en) * | 2017-06-12 | 2017-09-29 | 东莞理工学院 | The cascade scale forecast method of online community network |
CN107368534A (en) * | 2017-06-21 | 2017-11-21 | 南京邮电大学 | A kind of method for predicting social network user attribute |
-
2020
- 2020-06-22 CN CN202010572405.8A patent/CN111738447B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170199920A1 (en) * | 2010-03-17 | 2017-07-13 | At&T Intellectual Property I, L.P. | System for calculating a social graph using a sugi |
US20140279722A1 (en) * | 2013-03-15 | 2014-09-18 | Mitu Singh | Methods and systems for inferring user attributes in a social networking system |
US20160275401A1 (en) * | 2015-03-20 | 2016-09-22 | Fuji Xerox Co., Ltd. | Methods and systems of venue inference for social messages |
CN106447505A (en) * | 2016-09-26 | 2017-02-22 | 浙江工业大学 | Implementation method for effective friend relationship discovery in social network |
CN106649659A (en) * | 2016-12-13 | 2017-05-10 | 重庆邮电大学 | Link prediction system and method for social network |
CN107220902A (en) * | 2017-06-12 | 2017-09-29 | 东莞理工学院 | The cascade scale forecast method of online community network |
CN107368534A (en) * | 2017-06-21 | 2017-11-21 | 南京邮电大学 | A kind of method for predicting social network user attribute |
Non-Patent Citations (2)
Title |
---|
孙建伟,李媛 等: "社交网络中用户关系和地理位置组合服务算法的研究", 《小型微型计算机系统》 * |
李志,单洪 等: "基于时空共现的移动用户社会关系类型推断", 《吉林大学学报(工学版)》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112784162A (en) * | 2021-01-29 | 2021-05-11 | 东北财经大学 | TS 24-based dynamic POIs recommendation method |
CN112784162B (en) * | 2021-01-29 | 2023-09-19 | 东北财经大学 | TS 24-based dynamic POIs recommendation method |
CN113569154A (en) * | 2021-07-16 | 2021-10-29 | 上海理工大学 | Interactive object prediction method based on social behavior characteristics |
CN113569154B (en) * | 2021-07-16 | 2023-08-04 | 上海理工大学 | Interactive object prediction method based on social behavior characteristics |
CN114880586A (en) * | 2022-06-07 | 2022-08-09 | 电子科技大学 | Confrontation-based social circle inference method through mobility context awareness |
Also Published As
Publication number | Publication date |
---|---|
CN111738447B (en) | 2022-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111738447B (en) | Mobile social network user relationship inference method based on spatio-temporal relationship learning | |
CN111061961B (en) | Multi-feature-fused matrix decomposition interest point recommendation method and implementation system thereof | |
Sadilek et al. | Finding your friends and following them to where you are | |
Gudmundsson et al. | Computational movement analysis | |
Jiao et al. | A novel next new point-of-interest recommendation system based on simulated user travel decision-making process | |
CN110334293B (en) | Position social network-oriented position recommendation method with time perception based on fuzzy clustering | |
CN105389332A (en) | Geographical social network based user similarity computation method | |
CN112380426A (en) | Interest point recommendation method and system based on graph embedding and user long-term and short-term interest fusion | |
CN113139140B (en) | Tourist attraction recommendation method based on space-time perception GRU and combined with user relationship preference | |
Sudo et al. | Particle filter for real-time human mobility prediction following unprecedented disaster | |
CN113068131B (en) | Method, device, equipment and storage medium for predicting user movement mode and track | |
CN115270007B (en) | POI recommendation method and system based on mixed graph neural network | |
Du et al. | Beyond geo-first law: Learning spatial representations via integrated autocorrelations and complementarity | |
CN112667920A (en) | Text perception-based social influence prediction method, device and equipment | |
CN114168804A (en) | Similar information retrieval method and system based on heterogeneous subgraph neural network | |
Shafizadeh‐Moghadam et al. | On the spatiotemporal generalization of machine learning and ensemble models for simulating built‐up land expansion | |
JP7205630B2 (en) | Label estimation device, label estimation method, and label estimation program | |
Gong et al. | Urban land-use land-cover extraction for catchment modelling using deep learning techniques | |
CN112883292B (en) | User behavior recommendation model establishment and position recommendation method based on spatio-temporal information | |
Zarezade et al. | Spatio-temporal modeling of check-ins in location-based social networks | |
CN115310672A (en) | City development prediction model construction method, city development prediction method and device | |
Jenson et al. | Mining location information from users' spatio-temporal data | |
CN114048380A (en) | Interest point recommendation method based on graph neural network | |
Aderhold et al. | Reconstructing ecological networks with hierarchical Bayesian regression and Mondrian processes | |
Zhang et al. | Graph-Enhanced Spatio-Temporal Interval Aware Network for Next POI Recommendation in Mobile Environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |