CN111738447A - Mobile social network user relationship inference method based on spatio-temporal relationship learning - Google Patents

Mobile social network user relationship inference method based on spatio-temporal relationship learning Download PDF

Info

Publication number
CN111738447A
CN111738447A CN202010572405.8A CN202010572405A CN111738447A CN 111738447 A CN111738447 A CN 111738447A CN 202010572405 A CN202010572405 A CN 202010572405A CN 111738447 A CN111738447 A CN 111738447A
Authority
CN
China
Prior art keywords
user
users
graph
social
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010572405.8A
Other languages
Chinese (zh)
Other versions
CN111738447B (en
Inventor
陶玉婷
常姗
朱弘恣
王佳程
杜坷坷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donghua University
Original Assignee
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donghua University filed Critical Donghua University
Priority to CN202010572405.8A priority Critical patent/CN111738447B/en
Publication of CN111738447A publication Critical patent/CN111738447A/en
Application granted granted Critical
Publication of CN111738447B publication Critical patent/CN111738447B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for inferring the user relationship of a mobile social network based on spatio-temporal relationship learning, which considers the mobility and sociability of individuals at the same time. Considering the effectiveness of the social network structure on social connection prediction, the model firstly constructs a preliminary social graph based on the mobility of users, then extracts the social network structure characteristics of the user pairs from the preliminarily constructed social graph, and finally carries out friend relation inference by integrating the characteristics of the mobility and the sociability. Once the training of the model is completed, different scenes can be well migrated to predict the friendship between users. Experiments on two real-world data sets have shown that our method is consistently superior to existing methods. Furthermore, our model is also valid for relationships with little check-in data and no face-to-face events.

Description

Mobile social network user relationship inference method based on spatio-temporal relationship learning
Technical Field
The invention relates to a method for inferring the user relationship of a mobile social network based on spatio-temporal relationship learning, which utilizes the mobile information of a user.
Background
In recent years, with the popularization of mobile social networking applications such as Facebook, Twitter, and microblog, users can publish in a timely manner a place of interest (a web red restaurant, a tourist attraction, etc.) that they are visiting to share with friends. Although such social ways bring great convenience to people's friends making, there is a risk of revealing the user's social relationships. Users of mobile social networks are also becoming aware of this, for example, a large-scale study on Facebook users shows: the proportion of Facebook users hiding buddy lists rose from 17.2% in 2010 to 56.2% in 2011. Few users know that their friends can be inferred using check-in records with spatiotemporal relationships, accurately revealing the hidden social relationships between users.
Existing social relationship inference methods based on spatiotemporal relationships are mainly divided into two categories: the first type is a heuristic method based on feature selection, and effective features such as number of times of contact, number of shared positions, popularity of the shared positions and the like are observed and selected to serve as a measuring standard for measuring whether the users have a friendship or not. These methods, however, put many assumptions and limitations on the required movement data, which greatly reduce their applicability. For example, almost all existing effective methods can only be used if two individuals share a location. The other method is a method based on feature learning, the movement features of the users are vectorized through a machine learning method, and the similarity degree between vectors is used as a standard for deducing whether the user pairs have a friendship or not. However, this type of method is directed to individual modeling and loses time information in the mobile data, so that a more accurate relationship inference cannot be directly obtained.
Disclosure of Invention
The invention aims to provide a model capable of accurately deducing the social relationship of a user by utilizing the movement information of the user.
In order to achieve the above object, the technical solution of the present invention is to provide a method for inferring a relationship of a mobile social network user based on spatiotemporal relationship learning, which is characterized by comprising the following steps:
step 1, extracting the interactive behavior characteristics between user pairs, and deducing whether two users have a friendship by using the characteristics, wherein the method comprises the following steps:
step 101, dividing all interest points POIs signed by mobile data by all users into I multiplied by J grids according to longitude and latitude, simultaneously dividing time into M time segments, and constructing an I multiplied by J multiplied by M space-time matrix STD, wherein the time is divided into equal-length time segments with the length tau in the time dimension, the space is uniformly divided into grids with equal size in the space dimension, and each grid is recursively divided into four equal grids until the number of the interest points POIs is smaller than a threshold value sigma in each grid;
102, grouping each user pair (u)a,ub) The trajectories of (a) are all projected into a space-time matrix STD, each check-in of a user can be projected into a particular square, for each square, the following is calculated: user u in this time periodaAccessed interest points na(ii) a User ubAccessed interest points nb(ii) a User uaAnd user ubCommonly visited interest points na,bThereby obtaining a user pair (u)a,ub) Space-time matrix of
Figure BDA0002550117600000021
Figure BDA0002550117600000022
Triple in formula
Figure BDA0002550117600000023
Represents the user u in the m-th time periodaAnd ubMoving statistics of information in the location grid of the ith row and the jth column;
103, each pair of users (u)a,ub) Is/are as followsSpace-time matrix O(a,b)Encoding into a low-dimensional vector, and calculating user u using the low-dimensional vectoraAnd user ubObtaining an initial social relationship graph G which is (U, E), wherein U is a set of vertexes in the graph and represents all users with movement information; e is a set of edges representing a friendship between two users, wherein the spatio-temporal matrix O(a,b)Is adjusted by the parameters σ and τ;
step 2, extracting a k-reachable subgraph for each user to describe the network structure between the users, and defining an extracted user pair (U, E) for a given initial social relationship graph G (U, E)a,ub) K-reachable diagram of
Figure BDA0002550117600000024
The steps are as follows:
step 201, setting the path length to be 2, will
Figure BDA0002550117600000025
Initializing to be an empty graph;
step 202, find (u) in the initial social relationship graph Ga,ub) All paths of length 2 between them and all found paths are expressed as
Figure BDA0002550117600000026
Then delete social relationship graph G neutralization
Figure BDA0002550117600000027
Point and edge deletion all occurring in except for user uaAnd user ubItself;
step 203, increasing the path length step by step, and repeating the step 202 until the path length exceeds k;
step 3, according to the initial social relationship graph G, aiming at each pair of users (u)a,ub) K-reachable diagram of
Figure BDA0002550117600000028
Coding is carried out based on the principle of accumulationCoding paths with the same length, splicing coding results of paths with different lengths, realizing vectorization of a user on a k-reachable subgraph, and obtaining a comprehensive characteristic vector of the user pair;
and 4, 0/1 classification is carried out by using a classifier according to the comprehensive characteristic vector of the user, wherein 1 is a friend, and 0 is not a friend, so that the latest prediction social graph is obtained.
And 5, updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed any more, namely obtaining the final predicted social network graph.
Preferably, in step 103, the spatio-temporal matrix O is transformed(a,b)Inputting an automatic encoder with R hidden layers, which encodes the hidden layers into d-dimensional vectors to obtain a reconstructed space-time matrix
Figure BDA0002550117600000031
Make it and the input O of the encoder(a,b)The optimization goals of the training process are as follows:
Figure BDA0002550117600000032
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000033
representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possible
Figure BDA0002550117600000034
And the space-time matrix O before coding(a,b)The same is true. (ii) a
The automatic encoder adopts supervised training to realize reconstruction and distinction of the encoding process, namely, a classification network is added to the automatic encoder to monitor the encoding process of the automatic encoder, and a loss function of the process
Figure BDA0002550117600000035
Comprises the following steps:
Figure BDA0002550117600000036
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000037
representing the prediction result, namely the output result of the classification network; y represents a sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set.
To obtain a more discriminative vector representation, the following constraints are placed on the integrated hybrid network:
Figure BDA0002550117600000038
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000039
representing the overall loss function of the entire hybrid network;
once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
The invention provides a novel social relationship reasoning model which simultaneously considers mobility and sociability among individuals. Considering the effectiveness of the social network structure on social connection prediction, the model firstly constructs a preliminary social graph based on the mobility of users, then extracts the social network structure characteristics of the user pairs from the preliminarily constructed social graph, and finally carries out friend relation inference by integrating the characteristics of the mobility and the sociability. Once the training of the model is completed, different scenes can be well migrated to predict the friendship between users. Experiments on two real-world data sets have shown that our method is consistently superior to existing methods. Furthermore, our model is also valid for relationships with little check-in data and no face-to-face events.
Drawings
FIG. 1 is a diagram of mapping friendships to a user's mobile relationship;
FIG. 2 is a system diagram of a social relationship inference model;
FIG. 3 is a schematic diagram of an interactive behavior feature extraction process;
FIG. 4 is a schematic diagram of a k-reachable subgraph encoding process;
FIG. 5(a) is a graph of the effect of spatial granularity in a BrightKite dataset on the accuracy of friendship inference;
FIG. 5(b) is a graph of the effect of spatial granularity in a Gawalla dataset on the accuracy of friendship inference;
FIG. 6(a) is a graph of the effect of time granularity on friendship inference accuracy in a BrightKite dataset;
FIG. 6(b) is a graph of the effect of time granularity in a Gawalla dataset on the accuracy of friendship inference;
FIG. 7(a) is a graph of the effect of feature vector dimensions on the accuracy of friendship inference in the BrightKite dataset;
FIG. 7(b) is a graph of the effect of feature vector dimensions on the accuracy of friendship inference in the Gawalla dataset;
FIG. 8(a) is a graph of the effect of number of iterations in the BrightKite dataset on the accuracy of friendship inference;
FIG. 8(b) is a graph of the effect of iteration number on friendship inference accuracy in the Gawalla dataset;
FIG. 9(a) is a graph of the effect of user check-in number on friendship inference accuracy in the BrightKite dataset;
FIG. 9(b) is a graph of the effect of user check-in number on friendship inference accuracy in the Gawalla dataset;
FIG. 10(a) is a graph of the user's accurate impact on friendship inference on number of shared locations in the BrightKite dataset;
FIG. 10(b) is a graph of the user's accurate impact on friendship inferences for shared location number in the Gawalla dataset.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The invention deduces whether the users have a friendship or not by utilizing the similarity of the movement tracks among the users and the propagation of the social network, and FIG. 1 is an abstraction of the real problem. The inference model is divided into two phases: 1. constructing an initial social network graph: the initial social network graph is constructed based on the observation that the movement trajectories among friends generally have higher similarities. 2. Obtaining a hidden social relationship: based on the phenomenon that network topological structures among friends are more similar, hidden friendships with similar preferences but without similar movement tracks are mined.
FIG. 2 shows a specific implementation content of the spatio-temporal relationship learning-based mobile social network user relationship inference method provided by the present invention, which mainly includes the following two stages:
the first stage is as follows: building preliminary social graph
In the stage, the interactive behavior characteristics between the user pairs are extracted, and whether the two users have a friendship or not is deduced by using the characteristics. Generally, the interactive behavior features may be characterized by statistical attributes of the interactions between pairs of users, such as number of impressions, number of co-visited sites. However, there may be no co-visit events or meeting events for friends, and not all co-visits are of equal importance in inferring the user's friendship. Therefore, we propose a more comprehensive approach to learning complex interactive behavior features. We embed both the collective behavior and the individual behavior of two users into the interactive behavior feature. In addition, we consider that different places and time periods have different prediction importance, and design a hybrid model to generate the compressed interactive behavior feature vector and the initial classification result at the same time, wherein each position influence range and each time interval are parameters of the model. The method can ensure that the compressed interactive behavior characteristics keep the original action characteristics of the user as much as possible, and the characteristics have an important indication function in the task of deducing the friendship. The first stage comprises the following steps:
step 1, constructing a space-time matrix
Firstly, dividing all points of interest (POIs) signed by all users of mobile data into grids of I × J according to longitude and latitude, simultaneously dividing time into M time segments, and constructing a space-time matrix (STD) of I × J × M, wherein the time dimension is divided into time slices with equal length tau, the space dimension is uniformly divided into grids with equal size, and each grid is recursively divided into four equal grids in consideration of larger difference of density of the POIs in a geographic region until the number of the POIs in each grid is less than a threshold value sigmaa,ub) The trajectories of all the users can be projected into the STD, each check-in of the user can be projected into a specific square, and for each square, three indicating factors are calculated, namely the user u in the time periodaNumber n of POIs visitedaUser ubNumber n of POIs visitedbAnd user uaAnd user ubNumber n of commonly visited POIsa,b. Thereby obtaining a user pair (u)a,ub) Space-time matrix of
Figure BDA0002550117600000061
Figure BDA0002550117600000062
Triple in formula
Figure BDA0002550117600000063
Represents the user u in the m-th time periodaAnd ubThe statistics of the movement information in the location grid of the ith row and jth column.
Step 2, coding of space-time matrix
This step pairs each pair of users (u)a,ub) Space-time matrix O of(a,b)Encoding into a low-dimensional vector and using the vector to calculate user uaAnd user ubProbability of friendship, where the size of the spatio-temporal matrix can be adjusted by parameters σ and τ.
We input the spatio-temporal matrix into an automatic encoder, and the encoder with R hidden layers encodes it into d-dimensional vector, i.e. the R-th layer output result of the encoder. Decoder with R hidden layers outputs reconstructed space-time matrix according to d-dimensional vector
Figure BDA0002550117600000064
Make it and the input O of the encoder(a,b)And (4) approaching. The optimization goals of the training process are as follows:
Figure BDA0002550117600000065
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000071
representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possible
Figure BDA0002550117600000072
And the space-time matrix O before coding(a,b)Similarly, U represents all users in the training sample.
The training of the auto-encoder is unsupervised, which means that it only learns the input structure and does not have the function of feature selection or tendency, so the finally obtained d-dimensional feature vector only retains the user's original movement information and cannot guarantee utility in the friendship inference task. To avoid this problem, we use supervised training to reconstruct and distinguish the encoding process, i.e. we add a classification network (output) to the auto-encoder
Figure BDA0002550117600000073
Represents the pair of users (u)a,ub) Probability of having a friendship) to monitor its encoding process. Loss function of the process
Figure BDA0002550117600000074
Comprises the following steps:
Figure BDA0002550117600000075
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000076
representing the prediction result, namely the output result of the classification network; y represents a sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set.
To obtain a more discriminative vector representation, we put the following constraints on the integrated hybrid network:
Figure BDA0002550117600000077
in the formula (I), the compound is shown in the specification,
Figure BDA0002550117600000078
representing the composite loss function of the hybrid network.
Once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
And a second stage: obtaining hidden social relationships
In the second phase, we consider the characteristics of indirect links between user pairs, i.e. the characteristics of the network structure. Since mobile data does not have a social topological graph, we extract structural features from the first stage prediction (a semi-true social graph). Considering that the simple heuristic structural features are not universally applicable, a new K reachable subgraph is provided for describing the network structural features among target users, and the subgraph is coded to obtain a structural feature vector. And finally, combining the coded structure vector with corresponding interactive characteristics to improve the accuracy of inference. In addition, an iterative process is designed, and the structural feature vector and the obtained final social relationship graph can be simultaneously refined.
The second stage comprises the following steps:
step 3, extracting a local graph structure, namely a K reachable subgraph
Katz is a common method for measuring network proximity between users, which considers a global view of user pairs in a social graph, i.e., all possible paths between user pairs. However, we consider this to be unreasonable, as longer paths not only are computationally complex but also negatively impact relationship inference. Therefore, we describe the network structure between users with local view points in network proximity, i.e. extracting a k-reachable subgraph for each user. For a given initial social relationship graph G ═ U, E, we define the extraction user pair (U, E)a,ub) K-reachable diagram of
Figure BDA0002550117600000081
The steps are as follows:
step 301, setting the path length to 2, will
Figure BDA0002550117600000082
Initializing to be an empty graph;
step 302, find out (u) in the initial social relationship graph G constructed in the first stagea,ub) All paths of length 2 between them and all found paths are expressed as
Figure BDA0002550117600000083
Then delete social relationship graph G neutralization
Figure BDA0002550117600000084
Point and edge deletion all occur in (except user u)aAnd user ubItself).
Step 303, increasing the path length step by step, and repeating step 302 until the path length exceeds k.
Step 4, iteratively updating the prediction social graph
From the initial social relationship graph G constructed in the first stage, we are for each pair of users (u)a,ub) K-reachable diagram of
Figure BDA0002550117600000085
And (6) coding is carried out. Coding the paths with the same length based on the accumulation principle, splicing the coding results of the paths with different lengths, and realizing the vectorization of the user on the k-reachable subgraph. Fig. 4 illustrates the K-reachable subgraph encoding process in detail.
0/1 classification (1 is friend, 0 is not friend) is carried out according to the comprehensive feature vector (the interactive behavior feature obtained in the first stage and the local structure feature vector obtained in the second stage) of the user pair by using a classifier, and the latest prediction social graph is obtained.
And updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed, namely obtaining the final predicted social network graph.
The verification process of the invention is as follows:
the experiment adopts real data disclosed by BrightKite and Gowalla, the two data sets are both social networks based on positions, and a user can publish own time-position information and share the time-position information to own friends. Such data sets are divided into a check-in data set and a social relationship graph of the user, and table 1 shows statistical information of the experimental data set.
TABLE 1 data set statistics
Figure BDA0002550117600000086
Figure BDA0002550117600000091
The experiment was compared with four popular inference methods based on distance traveled, number of shared locations, random walks and graph embedding.
Inference based on the relationship of the number of shared locations: the method extracts and uses the number of common places visited by a pair of users as features to infer whether a friendship exists between the pair of users. The experiment was supervised learning using svm classifiers.
A relational inference based on distance of movement. The method calculates the center position of the activity of each user according to all the movement information of each user, and deduces whether the users have a friendship or not by taking the Euclidean distance of the center position of each pair of users as a characteristic. The experiment was supervised learning using svm classifiers.
Relationship inference based on random walks. The method maps the mobile data of the user into a user-position bipartite graph, walks out the neighbors of each user in the bipartite graph according to a certain rule, compresses the neighbor information into vectors, and finally carries out friend relation inference by comparing the similarity degree of characteristic vectors between user pairs.
Relationship inference based on graph embedding. The method constructs a head-meeting graph according to the mobile information of users, namely two users have edges when the two users appear together (appear at the same position in a certain time interval), swims out a certain number of friends for each user on the head-meeting graph, codes the friend nodes into vectors, and finally carries out friend relation inference by comparing the similarity degree of characteristic vectors between user pairs.
In the experiment, f1-score is used as an evaluation index, the index is insensitive to class distribution, and the accuracy of an inference result can be accurately depicted even if the class distribution is unbalanced. It is defined as:
f1-score ═ (2 × precision × recall) ÷ (precision + recall)
Where precision accuracy rate is the number of true positives divided by the number of predicted results as positives and recall (also called true positive rate) is the number of true positives divided by the number of tags as positives. The value of f1-score is between 0 and 1, and the larger the value is, the higher the accuracy of the inference is.
In our experiment, we adopt a fully connected self-encoder network, the number of nodes in each layer in the network structure is half of the number of nodes in the previous layer, and the last layer is set by the user. On the other hand, in the structure of the decoder, we use the same inverse network structure as the encoder. The experiment adjusts the number of layers according to the size of the space-time matrix input into the encoder. We use a simple KNN and SVM as the classifier in stage 1 and the binary classifier in stage 2, respectively, and RBF as the kernel function of the support vector machine. The learning rate for all networks was set to 0.005 and the condition for iteration termination was that the change in the social networking graph edge for both runs was less than 1%.
1) Influence of spatial granularity σ
We tested five cases of σ of 500, 750, 1000, 1250 and 1500 to infer the accuracy of the results. As shown in fig. 5(a), (b), in the brightkit dataset (fig. 5(a)), as σ increases from 500 to 1000, its f1-score increases by 4.56%, and then the accuracy gradually decreases as σ increases. A similar trend is for the Gawalla dataset (fig. 5(b)), where f1-score takes a maximum value when σ is 750, since POIs in the Gawalla dataset are more dispersed than POIs in the brightkit dataset, which means that a grid in Gawalla covers a larger geographical area than brightkit, so the optimal σ value for the Gawalla dataset is smaller than the brightkit data.
2) Influence of time granularity tau
We tested the accuracy of the results in five cases of tau being 1 day, 7 days, 14 days, 30 days and 60 days. As shown in fig. 6(a) and (b), the Brightkite dataset (fig. 6(a)) has a maximum F1-Score when τ is 7 days. Similar experimental results were also documented in the Gawalla dataset (fig. 6 (b)). This is because human activities tend to exhibit periodicity, typically on a weekly basis.
3) Influence of the coding dimension d
We tested the accuracy of the inferred results in five cases, d is 16, 32, 64, 128 and 256. As shown in fig. 7(a), (b), the accuracy variation of the brightkit dataset (fig. 7(a)) and the Gawalla dataset (fig. 7(b)) have the same trend, since the higher the dimensionality of the spatio-temporal relationship vector, the more information is contained, resulting in better attack performance; however, the high-dimensional space-time relationship vector also generates excessive noise, and the accuracy of the attack is reduced.
4) Influence of the number of iterations
From the above exploration of the parameters, we experimented with two data sets using the best value for each parameter.
Fig. 8(a), (b) depict the brightkit dataset (fig. 8(a)) and the Gawalla dataset (fig. 8(b)), the effect of the number of iterations on our inferred accuracy. It can be seen that the inference accuracy remains improved as the number of iterations increases, 4 and 5 for the number of iterations required to satisfy the termination condition in Gowalla and Brightkite, respectively.
5) Influence of the number of user sign-ins
Fig. 9(a) and (b) are graphs showing the influence of the number of shared positions by the user on the accuracy of the inference result. It can be seen that as the number of shared locations increased, the accuracy of the inference results also increased. The experimental results also show that the brightkit data set (fig. 9(a)) and the Gawalla data set (fig. 9(b)) of the inference model are higher than the method based on feature extraction (inference based on the relation of moving distance) by about 10% on the index of f 1-score.
6) User impact on number of shared locations
Fig. 10(a) and (b) are graphs showing the influence of the number of user check-ins on the accuracy of the estimation result. Experimental results of the brightkit dataset (fig. 10(a)) and the Gawalla dataset (fig. 10(b)) show that our method is somewhat robust to users with different numbers of check-ins. Obviously, the more user check-ins, the more accurate the modeling of user behavior patterns and the more accurate the user relationship inference.

Claims (2)

1. A mobile social network user relationship inference method based on spatiotemporal relationship learning is characterized by comprising the following steps:
step 1, extracting the interactive behavior characteristics between user pairs, and deducing whether two users have a friendship by using the characteristics, wherein the method comprises the following steps:
step 101, dividing all interest points POIs signed by mobile data by all users into I multiplied by J grids according to longitude and latitude, simultaneously dividing time into M time segments, and constructing an I multiplied by J multiplied by M space-time matrix STD, wherein the time is divided into equal-length time segments with the length tau in the time dimension, the space is uniformly divided into grids with equal size in the space dimension, and each grid is recursively divided into four equal grids until the number of the interest points POIs is smaller than a threshold value sigma in each grid;
102, grouping each user pair (u)a,ub) The trajectories of (a) are all projected into a space-time matrix STD, each check-in of a user can be projected into a particular square, for each square, the following is calculated: user u in this time periodaAccessed interest points na(ii) a User ubAccessed interest points nb(ii) a User uaAnd user ubCommonly visited interest points na,bThereby obtaining a user pair (u)a,ub) Space-time matrix of
Figure FDA0002550117590000011
Figure FDA0002550117590000012
Triple in formula
Figure FDA0002550117590000013
Represents the user u in the m-th time periodaAnd ubMoving statistics of information in the location grid of the ith row and the jth column;
103, each pair of users (u)a,ub) Space-time matrix O of(a,b)Encoding into a low-dimensional vector, and calculating user u using the low-dimensional vectoraAnd user ubObtaining an initial social relationship graph G which is (U, E), wherein U is a set of vertexes in the graph and represents all users with movement information; e is a set of edges representing a friendship between two users, wherein the spatio-temporal matrix O(a,b)Is adjusted by the parameters σ and τ;
step 2, extracting a k-reachable subgraph for each user to describe the network structure between the users, and defining an extracted user pair (U, E) for a given initial social relationship graph G (U, E)a,ub) K-reachable diagram of
Figure FDA0002550117590000014
The steps are as follows:
step 201, setting the path length to be 2, will
Figure FDA0002550117590000015
Initializing to be an empty graph;
step 202, find (u) in the initial social relationship graph Ga,ub) All paths of length 2 between them and all found paths are expressed as
Figure FDA0002550117590000021
Then delete social relationship graph G neutralization
Figure FDA0002550117590000022
Point and edge deletion all occurring in except for user uaAnd user ubItself;
step 203, increasing the path length step by step, and repeating the step 202 until the path length exceeds k;
step 3, according to the initial social relationship graph G, aiming at each pair of users (u)a,ub) K-reachable diagram of
Figure FDA0002550117590000023
Coding, namely coding paths with the same length based on an accumulation principle, splicing coding results of paths with different lengths, realizing vectorization of a user on a k-reachable subgraph, and obtaining a comprehensive characteristic vector of the user pair;
and 4, 0/1 classification is carried out by using a classifier according to the comprehensive characteristic vector of the user, wherein 1 is a friend, and 0 is not a friend, so that the latest prediction social graph is obtained.
And 5, updating the structural characteristics of the user pairs by using the latest social graph, and further predicting again until the prediction result is not changed, namely obtaining the final predicted social network graph.
2. The method as claimed in claim 1, wherein the spatio-temporal relationship learning-based mobile social network user relationship inference method is characterized in that in step 103, the spatio-temporal matrix O is set(a,b)Inputting an automatic encoder with R hidden layers, which encodes the hidden layers into d-dimensional vectors to obtain a reconstructed space-time matrix
Figure FDA0002550117590000024
Make it and the input O of the encoder(a,b)The optimization goals of the training process are as follows:
Figure FDA0002550117590000025
in the formula (I), the compound is shown in the specification,
Figure FDA0002550117590000026
representing the loss function of the self-encoder network in a hybrid network, i.e. making the reconstructed space-time matrix after decoding as much as possible
Figure FDA0002550117590000027
And the space-time matrix O before coding(a,b)Similarly, U represents all users in the training sample;
the training of the automatic encoder adopts supervised training to realize the reconstruction and the distinction of the encoding process, namely, a classification network is added to the automatic encoder to monitor the encoding process of the automatic encoder, and the loss function of the process
Figure FDA0002550117590000028
Comprises the following steps:
Figure FDA0002550117590000029
in the formula (I), the compound is shown in the specification,
Figure FDA00025501175900000210
representing the prediction result, namely the output result of the classification network; y representsA sample label; n represents the number of training samples, i.e. the number of pairs of users involved in the training data set. .
To obtain a more discriminative vector representation, the following constraints are placed on the integrated hybrid network:
Figure FDA0002550117590000031
in the formula (I), the compound is shown in the specification,
Figure FDA0002550117590000032
representing the composite loss function of the hybrid network.
Once training is complete, the encoder will be taken from the network of automated encoders for encoding and preliminary relationship inference for any pair of user spatio-temporal relationship matrices in the set of users.
CN202010572405.8A 2020-06-22 2020-06-22 Mobile social network user relationship inference method based on spatio-temporal relationship learning Active CN111738447B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010572405.8A CN111738447B (en) 2020-06-22 2020-06-22 Mobile social network user relationship inference method based on spatio-temporal relationship learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010572405.8A CN111738447B (en) 2020-06-22 2020-06-22 Mobile social network user relationship inference method based on spatio-temporal relationship learning

Publications (2)

Publication Number Publication Date
CN111738447A true CN111738447A (en) 2020-10-02
CN111738447B CN111738447B (en) 2022-07-29

Family

ID=72650234

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010572405.8A Active CN111738447B (en) 2020-06-22 2020-06-22 Mobile social network user relationship inference method based on spatio-temporal relationship learning

Country Status (1)

Country Link
CN (1) CN111738447B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784162A (en) * 2021-01-29 2021-05-11 东北财经大学 TS 24-based dynamic POIs recommendation method
CN113569154A (en) * 2021-07-16 2021-10-29 上海理工大学 Interactive object prediction method based on social behavior characteristics
CN114880586A (en) * 2022-06-07 2022-08-09 电子科技大学 Confrontation-based social circle inference method through mobility context awareness

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140279722A1 (en) * 2013-03-15 2014-09-18 Mitu Singh Methods and systems for inferring user attributes in a social networking system
US20160275401A1 (en) * 2015-03-20 2016-09-22 Fuji Xerox Co., Ltd. Methods and systems of venue inference for social messages
CN106447505A (en) * 2016-09-26 2017-02-22 浙江工业大学 Implementation method for effective friend relationship discovery in social network
CN106649659A (en) * 2016-12-13 2017-05-10 重庆邮电大学 Link prediction system and method for social network
US20170199920A1 (en) * 2010-03-17 2017-07-13 At&T Intellectual Property I, L.P. System for calculating a social graph using a sugi
CN107220902A (en) * 2017-06-12 2017-09-29 东莞理工学院 The cascade scale forecast method of online community network
CN107368534A (en) * 2017-06-21 2017-11-21 南京邮电大学 A kind of method for predicting social network user attribute

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170199920A1 (en) * 2010-03-17 2017-07-13 At&T Intellectual Property I, L.P. System for calculating a social graph using a sugi
US20140279722A1 (en) * 2013-03-15 2014-09-18 Mitu Singh Methods and systems for inferring user attributes in a social networking system
US20160275401A1 (en) * 2015-03-20 2016-09-22 Fuji Xerox Co., Ltd. Methods and systems of venue inference for social messages
CN106447505A (en) * 2016-09-26 2017-02-22 浙江工业大学 Implementation method for effective friend relationship discovery in social network
CN106649659A (en) * 2016-12-13 2017-05-10 重庆邮电大学 Link prediction system and method for social network
CN107220902A (en) * 2017-06-12 2017-09-29 东莞理工学院 The cascade scale forecast method of online community network
CN107368534A (en) * 2017-06-21 2017-11-21 南京邮电大学 A kind of method for predicting social network user attribute

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙建伟,李媛 等: "社交网络中用户关系和地理位置组合服务算法的研究", 《小型微型计算机系统》 *
李志,单洪 等: "基于时空共现的移动用户社会关系类型推断", 《吉林大学学报(工学版)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784162A (en) * 2021-01-29 2021-05-11 东北财经大学 TS 24-based dynamic POIs recommendation method
CN112784162B (en) * 2021-01-29 2023-09-19 东北财经大学 TS 24-based dynamic POIs recommendation method
CN113569154A (en) * 2021-07-16 2021-10-29 上海理工大学 Interactive object prediction method based on social behavior characteristics
CN113569154B (en) * 2021-07-16 2023-08-04 上海理工大学 Interactive object prediction method based on social behavior characteristics
CN114880586A (en) * 2022-06-07 2022-08-09 电子科技大学 Confrontation-based social circle inference method through mobility context awareness

Also Published As

Publication number Publication date
CN111738447B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN111738447B (en) Mobile social network user relationship inference method based on spatio-temporal relationship learning
CN111061961B (en) Multi-feature-fused matrix decomposition interest point recommendation method and implementation system thereof
Sadilek et al. Finding your friends and following them to where you are
Gudmundsson et al. Computational movement analysis
CN110879856B (en) Social group classification method and system based on multi-feature fusion
Jiao et al. A novel next new point-of-interest recommendation system based on simulated user travel decision-making process
CN110334293B (en) Position social network-oriented position recommendation method with time perception based on fuzzy clustering
CN107633100A (en) A kind of point of interest based on incorporation model recommends method and device
CN108804646B (en) Point of interest sign-in prediction method integrating deep learning and factorization machine
CN112380426A (en) Interest point recommendation method and system based on graph embedding and user long-term and short-term interest fusion
CN105389332A (en) Geographical social network based user similarity computation method
CN113139140B (en) Tourist attraction recommendation method based on space-time perception GRU and combined with user relationship preference
Sudo et al. Particle filter for real-time human mobility prediction following unprecedented disaster
CN113068131B (en) Method, device, equipment and storage medium for predicting user movement mode and track
CN112667920A (en) Text perception-based social influence prediction method, device and equipment
Shafizadeh‐Moghadam et al. On the spatiotemporal generalization of machine learning and ensemble models for simulating built‐up land expansion
Gong et al. Urban land-use land-cover extraction for catchment modelling using deep learning techniques
CN112883292B (en) User behavior recommendation model establishment and position recommendation method based on spatio-temporal information
Zarezade et al. Spatio-temporal modeling of check-ins in location-based social networks
CN113762648B (en) Method, device, equipment and medium for predicting male Wei Heitian goose event
CN113268770B (en) Track k anonymous privacy protection method based on user activity
Jenson et al. Mining location information from users' spatio-temporal data
JP7205630B2 (en) Label estimation device, label estimation method, and label estimation program
CN114048380A (en) Interest point recommendation method based on graph neural network
CN114168804A (en) Similar information retrieval method and system based on heterogeneous subgraph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant