CN108650614B

CN108650614B - Mobile user position prediction method and device for automatically deducing social relationship

Info

Publication number: CN108650614B
Application number: CN201810222692.2A
Authority: CN
Inventors: 李翔; 张澍民; 李聪
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2018-03-19
Filing date: 2018-03-19
Publication date: 2020-07-28
Anticipated expiration: 2038-03-19
Also published as: CN108650614A

Abstract

The invention belongs to the technical field of mobile behavior prediction, and particularly relates to a mobile user position prediction method and device for automatically deducing social relations. The invention comprises the following steps: acquiring an individual behavior record of a user from a user mobile behavior log database; deducing the social relationship type among the users according to the information; establishing a user social relationship network by taking users as nodes and the social relationship types between the two users as connecting edges; constructing a discrete movement track sequence of the user according to the time sequence by utilizing the user individual behavior record; generating social relationship subgraphs by using the Jacard coefficient, constructing a zero model, comparing the magnitude relation of statistical index values of the social relationship subgraphs under a real network and the zero model, and determining a social relationship motif of a user group; verifying a user individual social relation motif; a Markov predictor, an acquaintance predictor, a familiar stranger predictor, and an output governor are respectively established for predicting the future location of the user. The invention can improve the accuracy of position prediction and protect the personal privacy of the user.

Description

Mobile user position prediction method and device for automatically deducing social relationship

Technical Field

The invention belongs to the technical field of mobile behavior prediction, and particularly relates to a mobile user position prediction method and device for automatically deducing social relations.

Background

In recent years, the generation and collection of large-scale human behavior trajectory data has prompted academia to spurred a large set of innovative studies depicting human movement patterns. Empirical analysis and model research on human movement play a great role in practical application scenarios in different fields of human life, such as location prediction, disease prevention and control, traffic trip planning, data sharing, disaster response, and the like. As one of the extremely important applications, the topic of predicting the future position of a user is called a research hotspot in academic circles and industrial circles.

Currently, methods related to user location prediction can be mainly divided into two categories: one is a method based on the historical movement track of the user, and the other is a method combining the social relationship of the user.

Researchers in earlier years designed a series of prediction methods according to the historical movement track information of users, wherein the most representative method is a prediction method based on Markov chains, works such as evaluation Next-Cell Predictors with extensive Wi-Fi Mobility Data published in EEE transactions on Mobile Computing by Song et al in 2006 and evaluation of advancement of the L information of predictiveness in Human Mobility published in Scientific Reports by L u et al in 2013 introduced a prediction method based on Markov chains of different orders.

With the rapid development of Social networks, wireless communications, mobile computing, and other fields of Social networks in recent years, researchers have gradually utilized the Social relationships of users on Social networks to the design of location prediction methods, work in 2015 Zhang et al "New Cell: Predicting L mapping from Cell Phone Transactions" on IEEE Transactions on Computers "and work in 2016J ia et al" on ACMTransductions on Intelligent Systems and Technology "L mapping A Temporal-Spatial Bayesian Model" published by Zhang et al in 2015, have utilized the movement trajectories of the Social relationships of friends, etc. of users to predict the future locations of the users.

However, the current prediction method using the social relationship of the user mainly considers the social relationship of acquaintances such as friends and acquaintances of the user collected from link relationships, mobile phone calls and short message records, mobile phone address book records and the like on social websites, and in fact, on one hand, the development of social Networks causes more and more information of the user to be exposed in the social Networks, and the problem of disclosure of various privacy information caused by exposure of sensitive information in open social Networks gradually arouses people's attention to privacy protection issues, which makes researchers face more and more difficulties and challenges in acquiring data of the social relationship of the user, on the other hand, in addition to the close social relationship that we can intuitively feel, a special social relationship exists in our daily life, namely, "Familiar Strangers". the Familiar Strangers are a group of people who meet repeatedly, but they do not recognize each other and also go from Strangers to Strangers, such as a lot of daily study of social relationships in public buildings, which people who have been Familiar with Strangers who have been found daily relationships in daily contact with roads, thus, the Strangers who have been able to find a lot of daily study of Strangers in the social relationships of "history". 21.

Disclosure of Invention

The invention aims to provide a mobile user position prediction method and a mobile user position prediction device for automatically deducing social relations, which aim to solve the problems that the social relation data of a user needs to be collected and the social relations of the user are not considered sufficiently in the existing position prediction method, thereby protecting the personal privacy of the user and improving the accuracy of the position prediction method.

The invention provides a mobile user position prediction method for automatically deducing social relations, which comprises the following specific steps:

(1) acquiring individual behavior records of a user, namely acquiring the individual behavior records of the user from a user mobile behavior log database, wherein each data record comprises: user ID, access start time, access duration, access location ID;

(2) deducing the type of social relationship between the users, namely deducing the type of social relationship between the two users by using the individual behavior records of the users, wherein the type of social relationship comprises familiar strangers FS (family stranger), acquaintances F & IR (including friends and colleagues in-role) and strangers S (stranger);

(3) establishing a user social relationship network, namely establishing the user social relationship network by taking users as nodes and taking the type of social relationship between the two users as a continuous edge (not considering stranger relationship);

(4) establishing a user movement track sequence, namely constructing a discrete movement track sequence of the user according to a time sequence by using the user individual behavior record;

(5) detecting a social relationship motif of a user group, namely generating a social relationship subgraph by using the Jacard coefficient; generating a randomized user social relationship network by a degree-preserving broken edge reconnection method, and constructing a zero model; comparing the magnitude relation of the statistical index z value of each social relation subgraph under a real network and a zero model, and determining a social relation model of a user group;

(6) verifying a user individual social relationship model, namely generating a social relationship subgraph of the user by using a Jacard coefficient; comparing whether the user individual social relation subgraph is the social relation motif or not, if so, passing the verification, otherwise, not passing the verification;

(7) establishing a position predictor, namely respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, the future position of the user is predicted only by utilizing the Markov predictor, the familiar stranger predictor and the output regulator, and if the individual does not pass the social relationship motif verification, the future position of the user is predicted by utilizing the Markov predictor, the familiar stranger predictor and the output regulator.

In step (1), the individual behavior records of the user are obtained from the user mobile behavior log database, and each record includes: user ID, time, place, dwell time.

In step (2) of the present invention, the method and apparatus for inferring the social relationship type between two users by using the individual behavior records of the users is disclosed in the chinese patent, "a method and apparatus for classifying the offline social relationship based on the movement behavior of the users" (patent No. 201611264316.7), which includes:

obtaining a user set U and a place set L according to the user behavior records, wherein each data record comprises a user ID, an access starting time, an access duration and an access place ID;

determining a user behavior period T and discretizing a time step length delta T according to time data in the user behavior record, wherein the user behavior period T divides the whole time axis in the log data into N periods;

for each period n, a behavior matrix of the user u is constructed

Wherein U is the U-th user in the user set UN is the nth cycle of the N cycles,

l represents the ith place in the set of places L, a behavior matrix S_nElements of (u)

Is 0 or 1.

The co-occurrence of time and space means that the user u and the user v have the behavior records of time coincidence at the same place l. The spatio-temporal co-occurrence represents one "interaction event" of user u with user v in real life. Definition E_nFor the set of all interaction events in the nth period, if the user u and the user v have a time-space co-occurrence in the nth period, the location l and the time step t, the interaction event e_n＝(u,v,t,l)∈E_n。

For each pair of users (u, v) having at least one interaction event, an interaction matrix is constructed

Wherein U is the U-th user in the user set U, v is the v-th user in the user set U,

l represents the l-th place in the set of places L. interaction matrix M_u,vOf (2) element(s)

Is a doublet

The weight of the interaction is represented by,

the degree of interaction support is represented, wherein,

and

can be calculated by the following formulas (1), (2):

calculating the regularity d of the user space-time interaction matrix through the following formula (3)_r(u,v)：

Calculating the spatio-temporal entropy d of the user spatio-temporal interaction matrix by the following formula (4)_e(u,v)：

Constructing a null hypothesis: the individual behavior of the user is not influenced by others, and the individual behavior of the user has no periodic bias. And establishing a zero model of the user individual behaviors and the space-time interaction matrix among the users according to a zero hypothesis, namely a random user behavior matrix and a random space-time interaction matrix in each period.

And calculating the individual activity according to the user behavior matrix. User activity represents the probability of a user accessing a spatiotemporal grid during a period. Establishing a user-space-time grid bipartite graph according to a user behavior matrix; the user-spatio-temporal grid two-part graph comprises: the set of users represents nodes for each user, represents nodes for each spatio-temporal grid (t, l), and edges between users and spatio-temporal grids where a record of behavior exists. Elements in a user behavior matrix

Temporal, user u and spatio-temporal grids(t, l) connecting edges exist.

And randomizing the user-space-time grid bipartite graph by using a continuous edge exchange method of the retention degree to obtain the random user-space-time grid bipartite graph. The method keeps the degree of each node unchanged and the number of the nodes and the connecting edges unchanged.

Reconstructing a zero model of the user individual behavior matrix and the user-space-time interaction matrix in each period according to the individual activeness and the random user-space-time grid bipartite graph, and comprising: random user behavior matrix

Random space-time interaction matrix

Degree of random regularity

And random spatiotemporal entropy

Counting the probability distribution of space-time entropy and regularity in the zero model, and presetting a probability p₀Determining zero thresholds for spatio-temporal entropy and regularity, comprising:

preset probability p₀Wherein p is₀Much less than 1.

Determining a zero threshold e of the spatio-temporal entropy according to the probability distribution of the spatio-temporal entropy and the regularity in the zero model₀And a zero threshold r for regularity₀. Wherein the spatio-temporal entropy is zero threshold e₀Satisfy the requirement of

The regularity is zero threshold r₀Satisfy the requirement of

By comparing the magnitude relationship between the real user interaction matrix and its zero threshold in both dimensions of spatiotemporal entropy and regularity, the offline social relationship between two users (familiar strangers fs (family stranger), acquaintances F & IR (including friends and colleagues in-role), strangers s (stranger)) is determined, including:

and if the space-time entropy of the user interaction matrix is smaller than the space-time entropy random threshold value and the regularity is larger than the regularity random threshold value, determining that the offline social relationship among the users is a familiar stranger. And if the space-time entropy of the user interaction matrix is smaller than the space-time entropy random threshold value and the regularity is larger than the regularity random threshold value, determining that the offline social relationship among the users is a familiar stranger FS. And if the space-time entropy of the user interaction matrix is greater than the space-time entropy random threshold, determining that the offline social relationship among the users is an acquaintance relationship F & IR, wherein if the regularity is greater than the regularity random threshold, determining that the offline social relationship among the users is an occupational relationship IR such as colleagues/classmates in the acquaintance relationship, and if the regularity is less than the regularity random threshold, determining that the offline social relationship among the users is a friend relationship F in the acquaintance relationship.

In step (3), the method for establishing a user social relationship network by using users as nodes and using the type of the social relationship between the two users as a continuous edge (without considering stranger relationship) comprises the following steps:

each user U serves as a node, the type of the social relationship between the two users serves as a connecting edge e, if the type of the social relationship between the two users is an acquaintance or a familiar stranger relationship, a connecting edge is considered to exist between the two users, and the type of the connecting edge corresponds to the type of the social relationship, so that a user social relationship network G (U) is constructed, wherein U is a user set and is a connecting edge set.

In step (4) of the present invention, the constructing a discrete movement trajectory sequence of a user according to a time sequence by using the user individual behavior record includes:

determining the discretization time step length delta T according to the time data in the user behaviors; for the behavior records of the user, if a plurality of records exist in one discretization time step, a place with the longest visit duration or the largest visit times is selected as the place of the discretization time step, and therefore the discretization movement track sequence of the user is constructed.

In step (5), the generating a social relationship subgraph by using the jaccard coefficient includes:

defining (u) and (v) as acquaintance relationship neighbor sets for users u and v, respectively, calculating Jaccard's coefficient J for each user and all social relationships thereof by the following formula (5):

defining the n-order social relationship subgraph of the user as the top n most important (i.e. most similar moving track) social relationship individual types of the predicted user.

And sorting the social relationship of each user according to the Jacard coefficient from large to small, and taking the first n individuals to obtain an n-order social relationship subgraph of each user.

In step (5), the method for passing degree retention broken edge reconnection generates a randomized user social relationship network and constructs a zero model, which comprises the following steps:

given an actual social relationship network G ═ (U,), for one connecting edge e in the set of social relationship connecting edges_stRandomly selecting a continuous edge of the same social relationship type, e.g. for a continuous edge

Another continuous edge is randomly selected

By probability

And (c) replacing the two connected edges (u, v, FS) and (u ', v', FS) with (u, v ', FS) and (u', v, FS), and otherwise, replacing the two connected edges with (u, u ', FS) and (v, v', FS). If the broken edge reconnection process generates a self-loop edge or a heavy edge, the broken edge reconnection operation is terminated. Repeating the above processes until all the connecting edges are reconnected, and obtaining a random social relationship network.

The 100 random social relationship networks as above are generated as a zero model.

In step (5), the step of comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model to determine the social relation motif of the user group comprises:

for a certain subgraph type m, C (m) represents the occurrence frequency of the subgraph m in the real network,

for the frequency of occurrence of the subgraph m in the random network corresponding to the real network, μ (×) and σ (×) are respectively the operation of calculating mean and standard deviation, and the z value of the index describing the subgraph importance is defined as:

and respectively counting the z values of each type of social relation subgraphs under random networks generated by a real network and a zero model, and if the z value of a certain subgraph is significantly larger than 0, determining the subgraph as a user group social relation motif.

In step (6), the social relationship subgraph of the user is generated by using the Jacard coefficient, and in the same step (3), the social relationship of each user is sorted from large to small according to the Jacard coefficient, and the first n individuals are taken to obtain the n-order social relationship subgraph of each user.

In step (6), the comparing whether the user individual social relationship subgraph is the social relationship motif is performed, if yes, the verification is passed, otherwise, the verification is not passed, and the method includes:

and (4) comparing the social relation subgraph generated by the user with the social relation motif obtained in the step (5), if the social relation subgraph is the social relation motif in the step (5), passing the verification, otherwise, not passing the verification.

In step (7) of the present invention, the establishing a markov predictor, an acquaintance predictor, a familiar stranger predictor, and an output regulator includes:

in Markov predictors, using a random variable X_tRepresenting all possible states of a random variable at the location of an individual at a discrete time step tState { x₁,x₂,…,x_t+1All can be detected from the actual data, each state x_t∈ {1,2, …, L } is the location number and L is the total number of different locations, then the user's movement trajectory is modeled with a first order markov chain, i.e., the user's next location depends only on the previously visited location, which can be expressed as:

P_M(X_t+1＝x_t+1|X_t＝x_t,X_t-1＝x_t-1,…,X₁＝x₁)＝P_M(X_t+1＝x_t+1|X_t＝x_t) (7)

given the location of user u at discrete time step t-1 and the first order Markov transition matrix extracted from the historical location data, the Markov location access probability vector for user u at discrete time step t can be obtained:

in the acquaintance predictor, the user is urged by the acquaintance relationship F & IR to move in a short time toward the position where the acquaintance relationship is located.

Assuming that the position of the user u at the discrete time step t +1 is to be predicted, given that a certain acquaintance relationship v of the user u is at the location l at the discrete time step t and that v is always located at the location l from time step t to t +1, the user u is given the following formula

Indicating that user u visits location i at a discrete time step t +1,

representing the probability that user u and user v meet at location i at discrete time step t +1,

representing the probability that user v is always located at location i from time step t to t +1, then the conditional probability that user u will visit location i at discrete time step t +1 can be expressed as:

set of acquaintance relationships S for a given user u_F&IR＝{v₁,v₂,…,v_KW from_iRepresenting user u and user v_iNormalized interaction frequency of

Then the probability that user u visits location i at discrete time step t +1 can be expressed as:

obtaining the probability of the user u visiting each place at the time step t

Then, the location access probability vector of the user u at the discrete time step t can be obtained

Wherein a normalization process is to be applied to ensure

In familiar stranger predictors, a user's place of access may be replicated by their familiar stranger relationships due to the significant periodicity of the user's interactions with their familiar strangers.

As described in step (2), in each period n, the behavior matrix of the user u can be constructed as

Wherein U is the U-th user in the user set U, N is the N-th period in the N periods,

l denotes the l-th place in the set of places L behavior matrix S_nElements of (u)

Is 0 or 1. The cumulative behavior matrix for user u may be represented as

By using

Representing the cumulative number of times user v visits location i at time step t, then the probability that user v visits location i at time step t may be expressed as:

given familiar set of stranger relationships S for user u_FS＝{v₁,v₂,…,v_KW from_iRepresenting user u and user v_iNormalized interaction frequency of

Then the probability that user u visits location i at discrete time step t can be expressed as:

obtaining the probability of the user u visiting each place at the time step t

Wherein a normalization process is to be applied to ensure

In the output regulator, a multiple linear regression model is used to match MarkovThe outputs of the Fround predictor, the acquaintance predictor and the familiar stranger predictor are weighted and fused, let α, β and gamma be weight parameters, and α + β + gamma be 1, then the location access probability vector P is finally output_aggrCan be expressed as:

P_aggr＝αP_M+βP_F&IR+γP_FS(13)

by P_realRepresenting the actual location visit probability vector of the user, the weight parameter can be obtained by minimizing the loss function J:

for the user u, if it can be verified through the social relation motif in the invention step (4), only the markov predictor, the familiar stranger predictor and the output regulator are used to obtain the final predicted output, that is, the parameter β is made to be 0, otherwise, the markov predictor, the familiar stranger predictor and the output regulator are used completely to obtain the final predicted output, that is, the parameter β is not necessarily 0.

In another aspect, the present invention provides a mobile user location prediction device for automatically inferring social relationships, comprising:

(1) and the user individual behavior record acquisition module is used for acquiring the user individual behavior records from the user mobile behavior log database to obtain a user set U and a place set L.

(2) And the inter-user social relationship type inference module is used for establishing a time-space interaction matrix among users by using the user individual behavior records, extracting the time-space entropy and the regularity, establishing a zero model, determining a zero threshold value through a preset probability p, and inferring the inter-user social relationship.

(3) And the user social relationship network establishing module is used for establishing a user social relationship network by taking the user as a node and taking the type of the social relationship between the two users as a continuous edge (without considering the relationship of strangers).

(4) And the user movement track sequence establishing module is used for establishing a discrete movement track sequence of the user according to the time sequence by utilizing the user individual behavior record.

(5) The user group social relation motif detection module is used for generating a social relation subgraph by using the Jacard coefficient; generating a randomized user social relationship network by a random broken edge reconnection method, and constructing a zero model; and comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model, and determining a social relation motif of the user group.

(6) The user individual social relationship motif verification module is used for generating a social relationship subgraph of the user by using the Jacard coefficient; and comparing whether the user individual social relation subgraph is the social relation motif, if so, passing the verification, otherwise, not passing the verification.

(7) The position predictor establishing module is used for respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, the future position of the user is predicted only by utilizing the Markov predictor, the familiar stranger predictor and the output regulator, and if the individual does not pass the social relationship motif verification, the future position of the user is predicted by utilizing the Markov predictor, the familiar stranger predictor and the output regulator.

The above seven modules specifically perform the operations of the seven steps of the prediction method of the present invention. Wherein:

the user individual social relationship motif verification module comprises:

the social relation subgraph generation submodule is used for generating a social relation subgraph of the user by using the Jacard coefficient;

and the social relation motif verification submodule is used for comparing whether the user individual social relation subgraph is the social relation motif or not, if so, the verification is passed, and otherwise, the verification is not passed.

The location predictor building module comprises:

the Markov predictor establishing submodule is used for predicting the future position of the user by utilizing the historical position information of the user;

the acquaintance predictor establishing submodule is used for predicting the future position of the user by utilizing the acquaintance relationship of the user;

a familiar stranger predictor building submodule for predicting a future position of the user by using a familiar stranger relationship of the user;

and the output regulator establishing submodule is used for fusing the outputs of the Markov predictor establishing submodule, the acquaintance predictor establishing submodule and the familiar stranger predictor establishing submodule to obtain the final prediction output.

The technical scheme provided by the invention has the following advantages:

the mobile user position prediction method and the device for automatically deducing the social relationship, provided by the invention, introduce the relation of familiar strangers into the position prediction method for the first time on the basis of predicting the user position through the traditional historical track information of the user and the social relationship of acquaintances, thereby improving the accuracy of position prediction and providing a new idea for designing the position prediction method. According to the invention, through defining and mining the social relation motif of the user, the users of different types are processed heterogeneously, and the overall prediction performance is improved while the calculation cost is reduced. The invention is based on different types of social relationships directly deduced by using the user line downlink as the data, does not need to specially and additionally collect the user social relationship data, reduces the original data information required by prediction, well protects the individual privacy of the user and reduces the difficulty of data acquisition. The method provided by the invention is suitable for scenes with higher geographic resolution such as Wi-Fi and the like, is different from the traditional method which mostly depends on scenes with lower geographic resolution such as a base station, POI and the like, and can better predict fine-grained geographic positions.

Drawings

Fig. 1 is a schematic flowchart of a method for predicting a location of a mobile user for automatically inferring a social relationship according to an embodiment of the present invention.

Fig. 2 is a sample diagram of a user movement behavior log according to an embodiment of the present invention.

Fig. 3 is a schematic diagram illustrating determination of a social relationship type of a user according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of generating a social relationship network according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of a user social relationship subgraph provided in an embodiment of the present invention.

Fig. 6 is a schematic structural diagram of a mobile user location prediction apparatus for automatically inferring social relationships according to an embodiment of the present invention.

Fig. 7 is a schematic structural diagram of a module for inferring a social relationship type between users according to an embodiment of the present invention.

FIG. 8 is a schematic diagram of a structure of a user spatio-temporal interaction matrix building module according to an embodiment of the present invention.

Fig. 9 is a schematic structural diagram of a zero model building and zero threshold selecting module according to an embodiment of the present invention.

Fig. 10 is a schematic structural diagram of a user group social relationship motif detection module according to an embodiment of the present invention.

Fig. 11 is a schematic structural diagram of a user individual social relationship motif verification module according to an embodiment of the present invention.

Fig. 12 is a schematic structural diagram of a location predictor establishing module according to an embodiment of the present invention.

Detailed Description

In order to make the purpose, technical scheme and advantages of the present application more clear and obvious, the following will explain in detail the embodiments of the present invention by taking the log data of the wireless network login behavior of a certain college and university in China as an example with reference to the accompanying drawings.

FIG. 1 is a block diagram illustrating a process of a method for predicting a location of a mobile user for automatically inferring social relationships according to the present invention, as shown in FIG. 1, comprising:

step 100, obtaining an individual behavior record of the user from a user movement behavior log database, wherein the individual behavior record comprises a user ID, access time, duration and a place ID.

Taking wireless network login behavior log data collected by a university in China as an example, a wireless network login behavior log in a campus is collected and stored by school information, wireless network login behaviors of all users using a campus wireless network in the campus are recorded, and the format of original data is shown in fig. 2. Each data record includes a user ID, an access start time, an access duration, and an access location ID. In this data set, all the different wireless hotspots (APs) constitute a set of sites. Because the coverage range of the wireless hotspot is small, a user often automatically connects to the wireless hotspot closest to the user, and therefore, when the user moves from one place to another place, the wireless hotspot accessed by the user can be automatically switched. Each wireless network login record characterizes the time and place when a user accesses the wireless network, while a series of wireless network login records of a user characterize the mobile behavior of the user.

In this embodiment, step 100 obtains user individual behavior records from a user mobile behavior log database to obtain a user set U and a place set L, where each data record includes a user ID, an access start time, an access duration, and an access place ID. original data format as shown in fig. 2, and each record is a quadruple (U, t)_aT, l), where U represents the user number in the user set U, and t_aTo access the start time, t is the access duration, and l is the location (wireless hotspot) number in the set of locations L.

Step 101, deducing the social relationship type between two users by using the user individual behavior record, including familiar strangers, acquaintances and strangers, as detailed in the chinese patent, "a method and apparatus for offline social relationship classification based on user movement behavior" (patent No. 201611264316.7), specifically comprising the following steps:

(1) and establishing a space-time interaction matrix among the users by utilizing the individual behavior records of the users.

Firstly, according to time data in a user behavior record, a user behavior period T and a discretization time step length delta T are determined, wherein the user behavior period T divides the whole time axis in log data into N periods. By counting the probability distribution of the time intervals when the users return to the same place (the probability distribution is the probability distribution on the whole user set), the time intervals which are obviously prominent in probability are found, and the time intervals can be regarded as the user behavior period T. Typically, the human behavioral cycle is 1 day or 7 days. In this example, T is 7 days. T divides the entire time axis of the observation record into N periods. On the other hand, in order to fully mine the time of the user movement behavior, the space mode is needed to be subjected to subsequent analysis, a continuous time shaft needs to be discretized, the discretization time step length delta T is determined to simplify the representation of the user movement behavior, and the continuous time is discretized into a time period with the length delta T. The Δ T is selected according to specific data, and generally needs to be able to remove some noise in the data and sufficiently show the change of the user behavior. In this example, Δ T is taken to be 3 hours.

For each period n, a behavior matrix of the user u is constructed

Wherein N is the nth period of the N periods of the upper section; t belongs to

Denotes the T-th time step in the n-th cycle, where Δ T is the length of said time step in the upper section, which divides a cycle into

Time step,. l represents the ith place in the set of places L. the user behavior matrix S_nThe number of rows of (u) is

(number of time steps in a cycle), the number of columns is the total number of places | L |. S in the set of places L_nElements of (u)

Is 1 or 0, when the user u has a behavior record occurring at the time step t of the nth period at the location l,

if not, then,

it should be noted that the user behavior matrix in one period is equivalent to dividing the time and space in one period into

A spatio-temporal grid, each spatio-temporal grid being representable by a tuple (t, l),

indicating that the user accessed the spatiotemporal grid (t, l) during this period.

And establishing a space-time interaction matrix between every two users according to the time-space co-occurrence. The time-space co-occurrence means that the user u and the user v have a time-coincident behavior record at the same place l. The spatio-temporal co-occurrence represents one "interaction event" of user u with user v in real life. Definition E_nFor the set of all interaction events in the nth period, if the user u and the user v have a time-space co-occurrence in the nth period, the location l and the time step t, the interaction event e_n＝(u,v,t,l)∈E_n。

Wherein U is the U-th user in the user set U, v is the v-th user in the user set U, and t belongs to

Representing the t time step in the n cycle, l representing the l place in the set of places L_u,vNumber of lines of

(number of time steps in a cycle), the number of columns is the total number of places | L | in the set of places L_u,vEquivalent to dividing time and space in one period into

A spatio-temporal grid, each of which may be represented by (t, l). M_u,vOf (2) element(s)

Is a doublet

Is an interaction weight which represents the number of periods that the users u and v have interaction events in the space-time grid (t, l),

for interactive support, the probability of the user u and v to have an interactive event in the time-place grid (t, l) is expressed. Wherein

And

can be calculated by:

interaction weights

Embodying the preference degree and the interaction support degree of the time-space grid (t, l) when two users (u, v) have an interaction event

Represents the probability that an interaction event occurs in the spatio-temporal grid (t, l) when u, v are independent of each other. Interaction support as the periodicity of the user's spatio-temporal grid (t, l) behavior is stronger

The larger.

(2) For each pair of users' spatio-temporal interaction matrix, two interaction characteristics are extracted: spatio-temporal entropy and regularity. The space-time entropy is used for measuring social similarity between two users, and the regularity is used for measuring the periodicity degree of the interaction events between the two users.

Calculating the regularity d of the user space-time interaction matrix by the following method_r(u,v)：

Calculating the spatio-temporal entropy d of the user spatio-temporal interaction matrix by_e(u,v)：

(3) And establishing a zero model, and determining a zero threshold value through a preset probability p.

In order to distinguish different social relationships through two interactive characteristics of the space-time entropy and the regularity, a zero model of a space-time interactive matrix between zero-hypothesis users needs to be established, and the space-time entropy and the regularity distribution under the zero model are obtained.

Through the randomization treatment of the individual behaviors of the users, a zero model of the individual behaviors of the users and a space-time interaction matrix between the users is established: the method comprises the following steps of determining a space-time entropy random threshold value and a regularity random threshold value according to a zero model and a preset probability by using a random user behavior matrix and a random space-time interaction matrix in each period, wherein the random user behavior matrix and the random space-time interaction matrix in each period specifically comprise the following steps:

(a) and calculating individual activeness according to the user behavior matrix, wherein the user activeness represents the probability of accessing a space-time grid by a user in a period. Establishing a user-space-time grid bipartite graph G according to the user behavior matrix_USThe user-spatio-temporal grid two-part graph comprises: the user set represents nodes of each user, represents nodes of each spatio-temporal grid (t, l) and uses of the presence recordsThe connecting edge between the user and the spatio-temporal grid. Elements in the user behavior matrix

User u has a continuous edge with the spatio-temporal grid (t, l).

In calculating the individual liveness, L (u) is defined as the set of all places visited by user u, and in combination with the user behavior matrix in step 100, the user liveness act (u) can be calculated by the following formula:

traversing the user behavior matrix in each period when building the user-space-time grid bipartite graph

If elements are present

User u has a continuous edge with the spatio-temporal grid (t, l).

(b) Randomizing user-spatio-temporal grid bipartite graph G by edge-to-edge exchange with degree of preservation_USTo obtain a random user-space-time grid bipartite graph

The method keeps the degree of each node unchanged and the number of the nodes and the connecting edges unchanged.

In randomizing the user-spatio-temporal grid bipartite graph, a preserving degree of continuous edge exchange method is used. The method randomly selects two continuous edges (u, (t1, l1)), (v, (t2, l2)) in the bipartite graph to interact, obtains new continuous edges (u, (t2, l2)), (v, (t1, l1)), adds the new continuous edges into the bipartite graph, and deletes the original two continuous edges. After a sufficient number of consecutive edge exchanges have been performed, the randomization process is complete. The randomized user-space-time grid bipartite graph has the same number of nodes, the same number of connecting edges and the same node degree as the original graph, that is, each user node is connected with the same number of space-time grid nodes as the original graph, and each space-time grid node is connected with the original graphAnd the number of the original images is the same. The method ensures that the originally active nodes are still active and the space-time grid with a large number of originally accessed nodes is still accessed. For random user-space-time grid bipartite graph

And (4) showing.

In the step, the user-spatio-temporal grid bipartite graph randomization process of each user is independent, so that the spatio-temporal grid connected with the users after randomization is not influenced by social relations, and the first hypothesis in null hypothesis is satisfied.

(c) A randomized model for reconstructing the user individual behavior matrix and the user-user spatio-temporal interaction matrix in each period according to the individual activity in (a) and the random user-spatio-temporal grid bipartite graph in (b), comprising: random user behavior matrix

Random space-time interaction matrix

Degree of random regularity

And random spatiotemporal entropy

In the establishment of random user behavior matrix

Then, for each period n, if the random user-space-time grid bipartite graph

In (b), if there is a connecting edge (u, (t, l)), then

Middle element

Setting the probability act (u) as 1, otherwise, setting the probability act (u) as 0. This step makes the probability of connection of the user to each connectable spatio-temporal grid the same at each cycle, without the presence of periodic spatio-temporal bias, satisfying the second of the null hypotheses.

In establishing random space-time interaction matrix

First, a random interactive event set of each period is established

For random user behavior matrix

And

of elements in (1), if

Then a random interaction event

Correspondingly, according to the definition of the interaction matrix in (1), a random interaction matrix can be obtained

M_u,vOf (2) element(s)

Is a doublet

The calculation is as follows:

wherein

As a matrix of random user behavior

Element(s)

According to the interactive characteristic calculation method in (1), the degree of random regularity can be calculated by the following formula

And random spatiotemporal entropy

Wherein

As a random interaction matrix

And (4) elements.

(d) Preset probability p₀Wherein p is₀Much less than 1. Determining a space-time entropy zero threshold e according to the regularity and the random space-time entropy probability distribution under the zero model₀And a zero threshold r for regularity₀. Wherein e₀Satisfy the requirement of

r₀Satisfy the requirement of

In general p₀Is less than 0.001 to ensure sufficient confidence when p is₀Sufficiently small means that in the case of complete randomness, the regularity or the spatiotemporal entropy of the user interaction matrix is unlikely to be greater than the zero threshold corresponding to the user interaction matrix, and in the real scene, if the interaction characteristics between users are greater than the zero threshold, the interaction characteristics are caused by some non-random social relationship between the users.

(4) By comparing the magnitude relationship between the user interaction matrix and its random threshold in both the spatial-temporal entropy and regularity dimensions, familiar strangers FS (family stranger), acquaintances F & IR (friend and professional in-role), strangers S (stranger).

In this embodiment, a schematic diagram of the determination of the social relationship types of two users is shown in fig. 3.

If the spatio-temporal entropy d of the user interaction matrix_e(u, v) is less than the spatio-temporal entropy zero threshold e₀Degree of regularity d_r(u, v) is less than zero regularity threshold r₀Determining that the social relationship among the users is stranger relationship; if the space-time entropy of the user interaction matrix is smaller than a space-time entropy zero threshold value, and the regularity is larger than a regularity zero threshold value, determining that the social relationship among the users is a familiar stranger relationship; if the spatio-temporal entropy of the user interaction matrix is larger than a spatio-temporal entropy zero threshold value and the regularity is smaller than a regularity zero threshold value, determining that the social relationship among the users is a friendship; and if the space-time entropy of the user interaction matrix is greater than a space-time entropy zero threshold value and the regularity is greater than a regularity zero threshold value, determining that the social relationship among the users is professional relationships such as colleagues/classmates.

And 102, constructing a user social relationship network by taking the users as nodes and taking the social relationship types between the two users as continuous edges (not considering stranger relationships).

And taking each user U in the user set U as a node, and taking the social relationship type between the two users acquired in the step 101 as a connecting edge e. Since only one interaction has occurred between a pair of users in stranger relationship with each other, there is no substantial help for the location prediction of both parties, so the stranger relationship S will not be considered here. If the social relationship type between two users is acquaintance relationship F & IR or familiar stranger relationship FS, a social relationship connecting edge is considered to exist between the two users, and the type of the connecting edge corresponds to the type of the social relationship between the two users, so that a user social relationship network G (U) is constructed, wherein U is a user set and is a connecting edge set. An example of the generation of a social relationship network is shown in fig. 4. By taking the log data of the wireless network login behavior collected by a university in China as an example, the scale of the generated social relationship network is 10146 nodes, and 5182743 connecting edges.

And 103, constructing a discrete movement track sequence of the user according to the time sequence by using the user individual behavior record. The discretized time step length Δ T can be determined using the user behavior data as described in step 101. In this example, Δ T is taken to be 3 hours. According to the original behavior records of the user, if the user has a plurality of records in one discretization time step, the place with the longest visit duration or the largest visit times is selected as the place of the discretization time step. For example, a user u is at a location l within a discrete time step₁、l₂And l₃Visit 53 minutes, 28 minutes and 1 hour, then when constructing the user's discrete movement trajectory sequence, the user's visit location within the discretized time step is chosen/₃. Thereby obtaining a discrete movement trajectory sequence for each user. In the present embodiment, the time span of the data set is 84 days

104, generating a social relation subgraph by using the Jacard coefficient; generating a randomized user social relationship network by a random broken edge reconnection method, and constructing a zero model; and comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model, and determining a social relation motif of the user group.

In this embodiment, the method specifically includes the following steps:

(1) and generating a social relation subgraph by using the Jacard coefficient. First, an n-order social relationship subgraph of a user is defined as the top n most important (i.e. the most similar movement tracks) social relationship individual types of a predicted user, and an example graph is shown in fig. 5. Fig. 5 shows four different forms of 3-order social relationship subgraphs, where the central node of each subgraph is a user at a position to be predicted, the connected 3 nodes are the first 3 most important (i.e., most similar movement trajectories) users of the user at the position to be predicted, and the type of the connected edge corresponds to the type of the social relationship of the user pair. The Jacard coefficient is a representative index for describing the social proximity degree between users in the social network research, the Jacard coefficient has obvious positive correlation with the track similarity of the user pair, and meanwhile, compared with the method for directly calculating the track similarity of the user pair, the calculation of the Jacard coefficient has lower calculation complexity. Therefore, the top n most important social relationship individuals of the user are mined by the Jacard coefficient. Defining (u) and (v) sets of acquaintance relationships F & IR neighbors for users u and v, respectively, calculating the jarcard' scoefficient) J of each user with all their social relationships as follows:

(2) And generating a randomized user social relationship network by a method of degree-preserving broken edge reconnection and constructing a zero model. Given an actual social relationship network G ═ U, (U,), the generation step of the zero model can be expressed as:

(a) for one continuous edge e in the social relation continuous edge set_stRandomly selecting a continuous edge of the same social relationship type, e.g. for a continuous edge

Another continuous edge is randomly selected

(b) By probability

Two connecting edges (u, v, FS) and (u ', v', FS) are replacedTo (u, v ', FS) and (u', v, FS), otherwise replace them with (u, u ', FS) and (v, v', FS);

(c) if the process of the broken edge reconnection in the step (b) generates a self-loop edge or a heavy edge, the broken edge reconnection operation is terminated. Repeating the processes until all the connecting edges are reconnected, and obtaining a random social relationship network;

(d) the 100 random social relationship networks as above are generated as a zero model.

(3) And comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model, and determining a social relation motif of the user group. Generally, subgraphs with significant differences in the statistical number of the real network and the random network are considered as the motif, and the z value is a representative index for describing whether the differences in the number of the subgraphs between the real network and the random network are significant in the study of the motif. For a certain subgraph type m, C (m) represents the occurrence frequency of the subgraph m in the real network,

In this embodiment, taking a 3 rd order social relationship subgraph as an example (see fig. 5 for an example), the z values of 4 different types of social relationship subgraphs obtained by statistics are shown in the following table:

3 order subgraph	①	②	③	④
					z value	0.93	-13.51	7.28	39.15

Since the z-values of the 3 rd order

social relationship sub-graphs

③ and ④ are significantly greater than 0, in this embodiment, the 3 rd order

social relationship sub-graphs

③ and ④ are social relationship motifs of the user population, and the 3 rd order

social relationship sub-graphs

③ and ④ are dominated by familiar stranger relationships (most or all of the familiar stranger relationships in the 3 connected edges of the 3 rd order sub-graphs).

105, generating a social relation subgraph of the user by using the Jacard coefficient; and comparing whether the user individual social relation subgraph is the social relation motif, if so, passing the verification, otherwise, not passing the verification.

Firstly, as described in step 104, the social relationship of the user is sorted from large to small according to the jaccard coefficient, and the first n individuals are taken to obtain an n-order social relationship subgraph of each user, in this embodiment, n is taken as 3, that is, a 3-order social relationship subgraph of the user is generated.

Comparing the 3 rd order social relationship subgraph of the user with the user group social relationship motifs determined in the step 104 (in the embodiment, the 3 rd order social relationship subgraph ③ and ④), if the 3 rd order social relationship subgraph of the user is one of the social relationship motifs, the verification is passed, otherwise, the verification is not passed.

Step 106, respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, the future position of the user is predicted only by utilizing the Markov predictor, the familiar stranger predictor and the output regulator, and if the individual does not pass the social relationship motif verification, the future position of the user is predicted by utilizing the Markov predictor, the familiar stranger predictor and the output regulator. The method specifically comprises the following steps:

(1) a markov predictor is established. The Markov predictor predicts a future position using position information of the user history. In a Markov predictor, a sequence of user's movement trajectories can be modeled with a first order Markov chain, that is, the user's next visit location depends only on the previously visited location. By random variables X_tRepresenting all possible states { x } of a random variable at the location of an individual at discrete time step t₁,x₂,…,x_t+1All can be detected from the actual data, each state x_t∈ {1,2, …, L } is the location number, L is the total number of different locations, which can be expressed as:

P_M(X_t+1＝x_t+1|X_t＝x_t,X_t-1＝x_t-1,…,X₁＝x₁)＝P_M(X_t+1＝x_t+1|X_t＝x_t)

given the location of the user u at the discrete time step t-1 and the first-order Markov transition matrix extracted from the historical location data, the Markov location access probability vector of the user u at the discrete time step t can be obtained

(2) An acquaintance predictor is established. The acquaintance predictor is designed based on the fact that users have a significant tendency to move in a short time towards where their acquaintance relationships are currently located. For example, if a friend of a user eats at a canteen location, the user is likely to move to a location where his friendship is currently located in order to eat with his friend.

Indicating that user u visits location i at a discrete time step t +1,

i.e., the weighted sum of the influence exerted by the probability that the user visits location i at discrete time step t +1 on each of their friend relationships.

Obtaining the probability of the user u visiting each place at the time step t

Later, the available user u is in discrete timeStep t of location access probability vector

In this embodiment, since the sum of the probability vectors is not necessarily 1, for the convenience of calculation later, a normalization process is applied to ensure that

(3) Familiar stranger predictors are established. The interaction between the user and the familiar stranger relationship (namely, geographic meeting) has the obvious periodic characteristic, and the user frequently and periodically interacts in certain fixed places, so that part of the movement track information of the user can be reversely reproduced through the familiar stranger relationship group of the user.

As described in step (2), in each period n, the behavior matrix of the user u can be constructed

Is 0 or 1. The cumulative behavior matrix for user u may be represented as

By using

obtaining the probability of the user u visiting each place at the time step t

(4) An output regulator is established. Three location access probability vectors of the Markov predictor, the acquaintance predictor and the familiar stranger predictor are obtained according to the (1), (2) and (3) parts of the step 106, and the three location access probability vectors need to be fused in order to obtain the final unique location access probability vector output.

The outputs of the markov predictor, acquaintance predictor and familiar stranger predictor are weighted and fused using a multiple linear regression model, let α, β and gamma be weight parameters, and α + β + gamma be 1, then the final output locality visit probability vector P is_aggrCan be expressed as:

P_aggr＝αP_M+βP_F&IR+βP_FS

In this embodiment, the original data is divided into 50% of training sets and 50% of test sets according to the time sequence, the training sets are used for training to obtain the weight parameters α, β and gamma, and the test sets are used for testing the performance of the prediction method_uRepresents, i.e.:

wherein if the i-th prediction of user u is correct, i.e. user u does visit the predicted location η _i1, otherwise η _i0. This example predicts a 1-week period for each user.

In this example, 3 baseline methods were designed for comparative evaluation, which are:

(a) baseline method 1: first order Markov chain prediction method

(b) Baseline method 2: prediction method combining first-order Markov chain and acquaintance relationship

(c) Baseline method 3: prediction method combining first order Markov chains with familiar stranger relationships

The prediction results of the mobile user position prediction method for automatically inferring social relationships provided by the embodiment and the 3 baseline methods are compared as follows:

compared with 3 baseline methods, the mobile user position prediction method for automatically deducing social relationships provided by the invention has the advantages of highest prediction accuracy and best effect.

To facilitate a better implementation of the above-described aspects of embodiments of the present invention, the following also provides relevant means for implementing the above-described aspects.

Referring to fig. 6, an apparatus 600 for predicting a location of a mobile user for automatically inferring a social relationship according to an embodiment of the present invention may include: the system comprises a user individual behavior record acquisition module 601, a user social relationship type inference module 602, a user social relationship network establishment module 603, a user movement track sequence establishment module 604, a user group social relationship motif detection module 605, a user individual social relationship motif verification module 606 and a position predictor establishment module 607.

A user individual behavior record obtaining module 601, configured to obtain user individual behavior records from a user mobile behavior log database to obtain a user set U and a place set L, where each user behavior record includes a user ID, a start time, a duration, and a place;

the inter-user social relationship type inference module 602 is configured to establish a spatio-temporal interaction matrix between users by using the user individual behavior record, extract spatio-temporal entropy and regularity, establish a zero model, determine a zero threshold by using a preset probability p, and infer the type of the inter-user social relationship;

a user social relationship network establishing module 603, configured to establish a user social relationship network by using a user as a node and using the type of social relationship between two users as a continuous edge (without considering a stranger relationship);

a user movement track sequence establishing module 604, configured to utilize the user individual behavior record to establish a discrete movement track sequence of the user according to a time sequence;

a user group social relationship motif detection module 605, configured to generate a social relationship subgraph by using the jaccard coefficient; generating a randomized user social relationship network by a random broken edge reconnection method, and constructing a zero model; comparing the magnitude relation of the statistical index z value of each social relation subgraph under a real network and a zero model, and determining a social relation model of a user group;

a user individual social relationship motif verification module 606, configured to generate a social relationship subgraph of the user by using the jaccard coefficient; comparing whether the user individual social relation subgraph is the social relation motif or not, if so, passing the verification, otherwise, not passing the verification;

a location predictor establishing module 607 for respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, the future position of the user is predicted only by utilizing the Markov predictor, the familiar stranger predictor and the output regulator, and if the individual does not pass the social relationship motif verification, the future position of the user is predicted by utilizing the Markov predictor, the familiar stranger predictor and the output regulator.

In an embodiment of the present invention, please refer to fig. 7, the module 602 for inferring social relationship types between users includes:

the inter-user space-time interaction matrix establishing and interaction characteristic extracting submodule 6021 is used for establishing an inter-user space-time interaction matrix by utilizing the user individual behavior record and extracting space-time entropy and regularity;

a zero model establishing and interaction characteristic zero threshold selecting submodule 6022 for establishing a zero model and determining a zero threshold through a preset probability p;

the offline social relationship type determination submodule 6023 for determining the offline social relationship between the two users by comparing the spatial-temporal entropy and the regularity of the real interaction matrix of the users with the magnitude relationship between the zero threshold thereof;

in an embodiment of the present invention, please refer to fig. 8, the inter-user spatio-temporal interaction matrix establishing and interaction characteristics extracting submodule 6021 includes:

a user interaction event establishing submodule 60211, configured to determine all interaction events among users according to the spatio-temporal co-occurrence, and establish an interaction event set;

a spatio-temporal interaction matrix establishing submodule 60212, configured to establish a spatio-temporal interaction matrix between two users having at least one interaction event, where each matrix element is a binary group and describes weights and probabilities of interactions together;

the interactive characteristic extraction submodule 60213 is used for extracting interactive characteristics including space-time entropy and regularity according to the space-time interactive matrix among the users;

in an embodiment of the present invention, please refer to fig. 9, the zero model establishing and zero threshold selecting module 6022 includes:

the user individual behavior randomization submodule 60221 is configured to randomize the user behavior to obtain a random user behavior matrix;

a random space-time interaction matrix establishing submodule 60222 for demonstrating a random space-time interaction matrix between users according to the random user behavior matrix;

an interactive characteristic zero threshold extraction submodule 60223 for extracting the space-time entropy and regularity under the zero model, counting the probability distribution, and presetting the probability p₀Determining a space-time entropy zero threshold and a regularity zero threshold;

in an embodiment of the present invention, please refer to fig. 10, the module 605 for detecting social relationship motif of user group includes:

a user social relationship sub-graph generation sub-module 6051, configured to generate a social relationship sub-graph by using the jaccard coefficient;

a zero model establishing submodule 6052 for generating a randomized user social relationship network by a random broken edge reconnection method to establish a zero model;

the user social relationship motif determining submodule 6053 is used for comparing the magnitude relation of the statistical index z value of each social relationship subgraph under the real network and the zero model to determine a user group social relationship motif;

in an embodiment of the present invention, please refer to fig. 11, the user individual social relationship motif verification module 606 includes:

a social relationship sub-graph generation sub-module 6061, configured to generate a social relationship sub-graph of the user using the jaccard coefficient;

a social relationship motif verification submodule 6062, configured to compare whether the social relationship subgraph of the user individual is the social relationship motif, if so, the verification is passed, and otherwise, the verification is not passed;

in an embodiment of the present invention, please refer to fig. 12, the position predictor establishing module 607 includes:

a markov predictor builder sub-module 6071 for predicting a future location of the user using the user historical location information;

an acquaintance predictor establishing submodule 6072 for predicting a future position of the user using an acquaintance relationship of the user;

a familiar stranger predictor building sub-module 6073 for predicting a user's future location using the user's familiar stranger relationships;

the output regulator establishes submodule 6074 for fusing the outputs of submodule 6071, submodule 6072 and submodule 6073 to obtain the final prediction output.

As can be seen from the description of the present invention in the foregoing embodiment, first, an individual behavior record of a user is obtained from a user mobile behavior log database, and a social relationship type between two users is inferred; constructing a user social relationship network; constructing a user discrete movement track sequence; generating a social relationship subgraph by using the Jacard coefficient, generating a randomized user social relationship network by random broken edge reconnection, constructing a zero model, and mining a user group social relationship model body through a z value; comparing whether the individual social relation subgraph of the user to be predicted is the social relation motif, if so, passing the verification, otherwise, not passing the verification; establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, only a Markov predictor, a familiar stranger predictor and an output regulator are needed, and if the individual does not pass the social relationship motif verification, the Markov predictor, the acquaintance predictor, the familiar stranger predictor and the output regulator are needed. The mobile user position prediction method and the device for automatically deducing the social relationship, provided by the invention, introduce the relation of familiar strangers into the position prediction method for the first time on the basis of predicting the user position through the traditional historical track information of the user and the social relationship of acquaintances, thereby improving the accuracy of position prediction and providing a new idea for designing the position prediction method. According to the invention, through defining and mining the social relation motif of the user, the users of different types are processed heterogeneously, and the overall prediction performance is improved while the calculation cost is reduced. The invention is based on different types of social relationships directly deduced by using the user line downlink as the data, does not need to specially and additionally collect the user social relationship data, reduces the original data information required by prediction, well protects the individual privacy of the user and reduces the difficulty of data acquisition. The method provided by the invention is suitable for scenes with higher geographic resolution such as Wi-Fi and the like, is different from the traditional method which mostly depends on scenes with lower geographic resolution such as a base station, POI and the like, and can better predict fine-grained geographic positions.

Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by a program instructing the relevant hardware. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A mobile user position prediction method for automatically deducing social relations is characterized by comprising the following specific steps:

(1) acquiring individual behavior records of a user, namely acquiring the individual behavior records of the user from a user mobile behavior log database, wherein each data record comprises a user ID, an access starting time, an access duration and an access place ID;

(2) deducing the type of the social relationship between the users, namely deducing the type of the social relationship between the two users by using the individual behavior records of the users, wherein the type of the social relationship comprises familiar strangers FS, acquaintances F and IR and strangers S, and the acquaintances F and IR comprise friends and colleagues in-role;

(3) establishing a user social relationship network, namely establishing the user social relationship network by taking users as nodes and taking the social relationship types between the two users as connecting edges except stranger relationships;

(7) establishing a position predictor, namely respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, predicting the future position of the user by only utilizing a Markov predictor, a familiar stranger predictor and an output regulator, and if the individual does not pass the social relationship motif verification, predicting the future position of the user by utilizing the Markov predictor, the familiar stranger predictor and the output regulator;

in the step (2), the social relationship type between the two users is inferred by using the individual behavior records of the users, and the specific flow is as follows:

for each period n, a behavior matrix of the user u is constructed

Is 0 or 1;

representing the behavior record of the user u and the user v which have time coincidence in the same place l by using time space co-occurrence; the spatio-temporal co-occurrence represents one 'interaction event' of the user u and the user v in real life; definition E_nFor the set of all interaction events in the nth period, if the user u and the user v have a time-space co-occurrence in the nth period, the location l and the time step t, the interaction event e_n＝(u，v，t，l)∈E_n；

l represents the ith place in the set of places L, and an interaction matrix M_u，vOf (2) element(s)

Is a doublet

The weight of the interaction is represented by,

the degree of interaction support is represented, wherein,

and

calculated by the following formulas (1), (2):

calculating the regularity d of the user space-time interaction matrix through the following formula (3)_r(u，v)：

Calculating user spatio-temporal by the following equation (4)Spatio-temporal entropy of interaction matrix d_e(u，v)：

Constructing a null hypothesis: the individual behavior of the user is not influenced by others, and the individual behavior of the user does not have periodic bias; establishing a zero model of the user individual behavior and the space-time interaction matrix among the users according to a zero hypothesis, namely a random user behavior matrix and a random space-time interaction matrix in each period;

calculating individual activity according to the user behavior matrix; the user activity represents the probability of a user accessing a spatio-temporal grid within a period; establishing a user-space-time grid bipartite graph according to a user behavior matrix; the user-spatio-temporal grid two-part graph comprises: the user set represents nodes of each user, represents nodes of each spatio-temporal grid (t, l) and connecting edges between the users and the spatio-temporal grids where behavior records exist; elements in a user behavior matrix

When the user u and the space-time grid (t, l) have connecting edges;

randomizing the user-space-time grid bipartite graph by using a continuous edge exchange method of the retention degree to obtain a random user-space-time grid bipartite graph; the degree of each node is kept unchanged, and the number of the nodes and the connecting edges is unchanged;

Random space-time interaction

Degree of random regularity

And random spatiotemporal entropy

preset probability p₀Wherein p is₀Much less than 1;

determining a zero threshold e of the spatio-temporal entropy according to the probability distribution of the spatio-temporal entropy and the regularity in the zero model₀And a zero threshold r for regularity₀(ii) a Wherein the spatio-temporal entropy is zero threshold e₀Satisfy the requirement of

The regularity is zero threshold r₀Satisfy the requirement of

Determining the offline social relationship between two users by comparing the magnitude relationship between the real user interaction matrix and the zero threshold value of the real user interaction matrix in the two dimensions of the space-time entropy and the regularity: familiar strangers FS, acquaintances F & IR, strangers S, including:

if the space-time entropy of the user interaction matrix is smaller than the space-time entropy random threshold value, and the regularity is larger than the regularity random threshold value, determining that the offline social relationship among the users is a familiar stranger; if the space-time entropy of the user interaction matrix is smaller than a space-time entropy random threshold value, and the regularity is larger than a regularity random threshold value, determining that the offline social relationship between the users is a familiar stranger FS; if the space-time entropy of the user interaction matrix is larger than the space-time entropy random threshold, determining that the offline social relationship between the users is an acquaintance relationship F & IR; if the regularity degree is greater than the regularity degree random threshold value, determining that the offline social relationship among the users is a colleague or classmate career relationship IR in the acquaintance relationship, and if the regularity degree is less than the regularity degree random threshold value, determining that the offline social relationship among the users is a friend relationship F in the acquaintance relationship;

the step (3) of constructing the user social relationship network comprises the following steps:

each user U serves as a node, the type of the social relationship between two users serves as a connecting edge e, if the type of the social relationship between the two users is an acquaintance or familiar stranger relationship, a connecting edge is considered to exist between the two users, and the type of the connecting edge corresponds to the type of the social relationship, so that a user social relationship network G (U) is constructed, wherein U is a user set and is a connecting edge set;

in the step (4), the construction of the discrete movement track sequence of the user according to the time sequence by using the individual behavior record of the user comprises:

determining the discretization time step length delta T according to the time data in the user behaviors; for the behavior records of the user, if a plurality of records exist in one discretization time step, selecting a place with the longest visit duration or the largest visit times as the place of the discretization time step, and constructing a discretization movement track sequence of the user;

generating a social relationship subgraph by using the Jacard coefficient in the step (5), wherein the social relationship subgraph comprises the following steps:

defining (u) and (v) acquaintance relationship neighbor sets for users u and v, respectively, and calculating the Jacard coefficient J of each user with all social relationships thereof as follows:

defining an n-order social relationship subgraph of a user as the top n most important social relationship individual types of a predicted user;

sorting the social relationship of each user from big to small according to the Jacard coefficient, and taking the first n individuals to obtain an n-order social relationship subgraph of each user;

the method for excessively preserving broken edges and reconnecting in the step (5) generates a randomized user social relationship network and constructs a zero model, and comprises the following steps:

given an actual social relationship network G ═ (U,), for one connecting edge e in the set of social relationship connecting edges_stAt the same timeSelecting a connecting edge with the same social relationship type; i.e. for the connecting edge

Another continuous edge is randomly selected

By probability

Replacing the two connected edges (u, v, FS) and (u ', v', FS) with (u, v ', FS) and (u', v, FS), and otherwise replacing the two connected edges with (u, u ', FS) and (v, v', FS); if the process of the broken edge reconnection generates a self-loop edge or a heavy edge, the broken edge reconnection operation is terminated; repeating the processes until all the connecting edges are reconnected, and obtaining a random social relationship network;

generating the 100 random social relationship networks as the zero model;

in the step (5), the step of comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model to determine the social relation motif of the user group comprises the following steps:

respectively counting the z values of each type of social relationship subgraphs under random networks generated by a real network and a zero model, and if the z value of a certain subgraph is significantly larger than 0, determining the subgraph as a social relationship motif of a user group;

in the step (6), the social relationship subgraph of the user is generated by using the Jacard coefficient, and in the same step (3), the social relationship of each user is sorted from large to small according to the Jacard coefficient, and the first n individuals are taken to obtain the n-order social relationship subgraph of each user;

comparing whether the user individual social relationship subgraph is the social relationship motif, if so, passing the verification, otherwise, failing to pass the verification, and the method comprises the following steps:

comparing the social relation subgraph generated by the user with the social relation motif obtained in the step (5), if the social relation subgraph is the social relation motif in the step (5), passing the verification, otherwise, not passing the verification;

the establishing of the Markov predictor, the acquaintance predictor, the familiar stranger predictor and the output regulator in the step (7) comprises the following steps:

in Markov predictors, using a random variable X_tRepresenting all possible states { x } of a random variable at the location of an individual at discrete time step t₁，x₂，...，x_t+1All can be detected from the actual data, each state x_t∈ {1, 2., L } is the location number, L is the total number of different locations, then the user's movement trajectory is modeled with a first order markov chain, i.e., the user's next location depends only on the previously visited location, expressed as:

P_M(X_t+1＝x_t+1|X_t＝x_t，X_t-1＝x_t-1，…，X₁＝x₁)＝P_M(X_t+1＝x_t+1|X_t＝x_t)

given the location of the user u at the discrete time step t-1 and a first-order Markov transition matrix extracted from historical location data, obtaining a Markov location access probability vector of the user u at the discrete time step t:

in the acquaintance predictor, a user is driven by the acquaintance relationship F & IR to move towards the position of the acquaintance relationship in a short time;

Indicating that user u visits location i at a discrete time step t +1,

representing the probability that user v is always located at location i from time step t to t +1, then the conditional probability that user u will visit location i at discrete time step t +1 is expressed as:

set of acquaintance relationships S for a given user u_F&IR＝{v₁，v₂，...，v_KW from_iRepresenting user u and user v_iNormalized interaction frequency of (a):

then the probability that user u visits location i at discrete time step t +1 is expressed as:

obtaining the probability of the user u visiting each place at the time step t

Then, obtaining the location access probability vector of the user u at the discrete time step t

Normalize it, i.e. ensure

In familiar stranger predictors, a user's place of access is replicated by their familiar stranger relationships due to the significant periodicity of the user's interactions with their familiar strangers;

in each period n, the behavior matrix of the user u is constructed as

Is 0 or 1; the cumulative behavior matrix of user u is represented as

By using

Representing the cumulative number of times user v visits location i at time step t, then the probability that user v visits location i at time step t is expressed as:

given familiar set of stranger relationships S for user u_FS＝{v₁，v₂，...，v_KK is any positive integer, using w_iRepresenting user u and user v_iNormalized interaction frequency of (a):

then the probability that user u visits location i at discrete time step t is expressed as:

obtaining the probability of the user u visiting each place at the time step t

Normalize it and ensure

In the output regulator, outputs of the Markov predictor, the acquaintance predictor and the familiar stranger predictor are weighted and fused by using a multiple linear regression model, α, β and gamma are set as weight parameters, and α + β + gamma is 1, then the final output place visit probability vector P is obtained_aggrExpressed as:

P_aggr＝αP_M+βP_F&IR+γP_FS

by P_realRepresenting the actual location visit probability vector of the user, the weight parameter is obtained by minimizing the loss function J:

2. A mobile user location prediction device for automatically inferring social relationships based on the prediction method of claim 1, comprising:

(1) the user individual behavior record acquisition module is used for acquiring user individual behavior records from the user mobile behavior log database to obtain a user set U and a place set L, wherein each user behavior record comprises a user ID, a start time, a duration and a place;

(2) the inter-user social relationship type inference module is used for establishing a time-space interaction matrix between users by using the individual behavior records of the users, extracting the time-space entropy and the regularity, establishing a zero model and presetting a probability p₀Determining a zero threshold value, and deducing social relations among users;

(3) the user social relationship network establishing module is used for establishing a user social relationship network by taking the user as a node and taking the type of the social relationship between the two users as a connecting edge except the stranger relationship;

(4) the user movement track sequence establishing module is used for establishing a discrete movement track sequence of the user according to the time sequence by utilizing the user individual behavior record;

(5) the user group social relation motif detection module is used for generating a social relation subgraph by using the Jacard coefficient; generating a randomized user social relationship network by a random broken edge reconnection method, and constructing a zero model; comparing the magnitude relation of the statistical index z value of each social relation subgraph under a real network and a zero model, and determining a social relation model of a user group;

(6) the user individual social relationship motif verification module is used for generating a social relationship subgraph of the user by using the Jacard coefficient; comparing whether the user individual social relation subgraph is the social relation motif or not, if so, passing the verification, otherwise, not passing the verification;

(7) the position predictor establishing module is used for respectively establishing a Markov predictor, an acquaintance predictor, a familiar stranger predictor and an output regulator; if the individual passes the social relationship motif verification, predicting the future position of the user by only utilizing a Markov predictor, a familiar stranger predictor and an output regulator, and if the individual does not pass the social relationship motif verification, predicting the future position of the user by utilizing the Markov predictor, the familiar stranger predictor and the output regulator;

these 7 modules correspond to the operational content of the 7 steps of the prediction method.

3. The prediction device of claim 2, wherein the user group social relationship motif detection module comprises:

the user social relation sub-graph generation sub-module is used for generating a social relation sub-graph by using the Jacard coefficient;

the zero model establishing submodule is used for generating a randomized user social relationship network by a random broken edge reconnection method and establishing a zero model;

and the user social relation motif determining submodule is used for comparing the magnitude relation of the statistical index z value of each social relation subgraph under the real network and the zero model and determining the user group social relation motif.

4. The prediction device of claim 2, wherein the user individual social relationship motif verification module comprises:

5. The prediction apparatus of claim 2, wherein the location predictor establishing module comprises:

the Markov predictor establishing submodule predicts the future position of the user according to the historical position information of the user;

the acquaintance predictor establishing submodule predicts the future position of the user by using the acquaintance relationship of the user;

a familiar stranger predictor building submodule for predicting the future position of the user according to the familiar stranger relationship of the user;