CN114118250B - Cross-platform user identity recognition method based on activity similarity - Google Patents

Cross-platform user identity recognition method based on activity similarity Download PDF

Info

Publication number
CN114118250B
CN114118250B CN202111389814.5A CN202111389814A CN114118250B CN 114118250 B CN114118250 B CN 114118250B CN 202111389814 A CN202111389814 A CN 202111389814A CN 114118250 B CN114118250 B CN 114118250B
Authority
CN
China
Prior art keywords
user
activity
interest
similarity
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111389814.5A
Other languages
Chinese (zh)
Other versions
CN114118250A (en
Inventor
李勇军
黄丽蓉
颜兆洁
张银银
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202111389814.5A priority Critical patent/CN114118250B/en
Publication of CN114118250A publication Critical patent/CN114118250A/en
Application granted granted Critical
Publication of CN114118250B publication Critical patent/CN114118250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a cross-platform user identity recognition method based on activity similarity, which comprises the steps of firstly, combining time and semantic information in an activity track to extract an activity mode of a user, secondly, calculating similarity scores among the activity modes of the user, distributing different weights of different interest point types by utilizing the concept of inverse document frequency in order to distinguish the importance of different interest point types, and thirdly, introducing an interest point embedding layer similar to an embedded word in natural language to generate an embedded representation for each interest point, then generating vector representation of the activity mode of the user according to the activity mode of the user and the embedding of the interest point, and finally, calculating the activity similarity of the user according to the generated representation of the activity mode of the user, wherein the most similar user has the same natural person identity; the invention calculates the similarity of the users in the semantic space, embeds the activity habit of the users into the low-dimensional space, and can efficiently find the users which are most matched with any user.

Description

Cross-platform user identity recognition method based on activity similarity
Technical Field
The invention relates to a method for linking identity of a cross-platform user based on a track, in particular to a method based on activity similarity.
Background
Cross-platform user links play a vital role in numerous applications, such as user interest recommendation and location prediction. Many studies use user attributes and social networks to study this topic. But the attributes of the user and social networking characteristics are inconsistent for different service platforms. At the same time, some sensitive information cannot be obtained and used for identity analysis due to privacy concerns. Unlike platform-specific information, a user's spatiotemporal track may provide stable and consistent identity information. Because of the popularity of the global positioning system tracking technology, a mobile device records its time-space trajectory regardless of which service the user accesses. Linking user identities can be achieved by analyzing the spatiotemporal localization of user activity.
The method for realizing user identity linking by utilizing the user space-time track mainly focuses on matching the special time matching of the user according to the statistical characteristics so as to identify the user. By introducing different weight allocation policies, the frequency of user accesses, different locations and popularity of "encountered" events are used to calculate the similarity between different users. However, these methods cannot capture dynamic patterns and semantic information of the track.
In the conventional user track identity linking problem, longitude and latitude of GPS positioning are generally used as the representation of track points, but the longitude and latitude points do not contain semantic information of the track. There has been a great deal of effort to model human movements, but most of them are modeled for spatio-temporal laws. Human movement is typically modeled as a random process around a fixed point. The biggest drawback of this modeling approach is that the active information is ignored. What the person is at a particular location at a particular time is for is the purpose of movement, which is implicit behind the track. In recent years, researches are also conducted on the addition of semantic information of track points on the basis of space-time tracks, so that the accuracy of user identity linking is improved. However, these methods only use semantic information as auxiliary information, and cannot fuse space-time information with semantic information. Unlike available technology, the present invention provides one new method of combining the time and semantic information of the user's track and learning the user's hidden activity habit.
Disclosure of Invention
In order to fully mine semantic information in a user track, the invention provides a system based on representation learning, which converts the interest point track into a unique activity habit and well stores time and semantic information. Firstly, combining time and semantic information in an activity track, and extracting an activity mode of a user. Second, a similarity score between the user activity patterns is calculated. In order to distinguish the importance of different interest point types, the invention utilizes the concept of inverse document frequency to assign different weights to different interest point types. Again, similar to the embedded words in natural language, a point of interest embedding layer is introduced, generating an embedded representation for each point of interest. Then, a vector representation of the user's activity pattern is generated from the user's activity pattern and the point of interest embedding. Finally, user activity similarity is calculated according to the generated representation of the user activity pattern, and the most similar users have the same natural person identity.
(1) Extracting activity patterns of a user
The invention expresses the interest point track of the user as T (u) = { p 1 ,p 2 ,...,p t P, where t Is the type of point of interest of the address accessed by the user at a certain point in time t, u representing the user. In view of the strong periodicity and predictability of the user's activity patterns, it is necessary to analyze the user's daily activities, so the present invention divides the user's point of interest track into sub-tracks T of length in days sub (u). In order to better analyze the daily activity habit of the user, the method divides the day into m time partitions, and respectively counts the frequently accessed interest points of the user in each time partitionWherein->Indicating that the user ui has accessed the point of interest p in the jth time period t The number of accesses is n t And twice. The present invention defines that the daily activity pattern of the user is denoted +.>
(2) Analyzing and calculating similarity scores of user activity patterns
The invention introduces a new index to measureThe similarity of activity patterns between users in the original space is measured. An intuition of similarity scores is that similar users tend to appear in similar types of places at similar times. Thus, the present invention calculates the co-occurrence time of points of interest during a particular period. For two users of the user on the A platform and the B platform, there areAnd->The invention defines the time activity similarity of the user as follows:
wherein the method comprises the steps ofRepresenting user u A Frequent point of interest statistics at the jth time period. Thus, user link results may be achieved based on semantic similarity between computing users. For user u A The most similar user ui' can be calculated in the B-platform with the maximum temporal activity similarity score maximum +.>And u is to A And ui' are linked together to share the most similar activity pattern.
To improve S (u) A ,u B ) The invention improves the similarity function, introduces the idea of TF-IDF (inverse document frequency) and distinguishes the importance of different interest points. TF-IDF is a common weighting technique used for information retrieval and data mining, aimed at reflecting the importance of different words in corpora and documents. Inspired by TF-IDF, the method calculates word frequency and inverse document frequency of different interest points:
wherein the method comprises the steps ofRepresenting the raw statistics of points of interest in a trajectory, e.g. point of interest p t Number of occurrences in the track.
Where n= |t| is the number of all traces in the dataset, |{ T e T: p is p t E t } | represents containing a point of interest p t Is provided for the number of tracks of the track. Then, the inverse document frequency of the point of interest is calculated as follows:
tfidf(p t ,t,T)=tf(p t ,t)·idf(t,T) (7)
the TF-IDF value is calculated and used as the weight of each interest point, and the invention designs an improved time activity similarity score S# (u) A ,u B ) Its co-occurrence function is defined as follows:
(3) Representation learning of track points of interest
Although the user's daily time activity record L is obtained to represent statistics of the user's activity patterns, this statistical feature is still insufficient for analysis. First, it cannot distinguish between different points of interest, for example, the difference between Beijing banks and China construction banks is significantly greater than the difference between Beijing banks and Beijing restaurants. Second, the similarity of the user's activity patterns is calculated, which is feature-based and cannot be used to further link the user's identity. The present invention therefore proposes a method for learning an embedded representation of a user's activity pattern based on representation learning. The user's activity similarity can be easily calculated by classical distance functions.
The distribution of the interest points of the user in the track is very similar to the word frequency distribution in the natural language, so that the word embedding method in the natural language processing can be used for solving the embedding problem of the interest points. Inspired by the word2vec model, the invention designs a POI2vec model for learning the low-dimensional embedding of interest points.
Specifically, similar to the bag of words model, the target point of interest p t Can be predicted from its contextual points of interest, i.e. by maximizing the probability functionAnd (5) calculating. Conditional probability->Defined by a normalized exponential function:
where V is the set of all points of interest in the dataset,(where d is the dimension of the low-dimensional space) is the point of interest p t V is represented by (v) Context Is the Context point of interest Context (p t ) Is a sum vector of (a) and (b). Finally, the training goal of POI2vec is to maximize the average of the indices for all probabilities:
(4) Representation learning of trace activity patterns
Based on the interest point embedding obtained in the above steps, the time activity embedding of the user can be further obtained. In the user's time activity statistics L, the present invention counts the k (top-k) points of interest that the user accesses most frequently in each time partition of the day. Embedding v (p) in time activity statistics L and interest points obtained in the last step t )On the basis of (a), the activity habit of the user is expressed as +.>Where m is the number of time partitions and dim is the dimension of the point of interest embedding. If the user has a POI record in the time period, the embedding between the time periods is expressed as frequent POI embedding, and the embedding of the user in the time period is expressed as follows according to the occurrence times of each POI and tf-dif weight:
where concat represents the concatenation operation of vectors, p jl Is the first frequent interest point of the user in the jth time partition, and the access frequency is n jl . Similar to the definition of temporal activity similarity scores, the present invention counts TF-IDF weights into a representation of the user temporal activity pattern.
If the user does not have a record of points of interest within a certain time partition, the present invention proposes three strategies to replace the missing values: 1) Replacing the missing value with a zero vector: 2) Replace with the most frequent points of interest in other time partitions: 3) Replaced with a weighted average of points of interest at all other times.
(5) User identity linking
Through the above steps, an embedded representation of each user's temporal activity may be obtained. Cosine similarity is often used to calculate the similarity between two vectors, and the invention defines the similarity between the activity habits of two users to be calculated as follows:
wherein v is 1 and v 2 Is a representation of the activity habits of two users. Thus, designating a user of a certain platform, the method of the present invention can find the user with the most similar activity habit on another platform in the dataset, and link the two users, i.e. have the same user identity.
The method of the invention can calculate the similarity of the user in the semantic space because the unique activity mode of the user is extracted by combining the time information and the semantic information. Meanwhile, the method adopts a model based on representation learning, the activity habit of the user is embedded into a low-dimensional space, and for any user, the user which is most matched with the user can be efficiently found.
Drawings
FIG. 1 is a flow chart of a cross-platform user identity recognition method based on activity similarity.
Detailed Description
To illustrate an embodiment of the method, we will take the GeoLife dataset as an example to illustrate the processing steps of the method. The GeoLife dataset is a GPS dataset collected by microsoft asian research corporation, recording activity trajectories of 182 users over a period of three years (4 months 2007 to 8 months 2012). To obtain fine-grained activity information, we use the geocoder API provided by the Goldmap to obtain detailed point of interest information from latitude and longitude.
As shown in fig. 1, the specific description of the same identity recognition algorithm for the cross-platform user by using the activity similarity is as follows:
input: one user u on the A-platform j Trajectory data of (c) and user trajectory data on B-planeCollection set
And (3) outputting: b on platform and u i User u 'who is most similar in activity habit' j
Step1: preprocessing the activity track of the user, and extracting the activity mode of the user;
step2: calculating the activity similarity of the user in the original space according to formulas (1) - (8);
step3: obtaining a submerging representation v (p) of the interest point according to the POI2vec model;
step4: an embedded representation v of the user's activity pattern is generated from v (p) and the missing value substitution.
Step5: calculating the activity similarity D (v) i ,v j ) And the user with the highest similarity performs identity linking.
According to the invention, acc@K and mean rank are used as indexes for measuring the performance of the model, and the acc@K is defined as follows:
where # correctly identified users@k denotes the number of identical users correctly predicted among the first K candidates, # users denotes the total number of all unidentified users. mean rank represents ranking the similarity of all candidate users, the higher the similarity ranking is, and the ranking of the users with the same identity in the candidate users in the candidate is calculated for all the users. The average of these predictive ranks is reported herein. A lower average ranking indicates the superiority of the method.
After analyzing the trajectories in the dataset, the present invention obtains the trajectories of points of interest that are frequent to the user every day. According to the similarity scores defined in formulas (1) - (8), an initial user link result is obtained as shown in table 2 below:
table 1 dataset analysis is similar to the original activity. ( Ratio: the user ratio (1) most similar to itself was calculated using equation (1) without tf-idf weight. Ratio#: using tf-idf weight to calculate user ratio (1) most similar to itself )
dataset Ratio Ratio#
Geolife 94.5% 98.9%
It can be seen that the cross-platform user identity can be identified to a great extent according to the activity similarity calculation formula defined by the invention.
Based on the POI2vec model provided by the invention, we can obtain low-dimensional embedding of different interest points in the moving track. For parameters of the model, the invention sets the embedding dimension of the interest point equal to 80, and the window size of the context equal to 3 (1.5 hours). Obviously, larger embedding dimensions can better preserve the original semantic information, but as dimensions increase, gains in performance are gradually reduced. For the window size, if the value is too small, the association relation between the interest points may not be captured well, but if the window is too large, errors are introduced, and the performance of the model is reduced. Therefore, we recommend that the embedding dimension of the interest point is equal to 80 and the window size of the context is equal to 3.
Based on the POI2vec model, the present invention uses a POI embedded dictionary to convert the user activity pattern L into an embedded representation v of the user activity pattern. More specifically, we resolved one day into 48 equal-length slices (30 minutes per slice). The parameter is set to 48 because it provides the best granularity without suffering from data sparseness. The present invention marks time slices lacking point of interest information as "missing" types in the dataset. We represent frequent points of interest in the time partition with vector weighted sums and finally aggregate the embedded vector concatenation of 48 time slices into an embedded representation of the user's activity habits.
In this experiment, the present invention investigated the impact of different "missing" type replacement strategies on the user time activity of embedding similarity ranking. Policy 1 replaces the missing value with a zero vector, policy 2 replaces the "missing" type with the most common POI in the other time period, and policy 3 replaces the "missing" type with a vector weighted by the sum of all POIs in the other time period. In the purchased POI type transition tracking, 36.39% of the time slices in the GeoLife dataset were marked missing. Table 2 shows the impact of these three different strategies on the user-embedded similarity ranking. It can be seen that strategy 3 works best. Policy 3 considers the behavior habits of the user during all recording periods of the day.
Table 1 average ranking of user temporal activity embedded similarities
Data set MR@strategy 1 MR@strategy 2 MR@strategy 3
GeoLife 2.2197 13.3021 1.9175

Claims (5)

1. A method for identifying the identity of a cross-platform user based on activity similarity is characterized in that,
firstly, extracting an activity mode of a user by combining time and semantic information in an activity track, secondly, calculating similarity scores among the activity modes of the user, distributing different weights of different interest point types by utilizing the concept of inverse document frequency in order to distinguish the importance of different interest point types, thirdly, introducing an interest point embedding layer similar to an embedded word in natural language to generate an embedded representation for each interest point, then generating a vector representation of the activity mode of the user according to the activity mode of the user and the embedding of the interest point, and finally, calculating the activity similarity of the user according to the generated representation of the activity mode of the user, wherein the most similar user has the same natural person identity;
the method specifically comprises the following steps:
(1) Extracting activity patterns of a user
Representing the point of interest track of the user as T (u) = { p 1 ,p 2 ,...,p t P, where t Is the type of interest point of the address accessed by the user at a certain time point T, u represents the user, and considering that the activity mode of the user has strong periodicity and predictability, it is necessary to analyze the daily activity of the user, and the interest point track of the user is divided into sub-tracks T with the length of days sub (u) dividing a day into m time partitions for better analyzing daily activity habits of users, and respectively counting frequently accessed interest points of users in each time partitionWherein->Indicating that the user ui has accessed the point of interest p in the jth time period t The number of accesses is n t Next time, define the user's daily activity pattern to be denoted +.>
(2) Analyzing and calculating similarity scores of user activity patterns
A new index is introduced to measure the similarity of activity modes between users in an original space, and the intuition of the similarity score is that similar users tend to appear in similar types of places at similar times, the co-occurrence time of interest points in a specific period is calculated, and the method comprises the following steps ofAndthe temporal activity similarity of the user is defined as follows:
wherein the method comprises the steps ofRepresenting user u A Frequent point of interest statistics at the jth time period, therefore, user link results are achieved based on computing semantic similarity between users, for user u A Calculating the most similar user ui' in the B-platform with the maximum temporal activity similarity score maximum +.>And u is to A And ui' are connected to oneFrom the beginning, sharing the most similar activity patterns;
the TF-IDF value is calculated as the weight of each interest point, and the improved time activity similarity score S# (u) A ,u B ) Its co-occurrence function is defined as follows:
(3) Representation learning of track points of interest
Although a statistics of the user's daily time activity record L is obtained to represent the user's activity pattern, this statistical feature is still insufficient for analysis, firstly, it cannot distinguish between different points of interest, secondly, the user's activity pattern similarity is calculated, which is feature-based and cannot be used to further link the user's identity, so a learning-based approach is proposed to learn an embedded representation of the user's activity pattern, the user's activity similarity can be easily calculated by classical distance functions;
the distribution of the interest points of the user in the track is very similar to the word frequency distribution in the natural language, so that the word embedding method in the natural language processing can be used for solving the problem of embedding the interest points, inspired by a word2vec model, and a POI2vec model is designed for learning the low-dimensional embedding of the interest points;
specifically, similar to the bag of words model, the target point of interest p t Can be predicted from its contextual points of interest, i.e. by maximizing the probability functionCalculation, conditional probability->Defined by a normalized exponential function:
where V is the set of all points of interest in the dataset,(where d is the dimension of the low-dimensional space) is the point of interest p t V is represented by (v) Context Is the Context point of interest Context (p t ) Finally, the training goal of POI2vec is to maximize the average of the exponents of all probabilities:
(4) Representation learning of user activity patterns
Based on the interest point embedding obtained in the above steps, further obtaining the time activity embedding of the user, in the activity mode L of the user, counting k (top-k) interest points which are most frequently accessed by the user in each time partition in the day, and embedding the interest points obtained in the activity mode L of the user and the last stepOn the basis of (a), the embedded vector of the activity mode of the user is expressed as +.>Where m is the number of time partitions and dim is the dimension of point of interest embedding, if the user has a POI record in a time period, the embedding between this period is denoted as frequent POI embedding, and the embedding vector of the user in this time period is expressed as follows according to the occurrence number of each POI and tf-dif weight:
where concat represents the concatenation operation of vectors, p jl Is the 1 st frequent interest point of the user in the jth time partition, and the access frequency is n jl Similar to the definition of the temporal activity similarity score, the TF-IDF weightsCounting into a representation of the user activity pattern;
if the user does not have a record of points of interest within a certain time partition, three strategies are proposed to replace the missing values: 1) Replacing the missing value with a zero vector: 2) Replace with the most frequent points of interest in other time partitions: 3) Replacing with a weighted average of points of interest at all other times;
(5) User identity linking
Through the above steps, an embedded representation of each user's temporal activity is obtained, cosine similarity is often used to calculate the similarity between two vectors, and the similarity between the two users' activity habits is defined as follows:
wherein v is 1 and v 2 Is a representation of the activity habits of two users, thus, one user of a certain platform is designated, the user with the most similar activity habit is found on the other platform in the dataset, and the two users are linked, i.e. have the same user identity.
2. The activity similarity-based cross-platform user identity recognition method of claim 1, wherein the second step (2) is performed to increase s (u A ,u B ) The similarity function is improved, the thought of TF-IDF inverse document frequency is introduced, the importance of different interest points is distinguished, TF-IDF is a common weighting technology used for information retrieval and data mining, the importance of different words in a corpus and a document is reflected, and the word frequency and the inverse document frequency of different interest points are calculated under the inspired by TF-IDF:
wherein the method comprises the steps ofRepresenting the raw statistics of points of interest in a trajectory, e.g. point of interest p t The number of occurrences in the track;
where n= |t| is the number of all traces in the dataset, |{ T e T: p is p t E t } | represents containing a point of interest p t Is a number of tracks of (a); then, the inverse document frequency of the point of interest is calculated as follows:
tfidf(p t ,t,T)=tf(p t ,t)·idf(t,T) (7)。
3. the cross-platform user identity recognition method based on activity similarity according to claim 1, wherein the limitation of physical distance is broken through by using external semantic information of the user activity track, and even for tracks with a larger physical distance, hidden user fixed activity patterns in the tracks can be captured.
4. The activity similarity-based cross-platform user identity recognition method of claim 1, wherein the user activity trajectory context information is utilized by maximizing a probability functionCalculating vector representation of the active track points, and finally, the training target of the POI2vec is the average value of indexes of all probabilities:
by maximizing the average exponential probability for the entire data set, unsupervised representation learning for the active trajectory points is achieved.
5. The cross-platform user identity recognition method based on activity similarity according to claim 1, wherein top-k frequent activity places in a user activity mode are fully analyzed, and three strategies are provided to replace missing values in order to solve the problem of data sparseness: 1) Replacing the missing value with a zero vector: 2) Replace with the most frequent points of interest in other time partitions: 3) Replaced with a weighted average of points of interest at all other times.
CN202111389814.5A 2021-11-22 2021-11-22 Cross-platform user identity recognition method based on activity similarity Active CN114118250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111389814.5A CN114118250B (en) 2021-11-22 2021-11-22 Cross-platform user identity recognition method based on activity similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111389814.5A CN114118250B (en) 2021-11-22 2021-11-22 Cross-platform user identity recognition method based on activity similarity

Publications (2)

Publication Number Publication Date
CN114118250A CN114118250A (en) 2022-03-01
CN114118250B true CN114118250B (en) 2024-04-12

Family

ID=80439634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111389814.5A Active CN114118250B (en) 2021-11-22 2021-11-22 Cross-platform user identity recognition method based on activity similarity

Country Status (1)

Country Link
CN (1) CN114118250B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013107669A1 (en) * 2012-01-20 2013-07-25 Telefónica, S.A. A method for the automatic detection and labelling of user point of interest
CN104268171A (en) * 2014-09-11 2015-01-07 东北大学 Activity similarity and social trust based social networking website friend recommendation system and method
CN107194434A (en) * 2017-06-16 2017-09-22 中国矿业大学 A kind of mobile object similarity calculating method and system based on space-time data
CN109726336A (en) * 2018-12-21 2019-05-07 长安大学 A kind of POI recommended method of combination trip interest and social preference

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11042810B2 (en) * 2017-11-15 2021-06-22 Target Brands, Inc. Similarity learning-based device attribution
US11636439B2 (en) * 2019-06-18 2023-04-25 Capital One Services, Llc Techniques to apply machine learning to schedule events of interest

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013107669A1 (en) * 2012-01-20 2013-07-25 Telefónica, S.A. A method for the automatic detection and labelling of user point of interest
CN104268171A (en) * 2014-09-11 2015-01-07 东北大学 Activity similarity and social trust based social networking website friend recommendation system and method
CN107194434A (en) * 2017-06-16 2017-09-22 中国矿业大学 A kind of mobile object similarity calculating method and system based on space-time data
CN109726336A (en) * 2018-12-21 2019-05-07 长安大学 A kind of POI recommended method of combination trip interest and social preference

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张莹 ; 李智 ; 张省 ; .基于位置的社交网络用户轨迹相似性算法.四川大学学报(工程科学版).2013,(S2),全文. *
胡德敏 ; 杨晨 ; .一种基于多类型情景信息的兴趣点推荐模型.计算机应用研究.2017,(06),全文. *

Also Published As

Publication number Publication date
CN114118250A (en) 2022-03-01

Similar Documents

Publication Publication Date Title
US11816888B2 (en) Accurate tag relevance prediction for image search
Hu et al. Twitter100k: A real-world dataset for weakly supervised cross-media retrieval
CN108399163B (en) Text similarity measurement method combining word aggregation and word combination semantic features
Lin et al. User-level psychological stress detection from social media using deep neural network
Garimella et al. Social media image analysis for public health
JP3882048B2 (en) Question answering system and question answering processing method
Sarawagi et al. Open-domain quantity queries on web tables: annotation, response, and consensus models
Hossny et al. Event detection in twitter: A keyword volume approach
CN101840397A (en) Word sense disambiguation method and system
CN113544659A (en) Efficient hash-based user modeling
CN111324816B (en) Interest point recommendation method based on region division and context influence
CN106649605B (en) Method and device for triggering promotion keywords
CN103778206A (en) Method for providing network service resources
KR20190003231A (en) A method for normalizing biomedical names
KR101585644B1 (en) Apparatus, method and computer program for document classification using term association analysis
Wei et al. Embedding electronic health records for clinical information retrieval
CN114118250B (en) Cross-platform user identity recognition method based on activity similarity
CN109800429B (en) Theme mining method and device, storage medium and computer equipment
CN110362813B (en) Search relevance measuring method, storage medium, device and system based on BM25
CN109902129A (en) Insurance agent's classifying method and relevant device based on big data analysis
US20170293863A1 (en) Data analysis system, and control method, program, and recording medium therefor
US11822609B2 (en) Prediction of future prominence attributes in data set
CN115129864A (en) Text classification method and device, computer equipment and storage medium
Joly et al. Shared nearest neighbors match kernel for bird songs identification-lifeclef 2015 challenge
Zhang et al. Interactive mobile visual search for social activities completion using query image contextual model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant