WO2023015658A1 - 一种基于类脑时空感知表征的兴趣点推荐方法及系统 - Google Patents

一种基于类脑时空感知表征的兴趣点推荐方法及系统 Download PDF

Info

Publication number
WO2023015658A1
WO2023015658A1 PCT/CN2021/117879 CN2021117879W WO2023015658A1 WO 2023015658 A1 WO2023015658 A1 WO 2023015658A1 CN 2021117879 W CN2021117879 W CN 2021117879W WO 2023015658 A1 WO2023015658 A1 WO 2023015658A1
Authority
WO
WIPO (PCT)
Prior art keywords
interest
spatiotemporal
point
poi
spatial
Prior art date
Application number
PCT/CN2021/117879
Other languages
English (en)
French (fr)
Inventor
唐华锦
马歌华
燕锐
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2023015658A1 publication Critical patent/WO2023015658A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the technical field of artificial intelligence, in particular to a point-of-interest recommendation method and system based on brain-like space-time perception representation.
  • POI recommendation can accurately recommend the next POI.
  • This recommendation algorithm can mine information about points of interest, provide users with a recommendation list, and guide users to the next appropriate location, which is of great benefit to users and point-of-interest owners.
  • the first is that the recommendation performance cannot be guaranteed in the case of cold-start, that is, for users with no or only a small amount of access history, the recommendation depends on preferences Unreliable; the second is that there is a risk of leakage of the privacy data of users' personal preferences, which will lead to systematic ethical issues.
  • POIs have natural geospatial properties, adding spatial information into recommendations can greatly improve the quality of recommendations.
  • Lian et al. proposed to use power law distribution and normal distribution to describe the spatial distribution characteristics of interest points. Feng and other researchers describe the geographic location characteristics of POIs through multi-level two-dimensional space division. However, the grasp of geospatial information of interest points in these works is based on experience. In fact, it completely depends on artificial prior settings.
  • the purpose of the present invention is to provide a point-of-interest recommendation method and system based on brain-inspired spatio-temporal perception representations, by mining the spatio-temporal complex characteristics and access sequence characteristics of the interest points themselves, using brain-like spatio-temporal perception embedding inspired by brain entorhinal-hippocampus structure Model, which efficiently represents points of interest from multiple angles.
  • a point-of-interest recommendation method based on brain-like spatiotemporal perception representation including:
  • the context graph structure of the point of interest includes an access sequence context graph, a spatial context graph and a spatiotemporal context graph of the point of interest;
  • the POI access sequence embedding model in the brain-like spatiotemporal perception embedding model is trained by an unsupervised learning method; the POI access sequence embedding model is used to extract the POI sequence representation vector;
  • the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model is trained by unsupervised learning; the spatiotemporal embedding model is used to extract interest Point spatio-temporal joint characterization vector; the interest point spatio-temporal joint characterization vector comprises a spatial embedding characterization vector and a spatiotemporal embedding characterization vector;
  • a recurrent neural network recommender is trained based on the spatiotemporal perception representation vector of the interest point; and a next interest point is recommended by the trained recurrent neural network recommender.
  • the construction process of the access sequence context graph is as follows:
  • the adjacent POIs in the POI visit sequence are connected by edges to construct a context graph of the visit sequence.
  • the spatially adjacent POIs are the K POIs closest to the central POI.
  • the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model is trained through unsupervised learning, specifically including:
  • the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model is trained by unsupervised learning.
  • the construction of the POI visit time matrix is as follows:
  • the present invention also provides a point-of-interest recommendation system based on brain-like space-time perception representation, including:
  • POI context graph structure building module for constructing POI context graph structure based on POI visit data set; Described POI context graph structure includes the access sequence context graph, spatial context graph and spatiotemporal context graph of POI;
  • a first sampling module configured to sample the access sequence context graph to obtain sampling samples of interest points
  • the first training module is used to train the POI access sequence embedding model in the brain-like spatiotemporal perception embedding model based on the POI sampling sample through an unsupervised learning method; the POI access sequence embedding model is used to extract the POI sequence representation vector;
  • a second sampling module configured to sample the spatial context graph and the spatiotemporal context graph, obtain spatial interest point samples and spatiotemporal interest point samples, and generate an interest point access time matrix
  • the second training module is used to train the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model through unsupervised learning based on the spatial interest point sample, the spatiotemporal interest point sample and the interest point access time matrix;
  • the spatio-temporal embedding model is used to extract the joint spatio-temporal representation vector of the interest point;
  • the joint spatio-temporal representation vector of the interest point includes a spatial embedding representation vector and a spatio-temporal embedding representation vector;
  • a synthesis module for synthesizing the interest point sequence representation vector and the interest point joint spatiotemporal representation vector into an interest point spatiotemporal perception representation vector
  • the invention discloses the following technical effects:
  • the present invention fully utilizes the spatio-temporal characteristics of the POI itself, obtains a highly differentiated spatio-temporal perception representation, and can realize POI recommendation under extreme conditions such as no user privacy violation and cold start.
  • the method of the present invention adopts a spatial position encoder based on the entorhinal grid cell model of the brain to mine the multi-scale geographical distribution characteristics of the point of interest;
  • the method of the present invention adopts The tensorization of the access time pattern of the interest point utilizes the temporal characteristics of the interest point itself through the multi-level temporal-spatial coupling characteristics of the adjacent access time stamp-similar access time pattern interest point-spatial-temporal adjacent interest point.
  • the present invention draws on the graph representation mechanism of the entorhinal-hippocampus cognitive structure and the natural language processing word embedding (Word Embedding) method, fully utilizes the space-time and sequence context relations of the interest points themselves, constructs context graphs from different angles and realizes unsupervised representation learning.
  • the method proposed by the present invention does not require additional data annotation (point-of-interest tags, text Screening, etc.) cost, the order of interest point visits in the sequence used, the geographical location of interest points, the visit time of interest points, etc. can all be obtained during the data collection process.
  • FIG. 1 is a flowchart of a method for recommending points of interest based on brain-like spatiotemporal perception representations according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a method for recommending points of interest based on brain-like spatiotemporal perception representations according to an embodiment of the present invention
  • Figure 3 is a sequence context definition diagram
  • Figure 4 is a spatial context diagram
  • Figure 5 is the tensorization of interest point access time patterns.
  • a large amount of point-of-interest-related information can be expressed through the graph structure to support the learning of point-of-interest representations;
  • the spatial encoding method of grid cells can be used as the basis for spatial modeling of points of interest;
  • the multi-sensory signal joint representation mode of place cells also brings inspiration for the utilization of the time-dimensional characteristics of interest points themselves.
  • this patent invented a point-of-interest recommendation method based on brain-like spatiotemporal perception representation.
  • the brain-like spatiotemporal perception embedding model inspired by the brain's entorhinal-hippocampus structure is used to efficiently represent the points of interest from multiple angles.
  • This method coordinates the contextual features of the POI access sequence, the spatial distribution feature of the POI, and the joint space-time feature of the POI, and trains the corresponding neural network model for representation extraction through the unsupervised learning strategy of Context Graph construction-sampling-representation .
  • a point-of-interest recommendation method based on brain-like spatiotemporal perception representations includes:
  • Step 101 build the point of interest context graph structure based on the point of interest visit data set;
  • the point of interest context graph structure includes the access sequence context graph (Sequential Context Graph), spatial context graph (Spatial Context Graph) and spatiotemporal context graph ( Spatiotemporal Context Graph).
  • the POI access dataset used to build the context graph can be the public dataset Gowalla or Instagram Check-in.
  • the Gowalla dataset is collected through the data interface on the location-based social game Gowalla, and contains more than 6.44 million access records from 57,436 points of interest. Each record contains the geographical location information and access time information of the point of interest.
  • the Instagram Check-in dataset was collected on the famous social network Instagram, including more than 2.21 million POI access records from 13,187 POIs. These visits were generated by 78,233 users. Each record in the dataset contains Timestamp and additional tweet content.
  • the adjacent POIs in the POI visit sequence are connected by edges to construct a context graph of the visit sequence.
  • the present invention performs preprocessing to remove data outliers, that is, screen out POIs with less than 10 visits and users with less than 10 visit records. For each user serial number, select a user's access records and sort them in chronological order to obtain the access sequence of points of interest. Select a central interest point (target interest point) in the interest point access sequence, and its sequence context (sequence adjacent) interest point is in the same sliding window as the central interest point, as shown in Figure 3, the width of the sliding window. As an adjustable hyperparameter, the method of the present invention uses 2 (excluding target interest points) as the default value of the sliding window width. By connecting the sequence-adjacent points of interest with edges, the point-of-interest access sequence context graph constructed by the present invention is used for subsequent sequence embedding model training.
  • the spatially adjacent POIs are connected by edges to construct a spatial context graph; the spatially adjacent POIs are the K POIs closest to the central POI.
  • the process of building a spatial context map is mainly divided into two parts: coordinate transformation and map building.
  • the schematic diagram of the map building part is shown in Figure 4. Since the geographic location is usually given in the form of latitude and longitude, and the grid cell position encoder used in the method of the present invention is a vector in two-dimensional Euclidean space as input, the method of the present invention adopts a general coordinate conversion method to convert the latitude and longitude position Convert to two-dimensional Euclidean space coordinates. Specifically: convert the latitude and longitude coordinates of the WGS84 geographic coordinate system into projected coordinates (two-dimensional space coordinates) under the NAD27 projected coordinate system.
  • the method of the present invention adopts a spatial adjacency judgment method based on the distance between interest points.
  • the K interest points closest to it are defined as spatially adjacent interest points, and the method of the present invention uses 10 as the default K value.
  • Spatially adjacent interest points are connected by edges to form a spatial context graph. All edges in the graph are equivalent, and this processing method avoids too many adjacent interest points (occurring in dense areas of interest points) or too few adjacent interest points (occurring in sparse areas of interest points) by using a threshold value to filter spatially adjacent interest points. ) defect, which ensures the balance of the number of interest points in the graph.
  • the method of the present invention uses a grid cell encoder that can effectively describe the characteristics of multi-scale spatial distribution
  • the interest points selected according to the distance sorting can retain the multi-scale spatial distribution characteristics while ensuring that they are similar to the target interest points, which is conducive to obtaining
  • a more efficient spatial embedding representation of interest points is used to construct an efficient spatiotemporal perception representation.
  • temporally adjacent POIs Connect temporally adjacent POIs with edges to construct a temporal context graph; the temporally adjacent POIs are spatially adjacent and have similar access time patterns; the POIs with similar access time patterns are adjacent access time stamp pairs Points of interest not less than the threshold m; the pair of adjacent access time stamps is a time stamp pair with the same attribute of "weekday or not" and the access time is less than the threshold h.
  • the spatio-temporal context graph is constructed to exploit the access temporal pattern characteristics of POIs through POI spatio-temporal joint properties. It is not feasible to directly process the visit time pattern of POIs, because for recommending the next POI, there is no direct relationship between the similarity of the visit time pattern and the potential possibility of the POI being visited, similar to the visit time pattern
  • the points of interest can be far apart.
  • direct traversal mapping requires a huge computational cost.
  • Using the spatio-temporal joint feature and using the time pattern feature of interest points to add the precondition of spatial proximity in the process of determining the similarity of interest points at the time of visit greatly reducing the number of candidate interest points.
  • interest points with similar access time patterns have a higher potential to be visited, which provides the basis for defining an effective adjacency relationship, making it possible to learn an embedding model based on this graph.
  • the method of the present invention adopts the following method to construct the spatio-temporal context graph of the point of interest.
  • the method of the present invention defines adjacent visiting timestamps (Neighboring Visiting Timestamps) with the timestamp interval (hour) and timestamp attribute (whether it is a working day): the adjacent timestamp attributes are the same, and the timestamp interval is less than
  • the method of the present invention uses 2 as the default value of h.
  • Temporal Neighboring POIs with similar access time patterns are POIs whose adjacent access timestamps are not less than the threshold m, and the method of the present invention uses 11 as the default value of m .
  • the method of the present invention defines Spatiotemporal Neighboring POIs (Spatiotemporal Neighboring POIs), that is, POIs that are spatially adjacent and have similar access time patterns.
  • Step 102 Sampling the access sequence context graph to obtain interest point sampling samples.
  • the learning of the visit sequence embedding model aims to correctly predict the real (Ground Truth) context interest point (sequence adjacent interest point). This process ensures that during the update process of the access sequence embedding model, the distance between interest points with similar contexts in the embedding feature space will continue to shrink, highlighting the characteristics of the angle of interest point access sequence.
  • the method of the present invention adopts a method based on graph sampling to obtain positive interest point pairs (with edges directly connected in the graph) and negative interest point pairs (without directly connected edges in the graph) to calculate the target function update and initialize the interest point access sequence embedding model, the objective function is defined as follows:
  • O denotes the maximum likelihood objective of the access sequence embedding model
  • the sequence embedding representation vector of interest point i Represents the sequence embedding representation vector of interest point j, its superscript represents the serial number of the interest point, and the subscript represents the type of embedded representation
  • p j represents the target interest point j
  • fuzzy contrast estimation (Noise Contrastive Estimation) is usually used to construct balanced positive and negative pairs to calculate the target actually used for model update function, which uses negative sampling (Negative Sampling) to extract a batch of non-adjacent interest points for the target interest point.
  • the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model is trained by unsupervised learning.
  • the method of the invention adopts the spatial embedding model based on the grid cell encoder, performs model weight updating in the manner of spatial context map sampling, and extracts the spatial embedding representation of interest points.
  • the grid encoder g spa (p i ) transforms the coordinates in two-dimensional space Encoded as vectors in the representation space of geographic information at multiple scales Its process can be expressed as:
  • ⁇ i [ ⁇ 1 , ⁇ 2 , . . . , ⁇ s ]
  • the superscript of ⁇ represents the scale number
  • the location code is calculated as follows:
  • is the scale control coefficient
  • is the position code on the scale s
  • p i is the position vector of p i .
  • Distribute pattern basis vectors for grid cells specifically:
  • is the sigmoid function
  • K is the number of randomly selected negative examples, which is set to 16 by default in the method of the present invention.
  • p i represents the point of interest i
  • emb j spa indicates the embedding vector of the interest point space of p j
  • emb k spa indicates the embedding vector of the interest point space of p k .
  • the method of the invention adopts the construction-sampling-embedding method based on the spatio-temporal context graph to mine the geospatial characteristics of the interest points, and utilizes the access time characteristics of the interest points themselves.
  • the spatial-temporal adjacency relationship between the POIs defined by the method of the present invention represents the spatial proximity of POI locations, and at the same time takes into account the similarity of POI visit time patterns, and can provide highly reliable suggestions for POI visits.
  • the timestamps of visit records of POIs constitute an ensemble, which is difficult to serve as direct input for spatio-temporal embedding models.
  • the method of the present invention proposes a time pattern encoding method for POI visits, which quantizes discrete POI visit record time stamps into a fixed-size matrix.
  • the diversity of visit time patterns of POIs mainly comes from diurnal variation (hourly scale), weekday regularity (daily scale) and seasonal variation (monthly, daily scale), but is not sensitive on the annual scale. Therefore, the method of the present invention fills the visit records of the points of interest into a 24 ⁇ 366 zero matrix according to date and time to form a statistical matrix of visit time of the points of interest.
  • the matrix is represented as a thermal image of 24 ⁇ 366 pixels, and the pixel value of the time grid with high access frequency is larger and the color is darker, as shown in Figure 5.
  • the method of the present invention normalizes the original statistical matrix of visit time of points of interest to the (0, 1) interval according to the maximum value. After normalization, the entire matrix is convolved using a standard Gaussian window of size 3 ⁇ 3, so that the pixel grids around the original pixel are assigned values of different sizes according to the distance from the central pixel.
  • This operation augments the access records (Augment) of the interest points with fewer access records in a reasonable way, and at the same time reduces the variance of the final interest point visit time pattern matrix, which reduces the sparsity of the entire matrix. , which is beneficial to obtain more robust model parameters for the interest point embedding model receiving the matrix as input.
  • the spatio-temporal embedding model of interest points in the method of the present invention obtains positive/negative sample pairs in the way of spatio-temporal context graph sampling and calculates the objective function for updating model parameters.
  • is the sigmoid function
  • K is the number of randomly selected negative examples, and is set to 16 by default.
  • ⁇ spa is the smoothing coefficient of the balanced spatial adjacency objective function.
  • the model parameters are updated according to the objective function based on spatio-temporal proximity and spatial proximity calculation, so that the optimization of the model can benefit from the rich context information of spatio-temporal and spatial context graphs.
  • Step 106 Combining the interest point sequence representation vector and the interest point joint spatiotemporal representation vector into an interest point spatiotemporal perception representation vector.
  • Step 107 Train a recurrent neural network recommender based on the spatiotemporal perception representation vector of the interest point; recommend the next interest point through the trained recurrent neural network recommender.
  • the representations obtained by the POI sequence embedding model and the spatiotemporal embedding model are synthesized into the POI spatiotemporal perception representation, and the unlabeled POIs are used to access the sequence data to train the recurrent neural network recommender.
  • the method of the present invention adopts a recurrent neural network (Recurrent Neural Network) composed of long-short term memory neurons (Long-Short Term Memory Cell) Realize the purpose of recommending the next point of interest.
  • Recurrent Neural Network composed of long-short term memory neurons (Long-Short Term Memory Cell)
  • the method of the present invention selects the corresponding spatiotemporal perception embedding vector from the spatiotemporal perception embedding vector table as the input of the recommender model, and outputs a predicted spatiotemporal perceptual embedding vector Indicates that a point of interest is recommended.
  • the goal of the recommender is to minimize the cosine distance between the spatiotemporal perception embedding vector of the predicted interest point and the fact (GT, Ground Truth) interest point spatiotemporal perception embedding vector, that is, to maximize the normalized probability of recommending the correct interest point.
  • GT Ground Truth
  • the method of the present invention updates the recommender model through backpropagation based on the objective function; in the reasoning process, the method of the present invention applies the recommender to obtain the spatiotemporal perception embedding vector of the predicted interest point and the spatiotemporal perception embedding vector of the candidate interest point cosine distance, and generate a recommendation list by distance sorting.
  • the present invention also provides a point-of-interest recommendation system based on brain-like space-time perception representation, including:
  • POI context graph structure building module for constructing POI context graph structure based on POI visit data set; Described POI context graph structure includes the access sequence context graph, spatial context graph and spatiotemporal context graph of POI;
  • a first sampling module configured to sample the access sequence context graph to obtain sampling samples of interest points
  • a second sampling module configured to sample the spatial context graph and the spatiotemporal context graph, obtain spatial interest point samples and spatiotemporal interest point samples, and generate an interest point access time matrix
  • the second training module is used to train the spatiotemporal embedding model in the brain-like spatiotemporal perception embedding model through unsupervised learning based on the spatial interest point sample, the spatiotemporal interest point sample and the interest point access time matrix;
  • the spatio-temporal embedding model is used to extract the joint spatio-temporal representation vector of the interest point;
  • the joint spatio-temporal representation vector of the interest point includes a spatial embedding representation vector and a spatio-temporal embedding representation vector;
  • Synthesizing module be used for said interest point sequence representation vector and described interest point joint spatio-temporal representation synthesis vector interest point spatiotemporal perception representation vector;
  • the third training module is configured to train a recurrent neural network recommender based on the spatiotemporal perception representation vector of the interest point; and recommend the next interest point through the trained recurrent neural network recommender.
  • point-of-interest recommendation methods rely on the process of modeling user preferences. This user portrait process brings security risks to user privacy.
  • the point-of-interest recommendation method based on brain-like spatio-temporal perception representation proposed by the present invention fully exploits the spatio-temporal characteristics of the point of interest itself, obtains a highly differentiated spatio-temporal perception representation, and can realize no user privacy violation, cold start and other extreme conditions Recommended points of interest.
  • the method of the present invention adopts a spatial position encoder based on the entorhinal grid cell model of the brain to mine the multi-scale geographical distribution characteristics of the point of interest; for the time characteristics of the point of interest itself, the method of the present invention adopts The tensorization of the access time pattern of the interest point utilizes the temporal characteristics of the interest point itself through the multi-level temporal-spatial coupling characteristics of the adjacent access time stamp-similar access time pattern interest point-spatial-temporal adjacent interest point.
  • the method of the present invention refers to the information representation and processing mechanism in the entorhinal-hippocampus loop of the brain, uses the spatiotemporal perception embedding vector to describe the points of interest efficiently, and then realizes the recommendation of high-quality points of interest.
  • Table 1 and Table 2 compare the method of the present invention with several high-performance point-of-interest recommendation methods on the Instagram check-in dataset (Example 1) and the Gowalla dataset (Example 2).
  • comparative example [2] Xin Liu, Yong Liu, and Xiaoli Li. Exploring the Context of Locations for Personalized Location Recommendations. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1188–1194, 2016.
  • the present invention draws on the graph representation mechanism of the entorhinal-hippocampus cognitive structure and the natural language processing word embedding (Word Embedding) method, makes full use of the space-time and sequence context relations of the interest points themselves, constructs context graphs from different angles, and realizes unsupervised representation learning.
  • the method proposed by the present invention does not require additional data annotation (POI tags, text screening, etc.) ) cost, the sequence of POI visits in the sequence used, the geographic location of POIs, the time of POI visits, etc. can all be obtained during the data collection process.
  • each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.
  • the description is relatively simple, and for the related information, please refer to the description of the method part.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于类脑时空感知表征的兴趣点推荐方法及系统。该方法包括:基于兴趣点访问数据集构建兴趣点上下文图结构;对访问序列上下文图进行采样,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;对空间上下文图和时空上下文图进行采样,训练类脑时空感知嵌入模型中的时空嵌入模型;将兴趣点序列表征向量和兴趣点时空联合表征向量合成兴趣点时空感知表征向量;基于兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。该方法通过挖掘兴趣点本身的时空复杂特性、访问序列特性,使用大脑内嗅-海马结构启发的类脑时空感知嵌入模型,对兴趣点进行多角度的高效表征。

Description

一种基于类脑时空感知表征的兴趣点推荐方法及系统
本申请要求于2021年8月13日提交中国专利局、申请号为202110930940.0、发明名称为“一种基于类脑时空感知表征的兴趣点推荐方法及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及人工智能技术领域,特别是涉及一种基于类脑时空感知表征的兴趣点推荐方法及系统。
背景技术
随着基于位置的网络服务(Location-based Networks)的兴起,人们大量地分享带有精确地理位置的推文与点评,这变革了人与地理环境互动的方式,也带来了兴趣点推荐的巨大需求。通过增加时间维度的考量,兴趣点推荐可以实现精确地推荐下一个兴趣点。这种推荐算法能够挖掘兴趣点相关的信息,向用户提供推荐列表,从而引导用户前往合适的下一个地点,对于用户和兴趣点所有者可谓大有裨益。
近年来,研究人员通过挖掘兴趣点相关的大数据信息,开发出一系列的推荐方法。由于访问序列(Check-in Sequence)中相邻的兴趣点通常具备高度的相关性,许多研究人员使用马尔可夫链(Markov Chain)等序列分析模型对用户兴趣点访问序列进行建模,进而完成兴趣点的推荐。然而上述的方法仅仅是将兴趣点视为一般化的序列元素,未能充分利用其本身的丰富特性,制约了推荐的效果。大部分推荐系统都非常依赖于用户喜好建模,对于兴趣点推荐而言,通过对用户进行精确的画像,在用户的访问历史足够丰富的情况下,可以获取到准确度很高的推荐。然而基于用户画像的推荐有两个比较突出的问题,第一是冷启动(Cold-start)情况下的推荐性能无法保证,即对于没有或仅有少量访问历史的用户,依赖喜好来进行推荐并不可靠;第二是用户的个人喜好的隐私数据存在泄漏风险,会导致系统性的伦理问题。由于兴趣点具有天然的地理空间属性,在推荐中加入空间信息的考量可以极大地提升推荐的质量。Lian等人提出使用幂律分布、正态分布来描述兴趣点的空间分布特征。Feng等研究者通过多层 次二维空间划分的方式来刻画兴趣点的地理位置特性。然而这些工作对兴趣点地理空间信息的把握是基于经验的,事实上完全依赖人为的先验设置,同时只是对兴趣点局部或是全局的地理分布特征做了单一尺度表征,难以有效对兴趣点的多尺度空间特性进行描述。大量的数据分析表明,兴趣点本身的访问时间也呈现出多样性,这种兴趣点时间维度特性也起到辅助推荐决策的作用。在这一基础上同时考虑兴趣点的时间特性与空间特性,也有一些研究基于兴趣点空间距离与访问时间间隔的分析提出了一系列兴趣点推荐方法。如Li、Nabitumruksa、Zhao等提出了一种基于递归神经网络的基于时间-空间转移建模的兴趣点推荐系统。尽管如此,这些工作在考虑兴趣点时空特性的过程中使用了一般化的时间间隔与空间位移,未能充分挖掘兴趣点本身的时空特性以帮助推荐。另外,由于某些基于位置的社交平台提供了带位置标签的推送文本信息,也有研究者利用兴趣点相关的文本信息推荐下一个兴趣点。不过这类方法的局限性也是显而易见的,对于不能提供文本信息的大多数情况,这类方法的推荐性能大为下降。
发明内容
本发明的目的是提供一种基于类脑时空感知表征的兴趣点推荐方法及系统,通过挖掘兴趣点本身的时空复杂特性、访问序列特性,使用大脑内嗅-海马结构启发的类脑时空感知嵌入模型,对兴趣点进行多角度的高效表征。
为实现上述目的,本发明提供了如下方案:
一种基于类脑时空感知表征的兴趣点推荐方法,包括:
基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴趣点上下文图结构包括兴趣点的访问序列上下文图、空间上下文图以及时空上下文图;
对所述访问序列上下文图进行采样,得到兴趣点采样样本;
基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量;
对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵;
基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量;
将所述兴趣点序列表征向量和所述兴趣点时空联合表征向量合成兴趣点时空感知表征向量;
基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
可选地,所述访问序列上下文图的构建过程如下:
将用户的访问记录按照时间顺序进行排序,确定兴趣点访问序列;
将所述兴趣点访问序列中比邻的兴趣点以边连接,构建访问序列上下文图。
可选地,将空间比邻的兴趣点以边连接,构建空间上下文图;所述空间的比邻兴趣点为与中心兴趣点距离最近的K个兴趣点。
可选地,将时间比邻的兴趣点以边连接,构建时间上下文图;所述时间比邻的兴趣点为空间比邻且访问时间模式相似的兴趣点;所述访问时间模式相似的兴趣点为相邻访问时间戳对不少于阈值m的兴趣点;所述相邻访问时间戳对为“工作日与否”属性相同,且访问时刻小于阈值h的时间戳对。
可选地,基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型,具体包括:
基于所述空间兴趣点样本训练空间嵌入模型;训练好的空间嵌入模型用于提取空间比例兴趣点;
基于空间比例兴趣点、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型。
可选地,所述兴趣点访问时间矩阵的构建如下:
将时空上下文图中的兴趣点的访问记录按照日期、时间填充入零矩阵中,构建初始兴趣点访问时间矩阵;
对所述初始兴趣点访问时间矩阵进行归一化处理以及卷积操作,得到兴趣点访问时间矩阵。
本发明还提供了一种基于类脑时空感知表征的兴趣点推荐系统,包括:
兴趣点上下文图结构构建模块,用于基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴趣点上下文图结构包括兴趣点的访问序列上下文图、空间上下文图以及时空上下文图;
第一采样模块,用于对所述访问序列上下文图进行采样,得到兴趣点采样样本;
第一训练模块,用于基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量;
第二采样模块,用于对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵;
第二训练模块,用于基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量;
合成模块,用于将所述兴趣点序列表征向量和所述兴趣点时空联合表征向量合成兴趣点时空感知表征向量;
第三训练模块,用于基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
根据本发明提供的具体实施例,本发明公开了以下技术效果:
(1)本发明充分挖掘利用了兴趣点本身的时空特性,获取具有 高区分度的时空感知表征,可以实现无用户隐私侵犯、冷启动等极端条件下的兴趣点推荐。对于兴趣点本身的空间特性,本发明方法采用了基于大脑内嗅网格细胞模型的空间位置编码器,挖掘兴趣点的多尺度地理分布特征;对于兴趣点本身的时间特性,本发明方法采用了兴趣点访问时间模式张量化,通过相邻访问时间戳-相似访问时间模式兴趣点-时空比邻兴趣点的多层次兴趣点时空耦合特性利用了兴趣点本身的时间特性。
(2)本发明方法借鉴大脑内嗅-海马体环路中信息表征与处理机制,即图表征、多感知联合表征机制,采用时空感知嵌入向量对兴趣点进行高效描述,进而实现高质量的兴趣点推荐。
(3)本发明借鉴内嗅-海马认知结构的图表征机制以及自然语言处理词嵌入(Word Embedding)方法,充分利用兴趣点本身时空、序列上下文关系,构建不同角度的上下文图并实现无监督表征学习。相比于利用兴趣点标签(如兴趣点种类)或是其他兴趣点相关的信息(如推文、评论)的兴趣点推荐方法,本发明提出的方法无需额外的数据标注(兴趣点标签、文本筛选等)代价,所使用的序列中的兴趣点访问顺序、兴趣点地理位置、兴趣点访问时间等均可以在数据采集过程中获取。
说明书附图
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例基于类脑时空感知表征的兴趣点推荐方法的流程图;
图2为本发明实施例基于类脑时空感知表征的兴趣点推荐方法的原理图;
图3为序列上下文定义图;
图4为空间上下文图;
图5为兴趣点访问时间模式张量化。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
尽管目前已经有各式各样的兴趣点推荐方法,但是基于兴趣点本身时空特性的兴趣点推荐仍然没有很好的解决方案。
针对哺乳动物脑的内嗅-海马体环路的研究为设计高效的兴趣点推荐方法带来了启发。内嗅皮层的网格细胞(Grid-cell)被证明可以提供高效的多尺度空间表征;而海马体位置细胞则通过编码多种单一认知信号的耦合关系,提供了多感知信息的联合表征。随着内嗅-海马结构中的学习-表征机制研究的不断深入,研究者们认为通过图(Graph)结构抽象出的不同感知维度的联合表征,构成了内嗅-海马结构中记忆与认知的基础。对于推荐下一个兴趣点而言,大量的兴趣点相关信息可以通过图结构进行表出,用以支持兴趣点表征的学习;网格细胞的空间编码方式可以作为兴趣点空间建模的基础;同时,位置细胞的多感知信号联合表征模式也为兴趣点本身的时间维特性利用带来了启发。
针对现有方法存在的不足,受到哺乳动物脑内嗅-海马认知结构的机制启发,本专利发明了一种基于类脑时空感知表征的兴趣点推荐方法。通过挖掘兴趣点本身的时空复杂特性、访问序列特性,使用大脑内嗅-海马结构启发的类脑时空感知嵌入模型,对兴趣点进行多角度的高效表征。该方法协同了兴趣点访问序列上下文特征、兴趣点空间分布特征和兴趣点时空联合特征,并通过上下文图(Context Graph)构建-采样-表征的无监督学习策略训练相应的神经网络模型进行表征提取。
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
如图1、2所示,一种基于类脑时空感知表征的兴趣点推荐方法,包括:
步骤101:基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴 趣点上下文图结构包括兴趣点的访问序列上下文图(Sequential Context Graph)、空间上下文图(Spatial Context Graph)以及时空上下文图(Spatiotemporal Context Graph)。
构建上下文图使用的兴趣点访问数据集可以是公开数据集Gowalla或Instagram Check-in。Gowalla数据集是在基于位置的社交游戏Gowalla上通过数据接口收集的,包含来自57436个兴趣点的644万多条访问记录,每条记录包含了兴趣点的地理位置信息及访问时间信息。Instagram Check-in数据集是在著名的社交网络Instagram上收集的,包括了来自13187个兴趣点的221万余条兴趣点访问记录,这些访问由78233名用户生成,数据集中的每条记录包含了时间戳以及额外的推文内容。
(1)所述访问序列上下文图的构建过程如下:
将用户的访问记录按照时间顺序进行排序,确定兴趣点访问序列;
将所述兴趣点访问序列中比邻的兴趣点以边连接,构建访问序列上下文图。
对于兴趣点访问的原始数据,本发明进行了预处理以去除数据离群点,即筛去访问次数少于10次的兴趣点以及访问记录少于10条的用户。对每个用户序号选择某一用户的访问记录,按照时间顺序进行排序,可以获取兴趣点访问序列。在兴趣点访问序列中选定一个中心兴趣点(目标兴趣点),其序列上下文(序列比邻)兴趣点与中心兴趣点处于同一个滑窗中,如图3所示,滑窗的宽度
Figure PCTCN2021117879-appb-000001
作为可调的超参数,本发明方法使用2(不含目标兴趣点)作为滑窗宽度的默认值。通过将序列比邻的兴趣点以边(Edge)连接,本发明构建的兴趣点访问序列上下文图,用于后续的序列嵌入模型的训练。
(2)空间上下文图的构建过程如下:
将空间比邻的兴趣点以边连接,构建空间上下文图;所述空间的比邻兴趣点为与中心兴趣点距离最近的K个兴趣点。
构建空间上下文图的过程主要分为坐标转换和建图两个部分,建图部分的示意图如图4。由于地理位置通常是以经纬度的形式给出,而本发明方法使用的网格细胞位置编码器是以二维欧氏空间的向量作为输入,因此本发明方法采用通用的坐标转换方式,将经纬度位置转换为二维欧氏空间 坐标。具体为:将WGS84地理坐标系统的经纬度坐标转换为NAD27投影坐标系统下的投影坐标(二维空间坐标)。为定义兴趣点的空间比邻,并适当地简化空间上下文图,本发明方法采用了基于兴趣点间距的空间比邻判断方法。对于任一兴趣点,与其距离最近的K个兴趣点被定义为空间比邻兴趣点,本发明方法采用10作为默认K值。空间比邻的兴趣点间以边相连,构成空间上下文图。在该图中所有边是等价的,这种处理方式避免了采用阈值筛选空间比邻兴趣点可能导致的比邻兴趣点过多(发生在兴趣点稠密区域)或过少(发生在兴趣点稀疏区域)的缺陷,保证了图中兴趣点边数的均衡。同时,由于本发明方法使用了能够有效描述多尺度空间分布特性的网格细胞编码器,按照距离排序选的兴趣点在保证与目标兴趣点相似的同时保留了多尺度空间分布特征,有利于获取更加高效的兴趣点空间嵌入表征,进而构造出高效的时空感知表征。
(3)时间上下文图的构建过程如下:
将时间比邻的兴趣点以边连接,构建时间上下文图;所述时间比邻的兴趣点为空间比邻且访问时间模式相似的兴趣点;所述访问时间模式相似的兴趣点为相邻访问时间戳对不少于阈值m的兴趣点;所述相邻访问时间戳对为“工作日与否”属性相同,且访问时刻小于阈值h的时间戳对。
构建时空上下文图是为了通过兴趣点时空联合特性利用兴趣点的访问时间模式特征。对兴趣点访问时间模式进行直接处理是不可行的,因为对于推荐下一个兴趣点而言,访问时间模式的相似与兴趣点潜在的被访问可能性间并没有比较直接的关系,类似访问时间模式的兴趣点可以相隔很远。另外,由于单个兴趣点有大量的访问记录,直接进行遍历建图需要极大的计算代价。使用时空联合特性利用兴趣点访问时间模式特征在访问时间兴趣点相似性判定过程中添加了空间比邻的前置条件,大大减少了候选兴趣点的数量。同时,在空间比邻的基础上,访问时间模式相似的兴趣点具有更高的潜在被访问可能性,具备了定义有效比邻关系的基础,使得基于该图的嵌入模型学习成为可能。
本发明方法采用如下的方法构建兴趣点时空上下文图。首先本发明方法以时间戳间隔(小时)和时间戳属性(是否为工作日),对相邻访问时间戳(Neighboring Visiting Timestamps)做了定义:相邻的时间戳属性相 同,且时间戳间隔小于阈值h小时,本发明方法采用2作为h的缺省值。在相邻访问时间戳定义的基础上,访问时间模式相似的兴趣点(Temporal Neighboring POIs)为相邻访问时间戳对不少于阈值m的兴趣点,本发明方法采用11作为m的缺省值。在此基础上,本发明方法定义了时空比邻兴趣点(Spatiotemporal Neighboring POIs),即空间比邻且访问时间模式相似的兴趣点。
步骤102:对所述访问序列上下文图进行采样,得到兴趣点采样样本。
步骤103:基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量。
给定任一目标兴趣点以及所在的兴趣点访问序列,访问序列嵌入模型的学习以正确预测真实的(Ground Truth)上下文兴趣点(序列比邻兴趣点)为目标。这一过程保证了在访问序列嵌入模型的更新过程中,上下文类似的兴趣点在嵌入特征空间中的间距会不断缩小,凸显出兴趣点访问序列角度的特征。本发明方法采用基于图采样的方式来获取正兴趣点对(在图中有边直接相连)与负兴趣点对(在图中没有直接相连的边)计算目标函数更新初始化的兴趣点访问序列嵌入模型,该目标函数定义如下:
Figure PCTCN2021117879-appb-000002
其中,O表示访问序列嵌入模型的极大似然目标,
Figure PCTCN2021117879-appb-000003
表示兴趣点i的序列嵌入表征向量,
Figure PCTCN2021117879-appb-000004
表示兴趣点j的序列嵌入表征向量,其上标表示兴趣点序号,下标表示嵌入表征种类;p j表示目标兴趣点j;
Figure PCTCN2021117879-appb-000005
表示目标兴趣点j的序列相邻兴趣点构成的集合。类似的目标在元素嵌入,特别是词或词组嵌入的任务中多为使用。对于任意的目标兴趣点而言,其负样本构成的空间是无限大的,通常采用模糊对比估计(Noise Contrastive Estimation)类的方法来构造均衡的正、负对以计算实际用于模型更新的目标函数,该函数采用负例抽样(Negative Sampling)的方式,为目标兴趣点抽取一批次的非比邻兴趣点。
Figure PCTCN2021117879-appb-000006
其中,
Figure PCTCN2021117879-appb-000007
表示用于更新访问序列嵌入模型的损失函数,其中,
Figure PCTCN2021117879-appb-000008
为符号函数,当兴趣点i,j序列比邻时,γ取1,非序列比邻时,γ取-1,γ标记兴趣点p j和p i是否访问序列相邻;兴趣点p i,p j从全体兴趣点的集合
Figure PCTCN2021117879-appb-000009
中抽取。
步骤104:对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵。
步骤105:基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量。
具体包括:基于所述空间兴趣点样本训练空间嵌入模型;训练好的空间嵌入模型用于提取空间比例兴趣点;基于空间比例兴趣点、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型。
(1)基于空间上下文图的兴趣点空间嵌入表征提取。本发明方法采用基于网格细胞编码器的空间嵌入模型,以空间上下文图采样的方式进行模型权重更新,提取兴趣点空间嵌入表征。网格编码器g spa(p i)将二维空间的坐标
Figure PCTCN2021117879-appb-000010
编码为多尺度的地理信息表征空间中的向量
Figure PCTCN2021117879-appb-000011
其过程可以表示为:
ψ i=[ψ 1,ψ 2,...,ψ S]
其中,ψ的上标表示尺度序号,单个二维空间坐标p i=(x i,y i)的网格细胞位置码由S组不同尺度的位置码拼接而成,这是一个表示尺度数量的超参数,本发明方法采用64作为缺省值。位置码的计算方式如下:
Figure PCTCN2021117879-appb-000012
其中,ρ为尺度控制系数;
Figure PCTCN2021117879-appb-000013
为尺度s上的位置码,
Figure PCTCN2021117879-appb-000014
为p i的位置向量,
Figure PCTCN2021117879-appb-000015
为网格细胞发放模式基向量,具体为:
Figure PCTCN2021117879-appb-000016
表示各向同性的3个网格细胞发放模式(Grid-cell Firing Pattern)单位基向量。ρ=λ maxmin,λ min,λ max分别为最大、最小的尺度参数,本发明方法采用λ min=100m,λ max=1km作为缺省值,尺度参数的选择也可以根据具体情况作调整。给定一个目标兴趣点,兴趣点空间嵌入模型的目标是最大化观察到真正空间上下文兴趣点,即空间比邻兴趣点的概率。这一基于空间上下文图采样的无监督学习的目标函数如下:
Figure PCTCN2021117879-appb-000017
其中,
Figure PCTCN2021117879-appb-000018
表示基于空间上下文图采样的无监督学习的目标函数;σ为sigmoid函数,
Figure PCTCN2021117879-appb-000019
表示与p i在空间上下文图中有边直接相连的兴趣点构成的集合,
Figure PCTCN2021117879-appb-000020
表示与p i在没有边直接相连的兴趣点集合,K为随机选取的负例的数目,本发明方法默认设置为16。其中p i表示兴趣点i,
Figure PCTCN2021117879-appb-000021
表示p i的兴趣点空间嵌入向量,emb j spa表示p j的兴趣点空间嵌入向量,emb k spa表示p k的兴趣点空间嵌入向量。
(2)基于时空上下文图的兴趣点时空嵌入表征提取。本发明方法采用了基于时空上下文图构建-采样-嵌入的方式在挖掘兴趣点地理空间特性的基础上,利用兴趣点本身的访问时间特性。本发明方法定义的兴趣点间时空比邻关系表示了兴趣点地理位置空间上的接近,同时考虑到了兴趣点访问时间模式的相似性,可以为兴趣点访问提供高度可靠的建议。然而,兴趣点的访问记录时间戳构成一个集合,难以作为时空嵌入模型的直接输入。为解决这一问题,本发明方法提出了一种兴趣点访问时间模式编码方法,将离散的兴趣点访问记录时间戳张量化为固定大小的矩阵。根据数据 分析,兴趣点访问时间模式的多样性主要来自于昼夜变化(小时尺度)以及工作日规律(日尺度)和季节变化(月、日尺度),而在年尺度上并不敏感。因此,本发明方法将兴趣点的访问记录按照日期、时间填充入24×366的零矩阵中,构成兴趣点访问时间统计矩阵。该矩阵表现为24×366像素的热力图像,访问频次高的时间网格像素值更大,颜色更深,示意如图5。为了避免不同兴趣点的访问记录数目带来的影响,本发明方法对原始的兴趣点访问时间统计矩阵按最值归一化到(0,1)区间内。完成归一化后,使用大小为3×3的标准高斯窗口对整个矩阵进行卷积操作,使得原始像素周围的时刻的像素格按照距离中心像素的远近,被赋予大小不等的值。这一操作以合理的方式对访问记录较少的兴趣点做了访问记录的增广(Augment),同时减小了最终的兴趣点访问时间模式矩阵的方差,使得整个矩阵的稀疏性有所下降,有利于接收该矩阵作为输入的兴趣点时空嵌入模型的获取比较鲁棒的模型参数。
本发明方法中的兴趣点时空嵌入模型以时空上下文图采样的方式获取正/负样本对并计算目标函数用于更新模型参数,基于时空比邻关系计算的目标函数记为
Figure PCTCN2021117879-appb-000022
其中,
Figure PCTCN2021117879-appb-000023
表示基于时空比邻关系计算的目标函数,σ为sigmoid函数,
Figure PCTCN2021117879-appb-000024
表示与p i时空比邻的兴趣点集合,
Figure PCTCN2021117879-appb-000025
表示与p i的非时空比邻兴趣点的集合,K为随机选取的负例的数量,默认置为16。其中p i表示兴趣点i,
Figure PCTCN2021117879-appb-000026
表示p i的兴趣点时空嵌入向量,emb j st表示p j的兴趣点时空嵌入向量,emb k st表示p k的兴趣点时空嵌入向量,λ spa为平衡空间比邻目标函数的平滑系数。在兴趣点时空嵌入模型优化的过程中,同时根据了基于时空比邻和空间比邻计算的目标函数进行模型参数更新,使得模型的优化能够受益于时空、空间上下文图丰富的上下文信息。
步骤106:将所述兴趣点序列表征向量和所述兴趣点时空联合表征向量合成兴趣点时空感知表征向量。
步骤107:基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
将兴趣点序列嵌入模型、时空嵌入模型获得的表征合成兴趣点时空感知表征,使用无标注的兴趣点访问序列数据,训练递归神经网络推荐器。基于由兴趣点访问序列嵌入模型、时空嵌入模型构成的兴趣点类脑时空感知嵌入模型,本发明方法采用长短时记忆神经元(Long-Short Term Memory Cell)组成的递归神经网络(Recurrent Neural Network)实现推荐下一个兴趣点的目的。给定某用户最近访问的n个兴趣点(默认取n=1),本发明方法从时空感知嵌入向量表中选取对应的时空感知嵌入向量,作为推荐器模型的输入,并输出一个预测的时空感知嵌入向量
Figure PCTCN2021117879-appb-000027
表示推荐了某个兴趣点。
Figure PCTCN2021117879-appb-000028
推荐器的目标为最小化推荐的预测兴趣点时空感知嵌入向量与事实(GT,Ground Truth)兴趣点时空感知嵌入向量的余弦距离,即最大化推荐出正确兴趣点的归一化概率,目标函数表示为
Figure PCTCN2021117879-appb-000029
其中,
Figure PCTCN2021117879-appb-000030
表示用于更新推荐器模型的目标函数,其中p i表示兴趣点i,
Figure PCTCN2021117879-appb-000031
表示p i的兴趣点时空感知表征向量,
Figure PCTCN2021117879-appb-000032
为正确兴趣点的时空感知表征向量,σ‘为LeakyReLU非线性单元,s表示单个访问序列,
Figure PCTCN2021117879-appb-000033
为所有方位序列构成的集合。在训练过程中,本发明方法基于该目标函数,通过反向传播更新推荐器模型;在推理过程中,本发明方法适用推荐器获取预测兴趣点的时空感知嵌入向量与候选兴趣点时空感知嵌入向量余弦距离,并通过距离排序生成推荐列表。
本发明还提供了一种基于类脑时空感知表征的兴趣点推荐系统,包 括:
兴趣点上下文图结构构建模块,用于基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴趣点上下文图结构包括兴趣点的访问序列上下文图、空间上下文图以及时空上下文图;
第一采样模块,用于对所述访问序列上下文图进行采样,得到兴趣点采样样本;
第一训练模块,用于基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量;
第二采样模块,用于对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵;
第二训练模块,用于基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量;
合成模块,用于将所述兴趣点序列表征向量和所述兴趣点时空联合表征合成向量兴趣点时空感知表征向量;
第三训练模块,用于基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
本发明具备以下优点:
(1)隐私安全性
常规兴趣点推荐方法依赖用户喜好建模的过程,这一用户画像过程带来了用户隐私的安全隐患。本发明提出的基于类脑时空感知表征的兴趣点推荐方法充分挖掘利用了兴趣点本身的时空特性,获取具有高区分度的时空感知表征,可以实现无用户隐私侵犯、冷启动等极端条件下的兴趣点推荐。对于兴趣点本身的空间特性,本发明方法采用了基于大脑内嗅网格细胞模型的空间位置编码器,挖掘兴趣点的多尺度地理分布特征;对于兴趣点本身的时间特性,本发明方法采用了兴趣点访问时间模式张量化,通过相邻访问时间戳-相似访问时间模式兴趣点-时空比邻兴趣点的多层次兴 趣点时空耦合特性利用了兴趣点本身的时间特性。
(2)高效性、鲁棒性
本发明方法借鉴大脑内嗅-海马体环路中信息表征与处理机制,采用时空感知嵌入向量对兴趣点进行高效描述,进而实现高质量的兴趣点推荐。本发明方法与几种高性能兴趣点推荐方法在Instagram check-in数据集(实施例1)和Gowalla数据集(实施例2)上的对比如表1和表2。
表1 Instagram check-in上的兴趣点推荐性能
Figure PCTCN2021117879-appb-000034
表2 Gowalla上的兴趣点推荐性能
Figure PCTCN2021117879-appb-000035
其中,对比例[2]:Xin Liu,Yong Liu,and Xiaoli Li.Exploring the Context of Locations for Personalized Location Recommendations.In  Proceedings of the International Joint Conference on Artificial Intelligence,pages 1188–1194,2016.
对比例[3]:Buru Chang,Yonggyu Park,Donghyeon Park,Seongsoon Kim,and Jaewoo Kang.Content-aware hierarchical point-of-interest embedding model for successive POI recommendation.In Proceedings of the International Joint Conference on Artificial Intelligence,pages 3301–3307,2018.
对比例[4]:Qiang Liu,Shu Wu,Liang Wang,and Tieniu Tan.Predicting the next location:A recurrent model with spatial and temporal contexts.In Proceedings of the AAAI Conference on Artificial Intelligence,pages 194–200,2016.
对比例[5]:Pengpeng Zhao,Haifeng Zhu,Yanchi Liu,Jiajie Xu,Zhixu Li,Fuzhen Zhuang,Victor S Sheng,and Xiaofang Zhou.Where to Go Next:A Spatio-Temporal Gated Network for Next POI Recommendation.In Proceedings of the AAAI Conference on Artificial Intelligence,2019.
对比例[6]:Ke Sun,Tieyun Qian,Tong Chen,Yile Liang,Quoc Viet Hung Nguyen,and Hongzhi Yin.Where to Go Next:Modeling Long-and Short-Term User Preferences for Point-of-Interest Recommendation.In Proceedings of the AAAI Conference on Artificial Intelligence,pages 214–221,2020.
对比例[7]:Nicholas Lim,Bryan Hooi,and Xueou Wang.STP-UDGAT:Spatial-Temporal-Preference User Dimensional Graph Attention Network for Next POI Recommendation.In Proceedings of the ACM International Conference on Information & Knowledge Management,pages 845–854,2020.
(3)低数据标注成本
本发明借鉴内嗅-海马认知结构的图表征机制以及自然语言处理词嵌入(Word Embedding)方法,充分利用兴趣点本身时空、序列上下文关系,构建不同角度的上下文图并实现无监督表征学习。相比于利用兴趣点标签(如兴趣点种类)或时其他兴趣点相关的信息(如推文、评论)的推 荐方法,本发明提出的方法无需额外的数据标注(兴趣点标签、文本筛选等)代价,所使用的序列中的兴趣点访问顺序、兴趣点地理位置、兴趣点访问时间等均可以在数据采集过程中获取。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的系统而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。

Claims (7)

  1. 一种基于类脑时空感知表征的兴趣点推荐方法,其特征在于,包括:
    基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴趣点上下文图结构包括兴趣点的访问序列上下文图、空间上下文图以及时空上下文图;
    对所述访问序列上下文图进行采样,得到兴趣点采样样本;
    基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量;
    对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵;
    基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量;
    将所述兴趣点序列表征向量和所述兴趣点时空联合表征向量合成兴趣点时空感知表征向量;
    基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
  2. 根据权利要求1所述的基于类脑时空感知表征的兴趣点推荐方法,其特征在于,所述访问序列上下文图的构建过程如下:
    将用户的访问记录按照时间顺序进行排序,确定兴趣点访问序列;
    将所述兴趣点访问序列中比邻的兴趣点以边连接,构建访问序列上下文图。
  3. 根据权利要求1所述的基于类脑时空感知表征的兴趣点推荐方法,其特征在于,将空间比邻的兴趣点以边连接,构建空间上下文图;所述空间的比邻兴趣点为与中心兴趣点距离最近的K个兴趣点。
  4. 根据权利要求3所述的基于类脑时空感知表征的兴趣点推荐方法,其特征在于,将时间比邻的兴趣点以边连接,构建时间上下文图;所述时间比邻的兴趣点为空间比邻且访问时间模式相似的兴趣点;所述访问时间 模式相似的兴趣点为相邻访问时间戳对不少于阈值m的兴趣点;所述相邻访问时间戳对为“工作日与否”属性相同,且访问时刻小于阈值h的时间戳对。
  5. 根据权利要求1所述的基于类脑时空感知表征的兴趣点推荐方法,其特征在于,基于所述空间兴趣点样本、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型,具体包括:
    基于所述空间兴趣点样本训练空间嵌入模型;训练好的空间嵌入模型用于提取空间比例兴趣点;
    基于空间比例兴趣点、所述时空兴趣点样本和所述兴趣点访问时间矩阵,通过无监督学习方式训练类脑时空感知嵌入模型中的时空嵌入模型。
  6. 根据权利要求1所述的基于类脑时空感知表征的兴趣点推荐方法,其特征在于,所述兴趣点访问时间矩阵的构建如下:
    将时空上下文图中的兴趣点的访问记录按照日期、时间填充入零矩阵中,构建初始兴趣点访问时间矩阵;
    对所述初始兴趣点访问时间矩阵进行归一化处理以及卷积操作,得到兴趣点访问时间矩阵。
  7. 一种基于类脑时空感知表征的兴趣点推荐系统,其特征在于,包括:
    兴趣点上下文图结构构建模块,用于基于兴趣点访问数据集构建兴趣点上下文图结构;所述兴趣点上下文图结构包括兴趣点的访问序列上下文图、空间上下文图以及时空上下文图;
    第一采样模块,用于对所述访问序列上下文图进行采样,得到兴趣点采样样本;
    第一训练模块,用于基于所述兴趣点采样样本,通过无监督学习方式训练类脑时空感知嵌入模型中的兴趣点访问序列嵌入模型;所述兴趣点访问序列嵌入模型用于提取兴趣点序列表征向量;
    第二采样模块,用于对所述空间上下文图和所述时空上下文图进行采样,得到空间兴趣点样本和时空兴趣点样本,并生成兴趣点访问时间矩阵;
    第二训练模块,用于基于所述空间兴趣点样本、所述时空兴趣点样本 和所述兴趣点访问时间矩阵,通过无监督学习方式训练时空感知嵌入模型中的时空嵌入模型;所述时空嵌入模型用于提取兴趣点时空联合表征向量;所述兴趣点时空联合表征向量包括空间嵌入表征向量和时空嵌入表征向量;
    合成模块,用于将所述兴趣点序列表征向量和所述兴趣点时空联合表征向量合成兴趣点时空感知表征向量;
    第三训练模块,用于基于所述兴趣点时空感知表征向量训练递归神经网络推荐器;通过训练好的递归神经网络推荐器推荐下一个兴趣点。
PCT/CN2021/117879 2021-08-13 2021-09-13 一种基于类脑时空感知表征的兴趣点推荐方法及系统 WO2023015658A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110930940.0 2021-08-13
CN202110930940.0A CN113590971B (zh) 2021-08-13 2021-08-13 一种基于类脑时空感知表征的兴趣点推荐方法及系统

Publications (1)

Publication Number Publication Date
WO2023015658A1 true WO2023015658A1 (zh) 2023-02-16

Family

ID=78257728

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117879 WO2023015658A1 (zh) 2021-08-13 2021-09-13 一种基于类脑时空感知表征的兴趣点推荐方法及系统

Country Status (2)

Country Link
CN (1) CN113590971B (zh)
WO (1) WO2023015658A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573856A (zh) * 2024-01-15 2024-02-20 中国科学技术大学 一种基于记忆网络的建筑领域内容多兴趣召回方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116153069B (zh) * 2023-02-09 2024-01-30 东南大学 交通流模型与数据融合驱动的交通状态估计方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129971A1 (en) * 2016-11-10 2018-05-10 Adobe Systems Incorporated Learning user preferences using sequential user behavior data to predict user behavior and provide recommendations
CN108829766A (zh) * 2018-05-29 2018-11-16 苏州大学 一种兴趣点推荐方法、系统、设备及计算机可读存储介质
CN110399565A (zh) * 2019-07-29 2019-11-01 北京理工大学 基于时空周期注意力机制的递归神经网络兴趣点推荐方法
CN111241419A (zh) * 2020-01-09 2020-06-05 辽宁工程技术大学 一种基于用户关系嵌入模型的下一个兴趣点推荐方法
CN113158038A (zh) * 2021-04-02 2021-07-23 上海交通大学 基于sta-tcn神经网络框架的兴趣点推荐方法及系统

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8204886B2 (en) * 2009-11-06 2012-06-19 Nokia Corporation Method and apparatus for preparation of indexing structures for determining similar points-of-interests
US20110313954A1 (en) * 2010-06-18 2011-12-22 Microsoft Corporation Community model based point of interest local search
CN104063721B (zh) * 2014-07-04 2017-06-16 中国科学院自动化研究所 一种基于语义特征自动学习与筛选的人类行为识别方法
BR102016007265B1 (pt) * 2016-04-01 2022-11-16 Samsung Eletrônica da Amazônia Ltda. Método multimodal e em tempo real para filtragem de conteúdo sensível
CN108804551B (zh) * 2018-05-21 2021-06-04 辽宁工程技术大学 一种兼顾多样性与个性化的空间兴趣点推荐方法
CN108875007B (zh) * 2018-06-15 2019-12-17 腾讯科技(深圳)有限公司 兴趣点的确定方法和装置、存储介质、电子装置
US20200211053A1 (en) * 2018-12-26 2020-07-02 Yandex Europe Ag Method and system for determining fact of visit of user to point of interest
EP3828803A1 (en) * 2019-11-26 2021-06-02 Naver Corporation Ambient point-of-interest recommendation using look-alike groups
CN110929162B (zh) * 2019-12-04 2021-08-03 腾讯科技(深圳)有限公司 基于兴趣点的推荐方法、装置、计算机设备和存储介质
CN111949865A (zh) * 2020-08-10 2020-11-17 杭州电子科技大学 基于图神经网络与用户长短期偏好的兴趣点推荐方法
CN112925893B (zh) * 2021-03-23 2023-09-15 苏州大学 一种对话式兴趣点推荐方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180129971A1 (en) * 2016-11-10 2018-05-10 Adobe Systems Incorporated Learning user preferences using sequential user behavior data to predict user behavior and provide recommendations
CN108829766A (zh) * 2018-05-29 2018-11-16 苏州大学 一种兴趣点推荐方法、系统、设备及计算机可读存储介质
CN110399565A (zh) * 2019-07-29 2019-11-01 北京理工大学 基于时空周期注意力机制的递归神经网络兴趣点推荐方法
CN111241419A (zh) * 2020-01-09 2020-06-05 辽宁工程技术大学 一种基于用户关系嵌入模型的下一个兴趣点推荐方法
CN113158038A (zh) * 2021-04-02 2021-07-23 上海交通大学 基于sta-tcn神经网络框架的兴趣点推荐方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIM NICHOLAS NICHOLASLIM@U.NUS.EDU; HOOI BRYAN DCSBHK@NUS.EDU.SG; NG SEE-KIONG SEEKIONG@NUS.EDU.SG; WANG XUEOU IDSWX@NUS.EDU.SG; G: "STP-UDGAT: Spatial-Temporal-Preference User Dimensional Graph Attention Network for Next POI Recommendation", PROCEEDINGS OF THE 7TH ACM CONFERENCE ON INFORMATION-CENTRIC NETWORKING, ACMPUB27, NEW YORK, NY, USA, 19 October 2020 (2020-10-19) - 23 April 2021 (2021-04-23), New York, NY, USA , pages 845 - 854, XP058625138, ISBN: 978-1-4503-8312-7, DOI: 10.1145/3340531.3411876 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117573856A (zh) * 2024-01-15 2024-02-20 中国科学技术大学 一种基于记忆网络的建筑领域内容多兴趣召回方法

Also Published As

Publication number Publication date
CN113590971B (zh) 2023-11-07
CN113590971A (zh) 2021-11-02

Similar Documents

Publication Publication Date Title
May Petry et al. MARC: a robust method for multiple-aspect trajectory classification via space, time, and semantic embeddings
CN113139140B (zh) 基于时空感知gru并结合用户关系偏好的旅游景点推荐方法
CN110674323A (zh) 基于虚拟标签回归的无监督跨模态哈希检索方法及系统
El Mohadab et al. Predicting rank for scientific research papers using supervised learning
Dong et al. High-resolution land cover mapping through learning with noise correction
Cai et al. A robust interclass and intraclass loss function for deep learning based tongue segmentation
Huang et al. Research on urban modern architectural art based on artificial intelligence and GIS image recognition system
Shi et al. Attentional memory network with correlation-based embedding for time-aware POI recommendation
Wang et al. Regularized maximum correntropy machine
Bai et al. Geographic mapping with unsupervised multi-modal representation learning from VHR images and POIs
WO2023015658A1 (zh) 一种基于类脑时空感知表征的兴趣点推荐方法及系统
Mou et al. Personalized tourist route recommendation model with a trajectory understanding via neural networks
Wen et al. MSSRM: A multi-embedding based self-attention spatio-temporal recurrent model for human mobility prediction
Cao et al. A dual attention model based on probabilistically mask for 3D human motion prediction
CN114579892A (zh) 一种基于跨城市兴趣点匹配的用户异地访问位置预测方法
Hagenauer et al. Contextual neural gas for spatial clustering and analysis
CN108647295B (zh) 一种基于深度协同哈希的图片标注方法
Qin et al. Deep top similarity hashing with class-wise loss for multi-label image retrieval
Sun et al. Deep convolutional autoencoder for urban land use classification using mobile device data
Jiang et al. Evaluation of county-level poverty alleviation progress by deep learning and satellite observations
CN117010480A (zh) 模型训练方法、装置、设备、存储介质及程序产品
Li et al. Prediction of network public opinion features in urban planning based on geographical case-based reasoning
Chen et al. Attention-based multi-task learning for sensor analytics
Yao et al. Unsupervised land-use change detection using multi-temporal POI embedding
Huang Class prediction of cancer using probabilistic neural networks and relative correlation metric

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953267

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE