CN110059144A

CN110059144A - A kind of track owner's prediction technique based on convolutional neural networks

Info

Publication number: CN110059144A
Application number: CN201910266737.0A
Authority: CN
Inventors: 罗绪成; 李升阳; 仵筱妍
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-04-03
Filing date: 2019-04-03
Publication date: 2019-07-26
Anticipated expiration: 2039-04-03
Also published as: CN110059144B

Abstract

Track owner's prediction technique based on convolutional neural networks that the invention discloses a kind of, according to the track of all users formed an oriented no weight graph G=<V, E>, pass through Node2Vec learn track position ID low-dimensional real-valued vectors；Then slicing treatment is carried out to user trajectory, position ID is replaced with position ID corresponding low-dimensional real-valued vectors to the random length track after cutting, and form the fixed dimension matrix of track by interception or filling；Then, one four layers of convolutional neural networks of building and training are as prediction model, then by the track Input matrix of user location longitude and latitude to be detected building into trained prediction model, the probability distribution of track owner classification is obtained, the index of maximum value in probability distribution is finally corresponded to the number of owner labeled as the track.

Description

A Trajectory Ownership Prediction Method Based on Convolutional Neural Networks

技术领域technical field

本发明属于机器学习技术领域，更为具体地讲，涉及一种基于卷积神经网络的轨迹属主预测方法。The invention belongs to the technical field of machine learning, and more particularly, relates to a trajectory owner prediction method based on a convolutional neural network.

背景技术Background technique

轨迹属主预测通过对某个未知属主的轨迹进行特征提取和分析，然后判断该条轨迹的属主。轨迹属主预测是许多基于位置的服务的基础，对于提高基于位置的服务的质量具有重要意义，服务提供者可以利用预测结果进行个性化推荐和基于偏好的路径规划等。Trajectory owner prediction is based on feature extraction and analysis of the trajectory of an unknown owner, and then determines the owner of the trajectory. Trajectory owner prediction is the basis of many location-based services, and is of great significance for improving the quality of location-based services. Service providers can use the prediction results for personalized recommendation and preference-based path planning.

已有的轨迹属主预测方法通常是把轨迹作为时间序列处理，然后采用RNN等方式学习时间序列的表示。这种方式虽然学习了轨迹的前后关系，但是在一个轨迹序列中，可能某个特定位置或者某几个特定位置组合对于轨迹的分类至关重要，已有的方法却无法有效捕捉轨迹的这些特征，而卷积神经网络却能够更好的学习到这些特征，所以我们提出一种基于卷积神经网络的轨迹属主预测方法。The existing trajectory ownership prediction methods usually treat the trajectory as a time series, and then use RNN and other methods to learn the representation of the time series. Although this method learns the context of trajectories, in a trajectory sequence, a specific position or a combination of certain specific positions may be very important for the classification of trajectories, and existing methods cannot effectively capture these characteristics of trajectories. , and the convolutional neural network can learn these features better, so we propose a trajectory owner prediction method based on the convolutional neural network.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服现有技术的不足，提供一种基于卷积神经网络的轨迹属主预测方法，通过改进轨迹建模和特征提取方法，来提高轨迹预测的准确率。The purpose of the present invention is to overcome the deficiencies of the prior art and provide a trajectory owner prediction method based on a convolutional neural network, which improves the accuracy of trajectory prediction by improving trajectory modeling and feature extraction methods.

为实现上述发明目的，本发明提出一种基于卷积神经网络的轨迹属主预测方法，其特征在于，包括以下步骤：In order to achieve the above purpose of the invention, the present invention proposes a method for predicting trajectory ownership based on a convolutional neural network, which is characterized in that it includes the following steps:

(1)、数据预处理(1), data preprocessing

(1.1)、将所有用户轨迹的历史位置经纬度按照时间顺序统计，形成经纬度集合，其中，若某一经纬度重复出现，则在经纬度集合中仅保留一次；(1.1) Count the longitude and latitude of the historical location of all user trajectories in chronological order to form a longitude and latitude set, wherein, if a certain longitude and latitude appears repeatedly, it is only reserved once in the longitude and latitude set;

对经纬度集合中的每一个经纬度从1开始编号，给定唯一一个位置ID标识；Number each longitude and latitude in the longitude and latitude set from 1, and give a unique location ID;

(1.2)、按时间顺序，用步骤(1.1)中给定的位置ID去替代用户轨迹历史位置经纬度，则每个用户轨迹用一串位置ID表示；同时，将每个用户用唯一的整数ID进行标识，从而用户及用户轨迹就可以形成形式为[用户ID，轨迹(位置ID，…，位置ID)]的属主轨迹；(1.2), in chronological order, use the location ID given in step (1.1) to replace the longitude and latitude of the historical location of the user track, then each user track is represented by a string of location IDs; at the same time, each user is represented by a unique integer ID Identify, so that the user and the user track can form the owner track in the form of [user ID, track (location ID, ..., location ID)];

(1.3)、根据属主轨迹形成一个有向无权图G＝<V,E>，其中，V是所有位置ID的集合，如果出现某个用户从ID_i到ID_j，则<ID_i，ID_j>表示一条有向边，所有这样的边构成图G的边集E；(1.3) Form a directed unweighted graph G=<V, E> according to the owner trajectory, where V is the set of all location IDs, if a user appears from ID _i to ID _j , then <ID _i , ID _j > represents a directed edge, and all such edges constitute the edge set E of the graph G;

(2)、轨迹表示(2), track representation

(2.1)、以构建的有向无权图G作为输入，通过Node2Vec算法学习出G中每一个位置ID的低维实值向量；(2.1), take the constructed directed unweighted graph G as the input, and learn the low-dimensional real-valued vector of each position ID in G through the Node2Vec algorithm;

(2.2)、将步骤(1.2)中每个用户的轨迹按照固定时间间隔进行切片，从而将每一个用户轨迹分割成若干条位置ID序列，再用用户ID对切割后的位置ID序列进行属主标识；(2.2), slice the trajectory of each user in step (1.2) according to fixed time intervals, thereby dividing each user trajectory into several position ID sequences, and then use the user ID to own the cut position ID sequence. identification;

(2.3)、对切割后的不定长轨迹，用位置ID对应的低维实值向量来代替该位置ID，从而生成每个用户的轨迹矩阵；(2.3), to the indefinite length trajectory after cutting, replace the position ID with the low-dimensional real-valued vector corresponding to the position ID, thereby generating the trajectory matrix of each user;

然后通过截取或填充方式，构建每个用户固定维度的轨迹矩阵，从而形成数据集，其中，填充的向量为所有位置ID对应的低维实值向量的平均值；Then, by intercepting or filling, construct a fixed-dimensional trajectory matrix for each user to form a data set, where the filled vector is the average of the low-dimensional real-valued vectors corresponding to all location IDs;

(3)、构建预测模型(3), build a prediction model

构建一个四层卷积神经网络，其输入层为固定维度的轨迹矩阵；卷积层设置三个卷积核m*embedding_size，其中，m为常数，embedding size为Node2vec输出的低维实值向量的维度；池化层为k-max pooling，k表示卷积之后的前k个最大值；全连接层的输出输入至softmax函数，得到轨迹属主的概率分布；Construct a four-layer convolutional neural network, whose input layer is a fixed-dimensional trajectory matrix; the convolutional layer sets three convolution kernels m*embedding_size, where m is a constant, and the embedding size is the low-dimensional real-valued vector output by Node2vec. Dimension; the pooling layer is k-max pooling, k represents the first k maximum values after convolution; the output of the fully connected layer is input to the softmax function to obtain the probability distribution of the track owner;

(4)、训练预测模型(4), training prediction model

(4.1)、构建训练集(4.1), build a training set

将数据集中部分用户固定维度的轨迹矩阵集X和与之对应的one-hot类别向量集Y作为训练集，其中X＝[Vec_x₁,Vec_x₂,...,Vec_x_n]，Vec_x_n表示第n个用户固定维度的轨迹矩阵，Y＝[Vec_t₁,Vec_t₂,...,Vec_t_n]，Vec_t_n表示第n个属主对应的one-hot类别向量，若第n个位置为1，则其余位置全为0；The fixed-dimensional trajectory matrix set X of some users in the dataset and the corresponding one-hot category vector set Y are used as the training set, where X=[Vec_x ₁ ,Vec_x ₂ ,...,Vec_x _n ], Vec_x _n represents the first Trajectory matrix of n fixed dimensions of users, Y=[Vec_t ₁ , Vec_t ₂ ,...,Vec_t _n ], Vec_t _n represents the one-hot category vector corresponding to the nth owner, if the nth position is 1, Then the rest of the positions are all 0;

(4.2)、初始化预测模型(4.2), initialize the prediction model

初始化卷积层中每个卷积核的权重矩阵W_p的值为正态分布，其均值为0，方差为0.1；同时初始化卷积层中每个卷积核的偏置向量B_p为0.1，每个偏置向量的元素个数即为对应层神经元的个数；初始化全连接层的权重矩阵为W，其维度为[batch_size*k*卷积核个数，类别数]，同时初始化全连接层的偏置向量B值为0.1，元素个数为类别数；其中，batch_size为常数，p＝1,2,3，p表示卷积层中第几个卷积核；The value of the weight matrix W _p of each convolution kernel in the initialized convolution layer is a normal distribution with a mean value of 0 and a variance of 0.1; at the same time, the bias vector B _p of each convolution kernel in the convolution layer is initialized to 0.1 , the number of elements of each bias vector is the number of neurons in the corresponding layer; the weight matrix of the initialized fully connected layer is W, and its dimension is [batch_size*k*number of convolution kernels, number of categories], and at the same time initialize The bias vector B value of the fully connected layer is 0.1, and the number of elements is the number of categories; among them, batch_size is a constant, p=1, 2, 3, p represents the number of convolution kernels in the convolution layer;

(4.3)将训练集输入至初始化后的预测模型中，采用Adam算法优化损失函数，然后利用误差反向传播BP算法将误差传向前一层，更新卷积层的权重矩阵W_p、偏置向量B_p以及全连接层权重矩阵W、偏置向量B，经过若干次迭代后，得到收敛的神经网络模型，从而得到训练完成的预测模型；(4.3) Input the training set into the initialized prediction model, use the Adam algorithm to optimize the loss function, and then use the error backpropagation BP algorithm to pass the error to the previous layer, update the weight matrix W _p , bias of the convolution layer The vector B _p , the weight matrix W of the fully connected layer, and the bias vector B, after several iterations, the converged neural network model is obtained, thereby obtaining the trained prediction model;

(5)、轨迹属主预测(5), trajectory owner prediction

将待检测的用户位置经纬度按照步骤(1)、(2)所述方法，构建出该用户固定维度的轨迹矩阵，再将构建的轨迹矩阵输入到已经训练好的预测模型中，得到该轨迹对应所有属主类别的概率分布，概率分布中最大值的索引则为该轨迹对应的属主编号。According to the method described in steps (1) and (2), the latitude and longitude of the location of the user to be detected is used to construct a trajectory matrix of the fixed dimension of the user, and then the constructed trajectory matrix is input into the already trained prediction model, and the corresponding trajectory of the trajectory is obtained. The probability distribution of all owner categories, and the index of the maximum value in the probability distribution is the owner number corresponding to the track.

本发明的发明目的是这样实现的：The purpose of the invention of the present invention is achieved in this way:

本发明基于卷积神经网络的轨迹属主预测方法，根据所有用户的轨迹形成一个有向无权图G＝<V,E>，通过Node2Vec学习轨迹位置ID的低维实值向量；然后对用户轨迹进行切片处理，对切割后的不定长轨迹用位置ID对应的低维实值向量代替位置ID，并通过截取或填充形成轨迹的固定维度矩阵；接着，构建及训练一个四层卷积神经网络作为预测模型，然后将待检测的用户位置经纬度构建的轨迹矩阵输入到已经训练好的预测模型中，得到轨迹属主分类的概率分布，最后将概率分布中最大值的索引标记为该轨迹对应属主的编号。The present invention is based on the trajectory owner prediction method of the convolutional neural network, forms a directed unweighted graph G=<V, E> according to the trajectory of all users, and learns the low-dimensional real-valued vector of the trajectory position ID through Node2Vec; The trajectory is sliced, and the position ID is replaced by the low-dimensional real-valued vector corresponding to the position ID for the indefinite-length trajectory after cutting, and the fixed-dimensional matrix of the trajectory is formed by intercepting or filling; then, a four-layer convolutional neural network is constructed and trained. As a prediction model, the trajectory matrix constructed by the longitude and latitude of the user's location to be detected is input into the trained prediction model to obtain the probability distribution of the trajectory owner classification, and finally the index of the maximum value in the probability distribution is marked as the corresponding attribute of the trajectory main number.

同时，本发明基于卷积神经网络的轨迹属主预测方法还具有以下有益效果：Meanwhile, the trajectory owner prediction method based on the convolutional neural network of the present invention also has the following beneficial effects:

(1)、将属主轨迹序列构建成网络，即有向无权图，通过Node2Vec算法学习网络中每个节点的低维实值向量；(1) Construct the owner trajectory sequence into a network, that is, a directed unweighted graph, and learn the low-dimensional real-valued vector of each node in the network through the Node2Vec algorithm;

(2)、轨迹填充的向量为所有向量的平均值，相较于全部填充为0，属主预测准确率有了很大提升；(2) The vector filled by the trajectory is the average value of all vectors. Compared with all filling of 0, the accuracy of owner prediction has been greatly improved;

(3)、相对于传统卷积神经网络进行改进，使轨迹属主预测的准确率有了进一步提高。(3) Compared with the traditional convolutional neural network, the accuracy of trajectory owner prediction is further improved.

附图说明Description of drawings

图1是本发明基于卷积神经网络的轨迹属主预测方法流程图；Fig. 1 is the flow chart of the trajectory owner prediction method based on convolutional neural network of the present invention;

图2是预测模型的架构示意图。Figure 2 is a schematic diagram of the architecture of the prediction model.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式进行描述，以便本领域的技术人员更好地理解本发明。需要特别提醒注意的是，在以下的描述中，当已知功能和设计的详细描述也许会淡化本发明的主要内容时，这些描述在这里将被忽略。The specific embodiments of the present invention are described below with reference to the accompanying drawings, so that those skilled in the art can better understand the present invention. It should be noted that, in the following description, when the detailed description of known functions and designs may dilute the main content of the present invention, these descriptions will be omitted here.

实施例Example

图1是本发明基于卷积神经网络的轨迹属主预测方法流程图。FIG. 1 is a flow chart of a method for predicting trajectory ownership based on a convolutional neural network according to the present invention.

在本实施例中，如图1所示，本发明一种基于卷积神经网络的轨迹属主预测方法，包括以下步骤：In this embodiment, as shown in FIG. 1 , a method for predicting trajectory ownership based on a convolutional neural network of the present invention includes the following steps:

S1、数据预处理S1, data preprocessing

S1.1、将所有用户轨迹的历史位置经纬度按照时间顺序统计，形成经纬度集合，其中，若某一经纬度重复出现，则在经纬度集合中仅保留一次；S1.1. Count the longitude and latitude of the historical locations of all user trajectories in chronological order to form a longitude and latitude set, wherein if a certain longitude and latitude appears repeatedly, it is only reserved once in the longitude and latitude set;

如表1所示，对经纬度集合中的每一个经纬度从1开始编号，给定唯一一个位置ID标识；As shown in Table 1, each longitude and latitude in the longitude and latitude set is numbered from 1, and a unique location ID is given;

表1是经纬度的位置ID标识表；Table 1 is the location ID identification table of latitude and longitude;

编号Numbering 经纬度latitude and longitude 11 39.747652-104.9925139.747652-104.99251 22 39.891383-105.07081439.891383-105.070814 33 39.891077-105.06853239.891077-105.068532 44 39.750469-104.99907339.750469-104.999073 ……... ……...

表1Table 1

S1.2、按时间顺序，用步骤S1.1中给定的位置ID去替代用户轨迹历史位置经纬度，则每个用户轨迹用一串位置ID表示；同时，将每个用户用唯一的整数ID进行标识，从而用户及用户轨迹就可以形成形式为[用户ID，轨迹(位置ID，…，位置ID)]的属主轨迹；在本实施例中，如某一属主轨迹：[0，(622，474，474，474，481，482，482，83，83，270，487，270，270，83，83，471，……)]，其中，0为用户ID，后续为位置ID序列；S1.2. In chronological order, use the location ID given in step S1.1 to replace the latitude and longitude of the historical location of the user track, then each user track is represented by a string of location IDs; at the same time, each user is represented by a unique integer ID Identify, so that the user and the user track can form the owner track in the form of [user ID, track (location ID, ..., location ID)]; in this embodiment, such as a certain owner track: [0, ( 622, 474, 474, 474, 481, 482, 482, 83, 83, 270, 487, 270, 270, 83, 83, 471,...)], where 0 is the user ID, followed by the location ID sequence;

S1.3、根据属主轨迹形成一个有向无权图G＝<V,E>，其中，V是所有位置ID的集合，如果出现某个用户从ID_i到ID_j，则<ID_i，ID_j>表示一条有向边，所有这样的边构成图G的边集E；S1.3. Form a directed unweighted graph G=<V, E> according to the owner trajectory, where V is the set of all location IDs, if a user appears from ID _i to ID _j , then <ID _i , ID _j > represents a directed edge, and all such edges constitute the edge set E of the graph G;

S2、轨迹表示S2, track representation

S2.1、以构建的有向无权图G作为输入，通过Node2Vec算法学习出G中每一个位置ID的低维实值向量；Node2Vec是一种网络表示学习方法，其具体过程属于现有技术，在这里就不再赘述。S2.1. Taking the constructed directed unweighted graph G as input, learn the low-dimensional real-valued vector of each position ID in G through the Node2Vec algorithm; Node2Vec is a network representation learning method, and its specific process belongs to the prior art , and will not be repeated here.

在本实施例中，如编号为60的位置ID的低维实值向量表示为：In this embodiment, the low-dimensional real-valued vector of the position ID numbered 60 is represented as:

[60(-0.383389，-0.826315，-1.379363，……，-1.839076，1.930556，0.502587)]；[60(-0.383389, -0.826315, -1.379363, ..., -1.839076, 1.930556, 0.502587)];

S2.2、将步骤S1.2中每个用户的轨迹按照固定时间间隔进行切片，从而将每一个用户轨迹分割成若干条位置ID序列，再用用户ID对切割后的位置ID序列进行属主标识；在本实施例中，切片效果如下所示：S2.2. Slice the trajectory of each user in step S1.2 according to a fixed time interval, so as to divide each user trajectory into several position ID sequences, and then use the user ID to own the cut position ID sequence. In this embodiment, the slicing effect is as follows:

[0，(622，474，474，474，481，482，482，83)][0, (622, 474, 474, 474, 481, 482, 482, 83)]

[0，(83，270，487，270，270，83，83，471)][0, (83, 270, 487, 270, 270, 83, 83, 471)]

……...

其中，0为用户ID，后续为位置ID序列；Among them, 0 is the user ID, followed by the location ID sequence;

S2.3、对切割后的不定长轨迹用位置ID对应的低维实值向量来代替该位置ID，从而生成每个用户的轨迹矩阵；S2.3, replace the position ID with the low-dimensional real-valued vector corresponding to the position ID for the indefinite-length trajectory after cutting, thereby generating the trajectory matrix of each user;

然后通过截取或填充方式，构建每个用户固定维度的轨迹矩阵，其中，填充的向量为所有位置ID对应的低纬实值向量的平均值；在本实施例中，截取前30个位置ID来构建每个用户固定维度的轨迹矩阵；Then, a fixed-dimensional trajectory matrix of each user is constructed by intercepting or filling, wherein the filled vector is the average value of the low-latitude real-valued vectors corresponding to all location IDs; in this embodiment, the first 30 location IDs are intercepted to Build a trajectory matrix of fixed dimensions for each user;

S3、构建预测模型S3. Build a prediction model

构建一个四层卷积神经网络，如图2所示，其输入层为固定维度的轨迹矩阵；卷积层设置三个卷积核m*embedding_size，其中，m的取值为2、3、4，embedding size为Node2vec输出的低纬实值向量的维度，每种卷积核的个数为64；池化层为k-max pooling，k表示卷积之后的前k个最大值，在本实施例中k＝3；全连接层的输出输入至softmax函数，得到轨迹属主的概率分布。Construct a four-layer convolutional neural network, as shown in Figure 2, the input layer is a fixed-dimensional trajectory matrix; the convolutional layer sets three convolution kernels m*embedding_size, where m is 2, 3, 4. , the embedding size is the dimension of the low-latitude real-valued vector output by Node2vec, the number of each convolution kernel is 64; the pooling layer is k-max pooling, k represents the first k maximum values after convolution, in this implementation In the example, k=3; the output of the fully connected layer is input to the softmax function to obtain the probability distribution of the trajectory owner.

S4、训练预测模型S4. Train the prediction model

S4.1、构建训练集S4.1, build a training set

S4.2、初始化预测模型S4.2. Initialize the prediction model

初始化卷积层中每个卷积核的权重矩阵W_p的值为正太分布，其均值为0，方差为0.1；同时初始化卷积层中每个卷积核的偏置向量B_p为0.1，每个偏置向量的元素个数即为对应层神经元的个数；初始化全连接层的权重矩阵为W，其维度为[batch_size*k*卷积核个数，类别数]，同时初始化全连接层的偏置向量B值为0.1，元素个数为类别数；其中，batch_size为常数，取值为64，p＝1,2,3，p表示卷积层中第几个卷积核；The value of the weight matrix W _p of each convolution kernel in the initialized convolution layer is normal distribution, and its mean is 0 and the variance is 0.1; at the same time, the bias vector B _p of each convolution kernel in the convolution layer is initialized to 0.1, The number of elements of each bias vector is the number of neurons in the corresponding layer; the weight matrix of the initialized fully connected layer is W, and its dimension is [batch_size*k*Number of convolution kernels, number of categories], and at the same time initialize the full connection layer. The value of the bias vector B of the connection layer is 0.1, and the number of elements is the number of categories; among them, batch_size is a constant, the value is 64, p=1, 2, 3, p represents the number of convolution kernels in the convolution layer;

S4.3将训练集输入至初始化后的预测模型中，采用Adam算法优化损失函数，而优化的损失函数如下：S4.3 The training set is input into the initialized prediction model, and the Adam algorithm is used to optimize the loss function, and the optimized loss function is as follows:

其中，N＝batch_size，y_j为Vec_t_j中的所有元素，a_j为softmax函数的输出值；Wherein, N=batch_size, y _j is all elements in Vec_t _j , a _j is the output value of softmax function;

然后利用误差反向传播BP算法将误差传向前一层，更新卷积层的权重矩阵W_p、偏置向量B_p以及全连接层权重矩阵W、偏置向量B，再经过若干次迭代后，得到收敛的神经网络模型，从而得到训练完成的预测模型；Then use the error back propagation BP algorithm to pass the error to the previous layer, update the weight matrix W _p , the bias vector B _p of the convolution layer and the weight matrix W and bias vector B of the fully connected layer, and after several iterations , to obtain the converged neural network model, thereby obtaining the trained prediction model;

S5、轨迹属主预测S5, trajectory owner prediction

将待检测的用户位置经纬度按照步骤S1、S2所述方法，构建出该用户固定维度的轨迹矩阵，再将构建的轨迹矩阵输入到已经训练好的预测模型中，得到该轨迹对应所有属主类别的概率分布，概率分布中最大值的索引则为该轨迹对应的属主编号。According to the method described in steps S1 and S2, construct the trajectory matrix of the fixed dimension of the user's position, and then input the constructed trajectory matrix into the already trained prediction model to obtain all the owner categories corresponding to the trajectory. The probability distribution of , and the index of the maximum value in the probability distribution is the owner number corresponding to the track.

尽管上面对本发明说明性的具体实施方式进行了描述，以便于本技术领域的技术人员理解本发明，但应该清楚，本发明不限于具体实施方式的范围，对本技术领域的普通技术人员来讲，只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内，这些变化是显而易见的，一切利用本发明构思的发明创造均在保护之列。Although the illustrative specific embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be clear that the present invention is not limited to the scope of the specific embodiments. For those skilled in the art, As long as various changes are within the spirit and scope of the present invention as defined and determined by the appended claims, these changes are obvious, and all inventions and creations utilizing the inventive concept are included in the protection list.

Claims

1. a trajectory owner prediction method based on convolutional neural network, is characterized in that, comprises the following steps:

(1), data preprocessing

(1.1) Count the longitude and latitude of the historical location of all user trajectories in chronological order to form a longitude and latitude set, wherein, if a certain longitude and latitude appears repeatedly, it is only reserved once in the longitude and latitude set;

Number each longitude and latitude in the longitude and latitude set from 1, and give a unique location ID;

(1.2), in chronological order, use the location ID given in step (1.1) to replace the longitude and latitude of the historical location of the user track, then each user track is represented by a string of location IDs; at the same time, each user is represented by a unique integer ID Identify, so that the user and the user track can form the owner track in the form of [user ID, track (location ID, ..., location ID)];

(1.3) Form a directed unweighted graph G=<V, E> according to the owner trajectory, where V is the set of all location IDs, if a user appears from ID _i to ID _j , then <ID _i , ID _j > represents a directed edge, and all such edges constitute the edge set E of the graph G;

(2), track representation

(2.1), take the constructed directed unweighted graph G as the input, and learn the low-dimensional real-valued vector of each position ID in G through the Node2Vec algorithm;

(2.2), slice the trajectory of each user in step (1.2) according to fixed time intervals, thereby dividing each user trajectory into several position ID sequences, and then use the user ID to own the cut position ID sequence. identification;

(2.3), replace the position ID with the low-dimensional real-valued vector corresponding to the position ID to the indefinite-length trajectory after cutting, thereby generating the trajectory matrix of each user;

Then, a fixed-dimensional trajectory matrix of each user is constructed by intercepting or filling to form a data set, where the filled vector is the average value of the low-latitude real-valued vectors corresponding to all location IDs;

(3), build a prediction model

Construct a four-layer convolutional neural network, whose input layer is a fixed-dimensional trajectory matrix; the convolutional layer sets three convolution kernels m*embedding_size, where m is a constant, and the embedding size is the low-dimensional real-valued vector output by Node2vec. Dimension; the pooling layer is k-max pooling, k represents the first k maximum values after convolution; the output of the fully connected layer is input to the softmax function to obtain the probability distribution of the track owner;

(4), training prediction model

(4.1), build a training set

The fixed-dimensional trajectory matrix set X of some users in the dataset and the corresponding one-hot category vector set Y are used as the training set, where X=[Vec_x ₁ ,Vec_x ₂ ,...,Vec_x _n ], Vec_x _n represents the first Trajectory matrix of n fixed dimensions of users, Y=[Vec_t ₁ , Vec_t ₂ ,...,Vec_t _n ], Vec_t _n represents the one-hot category vector corresponding to the nth owner, if the nth position is 1, Then the rest of the positions are all 0;

(4.2), initialize the prediction model

The value of the weight matrix W _p of each convolution kernel in the initialized convolution layer is a normal normal distribution with a mean value of 0 and a variance of 0.1; at the same time, the bias vector B _p of each convolution kernel in the convolution layer is initialized is 0.1, the number of elements of each bias vector is the number of neurons in the corresponding layer; the weight matrix of the initialized fully connected layer is W, and its dimension is [batch_size*k*number of convolution kernels, number of categories], At the same time, the bias vector B value of the fully connected layer is initialized to 0.1, and the number of elements is the number of categories; among them, batch_size is a constant, p=1, 2, 3, p represents the number of convolution kernels in the convolution layer;

(4.3) Input the training set into the initialized prediction model, use the Adam algorithm to optimize the loss function, and then use the error backpropagation BP algorithm to pass the error to the previous layer, update the weight matrix W _p , bias of the convolution layer The vector B _p , the weight matrix W of the fully connected layer, and the bias vector B, after several iterations, the converged neural network model is obtained, thereby obtaining the trained prediction model;

(5), trajectory owner prediction

According to the method described in steps (1) and (2), the latitude and longitude of the location of the user to be detected is used to construct a trajectory matrix of the fixed dimension of the user, and then the constructed trajectory matrix is input into the already trained prediction model, and the corresponding trajectory of the trajectory is obtained. The probability distribution of all owner categories, and the index of the maximum value in the probability distribution is the owner number corresponding to the track.

2. a kind of trajectory owner prediction method based on convolutional neural network according to claim 1, is characterized in that, the loss function that described Adam algorithm optimizes is as follows:

Among them, N=batch_size, y _j is all the elements in Vec_t _j , and a _j is the output value of the softmax function.