CN117912248A

CN117912248A - Vehicle arrival time prediction method based on representation learning

Info

Publication number: CN117912248A
Application number: CN202410087571.7A
Authority: CN
Inventors: 刘美玲; 李昕彤; 李茂源; 司浩田; 陈茜雅
Original assignee: Northeast Forestry University
Current assignee: Northeast Forestry University
Priority date: 2024-01-22
Filing date: 2024-01-22
Publication date: 2024-04-19

Abstract

A vehicle arrival time prediction method based on representation learning belongs to the technical field of intelligent traffic. The method aims to solve the problem that the existing vehicle arrival time prediction accuracy is low. The method comprises the steps of firstly adopting SDNE algorithm to process an intersection type matrix and a supplementary information matrix comprising traffic flow information and important information points, and then inputting the processed intersection type matrix and supplementary information matrix into an MLP (multi-level processor) fully-connected neural network to obtain a first representation vector; processing the node information vector row by adopting an LSTM algorithm to obtain a second representation vector; meanwhile, a xDeepFM algorithm is adopted to process the discrete semantic information vector, and a third expression vector is obtained; and then splicing the first representation vector to the third representation vector according to channels, performing 1-dimensional ResNet to perform convolution operation, and outputting predicted vehicle arrival time through CBAM attention module and MLP processing by the MLP.

Description

Vehicle arrival time prediction method based on representation learning

Technical Field

The invention belongs to the technical field of intelligent traffic, and particularly relates to a vehicle arrival time prediction method.

Background

With the rise of intelligent traffic, people are increasingly eager for efficient travel, and searching for more excellent models to estimate vehicle travel time is a primary task. Because the traffic system has stronger nonlinearity and is influenced by various factors such as weather, time, traffic volume and the like, the current vehicle has lower prediction accuracy under the condition of complex weather and traffic volume.

In 2016, google firstly proposes, researches and develops a Wide & Deep model, has the advantages of logistic regression and Deep neural network, and lays a foundation for recommending a plurality of models of a system later. In 2018, the Wang et al proposed WDR (Wide Dynamic Range) algorithm, which was first used to predict the time required for a vehicle to reach a destination by combining the recommended systems algorithm WD (wide & DEEP LEARNING) with LSTM. In 2020, the TTPNet (TRAVEL TIME PREDICT Network) algorithm was proposed by Shen et al. All track sequences are embedded into a grid graph, the speed information of each grid within one hour is recovered by using a tensor decomposition algorithm, network node representation information is obtained, then other semantic information is spliced, and a BiLSTM network is input to obtain a prediction result. In 2019, li et al proposed to customize different predictive models for different functional areas. Although the models can effectively improve the accuracy of the prediction of the vehicle, the accuracy of the current method has a large improvement space under the condition of complex weather and traffic.

Disclosure of Invention

The method aims to solve the problem that the existing vehicle arrival time prediction accuracy is low.

Firstly, acquiring traffic information of an area where a vehicle to be predicted is located, and then predicting by adopting an SLD joint model;

The traffic information comprises an intersection type matrix A, a supplementary information matrix B and discrete semantic information;

the intersection type matrix A is a matrix formed by a node type sequence corresponding to a node sequence after vehicle track information of a travel plan to be predicted is mapped to a basic road network; the nodes are intersections in the basic road network, the node types are intersection types, the intersection types comprise cross intersections, T-shaped intersections and Y-shaped intersections, and the intersection types are respectively represented by the identification numbers 1,2 and 3; the A stores the identification number corresponding to the type of the intersection;

The supplementary information matrix B and the fork type matrix A have the same node sequence and corresponding relation, the supplementary information matrix B is N x M, wherein M represents the number of characteristic attributes of each node, and the characteristic attributes comprise traffic flow information and important information points;

The traffic flow information: when the current node is reached, whether the traffic flow corresponding to the current node exceeds the average traffic flow of all nodes in the same day or not, wherein the average traffic flow exceeds 1, otherwise, the traffic flow is 0;

The important information points: if the current node has important places around, the important information point of the node is1, otherwise, the important information point of the node is 0;

The discrete semantic information comprises: the total track length, departure time, weather conditions, number of weeks, highest air temperature, lowest air temperature and vehicle ID of the vehicle track of the travel plan to be predicted;

The process of predicting by the SLD joint model comprises the following steps:

for the intersection type matrix A and the supplementary information matrix B, the intersection type matrix A and the supplementary information matrix B are taken as input, the processing is carried out by adopting SDNE algorithm, then the result output by SDNE is input into the MLP fully-connected neural network to obtain a representation vector, and the representation vector is marked as a first representation vector;

And obtaining a node information vector based on the information of each node corresponding to the node sequence after the vehicle track information of the travel plan to be predicted is mapped to the basic road network, wherein the content of the node information vector comprises the following components: node position coordinates, node type, vehicle speed through the node; according to the node order of the sequence arrangement, using the node information vector as input and using an LSTM algorithm to process; outputting one representing vector through the LSTM algorithm, wherein the representing vector is marked as a second representing vector;

determining discrete semantic information vectors based on the discrete semantic information, adopting xDeepFM algorithm to process, and marking the representation vectors obtained after xDeepFM algorithm processing as third representation vectors;

And splicing the first representation vector to the third representation vector according to channels, performing 1-dimensional ResNet to perform convolution operation, and outputting predicted vehicle arrival time through CBAM attention module and MLP processing by the MLP.

Further, the SLD joint model is already trained, and the loss function of the SLD joint model in the training process is determined based on the mean absolute error percentage MAPE: a percentage error between the actual arrival time and the predicted arrival time is calculated for each sample, and then the average percentage corresponding to the absolute value of the percentage error for all samples is taken.

Further, the specific form of the loss function of the SLD joint model in the training process is as follows:

Wherein x _i represents the correspondence of each sample; f (x _i) is the prediction result of the SLD joint model; Ω (f) is a regularization term;

For regularization terms, learning is performed using a gradient-enhanced learning decision tree GBDT.

Further, before the SLD joint model is trained, the LSTM algorithm is trained in advance, and in the process of the training in advance, the learning rate of the LSTM algorithm is reduced according to an exponential function, and the method specifically comprises the following steps:

η_t＝η₀*e^{-decay_rate*t}

where η _t is the learning rate at time step t, η ₀ is the initial learning rate, decay_rate is the decay rate, and t is the number of rounds of current training.

Further, the automatic encoder structure in SDNE algorithm includes encoder Encoder and decoder, encoder and decoder each include an input layer, a hidden layer and an output layer; the hidden layer in the encoder and decoder in the automatic encoder is replaced by a hidden layer that introduces a convolutional layer, i.e. the front end of the hidden layer in the encoder and decoder in the automatic encoder introduces a convolutional layer, the activation function of which is PReLU.

Further, the number of convolutional layers introduced at the front end of the hidden layer in the encoder and decoder in the auto encoder is 8 convolutional layers.

Further, a convolution process is performed using a convolution kernel of (3, 5) size in each of the convolution layers introduced by the hidden layer front-end in the encoder and decoder in the automatic encoder.

Further, the hidden layer in the encoder and decoder in the automatic encoder is a full connection layer, and the hidden layer introduced into the convolution layer is the convolution layer+the full connection layer.

Further, the network structure of xDeepFM algorithm comprises an embedded layer, a combined layer, a pooling layer and a full connection layer; the full-connection layer is replaced by a full-connection normalization layer, namely, a batch of normalization layers are added after the full-connection layer.

Further, the embedded layer in the network structure of xDeepFM algorithm is set to 50 layers.

The beneficial effects are that:

according to the invention, various influencing factors are comprehensively considered, and different attribute information is fused by adopting the SLD combined model, so that the mining capability of the different attribute information can be effectively improved, and the accuracy of a prediction result can be improved. Meanwhile, the minimum average absolute error percentage Mape value is adopted to train the SLD combined model, so that the training process of the model can be effectively guided, and the model quality is improved. Therefore, the travel time prediction method and the travel time prediction device can realize more accurate travel time prediction, so that the time of a user is saved, and the traffic pressure is relieved.

Drawings

FIG. 1 is a schematic diagram of a vehicle arrival time prediction framework based on representation learning.

Detailed Description

The first embodiment is as follows: the present embodiment will be described with reference to figure 1,

The vehicle arrival time prediction method based on representation learning according to the present embodiment includes the steps of

Step one, data acquisition and feature extraction:

The present embodiment uses ACM SIGSPATIAL, 2021, GISCUP real-time traffic dynamics system data in the game data, ACM SIGSPATIAL, 2021, GISCUP, to build a high-dimensional feature map for the location-based system. Feature extraction is performed based on real-time traffic dynamic system data, and the features comprise: spatial information, temporal information, traffic information, other discrete semantic information. The "high-dimensional" of the previous high-dimensional feature map indicates that each location is described as a vector containing a plurality of features. These features may come from different data sources or information types such as space, time, traffic, and other discrete semantic information.

The data of the invention adopts the data of the path of the Shenzhen city net about vehicle in 2021, 8 months, 9 months and 1 day. When the driver responds to the customer's request, a sample is collected until the driver receives the customer. When a passenger gets on the vehicle, collection of travel samples is started until the passenger reaches the destination. The vehicle trajectories collected from a real scene can be extremely complex, containing multiple types of lines. Predicting the optimal travel time for this route accurately is a challenging and profound problem. After removing travel time anomalies, the experiment obtained the final raw data. The 8 month data was then used as a training set and the 9 month 1 day data was used as a test set.

Step two, constructing and training a model, and predicting:

Based on the extracted features, we re-write the ETA learning problem (ESTIMATED TIME of Arrival time predicted) to a more standardized machine learning form.

First, a reference travel time stamp of the travel time of all travel samples is represented by y= [ y ₁,y₂...,y_N ], which may also be referred to as a real travel time stamp. It is a continuous value or discrete time period representing the actual travel time for each travel sample. These tags are organized into a vector.

The corresponding time interval between each arrival time e _i and departure time s _i is calculated with y _i＝e_i-s_i∈R₊. In this embodiment, for each trip, the trip is divided into a series of road segments or paths, then the segment travel time of each trip sample is calculated, and finally the total travel time is obtained by adding, that is, one sample y _i. Because one section of travel in the data set adopted in the experiment is divided into a plurality of sections, the total travel time is calculated by adding the sections of travel time, and in the practical application process, the total travel time can also be directly obtained by subtracting the departure time from the end time of the whole section of travel.

Taking a whole journey corresponding to an order in the data set as one sample, and using X= [ X ₁,x₂...,x_N ] to represent a feature matrix corresponding to the sample, wherein X _i∈R^d is a coding vector corresponding to each sample (actually, the splicing result of the first representation vector to the third representation vector is obtained).

The invention aims to train a model which can accurately predict future unknown dataIs a travel time of the person.

The existing user order information of the drip and get car is counted, and the user's acceptance rate and satisfaction are found to be related to the average time difference estimated by the model and the total time of the user planning trip, so that the average absolute percentage error (MAPE) is a relatively more accurate, more reasonable and reliable measurement standard, and MAPE refers to calculating the percentage error between the actual arrival time and the predicted arrival time of each sample, and then taking the average percentage of the absolute values of the percentage errors of all samples. MAPE minimization was targeted.

However, it has been found through research that there is a potential risk of an overfitting event occurring in a conventional MAPE, and therefore, the present invention attempts to avoid the occurrence of overfitting by introducing some additional regularization terms appropriately, and once overfitting occurs, training needs to be stopped immediately, and the model is modified, thereby constraining the complexity of the model. After modification, the optimization objective is expressed as follows:

Wherein x _i represents a feature vector corresponding to each input sample, i=1, 2, … …, N is the number of samples; f (x _i) is the predicted result of sample xi. That is: for the feature vector xi corresponding to each sample, sequentially calculating Then, together with Ω (f), Ω (f) is a regularization term that controls model complexity and prevents too high a fit.

For regularization terms, here a gradient reinforcement learning decision tree GBDT is used for learning. In GBDT, the predictive model may be represented as an additive tree model:

Where x _i represents the eigenvector of the input sample, f _t(x_i) represents the t decision tree corresponding to the t iteration.

The optimization problem of the model is a convex optimization problem. However, since the MAPE function is not differentiable, the goal is not slick. The invention solves the optimization problem by using a sub-gradient method.

In the process of a large number of actual engineering applications, GBDT, SVR, a common neural network and a random forest are not optimal choices for solving the ETA problem, so that a model shown in FIG. 1 (affine transformation before data input is omitted in the figure) is constructed as a prediction model, which is called an SLD joint model; the processing process comprises the following steps:

(1) Graph representation processing of road network:

Spatial information: the length of time required to travel a road is closely related to the length of the main route driving the vehicle to the city and the size of its geographic area. Therefore, the main road information in the city is required to be abstracted, the space feature set of the road network information sequence is extracted, and the road network matrix is constructed.

Firstly, mapping traveling vehicle track information into a basic road network structure; after mapping, the road each vehicle passes through is divided into a series of road segments. Each intersection in the road is abstracted into a node, and then important characteristic information-intersection type of the road network is extracted. For the intersection type, we consider here three types, namely a crossroad, a T-intersection, a Y-intersection, denoted by 1, 2, 3, respectively. Therefore, in the final road network adjacency matrix, 0 indicates that the nodes corresponding to the abscissa of the matrix (namely, the intersections in practice) are not communicated, and the nodes are not communicated, and the different types of intersections also cause different traffic pressures, so that the types of the three intersections are distinguished by 1, 2 and 3. The intersection type matrix A is obtained through the steps.

Meanwhile, based on each node, a supplementary information matrix B is obtained, the supplementary information matrix B and the fork type matrix A have the same node sequence and corresponding relation, the supplementary information matrix B is N x M, wherein M represents the number of characteristic attributes of each node, and in the embodiment, the value of M is 2 and comprises traffic flow information and important information points;

The traffic flow information: when the traffic flow corresponding to the current node exceeds the average traffic flow of all nodes in the same day, the traffic flow exceeds 1, otherwise, the traffic flow is 0;

In the training stage, obtaining traffic flow information by using known flow data; firstly, counting the traffic flow (the number of vehicles) passing through all nodes in a study area in the day and the corresponding average traffic flow (the average number of vehicles) of all nodes, and if the current node traffic flow (the number of vehicles) passing through the current node in the day exceeds the average traffic flow of all nodes in the day, representing the current node traffic flow by 1, otherwise, representing the current node traffic flow by 0, and training an encoder.

In the prediction stage, firstly, a single-node traffic flow prediction model is used for predicting the traffic flow of a current node at a certain moment (in time) on a current date (a certain day needing prediction); simultaneously, predicting the average traffic flow of all nodes on the current date (the day needing prediction) by using an all-node balance average traffic flow prediction model;

The single-node traffic flow prediction model is a model capable of predicting traffic flow of a single node at a certain moment of a certain day; the model itself is not of interest to the present invention as long as it can predict traffic flow of a single node at a certain time of day, and the model type is of the prior art, including but not limited to neural network models, linear regression models, and the like. For example, the neural network model is used, input information is input into the neural network model to predict the traffic flow of a certain node at a certain moment in the current date, and of course, the input information includes, but is not limited to, the date, the moment, the node code (the node ID), the weather and other information, and can also include the traffic flow of the current node at the previous moment/moments, and the like, and the model type, the specific structure and the construction process are determined specifically according to the prediction precision and the actual situation, and the training process is realized by adopting the prior art, so that the invention does not excessively describe.

Similarly, the full-node balance average traffic flow prediction model is a model capable of predicting average traffic flow of all nodes on a certain date; the model is not focused on the invention, so long as the average traffic flow of all nodes on a certain date can be predicted, and the model is of the prior art, including but not limited to a neural network model, a linear regression model and the like. For example, the neural network model is used, input information is input into the neural network model to predict the average traffic flow of all nodes on a certain date, of course, the input information includes but is not limited to information such as date, node code (node ID), weather, node number and the like, and the input information can also include the average traffic flow of all nodes on a day/days before the current date and the like, and the input information can be determined specifically according to prediction accuracy and actual conditions, the model type, specific structure and construction process and training process can be realized by adopting the prior art, and the invention does not describe too much.

What should be additionally stated is: when in use, the traffic flow prediction model in single node and the traffic flow prediction model in all node balances are used for prediction, so that the invention is realized. Because the invention is based on ACM SIGSPATIAL 2021GISCUP competition data, the embodiment of the invention directly adopts competition data, and directly obtains the traffic flow of a single node at a certain time of a certain day and the average traffic flow of all nodes at a certain date by adopting the same mode as a training set corresponding to a training stage of training the SLD combined model of the invention, in consideration of the fact that the traffic flow of the single node at the certain time of the certain day and the average traffic flow of all nodes at the certain date are known, in consideration of the additional cost model type selection, specific construction and training time required by constructing the traffic flow prediction model of the single node and the balance of the whole node and in consideration of reducing the influence of the precision of the two models on the integral prediction precision of the invention.

The important information points: if the current node has important places such as a mall, a school and the like around, the important information point of the node is indicated by 1, otherwise, the important information point is 0.

For the intersection type matrix a and the supplementary information matrix B, experiments are performed by adopting SDNE algorithm, graph2Vec algorithm and the like in the embodiment, and the SDNE algorithm with the best effect is finally selected, wherein the SDNE algorithm is an algorithm for learning depth representation of a Graph structure. The encoder is used for mapping the nodes to a low-dimensional space, and the decoder is used for reconstructing the nodes to finally obtain the road network matrix.

The encoder aims to map the intersection type matrix and the supplemental information matrix into a low dimensional space as follows:

Z＝f₁(B,A)

Wherein f ₁ is a corresponding characteristic function of the encoder; b is a supplementary information matrix as one input to the input layer; a is a matrix of fork types, also serving as one input to the input layer.

The goal of the decoder is to reconstruct the mapped road network matrix, the reconstruction process of which is as follows:

wherein, Is a reconstructed road network matrix, sigma is an activation function, weight matrix W is a parameter of the decoder, and Z is a low-dimensional representation of the last encoder output.

In terms of optimization targets, SDNE minimizes an objective function through optimization algorithms such as random gradient descent, wherein the objective function comprises a reconstruction error term and a regularization term, meanwhile, SDNE considers local and global structures of nodes, the local structure refers to connection relations between the nodes and adjacent nodes in the graph, and the global structure refers to complex connection relations between the nodes in the whole graph.

The low-dimensional representation of the node is extracted from the encoder, passed through the decoder to obtain the road network matrix, and then available for subsequent tasks.

The theoretical process of SDNE training is described above, and in our specific experiments, we have improved on the basis of the SDNE algorithm, making the algorithm more suitable for experimental scenarios.

The original SDNE algorithm uses sigmoid as the activation function, but in some cases there may be a problem with gradient disappearance, where PReLU is used as the activation function for this part, allowing the slope to be a learnable parameter instead of fixed, as follows:

Where a _i is a learnable parameter that allows a small slope to be introduced when the input is negative, rather than being completely truncated as in a ReLU. This makes PReLU more flexible in handling negative inputs.

The auto encoder structure in SDNE algorithm consists of two parts: an encoder (Encoder) and a Decoder (Decoder), each section comprising an input layer, a hidden layer and an output layer. The input layer of Encoder is A and B, the hidden layer uses the full-connection layer, the output layer generates the expression vector of the node; the input layer of the Decoder portion inputs the node representation vectors generated by the encoder, the hidden layer typically uses a fully connected layer, and the output layer outputs a representation matrix that reconstructs the urban road network. Here we introduce convolutional layers in both the encoder and decoder's hidden layer portions in the automatic encoder to help better capture spatial features, we add the convolutional layers before the original fully-connected layer in the hidden layer, here 8 convolutional layers are used, with a (3, 5) sized convolutional kernel in each convolutional layer to preserve finer spatial information. The procedure for introducing the convolutional layer is as follows:

h_i＝σ(Conv(B,A))

Wherein Conv (B, a) denotes performing a convolution operation on the supplemental information matrix B and the road network matrix a, σ being the activation function, used here PReLU;

the result output by the modified SDNE is input to the MLP fully connected neural network to obtain a representation vector, which is denoted as a first representation vector.

(2) Representation learning process of vehicle trajectory:

Each track is composed of a series of node sequences, wherein the nodes refer to nodes formed by intersections in the abstract road network, the running track of the vehicle can be represented through the series of nodes, the information of each node is composed of a vector, and the vector content comprises: node position coordinates, type of node (1, 2, 3 divided when constructing road network above), speed through the node (average speed within 20 meters before and after the vehicle passes through the node is adopted as the speed of the node); and simultaneously, obtaining the time stamp of the node (the time stamp is adopted to indicate the sequence of passing the node when the model is trained, and the sequence of each node in the vehicle travelling direction in the path can be directly utilized when the model is actually used), taking the vector corresponding to the node arranged according to the time sequence as input, processing the vector by using the improved LSTM algorithm, arranging the node sequence according to the time sequence, and ensuring that the time stamp of the node sequence is increased.

The LSTM algorithm is a variant of Recurrent Neural Network (RNN) and aims to solve the problems of gradient extinction and gradient explosion in the conventional RNN. LSTM effectively captures and retains long-term dependencies in a sequence by introducing a structure of memory cells and three gates (input gate, forget gate and output gate). During each inference step to implement LSTM logic, the input gate, forget gate, output gate, and modulation input are updated as follows:

i_t＝σ(W_i[x_t;h_t-1]+b_i)

f_t＝σ(W_f[x_t;h_t-1]+b_f)

o_t＝σ(W_o[x_t;h_t-1]+b_o)

g_t＝tanh(W_g[x_t;h_t-1]+b_g)

Wherein x _t is the information vector of the current node, W _i、W_f、W_o、W_g is the weight matrix updated by the input gate, the forgetting gate, the output gate and the memory cell, b _i、b_f、b_o、b_g is the cyclic bias updated by the input gate, the forgetting gate, the output gate and the memory cell, respectively, and as such, it indicates multiplication by element, tan indicates hyperbolic tangent activation function, and i _t、f_t、o_t、g_t indicates the output of the input gate, the output of the forgetting gate, the output of the output gate and the current state of the memory cell, respectively.

In order to better adapt to the field and data characteristics of the invention, we adjust the LSTM hyper-parameters, the learning rate parameters can be adjusted to explore more directions in the parameter space in the model, the initial learning rate is set to be 0.01, the learning rate is gradually decreased in experiments so as to better balance the convergence rate and stability of the model in different training stages, here we use an exponential decay strategy, the learning rate is decreased according to an exponential function, and the mathematical formula is as follows:

η_t＝η₀*e^{-decay_rate*t}

The Dropout parameter is initially set to 0.2 and for the network to learn better about the different levels of features, different Dropout rates are used at different layers, a larger Dropout (0.5) in the middle of the hidden layers and a smaller Dropout (0.2) in the last hidden layer.

Finally, one representation vector is output through the LSTM algorithm, and the representation vector is marked as a second representation vector.

(3) Representation learning process of other discrete semantic information:

In addition to the information listed above, there are some semantic information that can affect travel route selection and travel time, including: personalized information, track total length, departure time, weather condition and air temperature, and the information is discrete semantic information. The personalized information here refers to that because different drivers may have different preferences for route selection and driving habits have larger differences, some drivers may like straight lines and some drivers do not mind to get away from some small roads with more turns in order to save time, so that personalized information is introduced into other semantic information, each user corresponds to a license plate number of the vehicle driven by the user, and the driving preference of each user, namely the personalized information, is memorized through the license plate number so as to take the personalized information as one of the basis when the driving route is recommended for the driver. Furthermore, for weather conditions, we divide it into 1-5 classes, with the number of weeks, the highest air temperature and the lowest air temperature recorded.

In the present invention, each piece of discrete semantic information includes: total track length, departure time (288 numbers per day every 5 minutes), weather conditions (classified into 1-5 classes, i.e. represented by numbers 1-5), day of week (corresponding day of week), highest air temperature, lowest air temperature, vehicle ID.

The invention adopts a modified xDeepFM algorithm to process. The xDeepFM algorithm is a recommended algorithm that combines a factorizer and a deep neural network. It aims to better handle high-dimensional sparse data by capturing the relationship between features on both low-order and high-order crossings. Mainly comprises an embedded layer, a combined layer, a pooling layer and a full-connection layer.

The combination layer uses a factorizer to capture low-order cross information as follows:

Where vi and vj represent hidden vectors for the ith and jth features, respectively, < v _i,v_j > represents the inner product of vectors v _i and v _j, and it is understood that the greater the value, the stronger the interaction for the ith and jth features. xi and xj represent the values of the ith and jth features of the input feature vector.

In order to better fit the scene of the experiment, the network structure in xDeepFM algorithm is adjusted, the dimension of an embedded layer is increased from the initial 16 to 50, and the high-dimension embedded can provide richer information, so that the model can better capture the complex relation of input data; and a batch normalization layer is added after the full connection layer, so that the input of the full connection layer is normalized, the gradient explosion or disappearance problem of the network intermediate layer is prevented, the training speed and stability of the model are improved, and the normalized input can normalize input data of each small batch, so that the average value of the input data is close to 0, and the standard deviation is close to 1.

Where μ is the mean of the small lot, σ is the standard deviation of the small lot, and γ and β are the learnable scaling and translation parameters. For each small batch of input x, BN (x) is obtained by normalization.

Scaling and shifting the normalized features so that the network can learn the feature representation suitable for the current task;

and marking the representation vector processed by the xDeepFM algorithm as a third representation vector.

(4) And (3) processing by a time prediction module:

The three steps are sequentially carried out on each module by selecting a proper model algorithm to predict, three representing vectors with the same size are finally obtained, namely a first representing vector to a third representing vector, the three vectors are spliced according to channels, convolution operation is carried out through 1D ResNet, then the predicted vehicle arrival time is output through a CBAM attention module and MLP processing, and a loss function determined based on a mean absolute error percentage (MAPE) is adopted in the training process.

ResNet is a deep learning architecture, which aims to solve the problems of gradient disappearance, gradient explosion and the like in the deep neural network training process. The core idea is to introduce residual connection, transfer information through direct shortcut connection crossing layers, and help training deeper networks. CBAM is a convolutional neural network module for enhancing the attention of the model to important features, thereby improving network performance. It combines a channel attention mechanism and a spatial attention mechanism.

In order to verify and demonstrate the effect of the inventive model, the invention synchronously performs a comparative test. The same data is used for prediction through a traditional deep learning algorithm, and the prediction result is compared with the experimental result, wherein the comparison result is shown in the following table:

TABLE 1

Compared with the traditional algorithm, the SLD combined model has the minimum average absolute error percentage Mape value, which fully shows that the algorithm has great progress in model quality, and the new model is more effective in rapidly solving ETA learning calculation problems than the traditional deep learning calculation model, realizes more accurate travel time prediction, saves user time and relieves a part of traffic pressure to a certain extent.

The above examples of the present invention are only for describing the calculation model and calculation flow of the present invention in detail, and are not limiting of the embodiments of the present invention. Other variations and modifications of the above description will be apparent to those of ordinary skill in the art, and it is not intended to be exhaustive of all embodiments, all of which are within the scope of the invention.

Claims

1. The vehicle arrival time prediction method based on representation learning is characterized in that firstly, traffic information of an area where a vehicle to be predicted is located is obtained, and then, prediction is carried out by adopting an SLD joint model;

The process of predicting by the SLD joint model comprises the following steps:

2. The vehicle arrival time prediction method based on representation learning according to claim 1, wherein said SLD combined model is trained, and wherein a loss function of said SLD combined model during training is determined based on a mean absolute error percentage MAPE, a mean absolute error percentage MAPE: a percentage error between the actual arrival time and the predicted arrival time is calculated for each sample, and then the average percentage corresponding to the absolute value of the percentage error for all samples is taken.

3. The vehicle arrival time prediction method based on representation learning according to claim 2, wherein the loss function of the SLD joint model in the training process is specifically as follows:

4. A method for predicting vehicle arrival time based on representation learning according to claim 3, wherein the LSTM algorithm is pre-trained prior to the training of the SLD joint model, and wherein the LSTM algorithm is trained with a learning rate decreasing according to an exponential function during the pre-training process, specifically as follows:

η_t＝η₀*e^{-decay_rate*t}

5. A method for predicting vehicle arrival time based on representation learning according to any one of claims 1 to 4, wherein said SDNE algorithm has an automatic encoder structure comprising an encoder Encoder and a decoder, each of Encoder and decoder comprising an input layer, a hidden layer and an output layer; the hidden layer in the encoder and decoder in the automatic encoder is replaced by a hidden layer that introduces a convolutional layer, i.e. the front end of the hidden layer in the encoder and decoder in the automatic encoder introduces a convolutional layer, the activation function of which is PReLU.

6. A vehicle arrival time prediction method based on representation learning according to claim 5, wherein the number of convolution layers introduced at the front end of hidden layers in an encoder and decoder in an automatic encoder is 8 convolution layers.

7. A vehicle arrival time prediction method based on representation learning according to claim 6, wherein a convolution process is performed using a convolution kernel of (3, 5) size in each of the convolution layers introduced by the front end of the hidden layer in the encoder and decoder in the automatic encoder.

8. The method of claim 7, wherein the hidden layer of the encoder and decoder in the auto encoder is a fully connected layer, and the hidden layer introduced into the convolutional layer is a convolutional layer+fully connected layer.

9. The vehicle arrival time prediction method based on representation learning according to claim 5, wherein said xDeepFM algorithm network structure comprises an embedded layer, a combined layer, a pooled layer, a fully connected layer; the full-connection layer is replaced by a full-connection normalization layer, namely, a batch of normalization layers are added after the full-connection layer.

10. The method of claim 9, wherein the embedded layer in the network structure of xDeepFM algorithm is set to 50 layers.