CN116596170B

CN116596170B - Intelligent prediction method for delivery time based on space-time attention mechanism

Info

Publication number: CN116596170B
Application number: CN202310875592.0A
Authority: CN
Inventors: 谢贻富; 谢飞; 李海松; 张少华
Original assignee: Hefei City Cloud Data Center Co ltd
Current assignee: Hefei City Cloud Data Center Co ltd
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2023-09-22
Anticipated expiration: 2043-07-18
Also published as: CN116596170A

Abstract

The invention relates to the technical field of instant delivery time prediction, in particular to an intelligent delivery time prediction method based on a space-time attention mechanism, which comprises the following steps: calculating cosine similarity among nodes, and constructing a neighbor relation characteristic data set; calculating node selection probability by using a deep neural network, and determining a distribution sequence; embedding and extracting the distribution route sequence by using a graph embedding and convolution neural network, and calculating an attention weight value; inputting route node characteristics into a gating circulation unit, and extracting time relevance among the characteristics; and multiplying the attention weight and the gating cyclic unit value to obtain a final feature vector, and inputting the final feature vector into a multi-layer perceptron to predict the delivery time. In the instant delivery time prediction process, the influence of various characteristics on the delivery node selection is fully considered, the accuracy of sequential prediction is improved, and meanwhile, the influence relationship of the space-time relationship on the time of the delivery process is fully excavated by utilizing the proposed space-time attention mechanism, so that the prediction with higher precision is realized.

Description

Intelligent prediction method for delivery time based on space-time attention mechanism

Technical Field

The invention relates to the technical field of instant delivery time prediction, in particular to an intelligent delivery time prediction method based on a space-time attention mechanism.

Background

Instant distribution has evolved rapidly in recent years as a key element in supporting new retail models. Because the delivery time is the most intuitive evaluation index of the service quality, the high sensitivity of the customer to time requires the platform to reasonably evaluate the total use of the order before delivery so as to provide the customer with accurate waiting time. And is therefore critical for intelligent prediction of instant delivery times. Aiming at the time prediction problem, the existing solutions include machine learning and deep learning, wherein the machine learning usually uses random forest, gradient lifting tree and other methods, and the deep learning uses model combinations such as a cyclic neural network, a deep neural network and the like to predict. Paper A deep learning method for route and time prediction in food delivery service proposes a long and short time based memory model combined with an attention mechanism to predict the next destination and predict the transit time of the journey based on the prediction, and cycle predicts the order of all delivery nodes and the time of each journey (GAO C L, ZHANG F, WU G Q, et al 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD' 21). New York: association for Computing Machinery,2021: 2879-2889).

Paper Supervised learning for arrival time estimations in restaurant meal delivery proposes a delivery time prediction model combining a DNN model and a gradient boosting decision tree, predicts the order of each node using DNN, and predicts the delivery time using a gradient boosting decision tree (HILDEBRANDT F D, et al transportation Science 2022,56 (4): 1058-1084). Although the above method improves on solving the instant delivery time prediction problem and has better prediction performance, the existing method still has the defects. In the case of simultaneous prediction of multiple orders, the existing method often selects segment prediction, and finally, when the segments are accumulated as prediction results, error accumulation exists, so that the possibility of deviation is increased. In the aspect of various feature learning, the existing method utilizes various models such as gradient lifting trees, LSTM, attention mechanisms and the like to learn, but a lifting space still exists in the aspects of space-time relationship existing in the features and feature association learning.

Disclosure of Invention

In order to avoid and overcome the technical problems in the prior art, the invention provides an intelligent prediction method for the delivery time based on a space-time attention mechanism. According to the invention, different characteristics are learned through the convolutional neural network and the cyclic neural network, and the influence of the space-time relationship among the distribution nodes on the time is fully excavated by combining the attention mechanism, so that the accuracy of instant distribution time prediction is improved.

In order to achieve the above purpose, the present invention provides the following technical solutions:

an intelligent prediction method for delivery time based on a space-time attention mechanism comprises the following steps:

s1, acquiring order characteristics of an instant delivery order, wherein the order characteristics comprise merchant addresses and customer addresses; taking the merchant address and the customer address as distribution nodes, and collecting node characteristics of the distribution nodes;

s2, predicting the traversal sequence of each delivery node according to the node characteristics, and marking the traversal sequence as a delivery route;

s3, obtaining route characteristics of the distribution route;

s4, predicting the delivery time of each instant delivery order according to the route characteristics.

As still further aspects of the invention: the specific steps of step S1 are as follows:

s11, acquiring order features of each instant delivery order, wherein the order features comprise order numbers, rider speeds, merchant addresses, customer addresses, distances between merchants and customers and time stamps;

s12, taking the merchant address and the customer address as delivery nodes, summarizing to obtain all delivery nodes to be traversed in the set delivery area, and storing the delivery nodes in a delivery node setLIn the process, wherein the method comprises the steps ofl ₁ Representing a distribution node setLThe 1 st distribution node;l ₂ representing a distribution node setLThe 2 nd delivery node; />Representing a distribution node setLMiddle (f)n _l The number of distribution nodes is chosen to be equal,n _l representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;

s13, acquiring node characteristics of each distribution node according to the order characteristics, wherein the node characteristics comprise: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store node features in a node feature matrixF _l Is a kind of medium.

As still further aspects of the invention: the specific steps of step S2 are as follows:

s21, according to the node characteristic matrixF _l Node characteristics of all the distribution nodes stored in the storage device are used for calculating the similarity among all the distribution nodes through a cosine similarity calculation formula;

s22, selecting a node characteristic matrixF _l Selecting another distribution node with highest similarity with the original node as an adjacent node according to a similarity calculation result, wherein the used original node is not drawn into the selection range of the next original node;

s23, transversely splicing node feature matrixes of the original nodes and the corresponding adjacent nodes to obtain a spliced matrix, and covering the node feature matrixes corresponding to the original nodes with the spliced matrix to update node features; the splice matrix of all the distribution nodes in the set distribution area is stored in the splice node characteristic matrixF _P In (a) and (b);

s24, splicing node feature matrixF _P The node characteristics of each delivery node are input into a neural network to obtain the score of each delivery node, the score refers to the probability that the delivery node is taken as a delivery target, and the score of each delivery node is stored in a score matrix;

s25, defining a distribution node with a plurality of adjacent distribution nodes as a decision point, and using a mask matrix formed by 0 and 1 to represent the states of the distribution nodes when a rider is at the decision point according to the order completion condition of the decision point; representing the traversed delivery nodes and the delivery nodes of the immediate delivery order without taking goods by 0, wherein 0 represents that the delivery nodes are not reachable; the delivery node is reachable as indicated by 1;

s26, multiplying the mask matrix and the scoring matrix to obtain a scoring matrix containing scores of all the distribution nodes; inputting the scores of the distribution nodes in the scoring matrix into a probability calculation function to obtain the selected probability of each distribution node, and selecting the distribution node with the largest selected probability as a prediction place for the next step of decision points;

s27, returning to the step S21, repeating the steps S21-S26 until the traversal sequence of all the delivery nodes is predicted, and marking the traversal sequence as a delivery route.

As still further aspects of the invention: the cosine similarity calculation formula is as follows:

wherein, Simrepresenting a cosine similarity function;Sim(f _l ,f _φ ) Represent the firstlIndividual distribution nodes and the firstφSimilarity between the individual distribution nodes;f _l represent the firstlNode feature matrices of the individual distribution nodes;f _φ represent the firstφNode characteristic matrix of each distribution node.

As still further aspects of the invention: the calculation formula of the probability calculation function is as follows:

wherein, Softmaxrepresenting a normalization function;represent the firstlThe delivery node is at the firsttScoring values of the individual decision points; />Represent the firstcThe delivery node is at the firsttScoring values of the individual decision points;n _l indicating the total number of all delivery nodes in the set delivery area.

As still further aspects of the invention: the loss function of the neural network is a cross entropy function, and the calculation formula of the cross entropy function is as follows:

wherein, L _seq a loss value representing a loss function;Zis the number of neural network training sets; />Represent the firstzThe number of individual task decision points; />Is shown in the firsttThe prediction probability of the actual selection of the location at the decision point.

As still further aspects of the invention: the specific steps of step S3 are as follows:

s31, constructing route characteristics of the distribution route according to node characteristics of each distribution node on the distribution route; the route characteristics include: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store route characteristics in a distribution route characteristics matrixF ^r In (a) and (b);

s32, according to the traversal sequence of each delivery node in the delivery route, sequentially storing each delivery node in a delivery node sequence setUIs a kind of medium.

As still further aspects of the invention: the specific steps of step S4 are as follows:

s41, acquiring delivery nodes on each delivery route corresponding to each instant delivery order in the set delivery area, and carrying out the stepsThe distribution nodes are connected with each other, and the connection lines between the two distribution nodes connected with each other form edges to form a distribution node networkG，G=(V,E)，VRepresenting a set of distribution nodes formed by the distribution nodes;Efor the relationship matrix of the distribution node,E={e _ij ｜1≤i≤j≤n _e }，e _ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e _ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;

s42, calculating a distribution node sequence set through a transition probability calculation formulaUThe transition probability among the distribution nodes;

s43, obtaining a distribution node set through second-order random walkVUsing Skip-gram method to vector distribution nodes in the random walk node sequence to obtain distribution node vector matrixE _u The method comprises the steps of carrying out a first treatment on the surface of the Then using the distribution node vector matrixE _u Aggregating distribution node sequencesUVectorization distribution node sequence feature matrixF ^u ；

S44, node sequence feature matrixF ^u Input into convolutional neural network module, utilizen _c The size isn _k ×n _d Is to check the characteristic matrix of the distribution node sequenceCalculating to obtain corresponding calculated value with stride of 1All calculated values +.>Splicing and combining to form a combined matrixA _k Wherein, is less than or equal to 1k≤n _k ，1≤x≤n _r -n _k +1, calculated valueCombination matrixA _k Is represented as follows:

wherein, is an activation function;W _k andb _k is a training parameter; />Indicate->The convolution kernel is at->Node characteristics of the distribution node selected by the window, selecting +.>To->Individual node characteristics, window is largen _k ；n _r ×n _d Representing a distribution node sequence feature matrixF ^u Is thatn _r A row(s),n _d A matrix of columns; />Represent the firstkThe convolution kernel is atxOutput of the individual window calculations;

s45, willn _r Combination matrix obtained by calculation of convolution kernelsA _k Splicing to obtain an output matrixE _c ，，(n _r -n _k +1)×n _c Representing an output matrixE _c Is thatn _r -n _k +1 row,n _c A matrix of columns; setting the size of the pooling layer asn _p ×1, stride 1, then to the output matrixE _c Performing maximum pooling to obtain a pooled output matrix，(n _r -n _p -n _k +2)×n _c Representing the pooled output matrix asn _r -n _p -n _k +2 rows,n _c A matrix of columns;

s46, setting a weight matrixAnd weight matrixE _Q Each position of the Chinese character is randomly taken as an initial value, and the value is calculated according to the specificationn _r -n _k -n _p +2)×n _c Representing pooled output matricesE _Q Is thatn _r -n _k -n _p +2 rows,n _c A matrix of columns; then weight matrixE _Q And output matrix->Adding, and sequentially passing through tanh activation function and a linear layer to obtain addition matrix，n _r ×n _c Representing an addition matrixE _kq Is thatn _r A row(s),n _c A matrix of columns; will beE _kq Input deviceSoftmaxCalculating to obtain the injectionItalian weight matrix->The method comprises the steps of carrying out a first treatment on the surface of the Addition matrixE _kq The expression is as follows:

wherein, W _kq andb _kq all are parameters of the linear layer;

s47, distributing route characteristic matrixF ^r All route characteristics of the network system are vectorized to obtain a new distribution route characteristic matrix，/>Representing a new distribution route feature matrix->Is thatn _r Row, & gt>Matrix of columns, and will->Inputting into a single-layer GRU model to obtain an output matrix +.>，n _r ×n _g Representing an output matrixE _g Is thatn _r A row(s),n _g A matrix of columns;

s48, willAndE _g multiplication to obtain multiplication matrix->，n _g ×n _c Representing multiplication matrixE _att Is thatn _r A row(s),n _d A matrix of columns; and will beE _att Inputting into a multi-layer perceptron for calculation to obtain the predicted delivery time of the instant delivery order +.>。

As still further aspects of the invention: the transition probability calculation formula is as follows:

wherein, P(u _i ｜u _i-1 ) Representing distribution nodesu _i-1 Transfer to distribution nodeu _i Probability of (2);αrepresenting random walk parameters;representing distribution nodesu _i-1 And a distribution nodeu _i Weighting of the edges; />Representation and distribution nodeu _i-1 A distribution node set formed by the connected distribution nodes;vrepresenting distribution node set +.>The first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu _i-1 Weights of the edges in between.

As still further aspects of the invention: the loss function used by the convolutional neural network module is a mean square error loss function, which is expressed as follows:

wherein, L _time representing the mean square value;y _n is the firstnAt the time of actual delivery of the individual instant delivery orders,is the firstnThe predicted delivery of each instant delivery order is based on the total amount of data.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the invention, after the distribution route is determined, the distribution node is selected to represent the order distribution route, and the route representation method not only reduces the difficulty of route representation, but also greatly reduces the sparsity problem of the route, and meanwhile, when the whole route is predicted, the possibility of error increase caused by segment prediction is avoided.

2. When the route prediction method is used for predicting the route, the similarity of the neighbor relation and the task urgency degree between the task nodes can be fully represented by calculating the cosine similarity between the nodes, and a realistic selection strategy is effectively simulated.

3. The invention uses the graph embedding method to represent the distribution nodes, and simultaneously uses the convolutional neural network to extract the spatial characteristics of the distribution route, thereby effectively extracting the spatial relationship among the distribution nodes.

4. According to the invention, the time sequence relation existing in the distribution route characteristics is extracted by using the gating circulation unit model, so that the time relation in the characteristics is effectively extracted.

5. Based on the attention mechanism, the outputs of the convolutional neural network and the gating circulation unit model are respectively used as parameter matrixes of the attention mechanism network, so that the time-space association relation in the features is learned together, and the features are extracted effectively.

Drawings

FIG. 1 is a flow chart of the operation of the present invention.

FIG. 2 is a schematic diagram of the method of the present invention.

Fig. 3 is a diagram of the structure of the model of the present invention.

FIG. 4 is a graph showing the experimental results of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1 to fig. 4, in an embodiment of the present invention, an intelligent prediction method for delivery time based on a space-time attention mechanism is performed according to the following steps:

1. acquiring distribution order information, collecting characteristic data related to the order, and constructing a data set;

1.1, acquiring order information to be distributed at a certain moment, taking merchant addresses and customer addresses as distribution nodes, and summarizing to obtain a node set to be traversedWhereinn _l Representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;

1.2, acquiring node characteristics of each distribution node according to the order characteristics, wherein the node characteristics comprise: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store node features in a node feature matrixF _l In (a) and (b); obtaining characteristics of all distribution nodesWherein->Indicate->Node characteristics of individual distribution nodes,/->Representing the number of node features.

2. According to the node characteristics, predicting the traversal sequence of the delivery nodes by a delivery route prediction module:

2.1 at the firsttDecision points 1 <)t≤n _t WhereinRepresenting the total number of decision points, facilitating the calculation of the cosine similarity of each node feature vector to other node feature vectors in equation (1), whereinSim(l,φ) Represent the firstlAnd (b)φCosine similarity of the node feature vectors of the individual nodes;

(1)

2.2, selecting another distribution node with highest cosine similarity as an adjacent distribution node according to the cosine similarity result, combining information of the adjacent distribution node with node characteristics of the original distribution node, and constructing a new node characteristic matrix；

2.3 inputting the new node feature matrix into a deep neural network module consisting ofk _l The number of linear layers is chosen such that,k _a individual Relu activation functionsk _b The normalized layers are stacked and combined, and the score of each candidate delivery node is calculated and output through the module；

2.4 according to the firsttOrder completion at each decision point using a mask matrix of 0 and 1Representing the state of the distribution node at that time, the nodes will be traversed and not fetchedThe delivery node of the order is denoted by 0 and represents not selectable, i.e. not reachable; 1 indicates that the node can be selected, namely, the node can be reached;

2.5 matrix of maskAnd scoring matrix->Multiplying to obtain the firsttScoring matrix of selectable distribution nodes at decision point>Will be->Input deviceSoftmaxThe calculation is performed in the function as shown in formula (2), wherein +.>Represent the firstlThe scoring values of the points obtain the selection probability of each candidate distribution node, and the distribution node with the highest probability is selected as the prediction place to which the next step goes;

(2)

and 2.6, updating the mask matrix and the node characteristics according to the prediction result, returning to 2.1, and repeating the steps until the sequence of all the nodes is predicted.

2.7 the delivery route prediction section uses a cross entropy function as a loss function at the time of training, formula (3) showsZFor the number of training sets,represent the firstzDecision point number of individual tasks, +.>Is shown in the firsttAnd when the decision points are selected, the prediction probability of the places is actually selected.

(3)

3. Constructing a distribution route characteristic matrix according to the distribution route prediction result，And the distribution node sequence +.>Whereinn _r Andm _r representing the number of nodes and the number of features of the delivery route, respectively.

4. According to the distribution route characteristics, a time prediction module is used for predicting the distribution time of each order:

4.1, constructing a distribution node network according to the connection relation among the distribution nodesG，G=(V,E)，VFor a set of nodes,E={e _ij ｜1≤i≤j≤n _e }，e _ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e _ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;n _e is the total number of nodes.

4.2, based on the Node network, using the Node2vec method to embed the Node to represent:

4.2.1, equation (4), calculating transition probabilities between nodes, whereinP(u _i ｜u _i-1 ) Representing distribution nodesu _i-1 Transfer to distribution nodeu _i Probability of (2);αrepresenting random walk parameters;representing distribution nodesu _i-1 And a distribution nodeu _i Weighting of the edges; />Representation and distribution nodeu _i-1 A distribution node set formed by the connected distribution nodes;vrepresenting distribution node set +.>The first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu _i-1 Weights of the edges in between.

The calculation method is shown in formula (5), whereinpAndqis a set parameter which is set up to be,d _i,i-2 representing upper delivery nodesu _i-2 And shortest distance between adjacent delivery nodes, when two delivery nodes are directly connectedd _i,i-2 =0, there is an intermediate distribution node between distribution nodesd _i,i-2 =1, in other casesd _i,i-2 =2；

(4)

(5)

4.2.2, obtaining a random walk node sequence through second-order random walk, and vectorizing the generated nodes by using a Skip-gram method to obtain a node vector matrixWhereinn _d For vectorizing length and utilizingE _u To distribute node sequenceUVectorization into node sequence feature matrix->；

4.3, feature matrix of node sequenceAnd (3) inputting a convolutional neural network to perform feature extraction:

4.3.1, formulae (6) and (7), usingn _c The size isn _k ×n _d Is to check the characteristic matrix of the distribution node sequencePerforming calculation with stride of 1 to obtain corresponding calculated value +.>All calculated values +.>Splicing and combining to form a combined matrixA _k Wherein, is less than or equal to 1k≤n _k ，1≤x≤n _r -n _k +1, calculated->Combination matrixA _k Is represented as follows:

(6)

(7)

wherein, is an activation function;W _k andb _k is a training parameter; />Indicate->The convolution kernel is at->Node characteristics of the distribution node selected by the window, selecting +.>To->Individual node characteristics, window is largen _k ；n _r ×n _d Representing a distribution node sequence feature matrixF ^u Is thatn _r A row(s),n _d A matrix of columns; />Represent the firstkThe convolution kernel is atiOutput of the individual window calculations;

4.3.2, all convolution kernels are calculatedA _k Splicing to obtain an output matrixSetting the size of the pooling layer asn _p ×1, stride 1, followed by maximum pooling to give +.>；

4.4, formula (8), setting a weight matrixAnd initializing, thenE _Q And adding, and obtaining the output +.>；

(8)

4.5, willE _kq Input deviceSoftmaxThe concentration weight is obtained by calculation；

4.6 characterizing delivery routesF ^r Vectorizing all features in the road to obtain new route featuresAnd input single-layer GRU model to obtain output +.>Whereinn _g Output size for the GRU;

4.7, willAndE _g multiplication to obtain->And input into a multi-layer perceptron to obtain final time prediction value +.>；

4.8, the time prediction module uses the mean square error as a loss function, and the loss function is shown in a formula (9), wherein y is the actual time of delivery,in order to make the prediction available,Ndata total.

(9)

5. The training batch size and the learning rate of the method are determined through the minimized loss function, and the model is trained by utilizing the instant distribution data to obtain an instant distribution time prediction model.

In order to verify the effectiveness of the method, a specific experiment is carried out on a real data set, and the experimental environment is as follows: windows operating system, intel (R) Core (TM) i5-8500 CPU,16GB memory, pytorch deep learning framework.

Experimental data: a distribution data set on a certain platform is selected as the experimental data. The data set is collected in a region with a time span of 1 day from 2020 to 27 days from 2 months and 1 day to 2020. The dataset includes four aspects of information, respectively: the rider data, order data, distance data, and behavior data, and specific statistics are shown in table 1. Data integration was performed on the order data basis, and the constructed dataset contained 254802 pieces of delivery order data in total, with 80% of the data used for training and 20% of the data used for testing.

Table 1 data set statistics

The experimental result of fig. 4 shows that the predicted value of the invention can be better fit with the true value, which shows that the invention can realize accurate prediction in the aspect of predicting the delivery time of instant delivery, and proves the feasibility of the method of the invention.

The foregoing shows and describes the basic principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The intelligent prediction method for the delivery time based on the space-time attention mechanism is characterized by comprising the following steps of:

s3, obtaining route characteristics of the distribution route;

s4, predicting the delivery time of each instant delivery order according to the route characteristics;

the specific steps of step S4 are as follows:

s41, acquiring delivery nodes on each delivery route corresponding to each instant delivery order in the set delivery area, connecting the delivery nodes with each other, and forming edges by connecting the two connected delivery nodes to form a delivery node networkG，G=(V,E)，VRepresenting a set of distribution nodes formed by the distribution nodes;Efor the relationship matrix of the distribution node,E={e _ij ｜1≤i≤j≤n _e }，e _ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e _ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;n _e is the total number of nodes;

S44, node sequence feature matrixF ^u Input into convolutional neural network module, utilizen _c The size isn _k ×n _d Is to check the characteristic matrix of the distribution node sequencePerforming calculation with stride of 1 to obtain corresponding calculated value +.>All calculated values +.>Splicing and combining to form a combined matrixA _k Wherein, is less than or equal to 1k≤n _k ，1≤x≤n _r -n _k +1, calculated->Combination matrixA _k Is represented as follows:

wherein (1)>Is an activation function;W _k andb _k is a training parameter;represent the firstkThe convolution kernel is atxNode characteristics of the distribution nodes selected by the windows are selected, and the first node is selectedxTo the point ofx+n _k -1 node feature, window is largen _k ； n _r ×n _d Representing a distribution node sequence feature matrixF ^u Is thatn _r A row(s),n _d A matrix of columns; />Represent the firstkThe convolution kernel is atxOutput of the individual window calculations;

s45, willn _r Combination matrix obtained by calculation of convolution kernelsA _k Splicing to obtain an output matrixE _c ，，(n _r -n _k +1)×n _c Representing an output matrixE _c Is thatn _r -n _k +1 row,n _c A matrix of columns; setting the size of the pooling layer asn _p ×1, stride 1, then to the output matrixE _c Performing maximum pooling to obtain a pooled output matrix(n _r -n _p -n _k +2)×n _c Representing a pooled output matrix->Is thatn _r -n _p -n _k +2 rows,n _c A matrix of columns;

s46, setting a weight matrixAnd weight matrixE _Q Each position of the Chinese character is randomly taken as an initial value, and the value is calculated according to the specificationn _r -n _k -n _p +2)×n _c Representing pooled output matricesE _Q Is thatn _r -n _k -n _p +2 rows,n _c A matrix of columns; then weight matrixE _Q And output matrix->Adding, and sequentially passing through tanh activation function and a linear layer to obtain addition matrix，n _r ×n _c Representing an addition matrixE _kq Is thatn _r A row(s),n _c A matrix of columns; will beE _kq Input deviceSoftmaxCalculating to obtain attention weight matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the Addition matrixE _kq The expression is as follows:

wherein, W _kq andb _kq all are parameters of the linear layer;

s47, distributing route characteristic matrixF ^r All route characteristics of the network system are vectorized to obtain a new distribution route characteristic matrix，/>Representing a new distribution route feature matrix->Is thatn _r Row, & gt>Matrix of columns and willInputting into a single-layer GRU model to obtain an output matrix +.>，n _r ×n _g Representing an output matrixE _g Is thatn _r A row(s),n _g A matrix of columns;

S48、will beAndE _g multiplication to obtain multiplication matrix->，n _g ×n _c Representing multiplication matrixE _att Is thatn _r A row(s),n _d A matrix of columns; and will beE _att Inputting into a multi-layer perceptron for calculation to obtain the predicted delivery time of the instant delivery order +.>。

2. The intelligent prediction method of delivery time based on space-time attention mechanism of claim 1, wherein the specific steps of step S1 are as follows:

s12, taking the merchant address and the customer address as delivery nodes, summarizing to obtain all delivery nodes to be traversed in the set delivery area, and storing the delivery nodes in a delivery node setLIn the process, whereinl ₁ Representing a distribution node setLThe 1 st distribution node;l ₂ representing a distribution node setLThe 2 nd delivery node; />Representing a distribution node setLMiddle (f)n _l The number of distribution nodes is chosen to be equal,n _l representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;

3. The intelligent prediction method of delivery time based on space-time attention mechanism according to claim 2, wherein the specific steps of step S2 are as follows:

4. The intelligent prediction method of delivering time based on space-time attention mechanism as set forth in claim 3 wherein the cosine similarity calculation formula is as follows:

5. The intelligent prediction method for delivery time based on a spatio-temporal attention mechanism according to claim 3 or 4, wherein the probability calculation function has the following calculation formula:

wherein, Softmaxrepresenting a normalization function; />Representation ofFirst, thelThe delivery node is at the firsttScoring values of the individual decision points; />Represent the firstcThe delivery node is at the firsttScoring values of the individual decision points;n _l indicating the total number of all delivery nodes in the set delivery area.

6. The intelligent prediction method of delivery time based on a space-time attention mechanism of claim 5, wherein the loss function of the neural network is a cross entropy function, and the calculation formula of the cross entropy function is as follows:

7. The intelligent prediction method for delivery time based on space-time attention mechanism of claim 6, wherein the specific steps of step S3 are as follows:

s31, constructing route characteristics of the distribution route according to node characteristics of each distribution node on the distribution route; the route characteristics include: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store route characteristics in a distribution route characteristics matrixF _r In (a) and (b);

s32, according to the traversal sequence of each delivery node in the delivery route, each delivery node is subjected to the following stepsSequence storage in distribution node sequence setUIs a kind of medium.

8. The intelligent prediction method for delivery time based on a spatiotemporal attention mechanism according to claim 7, wherein the calculation formula of transition probability is as follows:

wherein, P(u _i ｜u _i-1 ) Representing distribution nodesu _i-1 Transfer to distribution nodeu _i Probability of (2);αrepresenting random walk parameters; />Representing distribution nodesu _i-1 And a distribution nodeu _i Weighting of the edges;representation and distribution nodeu _i-1 A distribution node set formed by the connected distribution nodes;vrepresenting a distribution node setThe first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu _i-1 Weights of the edges in between.

9. The intelligent prediction method of delivery time based on space-time attention mechanism of claim 8, wherein the loss function used by the convolutional neural network module is a mean square error loss function, expressed as follows:

wherein, L _time representing the mean square value;y _n is the firstnDuring actual delivery of an instant delivery order +.>Is the firstnAt the time of predictive delivery of an instant delivery order,Ndata total.