CN116596170B - Intelligent prediction method for delivery time based on space-time attention mechanism - Google Patents
Intelligent prediction method for delivery time based on space-time attention mechanism Download PDFInfo
- Publication number
- CN116596170B CN116596170B CN202310875592.0A CN202310875592A CN116596170B CN 116596170 B CN116596170 B CN 116596170B CN 202310875592 A CN202310875592 A CN 202310875592A CN 116596170 B CN116596170 B CN 116596170B
- Authority
- CN
- China
- Prior art keywords
- node
- distribution
- delivery
- matrix
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000007246 mechanism Effects 0.000 title claims abstract description 22
- 238000013528 artificial neural network Methods 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims abstract description 12
- 230000008569 process Effects 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 117
- 238000004364 calculation method Methods 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 31
- 239000010410 layer Substances 0.000 claims description 12
- 238000005295 random walk Methods 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 238000013527 convolutional neural network Methods 0.000 claims description 8
- 230000007704 transition Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 101001138030 Homo sapiens Protein Largen Proteins 0.000 claims description 3
- 102100020860 Protein Largen Human genes 0.000 claims description 3
- 239000002356 single layer Substances 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 125000004122 cyclic group Chemical group 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 description 4
- 238000003066 decision tree Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 235000012054 meals Nutrition 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/083—Shipping
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Human Resources & Organizations (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Quality & Reliability (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of instant delivery time prediction, in particular to an intelligent delivery time prediction method based on a space-time attention mechanism, which comprises the following steps: calculating cosine similarity among nodes, and constructing a neighbor relation characteristic data set; calculating node selection probability by using a deep neural network, and determining a distribution sequence; embedding and extracting the distribution route sequence by using a graph embedding and convolution neural network, and calculating an attention weight value; inputting route node characteristics into a gating circulation unit, and extracting time relevance among the characteristics; and multiplying the attention weight and the gating cyclic unit value to obtain a final feature vector, and inputting the final feature vector into a multi-layer perceptron to predict the delivery time. In the instant delivery time prediction process, the influence of various characteristics on the delivery node selection is fully considered, the accuracy of sequential prediction is improved, and meanwhile, the influence relationship of the space-time relationship on the time of the delivery process is fully excavated by utilizing the proposed space-time attention mechanism, so that the prediction with higher precision is realized.
Description
Technical Field
The invention relates to the technical field of instant delivery time prediction, in particular to an intelligent delivery time prediction method based on a space-time attention mechanism.
Background
Instant distribution has evolved rapidly in recent years as a key element in supporting new retail models. Because the delivery time is the most intuitive evaluation index of the service quality, the high sensitivity of the customer to time requires the platform to reasonably evaluate the total use of the order before delivery so as to provide the customer with accurate waiting time. And is therefore critical for intelligent prediction of instant delivery times. Aiming at the time prediction problem, the existing solutions include machine learning and deep learning, wherein the machine learning usually uses random forest, gradient lifting tree and other methods, and the deep learning uses model combinations such as a cyclic neural network, a deep neural network and the like to predict. Paper A deep learning method for route and time prediction in food delivery service proposes a long and short time based memory model combined with an attention mechanism to predict the next destination and predict the transit time of the journey based on the prediction, and cycle predicts the order of all delivery nodes and the time of each journey (GAO C L, ZHANG F, WU G Q, et al 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD' 21). New York: association for Computing Machinery,2021: 2879-2889).
Paper Supervised learning for arrival time estimations in restaurant meal delivery proposes a delivery time prediction model combining a DNN model and a gradient boosting decision tree, predicts the order of each node using DNN, and predicts the delivery time using a gradient boosting decision tree (HILDEBRANDT F D, et al transportation Science 2022,56 (4): 1058-1084). Although the above method improves on solving the instant delivery time prediction problem and has better prediction performance, the existing method still has the defects. In the case of simultaneous prediction of multiple orders, the existing method often selects segment prediction, and finally, when the segments are accumulated as prediction results, error accumulation exists, so that the possibility of deviation is increased. In the aspect of various feature learning, the existing method utilizes various models such as gradient lifting trees, LSTM, attention mechanisms and the like to learn, but a lifting space still exists in the aspects of space-time relationship existing in the features and feature association learning.
Disclosure of Invention
In order to avoid and overcome the technical problems in the prior art, the invention provides an intelligent prediction method for the delivery time based on a space-time attention mechanism. According to the invention, different characteristics are learned through the convolutional neural network and the cyclic neural network, and the influence of the space-time relationship among the distribution nodes on the time is fully excavated by combining the attention mechanism, so that the accuracy of instant distribution time prediction is improved.
In order to achieve the above purpose, the present invention provides the following technical solutions:
an intelligent prediction method for delivery time based on a space-time attention mechanism comprises the following steps:
s1, acquiring order characteristics of an instant delivery order, wherein the order characteristics comprise merchant addresses and customer addresses; taking the merchant address and the customer address as distribution nodes, and collecting node characteristics of the distribution nodes;
s2, predicting the traversal sequence of each delivery node according to the node characteristics, and marking the traversal sequence as a delivery route;
s3, obtaining route characteristics of the distribution route;
s4, predicting the delivery time of each instant delivery order according to the route characteristics.
As still further aspects of the invention: the specific steps of step S1 are as follows:
s11, acquiring order features of each instant delivery order, wherein the order features comprise order numbers, rider speeds, merchant addresses, customer addresses, distances between merchants and customers and time stamps;
s12, taking the merchant address and the customer address as delivery nodes, summarizing to obtain all delivery nodes to be traversed in the set delivery area, and storing the delivery nodes in a delivery node setLIn the process, wherein the method comprises the steps ofl 1 Representing a distribution node setLThe 1 st distribution node;l 2 representing a distribution node setLThe 2 nd delivery node; />Representing a distribution node setLMiddle (f)n l The number of distribution nodes is chosen to be equal,n l representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;
s13, acquiring node characteristics of each distribution node according to the order characteristics, wherein the node characteristics comprise: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store node features in a node feature matrixF l Is a kind of medium.
As still further aspects of the invention: the specific steps of step S2 are as follows:
s21, according to the node characteristic matrixF l Node characteristics of all the distribution nodes stored in the storage device are used for calculating the similarity among all the distribution nodes through a cosine similarity calculation formula;
s22, selecting a node characteristic matrixF l Selecting another distribution node with highest similarity with the original node as an adjacent node according to a similarity calculation result, wherein the used original node is not drawn into the selection range of the next original node;
s23, transversely splicing node feature matrixes of the original nodes and the corresponding adjacent nodes to obtain a spliced matrix, and covering the node feature matrixes corresponding to the original nodes with the spliced matrix to update node features; the splice matrix of all the distribution nodes in the set distribution area is stored in the splice node characteristic matrixF P In (a) and (b);
s24, splicing node feature matrixF P The node characteristics of each delivery node are input into a neural network to obtain the score of each delivery node, the score refers to the probability that the delivery node is taken as a delivery target, and the score of each delivery node is stored in a score matrix;
s25, defining a distribution node with a plurality of adjacent distribution nodes as a decision point, and using a mask matrix formed by 0 and 1 to represent the states of the distribution nodes when a rider is at the decision point according to the order completion condition of the decision point; representing the traversed delivery nodes and the delivery nodes of the immediate delivery order without taking goods by 0, wherein 0 represents that the delivery nodes are not reachable; the delivery node is reachable as indicated by 1;
s26, multiplying the mask matrix and the scoring matrix to obtain a scoring matrix containing scores of all the distribution nodes; inputting the scores of the distribution nodes in the scoring matrix into a probability calculation function to obtain the selected probability of each distribution node, and selecting the distribution node with the largest selected probability as a prediction place for the next step of decision points;
s27, returning to the step S21, repeating the steps S21-S26 until the traversal sequence of all the delivery nodes is predicted, and marking the traversal sequence as a delivery route.
As still further aspects of the invention: the cosine similarity calculation formula is as follows:
wherein, Simrepresenting a cosine similarity function;Sim(f l ,f φ ) Represent the firstlIndividual distribution nodes and the firstφSimilarity between the individual distribution nodes;f l represent the firstlNode feature matrices of the individual distribution nodes;f φ represent the firstφNode characteristic matrix of each distribution node.
As still further aspects of the invention: the calculation formula of the probability calculation function is as follows:
wherein, Softmaxrepresenting a normalization function;represent the firstlThe delivery node is at the firsttScoring values of the individual decision points; />Represent the firstcThe delivery node is at the firsttScoring values of the individual decision points;n l indicating the total number of all delivery nodes in the set delivery area.
As still further aspects of the invention: the loss function of the neural network is a cross entropy function, and the calculation formula of the cross entropy function is as follows:
wherein, L seq a loss value representing a loss function;Zis the number of neural network training sets; />Represent the firstzThe number of individual task decision points; />Is shown in the firsttThe prediction probability of the actual selection of the location at the decision point.
As still further aspects of the invention: the specific steps of step S3 are as follows:
s31, constructing route characteristics of the distribution route according to node characteristics of each distribution node on the distribution route; the route characteristics include: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store route characteristics in a distribution route characteristics matrixF r In (a) and (b);
s32, according to the traversal sequence of each delivery node in the delivery route, sequentially storing each delivery node in a delivery node sequence setUIs a kind of medium.
As still further aspects of the invention: the specific steps of step S4 are as follows:
s41, acquiring delivery nodes on each delivery route corresponding to each instant delivery order in the set delivery area, and carrying out the stepsThe distribution nodes are connected with each other, and the connection lines between the two distribution nodes connected with each other form edges to form a distribution node networkG,G=(V,E),VRepresenting a set of distribution nodes formed by the distribution nodes;Efor the relationship matrix of the distribution node,E={e ij |1≤i≤j≤n e },e ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;
s42, calculating a distribution node sequence set through a transition probability calculation formulaUThe transition probability among the distribution nodes;
s43, obtaining a distribution node set through second-order random walkVUsing Skip-gram method to vector distribution nodes in the random walk node sequence to obtain distribution node vector matrixE u The method comprises the steps of carrying out a first treatment on the surface of the Then using the distribution node vector matrixE u Aggregating distribution node sequencesUVectorization distribution node sequence feature matrixF u ;
S44, node sequence feature matrixF u Input into convolutional neural network module, utilizen c The size isn k ×n d Is to check the characteristic matrix of the distribution node sequenceCalculating to obtain corresponding calculated value with stride of 1All calculated values +.>Splicing and combining to form a combined matrixA k Wherein, is less than or equal to 1k≤n k ,1≤x≤n r -n k +1, calculated valueCombination matrixA k Is represented as follows:
wherein, is an activation function;W k andb k is a training parameter; />Indicate->The convolution kernel is at->Node characteristics of the distribution node selected by the window, selecting +.>To->Individual node characteristics, window is largen k ;n r ×n d Representing a distribution node sequence feature matrixF u Is thatn r A row(s),n d A matrix of columns; />Represent the firstkThe convolution kernel is atxOutput of the individual window calculations;
s45, willn r Combination matrix obtained by calculation of convolution kernelsA k Splicing to obtain an output matrixE c ,,(n r -n k +1)×n c Representing an output matrixE c Is thatn r -n k +1 row,n c A matrix of columns; setting the size of the pooling layer asn p ×1, stride 1, then to the output matrixE c Performing maximum pooling to obtain a pooled output matrix,(n r -n p -n k +2)×n c Representing the pooled output matrix asn r -n p -n k +2 rows,n c A matrix of columns;
s46, setting a weight matrixAnd weight matrixE Q Each position of the Chinese character is randomly taken as an initial value, and the value is calculated according to the specificationn r -n k -n p +2)×n c Representing pooled output matricesE Q Is thatn r -n k -n p +2 rows,n c A matrix of columns; then weight matrixE Q And output matrix->Adding, and sequentially passing through tanh activation function and a linear layer to obtain addition matrix,n r ×n c Representing an addition matrixE kq Is thatn r A row(s),n c A matrix of columns; will beE kq Input deviceSoftmaxCalculating to obtain the injectionItalian weight matrix->The method comprises the steps of carrying out a first treatment on the surface of the Addition matrixE kq The expression is as follows:
wherein, W kq andb kq all are parameters of the linear layer;
s47, distributing route characteristic matrixF r All route characteristics of the network system are vectorized to obtain a new distribution route characteristic matrix,/>Representing a new distribution route feature matrix->Is thatn r Row, & gt>Matrix of columns, and will->Inputting into a single-layer GRU model to obtain an output matrix +.>,n r ×n g Representing an output matrixE g Is thatn r A row(s),n g A matrix of columns;
s48, willAndE g multiplication to obtain multiplication matrix->,n g ×n c Representing multiplication matrixE att Is thatn r A row(s),n d A matrix of columns; and will beE att Inputting into a multi-layer perceptron for calculation to obtain the predicted delivery time of the instant delivery order +.>。
As still further aspects of the invention: the transition probability calculation formula is as follows:
wherein, P(u i |u i-1 ) Representing distribution nodesu i-1 Transfer to distribution nodeu i Probability of (2);αrepresenting random walk parameters;representing distribution nodesu i-1 And a distribution nodeu i Weighting of the edges; />Representation and distribution nodeu i-1 A distribution node set formed by the connected distribution nodes;vrepresenting distribution node set +.>The first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu i-1 Weights of the edges in between.
As still further aspects of the invention: the loss function used by the convolutional neural network module is a mean square error loss function, which is expressed as follows:
wherein, L time representing the mean square value;y n is the firstnAt the time of actual delivery of the individual instant delivery orders,is the firstnThe predicted delivery of each instant delivery order is based on the total amount of data.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, after the distribution route is determined, the distribution node is selected to represent the order distribution route, and the route representation method not only reduces the difficulty of route representation, but also greatly reduces the sparsity problem of the route, and meanwhile, when the whole route is predicted, the possibility of error increase caused by segment prediction is avoided.
2. When the route prediction method is used for predicting the route, the similarity of the neighbor relation and the task urgency degree between the task nodes can be fully represented by calculating the cosine similarity between the nodes, and a realistic selection strategy is effectively simulated.
3. The invention uses the graph embedding method to represent the distribution nodes, and simultaneously uses the convolutional neural network to extract the spatial characteristics of the distribution route, thereby effectively extracting the spatial relationship among the distribution nodes.
4. According to the invention, the time sequence relation existing in the distribution route characteristics is extracted by using the gating circulation unit model, so that the time relation in the characteristics is effectively extracted.
5. Based on the attention mechanism, the outputs of the convolutional neural network and the gating circulation unit model are respectively used as parameter matrixes of the attention mechanism network, so that the time-space association relation in the features is learned together, and the features are extracted effectively.
Drawings
FIG. 1 is a flow chart of the operation of the present invention.
FIG. 2 is a schematic diagram of the method of the present invention.
Fig. 3 is a diagram of the structure of the model of the present invention.
FIG. 4 is a graph showing the experimental results of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1 to fig. 4, in an embodiment of the present invention, an intelligent prediction method for delivery time based on a space-time attention mechanism is performed according to the following steps:
1. acquiring distribution order information, collecting characteristic data related to the order, and constructing a data set;
1.1, acquiring order information to be distributed at a certain moment, taking merchant addresses and customer addresses as distribution nodes, and summarizing to obtain a node set to be traversedWhereinn l Representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;
1.2, acquiring node characteristics of each distribution node according to the order characteristics, wherein the node characteristics comprise: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store node features in a node feature matrixF l In (a) and (b); obtaining characteristics of all distribution nodesWherein->Indicate->Node characteristics of individual distribution nodes,/->Representing the number of node features.
2. According to the node characteristics, predicting the traversal sequence of the delivery nodes by a delivery route prediction module:
2.1 at the firsttDecision points 1 <)t≤n t WhereinRepresenting the total number of decision points, facilitating the calculation of the cosine similarity of each node feature vector to other node feature vectors in equation (1), whereinSim(l,φ) Represent the firstlAnd (b)φCosine similarity of the node feature vectors of the individual nodes;
(1)
2.2, selecting another distribution node with highest cosine similarity as an adjacent distribution node according to the cosine similarity result, combining information of the adjacent distribution node with node characteristics of the original distribution node, and constructing a new node characteristic matrix;
2.3 inputting the new node feature matrix into a deep neural network module consisting ofk l The number of linear layers is chosen such that,k a individual Relu activation functionsk b The normalized layers are stacked and combined, and the score of each candidate delivery node is calculated and output through the module;
2.4 according to the firsttOrder completion at each decision point using a mask matrix of 0 and 1Representing the state of the distribution node at that time, the nodes will be traversed and not fetchedThe delivery node of the order is denoted by 0 and represents not selectable, i.e. not reachable; 1 indicates that the node can be selected, namely, the node can be reached;
2.5 matrix of maskAnd scoring matrix->Multiplying to obtain the firsttScoring matrix of selectable distribution nodes at decision point>Will be->Input deviceSoftmaxThe calculation is performed in the function as shown in formula (2), wherein +.>Represent the firstlThe scoring values of the points obtain the selection probability of each candidate distribution node, and the distribution node with the highest probability is selected as the prediction place to which the next step goes;
(2)
and 2.6, updating the mask matrix and the node characteristics according to the prediction result, returning to 2.1, and repeating the steps until the sequence of all the nodes is predicted.
2.7 the delivery route prediction section uses a cross entropy function as a loss function at the time of training, formula (3) showsZFor the number of training sets,represent the firstzDecision point number of individual tasks, +.>Is shown in the firsttAnd when the decision points are selected, the prediction probability of the places is actually selected.
(3)
3. Constructing a distribution route characteristic matrix according to the distribution route prediction result,And the distribution node sequence +.>Whereinn r Andm r representing the number of nodes and the number of features of the delivery route, respectively.
4. According to the distribution route characteristics, a time prediction module is used for predicting the distribution time of each order:
4.1, constructing a distribution node network according to the connection relation among the distribution nodesG,G=(V,E),VFor a set of nodes,E={e ij |1≤i≤j≤n e },e ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;n e is the total number of nodes.
4.2, based on the Node network, using the Node2vec method to embed the Node to represent:
4.2.1, equation (4), calculating transition probabilities between nodes, whereinP(u i |u i-1 ) Representing distribution nodesu i-1 Transfer to distribution nodeu i Probability of (2);αrepresenting random walk parameters;representing distribution nodesu i-1 And a distribution nodeu i Weighting of the edges; />Representation and distribution nodeu i-1 A distribution node set formed by the connected distribution nodes;vrepresenting distribution node set +.>The first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu i-1 Weights of the edges in between.
The calculation method is shown in formula (5), whereinpAndqis a set parameter which is set up to be,d i,i-2 representing upper delivery nodesu i-2 And shortest distance between adjacent delivery nodes, when two delivery nodes are directly connectedd i,i-2 =0, there is an intermediate distribution node between distribution nodesd i,i-2 =1, in other casesd i,i-2 =2;
(4)
(5)
4.2.2, obtaining a random walk node sequence through second-order random walk, and vectorizing the generated nodes by using a Skip-gram method to obtain a node vector matrixWhereinn d For vectorizing length and utilizingE u To distribute node sequenceUVectorization into node sequence feature matrix->;
4.3, feature matrix of node sequenceAnd (3) inputting a convolutional neural network to perform feature extraction:
4.3.1, formulae (6) and (7), usingn c The size isn k ×n d Is to check the characteristic matrix of the distribution node sequencePerforming calculation with stride of 1 to obtain corresponding calculated value +.>All calculated values +.>Splicing and combining to form a combined matrixA k Wherein, is less than or equal to 1k≤n k ,1≤x≤n r -n k +1, calculated->Combination matrixA k Is represented as follows:
(6)
(7)
wherein, is an activation function;W k andb k is a training parameter; />Indicate->The convolution kernel is at->Node characteristics of the distribution node selected by the window, selecting +.>To->Individual node characteristics, window is largen k ;n r ×n d Representing a distribution node sequence feature matrixF u Is thatn r A row(s),n d A matrix of columns; />Represent the firstkThe convolution kernel is atiOutput of the individual window calculations;
4.3.2, all convolution kernels are calculatedA k Splicing to obtain an output matrixSetting the size of the pooling layer asn p ×1, stride 1, followed by maximum pooling to give +.>;
4.4, formula (8), setting a weight matrixAnd initializing, thenE Q And adding, and obtaining the output +.>;
(8)
4.5, willE kq Input deviceSoftmaxThe concentration weight is obtained by calculation;
4.6 characterizing delivery routesF r Vectorizing all features in the road to obtain new route featuresAnd input single-layer GRU model to obtain output +.>Whereinn g Output size for the GRU;
4.7, willAndE g multiplication to obtain->And input into a multi-layer perceptron to obtain final time prediction value +.>;
4.8, the time prediction module uses the mean square error as a loss function, and the loss function is shown in a formula (9), wherein y is the actual time of delivery,in order to make the prediction available,Ndata total.
(9)
5. The training batch size and the learning rate of the method are determined through the minimized loss function, and the model is trained by utilizing the instant distribution data to obtain an instant distribution time prediction model.
In order to verify the effectiveness of the method, a specific experiment is carried out on a real data set, and the experimental environment is as follows: windows operating system, intel (R) Core (TM) i5-8500 CPU,16GB memory, pytorch deep learning framework.
Experimental data: a distribution data set on a certain platform is selected as the experimental data. The data set is collected in a region with a time span of 1 day from 2020 to 27 days from 2 months and 1 day to 2020. The dataset includes four aspects of information, respectively: the rider data, order data, distance data, and behavior data, and specific statistics are shown in table 1. Data integration was performed on the order data basis, and the constructed dataset contained 254802 pieces of delivery order data in total, with 80% of the data used for training and 20% of the data used for testing.
Table 1 data set statistics
The experimental result of fig. 4 shows that the predicted value of the invention can be better fit with the true value, which shows that the invention can realize accurate prediction in the aspect of predicting the delivery time of instant delivery, and proves the feasibility of the method of the invention.
The foregoing shows and describes the basic principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (9)
1. The intelligent prediction method for the delivery time based on the space-time attention mechanism is characterized by comprising the following steps of:
s1, acquiring order characteristics of an instant delivery order, wherein the order characteristics comprise merchant addresses and customer addresses; taking the merchant address and the customer address as distribution nodes, and collecting node characteristics of the distribution nodes;
s2, predicting the traversal sequence of each delivery node according to the node characteristics, and marking the traversal sequence as a delivery route;
s3, obtaining route characteristics of the distribution route;
s4, predicting the delivery time of each instant delivery order according to the route characteristics;
the specific steps of step S4 are as follows:
s41, acquiring delivery nodes on each delivery route corresponding to each instant delivery order in the set delivery area, connecting the delivery nodes with each other, and forming edges by connecting the two connected delivery nodes to form a delivery node networkG,G=(V,E),VRepresenting a set of distribution nodes formed by the distribution nodes;Efor the relationship matrix of the distribution node,E={e ij |1≤i≤j≤n e },e ij representing a network of nodesGMiddle (f)iIndividual distribution nodes and the firstjThe degree of association between the individual distribution nodes,e ij the value of (1) is 0 or 1,0 indicates no association, and 1 indicates association;n e is the total number of nodes;
s42, calculating a distribution node sequence set through a transition probability calculation formulaUThe transition probability among the distribution nodes;
s43, obtaining a distribution node set through second-order random walkVUsing Skip-gram method to vector distribution nodes in the random walk node sequence to obtain distribution node vector matrixE u The method comprises the steps of carrying out a first treatment on the surface of the Then using the distribution node vector matrixE u Aggregating distribution node sequencesUVectorization distribution node sequence feature matrixF u ;
S44, node sequence feature matrixF u Input into convolutional neural network module, utilizen c The size isn k ×n d Is to check the characteristic matrix of the distribution node sequencePerforming calculation with stride of 1 to obtain corresponding calculated value +.>All calculated values +.>Splicing and combining to form a combined matrixA k Wherein, is less than or equal to 1k≤n k ,1≤x≤n r -n k +1, calculated->Combination matrixA k Is represented as follows:
wherein (1)>Is an activation function;W k andb k is a training parameter;represent the firstkThe convolution kernel is atxNode characteristics of the distribution nodes selected by the windows are selected, and the first node is selectedxTo the point ofx+n k -1 node feature, window is largen k ; n r ×n d Representing a distribution node sequence feature matrixF u Is thatn r A row(s),n d A matrix of columns; />Represent the firstkThe convolution kernel is atxOutput of the individual window calculations;
s45, willn r Combination matrix obtained by calculation of convolution kernelsA k Splicing to obtain an output matrixE c ,,(n r -n k +1)×n c Representing an output matrixE c Is thatn r -n k +1 row,n c A matrix of columns; setting the size of the pooling layer asn p ×1, stride 1, then to the output matrixE c Performing maximum pooling to obtain a pooled output matrix(n r -n p -n k +2)×n c Representing a pooled output matrix->Is thatn r -n p -n k +2 rows,n c A matrix of columns;
s46, setting a weight matrixAnd weight matrixE Q Each position of the Chinese character is randomly taken as an initial value, and the value is calculated according to the specificationn r -n k -n p +2)×n c Representing pooled output matricesE Q Is thatn r -n k -n p +2 rows,n c A matrix of columns; then weight matrixE Q And output matrix->Adding, and sequentially passing through tanh activation function and a linear layer to obtain addition matrix,n r ×n c Representing an addition matrixE kq Is thatn r A row(s),n c A matrix of columns; will beE kq Input deviceSoftmaxCalculating to obtain attention weight matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the Addition matrixE kq The expression is as follows:
wherein, W kq andb kq all are parameters of the linear layer;
s47, distributing route characteristic matrixF r All route characteristics of the network system are vectorized to obtain a new distribution route characteristic matrix,/>Representing a new distribution route feature matrix->Is thatn r Row, & gt>Matrix of columns and willInputting into a single-layer GRU model to obtain an output matrix +.>,n r ×n g Representing an output matrixE g Is thatn r A row(s),n g A matrix of columns;
S48、will beAndE g multiplication to obtain multiplication matrix->,n g ×n c Representing multiplication matrixE att Is thatn r A row(s),n d A matrix of columns; and will beE att Inputting into a multi-layer perceptron for calculation to obtain the predicted delivery time of the instant delivery order +.>。
2. The intelligent prediction method of delivery time based on space-time attention mechanism of claim 1, wherein the specific steps of step S1 are as follows:
s11, acquiring order features of each instant delivery order, wherein the order features comprise order numbers, rider speeds, merchant addresses, customer addresses, distances between merchants and customers and time stamps;
s12, taking the merchant address and the customer address as delivery nodes, summarizing to obtain all delivery nodes to be traversed in the set delivery area, and storing the delivery nodes in a delivery node setLIn the process, whereinl 1 Representing a distribution node setLThe 1 st distribution node;l 2 representing a distribution node setLThe 2 nd delivery node; />Representing a distribution node setLMiddle (f)n l The number of distribution nodes is chosen to be equal,n l representing a distribution node setLI.e., the total number of distribution nodes in the distribution area;
s13, acquiring node characteristics of each distribution node according to the order characteristics, wherein the node characteristics comprise: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store node features in a node feature matrixF l Is a kind of medium.
3. The intelligent prediction method of delivery time based on space-time attention mechanism according to claim 2, wherein the specific steps of step S2 are as follows:
s21, according to the node characteristic matrixF l Node characteristics of all the distribution nodes stored in the storage device are used for calculating the similarity among all the distribution nodes through a cosine similarity calculation formula;
s22, selecting a node characteristic matrixF l Selecting another distribution node with highest similarity with the original node as an adjacent node according to a similarity calculation result, wherein the used original node is not drawn into the selection range of the next original node;
s23, transversely splicing node feature matrixes of the original nodes and the corresponding adjacent nodes to obtain a spliced matrix, and covering the node feature matrixes corresponding to the original nodes with the spliced matrix to update node features; the splice matrix of all the distribution nodes in the set distribution area is stored in the splice node characteristic matrixF P In (a) and (b);
s24, splicing node feature matrixF P The node characteristics of each delivery node are input into a neural network to obtain the score of each delivery node, the score refers to the probability that the delivery node is taken as a delivery target, and the score of each delivery node is stored in a score matrix;
s25, defining a distribution node with a plurality of adjacent distribution nodes as a decision point, and using a mask matrix formed by 0 and 1 to represent the states of the distribution nodes when a rider is at the decision point according to the order completion condition of the decision point; representing the traversed delivery nodes and the delivery nodes of the immediate delivery order without taking goods by 0, wherein 0 represents that the delivery nodes are not reachable; the delivery node is reachable as indicated by 1;
s26, multiplying the mask matrix and the scoring matrix to obtain a scoring matrix containing scores of all the distribution nodes; inputting the scores of the distribution nodes in the scoring matrix into a probability calculation function to obtain the selected probability of each distribution node, and selecting the distribution node with the largest selected probability as a prediction place for the next step of decision points;
s27, returning to the step S21, repeating the steps S21-S26 until the traversal sequence of all the delivery nodes is predicted, and marking the traversal sequence as a delivery route.
4. The intelligent prediction method of delivering time based on space-time attention mechanism as set forth in claim 3 wherein the cosine similarity calculation formula is as follows:
wherein, Simrepresenting a cosine similarity function;Sim(f l ,f φ ) Represent the firstlIndividual distribution nodes and the firstφSimilarity between the individual distribution nodes;f l represent the firstlNode feature matrices of the individual distribution nodes;f φ represent the firstφNode characteristic matrix of each distribution node.
5. The intelligent prediction method for delivery time based on a spatio-temporal attention mechanism according to claim 3 or 4, wherein the probability calculation function has the following calculation formula:
wherein, Softmaxrepresenting a normalization function; />Representation ofFirst, thelThe delivery node is at the firsttScoring values of the individual decision points; />Represent the firstcThe delivery node is at the firsttScoring values of the individual decision points;n l indicating the total number of all delivery nodes in the set delivery area.
6. The intelligent prediction method of delivery time based on a space-time attention mechanism of claim 5, wherein the loss function of the neural network is a cross entropy function, and the calculation formula of the cross entropy function is as follows:
wherein, L seq a loss value representing a loss function;Zis the number of neural network training sets; />Represent the firstzThe number of individual task decision points; />Is shown in the firsttThe prediction probability of the actual selection of the location at the decision point.
7. The intelligent prediction method for delivery time based on space-time attention mechanism of claim 6, wherein the specific steps of step S3 are as follows:
s31, constructing route characteristics of the distribution route according to node characteristics of each distribution node on the distribution route; the route characteristics include: order number, rider speed, merchant address, customer address, distance between merchant and customer, and time stamp, and store route characteristics in a distribution route characteristics matrixF r In (a) and (b);
s32, according to the traversal sequence of each delivery node in the delivery route, each delivery node is subjected to the following stepsSequence storage in distribution node sequence setUIs a kind of medium.
8. The intelligent prediction method for delivery time based on a spatiotemporal attention mechanism according to claim 7, wherein the calculation formula of transition probability is as follows:
wherein, P(u i |u i-1 ) Representing distribution nodesu i-1 Transfer to distribution nodeu i Probability of (2);αrepresenting random walk parameters; />Representing distribution nodesu i-1 And a distribution nodeu i Weighting of the edges;representation and distribution nodeu i-1 A distribution node set formed by the connected distribution nodes;vrepresenting a distribution node setThe first of (3)vIndividual distribution nodes->Representing distribution nodesvAnd a distribution nodeu i-1 Weights of the edges in between.
9. The intelligent prediction method of delivery time based on space-time attention mechanism of claim 8, wherein the loss function used by the convolutional neural network module is a mean square error loss function, expressed as follows:
wherein, L time representing the mean square value;y n is the firstnDuring actual delivery of an instant delivery order +.>Is the firstnAt the time of predictive delivery of an instant delivery order,Ndata total.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875592.0A CN116596170B (en) | 2023-07-18 | 2023-07-18 | Intelligent prediction method for delivery time based on space-time attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875592.0A CN116596170B (en) | 2023-07-18 | 2023-07-18 | Intelligent prediction method for delivery time based on space-time attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116596170A CN116596170A (en) | 2023-08-15 |
CN116596170B true CN116596170B (en) | 2023-09-22 |
Family
ID=87612071
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310875592.0A Active CN116596170B (en) | 2023-07-18 | 2023-07-18 | Intelligent prediction method for delivery time based on space-time attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116596170B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020024319A1 (en) * | 2018-08-01 | 2020-02-06 | 苏州大学张家港工业技术研究院 | Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting |
CN110766510A (en) * | 2019-09-18 | 2020-02-07 | 北京三快在线科技有限公司 | Recommendation method and device, electronic equipment and readable storage medium |
CN111612400A (en) * | 2020-05-22 | 2020-09-01 | 上海明略人工智能(集团)有限公司 | Distribution time length prediction method and device |
CN111860914A (en) * | 2020-07-31 | 2020-10-30 | 安徽梦馨信息技术有限公司 | Flower appointment order distribution method and system |
CN111899510A (en) * | 2020-07-28 | 2020-11-06 | 南京工程学院 | Intelligent traffic system flow short-term prediction method and system based on divergent convolution and GAT |
CN112308514A (en) * | 2020-10-30 | 2021-02-02 | 康键信息技术(深圳)有限公司 | Logistics order automatic issuing method, device, equipment and storage medium |
CN112884390A (en) * | 2019-11-29 | 2021-06-01 | 北京三快在线科技有限公司 | Order processing method and device, readable storage medium and electronic equipment |
CN112948412A (en) * | 2021-04-21 | 2021-06-11 | 携程旅游网络技术(上海)有限公司 | Flight inventory updating method, system, electronic equipment and storage medium |
CN113947245A (en) * | 2021-10-20 | 2022-01-18 | 辽宁工程技术大学 | Multi-passenger multi-driver sharing matching method and system based on order accumulation |
CN114819819A (en) * | 2022-04-15 | 2022-07-29 | 电子科技大学 | Path planning implementation method under instant logistics scene |
CN115099460A (en) * | 2022-05-19 | 2022-09-23 | 大连海事大学 | Region division method for urban logistics immediate distribution merchants |
WO2023279407A1 (en) * | 2021-07-06 | 2023-01-12 | 深圳市通拓信息技术网络有限公司 | Outbound and distribution method for e-commerce intelligent warehousing |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190156357A1 (en) * | 2017-11-22 | 2019-05-23 | Staples, Inc. | Advanced computational prediction models for heterogeneous data |
WO2022005451A1 (en) * | 2020-06-29 | 2022-01-06 | Walmart Apollo, Llc | Door-step time estimation and delivery route optimization |
-
2023
- 2023-07-18 CN CN202310875592.0A patent/CN116596170B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020024319A1 (en) * | 2018-08-01 | 2020-02-06 | 苏州大学张家港工业技术研究院 | Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting |
CN110766510A (en) * | 2019-09-18 | 2020-02-07 | 北京三快在线科技有限公司 | Recommendation method and device, electronic equipment and readable storage medium |
CN112884390A (en) * | 2019-11-29 | 2021-06-01 | 北京三快在线科技有限公司 | Order processing method and device, readable storage medium and electronic equipment |
CN111612400A (en) * | 2020-05-22 | 2020-09-01 | 上海明略人工智能(集团)有限公司 | Distribution time length prediction method and device |
CN111899510A (en) * | 2020-07-28 | 2020-11-06 | 南京工程学院 | Intelligent traffic system flow short-term prediction method and system based on divergent convolution and GAT |
CN111860914A (en) * | 2020-07-31 | 2020-10-30 | 安徽梦馨信息技术有限公司 | Flower appointment order distribution method and system |
CN112308514A (en) * | 2020-10-30 | 2021-02-02 | 康键信息技术(深圳)有限公司 | Logistics order automatic issuing method, device, equipment and storage medium |
CN112948412A (en) * | 2021-04-21 | 2021-06-11 | 携程旅游网络技术(上海)有限公司 | Flight inventory updating method, system, electronic equipment and storage medium |
WO2023279407A1 (en) * | 2021-07-06 | 2023-01-12 | 深圳市通拓信息技术网络有限公司 | Outbound and distribution method for e-commerce intelligent warehousing |
CN113947245A (en) * | 2021-10-20 | 2022-01-18 | 辽宁工程技术大学 | Multi-passenger multi-driver sharing matching method and system based on order accumulation |
CN114819819A (en) * | 2022-04-15 | 2022-07-29 | 电子科技大学 | Path planning implementation method under instant logistics scene |
CN115099460A (en) * | 2022-05-19 | 2022-09-23 | 大连海事大学 | Region division method for urban logistics immediate distribution merchants |
Non-Patent Citations (1)
Title |
---|
" 考虑一单多品的外卖订单配送时间的带时间窗的车辆路径问题";杨浩雄等;《计算机科学》;191-198 * |
Also Published As
Publication number | Publication date |
---|---|
CN116596170A (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107145977B (en) | Method for carrying out structured attribute inference on online social network user | |
CN110717098B (en) | Meta-path-based context-aware user modeling method and sequence recommendation method | |
CN110837602B (en) | User recommendation method based on representation learning and multi-mode convolutional neural network | |
CN113190688B (en) | Complex network link prediction method and system based on logical reasoning and graph convolution | |
CN110390561B (en) | User-financial product selection tendency high-speed prediction method and device based on momentum acceleration random gradient decline | |
CN108427708A (en) | Data processing method, device, storage medium and electronic device | |
CN107239993A (en) | A kind of matrix decomposition recommendation method and system based on expansion label | |
CN107688865A (en) | Identify the method and apparatus of potential high consumption user in online game | |
CN112905801A (en) | Event map-based travel prediction method, system, device and storage medium | |
CN113222068B (en) | Remote sensing image multi-label classification method based on adjacency matrix guidance label embedding | |
CN112529415B (en) | Article scoring method based on combined multiple receptive field graph neural network | |
WO2022077231A1 (en) | System and method for efficiently training intelligible models | |
CN114861890A (en) | Method and device for constructing neural network, computing equipment and storage medium | |
CN111178986A (en) | User-commodity preference prediction method and system | |
CN116596170B (en) | Intelligent prediction method for delivery time based on space-time attention mechanism | |
CN116911949A (en) | Article recommendation method based on boundary rank loss and neighborhood perception graph neural network | |
CN116975686A (en) | Method for training student model, behavior prediction method and device | |
CN115952438A (en) | Social platform user attribute prediction method and system, mobile device and storage medium | |
CN114896138B (en) | Software defect prediction method based on complex network and graph neural network | |
CN111325401B (en) | Method and device for training path planning model and computer system | |
CN113554099A (en) | Method and device for identifying abnormal commercial tenant | |
CN112052386A (en) | Information recommendation method and device and storage medium | |
CN111563767A (en) | Stock price prediction method and device | |
CN115098787B (en) | Article recommendation method based on cosine ranking loss and virtual edge map neural network | |
CN117591969B (en) | Rule checking method and system based on IPC label co-occurrence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |