CN117073703A

CN117073703A - Method and device for generating vehicle path problem solution

Info

Publication number: CN117073703A
Application number: CN202310968598.2A
Authority: CN
Inventors: 王振坤; 罗福; 柳斐; 林熙; 张青富
Original assignee: Southwest University of Science and Technology
Current assignee: Southwest University of Science and Technology
Priority date: 2023-08-02
Filing date: 2023-08-02
Publication date: 2023-11-17

Abstract

The invention discloses a method and a device for generating a vehicle path problem solution, wherein the method comprises the following steps: acquiring vehicle path problem information to be processed, and determining a starting city, a destination city and a plurality of necessary cities according to the vehicle path problem information; inputting the vehicle path problem information into a trained solution generation model, wherein the solution generation model adopts an encoder and a decoder as a model frame, and the decoder comprises a plurality of attention layers; processing the vehicle path problem information in the encoder to obtain an embedded vector matrix of the target node; all node vectors of the target node embedded in a vector matrix are processed by the decoder, resulting in a shortest path from the starting city to the ending city and through all of the must-be cities. The invention can better extract the relation among a plurality of cities through the multi-layer attention layer in the decoder, thereby improving the quality of the path planning result.

Description

Method and device for generating vehicle path problem solution

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for generating a vehicle path problem solution.

Background

In the field of logistics transportation and travel planning, as the cities are involved, the scene is complex, and how to well plan the path, the shortest path from the starting city to the destination city and passing through all the necessary cities is obtained, so that the traffic cost is reduced, and the problem to be solved is urgent.

Many neural combination optimization methods based on learning are currently proposed and applied to solve the real-life path planning problem to obtain the shortest path. The neural combination optimization method does not need to manually design rules, but builds a neural network model, and learns a strategy for generating a solution from data through supervised learning or reinforcement learning, so that development cost can be remarkably reduced. However, current learning-based neural combination optimization models typically employ encoder and decoder architectures, including multiple attention layers in the encoder, and only one simple attention mechanism calculation and one compatibility calculation in the decoder. The model has the advantages that due to the simple structure of the decoder, when a large number of problems of path planning among cities are processed, complex relations among the cities can not be captured well, so that the quality of a shortest path planning result is lower.

Accordingly, the prior art has drawbacks and needs to be improved and developed.

Disclosure of Invention

The invention aims to solve the technical problem that the shortest path planning result in the prior art is lower in quality by providing a method and a device for generating a vehicle path problem solution aiming at the defects in the prior art.

The technical scheme adopted for solving the technical problems is as follows:

a method of generating a vehicle path problem solution, wherein the method comprises:

acquiring vehicle path problem information to be processed, and determining a starting city, a destination city and a plurality of necessary cities according to the vehicle path problem information;

inputting the vehicle path problem information into a trained solution generation model, wherein the solution generation model adopts an encoder and a decoder as a model frame, and the decoder comprises a plurality of attention layers;

processing the vehicle path problem information in the encoder to obtain a target node embedded vector matrix;

all node vectors of the target node embedded in a vector matrix are processed by the decoder, resulting in a shortest path from the starting city to the ending city and through all of the must-be cities.

In one implementation, the vehicle path problem information includes characteristic information of the starting city, characteristic information of the ending city, and characteristic information of all the necessary cities; the processing the vehicle path problem information in the encoder to obtain a target node embedded vector matrix comprises the following steps:

taking the initial city as an initial node, the terminal city as a terminal node, and all the necessary cities as non-access nodes;

in the linear mapping layer of the encoder, converting the characteristic information of the initial node into an initial node vector, converting the characteristic information of the terminal node into an initial terminal node vector, and converting the characteristic information of each non-access node into a corresponding initial non-access node vector;

in the attention layer of the encoder, processing the initial node vector to obtain an initial node vector, processing the initial end node vector to obtain an end node vector, and processing each initial non-access node vector to obtain a corresponding non-access node vector, wherein all the non-access node vectors form a non-access node vector set;

And storing the initial node vector, the end node vector and the non-access node vector set in a matrix form to obtain a target node embedded vector matrix.

In one implementation, the decoder further includes a linear layer; said embedding all node vectors of said target node into a vector matrix, by said decoder, resulting in a shortest path from said starting city to said ending city and through all of said must-be cities, comprising:

embedding the target node into the initial node vector, the final node vector and the non-access node vector set in a vector matrix to serve as input information of the multi-layer attention layer, extracting node characteristic relations of the input information layer by layer through the multi-layer attention layer, and outputting corresponding node characteristic relations after passing through each attention layer;

inputting the node characteristic relation output by the last attention layer into the linear layer to obtain an output quantity set;

processing the output quantity set through a softmax function to obtain the selection probability of each non-access node;

selecting all non-access nodes corresponding to the maximum value in the selection probabilities, and taking the non-access nodes as selection results;

Taking the non-access node vector corresponding to the selection result as a starting node vector of the next iteration, and removing the non-access node vector corresponding to the selection result from the non-access node vector set to obtain updated input information;

inputting the updated input information into a multi-layer attention layer for processing;

when only one non-access node vector exists in the non-access node vector set, ending iteration;

obtaining the shortest path from the starting city to the ending city and passing through all the necessary cities according to the selection result of each iteration;

the input information is used as input data of a first layer of attention layer, and the node characteristic relation output by a previous layer of attention layer is used as input data of a next layer of attention layer.

In one implementation, the processing, by the decoder, all node vectors of the target node embedded in a vector matrix, resulting in a shortest path from the start city to the end city and through all of the necessary cities, further includes:

extracting the shortest path according to a preset first rule to obtain a first partial decomposition;

determining the vehicle path sub-problem information corresponding to the first partial solution;

Inputting the vehicle path sub-problem information into a trained scheme generation model to obtain a second partial decomposition;

in the shortest path, replacing the first partial solution with the second partial solution to obtain a first optimized path;

respectively carrying out path length calculation on the shortest path and the first optimized path to obtain a shortest path distance and a first optimized path distance;

iteratively extracting and replacing the first optimized path when the shortest path distance is greater than the first optimized path distance;

or iteratively extracting and replacing the shortest path when the shortest path distance is less than the first optimized path distance;

when reaching a preset iteration termination condition, obtaining a reconstructed shortest path;

the vehicle path sub-problem information comprises characteristic information of a sub-problem starting city, characteristic information of a sub-problem ending city and characteristic information of all sub-problems passing through the city.

In one implementation, each of the attention layers includes: a multi-headed attention layer and a feedforward layer;

in each attention layer, the step of extracting the node characteristic relation of the input information comprises the following steps:

Acquiring input data, inputting the input data into a multi-head attention layer for processing, and obtaining a multi-head attention calculation result corresponding to each node vector;

adding each node vector to the corresponding multi-head attention calculation result to obtain an intermediate attention calculation result of each node vector;

inputting the intermediate attention calculation result of each node vector into a feedforward layer for processing to obtain a feedforward layer calculation result of each node vector, and adding the intermediate attention calculation result of each node vector and the feedforward layer calculation result of each node vector to obtain a node characteristic relation of the input data.

In one implementation, the training step of the solution generation model includes:

constructing an initial scheme generation model, wherein the initial scheme generation model adopts an encoder and a decoder as model frames, the encoder comprises a linear mapping layer and an attention layer, the decoder comprises a plurality of attention layers and a linear layer, and each attention layer comprises a multi-head attention layer and a feedforward layer;

acquiring a training data set, wherein the training data set comprises a plurality of vehicle path problem training information and corresponding labels;

And carrying out model training on the initial scheme generation model by utilizing the training data set to obtain a trained scheme generation model.

In one implementation, the vehicle path problem training information includes feature information of a start training city, feature information of a destination training city, and feature information of a plurality of training-necessary cities; the training data set is utilized to carry out model training on the initial scheme generation model to obtain a trained scheme generation model, and the method comprises the following steps:

inputting a plurality of vehicle path problem training information into an initial scheme generation model;

taking the initial training city as an initial training node, taking the terminal training city as a terminal training node, and taking all the training cities as unvisited training nodes;

in a linear mapping layer of the encoder, converting the characteristic information of the initial training node in each vehicle path problem training information into an initial training node vector, converting the characteristic information of the final training node into an initial final training node vector, and converting the characteristic information of each unvisited training node into an initial unvisited training node vector;

In the attention layer of the encoder, processing the initial training node vector to obtain an initial training node vector, processing the initial end training node vector to obtain an end training node vector, and processing each initial training unaccessed node vector to obtain a corresponding unaccessed training node vector, wherein all the unaccessed training node vectors form an unaccessed training node vector set;

storing the initial training node vector, the final training node vector and the unvisited training node vector set in a matrix form to obtain a training node embedded vector matrix corresponding to each vehicle path problem training information;

embedding the training nodes into the initial training node vector, the final training node vector and the unvisited training node vector set in a vector matrix to serve as training input information of the multi-layer attention layer, extracting training node characteristic relations of the training input information layer by layer through the multi-layer attention layer, and outputting corresponding training node characteristic relations after passing through each attention layer;

inputting the characteristic relation of the training node output by the last attention layer into the linear layer to obtain a training output quantity set;

Processing the training output quantity set through a softmax function to obtain training selection probability of each unvisited node;

selecting the non-access training nodes corresponding to the maximum value in all the selection probabilities, and taking the non-access training nodes as training selection results;

taking the unvisited training node vector corresponding to the training selection result as a starting training node vector of the next iteration, and removing the unvisited training node vector corresponding to the training selection result from the unvisited training node vector set to obtain updated training input information;

acquiring a label corresponding to the vehicle path problem training information, training the initial scheme generation model by taking the label as supervision information and taking a preset loss function as a target, updating model parameters, inputting updated training input information into a multi-layer attention layer for processing, and ending the round of iteration when only one unvisited node vector exists in the unvisited node vector set;

after the preset round of training is completed, a trained scheme generation model is obtained;

the training input information is used as training input data of a first attention layer, and the training node characteristic relation output by a previous attention layer is used as training input data of a next attention layer.

In one implementation, in each attention layer, the step of extracting training node feature relationships of the training input data includes:

acquiring training input data, inputting the training input data into a multi-head attention layer for processing, and obtaining a training multi-head attention calculation result of each training node vector;

adding each training node vector to the corresponding training multi-head attention calculation result to obtain a training intermediate attention calculation result of each training node vector;

inputting the training intermediate attention calculation result of each training node vector into a feedforward layer for processing to obtain a training feedforward layer calculation result of each training node vector, and adding the training intermediate attention calculation result of each training node vector and the training feedforward layer calculation result of each training node vector to obtain a training node characteristic relation of the training input data.

The invention also discloses a device for generating the vehicle path problem solution, wherein the device comprises:

the information acquisition module is used for acquiring vehicle path problem information to be processed and determining a starting city, a destination city and a plurality of necessary cities according to the vehicle path problem information;

The information input module is used for inputting the vehicle path problem information into a trained scheme generation model, the scheme generation model adopts an encoder and a decoder as a model frame, and the decoder comprises a plurality of attention layers;

the encoder processing module is used for processing the vehicle path problem information in the encoder to obtain a target node embedded vector matrix;

the solution generating module is used for embedding all node vectors of the target node into a vector matrix, and processing the node vectors by the decoder to obtain the shortest path from the starting city to the ending city and passing through all the necessary cities.

The invention also discloses a terminal, which comprises: the system comprises a memory, a processor and a vehicle path problem solution generation program stored on the memory and capable of running on the processor, wherein the vehicle path problem solution generation program realizes the steps of the vehicle path problem solution generation method when being executed by the processor.

The present invention also discloses a computer-readable storage medium storing a computer program executable for implementing the steps of the method of generating a vehicle path problem solution as described above.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of a method of generating a vehicle path problem solution in accordance with the present invention.

Fig. 2 is a schematic diagram of an encoder of a pattern generation model in the present invention processing vehicle path problem information.

Fig. 3 is a schematic diagram of a decoder of the pattern generation model of the present invention processing input data.

Fig. 4 is a schematic diagram of the processing of input data by each layer of decoder of the pattern generation model in the present invention.

FIG. 5 is a schematic diagram of training data generation in a travel planning scenario in accordance with the present invention.

Fig. 6 is a schematic diagram of training data generation in a logistics transportation planning scenario in the present invention.

Fig. 7 is a functional block diagram of a preferred embodiment of a vehicle path problem solution generation apparatus of the present invention.

Fig. 8 is a functional block diagram of a preferred embodiment of the terminal of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more clear and clear, the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In the last decades, many conventional algorithms have been proposed to solve the vehicle path problem, but they have mainly two main limitations. First, solving each new class of problems requires extensive domain knowledge by an expert to design a suitable algorithm, which results in significant development costs. Furthermore, these algorithms typically require lengthy run times due to NP-hard vehicle path problems.

Many neural combination optimization methods based on learning have been proposed and applied to solve the shortest path planning problem in travel scenes and logistics planning scenes. These methods may build neural network models and use supervised learning or reinforcement learning based methods for model training such that the models learn strategies that generate shortest paths. Currently such models typically employ heavy encoder and light decoder (Heavy Encoder and Light Decode, HELD) structures, which generate node embedded vectors for all cities through one-time learning by the heavy encoder, and then build solutions step by the light decoder using static node embedded vectors. The lightweight decoder uses simple attention mechanisms, i.e., multi-head attention mechanisms and compatibility calculations, based on the static node embedded vector matrix, to construct a solution without changing the relationship between nodes, generating the shortest path. The performance of such a model depends largely on the quality of the static node embedded vector, which represents the relationship of all cities in the map. Such one-time embedded vector learning may tend to model scale-related features that perform well on shortest path planning for a small number of cities. However, when the model is generalized to the shortest path planning of a large number of cities, the learned scale-related features prevent the model from capturing the necessary relationships between the large number of cities, and because the decoder only comprises a multi-head attention mechanism and compatibility calculation and has no feedforward layer, the structure is simpler, and the relationships between a plurality of cities cannot be extracted better, so that the quality of the shortest path planning result is poor.

Meanwhile, when shortest path planning is required between hundreds to thousands of cities, such problems are difficult to obtain a sufficiently high quality solution as labeled training data based on a supervised learning method due to NP difficulty. Meanwhile, the reinforcement learning-based method faces the key problems of sparse rewards, equipment memory limitation and the like, huge memory and calculation cost are needed for training, and the implementation difficulty is high. Therefore, such models cannot be effectively trained when a large number of city shortest path plans need to be processed, and thus cannot be well applied. Some researchers try to train such models on the shortest path planning problem of a small number of cities and generalize to the shortest path planning problem of a large number of cities, however, such models are very poor in generalization capability, resulting in poor practical application effects.

The present invention proposes a novel lightweight encoder and heavy decoder (LEHD, light Encoder and Heavy Decoder) model that no longer learns to generate embedded vectors for all nodes at once, but rather captures the relationship between the current partial solution and all unvisited cities dynamically by the heavy decoder. As the number of unvisited cities changes in the construction step, the model learns more easily to scale-independent features. Furthermore, the model may iteratively adjust and refine learned node relationships during the training process. Thus, it is less sensitive to instance scale and has better generalization performance on large scale problem instances. Meanwhile, the invention provides an efficient data training scheme called as 'learning construction part decomposition' (learn to construct partial solution), and the model of the invention is trained in a supervised learning mode. By this approach, the model learns to reconstruct the partial solution during the optimization process, which can be regarded as a data enhancement technique for robust model training. Finally, similar to the training process, a flexible solution construction mechanism called "Random Re-construction" is proposed in the reasoning phase to achieve effective active improvement, and the quality of the solution is significantly improved by iterative local reconstruction.

Referring to fig. 1, a method for generating a vehicle path problem solution according to an embodiment of the present invention includes the following steps:

and step S100, acquiring vehicle path problem information to be processed, and determining a starting city, a destination city and a plurality of necessary cities according to the vehicle path problem information.

Specifically, the starting city and the ending city may be the same city or different cities.

As shown in fig. 1, the method for generating a vehicle path problem solution further includes the following steps:

step 200, inputting the vehicle path problem information into a trained scheme generation model, wherein the scheme generation model adopts an encoder and a decoder as a model frame, and the decoder comprises a plurality of attention layers.

Specifically, the decoder comprises a plurality of layers of attention layers, the vehicle path problem information is output through the processing of the plurality of layers of attention layers, the plurality of layers of attention layers can effectively extract the relations among different cities, and the quality of the shortest path can be effectively improved.

step S300, processing the vehicle path problem information in the encoder to obtain a target node embedded vector matrix;

Specifically, the encoder comprises a linear mapping layer and an attention layer, and the vehicle path problem information is sequentially processed by the linear mapping layer and the attention layer to obtain a target node embedded vector matrix.

In one implementation, the step S300 specifically includes:

step S310, taking the initial city as an initial node, the terminal city as a terminal node, and all the necessary cities as non-access nodes;

step 320, in the linear mapping layer of the encoder, converting the feature information of the start node into an initial start node vector, converting the feature information of the end node into an initial end node vector, and converting the feature information of each non-access node into a corresponding initial non-access node vector;

step 320, processing the initial node vector in the attention layer of the encoder to obtain an initial node vector, processing the initial end node vector to obtain an end node vector, and processing each initial non-access node vector to obtain a corresponding non-access node vector, wherein all the non-access node vectors form a non-access node vector set;

And step S330, storing the initial node vector, the end node vector and the non-access node vector set in a matrix form to obtain a target node embedded vector matrix.

Specifically, the characteristic information is coordinates or the characteristic information includes coordinates and cargo demand. When in the scene of travel planning, the characteristic information is coordinates. When in the scene of logistics transportation planning, the characteristic information comprises coordinates and cargo demand.

As shown in fig. 2, let the vehicle path problem information be represented as s= (S) ₁ ,…,s _n ) Wherein s is _i Representing the characteristic information of a certain city, wherein the vehicle path problem information is processed by linear projection of a linear mapping layer to obtain the characteristic information s of each city _i Conversion to a corresponding initial node vectorThrough the processing, the characteristic information of the initial city is converted into an initial node vector, the characteristic information of the terminal city is converted into an initial terminal node vector, and the characteristic information of each necessary city is converted into an initial non-access node vector. Then the whole initial node vector +.>Inputting into an attention layer to obtain a starting node vector, an ending node vector and a non-access node vector set, and storing the starting node vector, the ending node vector and the non-access node vector set in a matrix form to obtain a target node embedded vector matrix H= (H) ₁ ,…,h _n ). According to the invention, the vehicle path problem information is processed in the encoder, so that the target node embedded vector matrix can be obtained, and the target node embedded vector matrix contains the relation among cities and provides powerful support for the subsequent generation of the shortest path.

step S400, all node vectors of the target node embedded in a vector matrix are processed by the decoder, so that the shortest path from the starting city to the ending city and passing through all the necessary cities is obtained.

Specifically, all node vectors of the target node embedded in the vector matrix are processed layer by layer through the multi-layer attention layer of the decoder, so that the relation between different cities can be better extracted, and the accuracy of shortest path planning is improved.

In one implementation, the decoder further includes a linear layer; the step S400 specifically includes:

obtaining the shortest path from the starting city to the ending city and passing through all the necessary cities according to the selection result obtained by each iteration;

Specifically, the starting node vector is expressed as Representing the endpoint node vector as +.>Representing a set of unvisited node vectors as H _a 。

In one implementation, in the scenario of logistics transportation planning, the characteristic information includes coordinates and cargo demand, and the logistics vehicle starts from a starting city (city where the warehouse is located) and transports the cargo to a plurality of necessary cities to meet their cargo demand. The logistic vehicle has the maximum cargo capacity, and the cargo capacity is reduced every time the vehicle visits a necessary city, and the remaining cargo capacity of the vehicle is called the vehicle remaining capacity. If the remaining capacity of the vehicle cannot meet the cargo demand of the next city, the vehicle needs to return to the city where the warehouse is located for cargo replenishment. That is, in a logistics transportation planning scenario, the characteristic information (coordinates, cargo demand) is converted by the encoder process into a target node embedded vector matrix, which includes the set of the start node vector, the end node vector, and the unvisited node vector. And the decoder receives the initial node vector, the final node vector and the non-access node vector set, and combines the residual capacity of the vehicle with the initial node vector and the final node vector to obtain a first initial node vector and a first final node vector. Specifically, the remaining capacity of the vehicle is denoted as C _left The merging process is thatThe first initial node vector, the first end node vector and the non-access node vector set are used as input information of the iteration, and are input into multi-layer attention layer processing,

obtaining the city selected in the iterative step. The iterative process is repeated until a stop condition is met, at which point the shortest path is generated. Wherein the vehicle residual capacity is updated through each iteration step.

As shown in fig. 3, the initial node vectorThe endpoint node vector->Non-access node vector set H _a Inputting a first attention layer of the multi-layer attention layer as input information of the multi-layer attention layer, and processing the input information by the first attention layer to obtain an output result H of the first attention layer ⁽¹⁾ . Output result H of the first attention layer ⁽¹⁾ As the input data of the second attention layer, the second attention layer processes the input data to obtain the output H of the second layer ⁽²⁾ The multiple attention layers sequentially calculate the output result of the previous layer as the input data of the attention layer of the current layer. The decoder is provided with L attention layers, and the output result H of the last attention layer is finally obtained ^(L) 。

Wherein the initial processing result of the multi-layer attention layer is expressed asThe output of the first attention layer is denoted as H ⁽¹⁾ ＝AttentionLayer ¹ (H ⁽⁰⁾ ) The output of the second attention layer is denoted as H ⁽²⁾ ＝AttentionLayer ² 9H ⁽¹⁾ ) The output of the layer L (i.e., the last layer) is denoted as H ^(L) ＝AttentionLayer ^L (H ^(L-1) )，W ₁ And W is ₂ To learn parameters, W ₁ Used to mark the starting city and W2 used to mark the ending city. The output of each attention layer represents the node characteristic relationship. AttenionLayer performs data processing for attention layerA function of line processing.

Output result H of the last attention layer ^(L) And inputting the linear layer to obtain an output quantity set O. Wherein each node vector passes through the linear layerConversion into corresponding output quantity o _i ,o _i E O. All of the outputs form an output set O. This calculation process is denoted as o=w _O H ^(L) Wherein W is _O Is a learnable parameter for linear mapping.

In the travel planning scenario, W is determined by determining only the selected city _O ∈R ^d×1 Where d is the dimension of the embedded vector. At this time output o _i Is one-dimensional.

In the logistics transportation planning scene, W is determined by judging not only the selected city but also the preset binary variable value _O ∈R ^d×2 Where d is the dimension of the embedded vector. At this time output o _i Is two-dimensional.

Processing the output quantity set O through a softmax function to obtain a selection probability set P of each non-access node vector in the non-access node vector set in the current iteration _t . Wherein the selection probability is expressed as P _t ＝softmax(O)。

And selecting the non-access node corresponding to the maximum value in all the selection probabilities, and taking the non-access node as a selection result. Taking the non-access node vector corresponding to the selection result as the initial node vector of the next iteration, and taking the non-access node vector corresponding to the selection result from the non-access node vector set H _a And (3) removing the data to obtain updated input information.

And inputting the updated input information into the multi-layer attention layer again, and calculating layer by layer. Iterating a plurality of times until the set of non-accessed node vectors H _a When only one unvisited node vector remains, the iteration is ended. Each iteration of decoder processing outputs a selection result x _i (i.e. the non-visited city corresponding to the maximum value in all the selection probabilities in this iteration), where i e (1, n). Ending the iteration after n times of iteration, wherein the decoder generates the shortest path

In a logistics transportation planning scenario, the shortest path x is indicated by the binary variable. For example, the shortest path in the logistics transportation planning scene is {0,1,2,3,0,4,5,0,6,7,0}, where 0 represents the city of the warehouse, and can be expressed asThe first row of this representation represents the sequence of visited cities in the shortest path, and the second row represents whether the city reached the present city through the warehouse or through other cities. When the binary variable value corresponding to the city is 0, it means that the city is reached through the city where the warehouse is located, and when the binary variable value corresponding to the city is 1, it means that the city is reached through other cities. The purpose of this expression is to ensure that the logistics transportation planning scheme is aligned. In a logistics planning scenario, planning schemes with the same number of city nodes may have different numbers of sub-trips, resulting in potential misalignment. By this way of representation, such problems can be effectively avoided, resulting in parallel computation.

The decoder of the present invention dynamically updates the starting node vector and the set of unvisited nodes through multiple layers of attention layers, thereby updating the node characteristic relationships between the corresponding cities at each decoding step. In this way, the model learns to adjust and re-find the node characteristic relationship between the shortest path generated by the current iteration and the unvisited city vector. Furthermore, as the size of these node embedded vectors changes continuously during construction, models tend to learn scale-independent features. Therefore, the model of the invention has strong capability of dynamically capturing the node characteristic relation among related cities, and can make more intelligent decisions when selecting the most suitable non-accessed node vector from the non-accessed node vector sets with various sizes, thereby obtaining more accurate results on the shortest path planning of a large number of cities.

Specifically, the present invention refers to the above steps as Random Re-configuration (RRC). Since the solution generation model is a build-based model, different build directions, different start cities and end cities may result in different shortest path qualities. Meanwhile, the scheme of the invention is realized based on greedy search, wherein the greedy search is started from one node, and then a solution is constructed along one direction, so that suboptimal partial solution exists in the obtained solution. Therefore, in the inference phase, it is necessary to correct the suboptimal partial solution among the resulting solutions.

In order to improve the quality of the shortest path, the shortest path output by the model generation model of the model can be reconstructed. And extracting the shortest path which is output by the scheme generation model for the first time according to a preset first rule to obtain a first partial decomposition. The first portion is broken down into a portion of the shortest path. The length of the first partial solution is w epsilon [4, V ], and V is the number of all nodes in the shortest path which is output for the first time. The preset first rule is that a plurality of adjacent nodes are randomly selected according to the ordering of solutions, the number of the selected nodes is greater than or equal to 4, and the number of the selected nodes is smaller than or equal to the number of all nodes in the shortest path. Corresponding vehicle path sub-problem information may be determined from the first partial solution. The vehicle path sub-problem information comprises characteristic information of a sub-problem starting city, characteristic information of a sub-problem ending city and characteristic information of all sub-problems passing through the city. The sub-problem starting city, the sub-problem ending city and the sub-problem must be cities. And inputting the vehicle path sub-problem information into a trained scheme generation model to obtain a second partial decomposition. The second partial decomposition is also a city path. And in the shortest path, replacing the first partial solution with the second partial solution to obtain a first optimized path. And selecting a path with a shorter distance as a path for optimization in the next iteration by comparing the shortest path with the first optimized path. And repeating the iteration until reaching the preset time budget, and obtaining the shortest path after reconstruction. The invention can obviously improve the quality of the shortest path in the specified time budget by randomly reconstructing the shortest path, and can better facilitate the travel planning and the logistics transportation planning of users.

In one implementation, the shortest path is { A1 city, A2 city, A3 city, A4 city, A5 city, A6 city, A7 city, A8 city, A9 city, a10 city }, the first partial solution is decomposed into { A4 city, A5 city, A6 city, A7 city, A8 city }, and the first partial solution is input into a trained solution generating model to obtain a second partial solution { A7 city, A8 city, A4 city, A5 city, A6 city }. In the shortest path, the first partial solution is replaced with the second partial solution to obtain a first optimized path { A1 city, A2 city, A3 city, A7 city, A8 city, A4 city, A5 city, A6 city, A9 city, A10 city }. Respectively calculating path distances of a shortest path and a first optimized path, and iteratively extracting and replacing the first optimized path when the shortest path distance is larger than the first optimized path distance; or iteratively extracting and replacing the shortest path when the shortest path distance is less than the first optimized path distance. And when the iteration is repeated for a plurality of times and reaches a preset time budget, obtaining the shortest path after reconstruction.

Specifically, as shown in fig. 4, taking calculation of the attention layer of the first layer as an example, input data is first obtained, where the input data is an output result of the attention layer of the previous layer. The multi-head attention layer in the first attention layer processes input data to obtain multi-head attention calculation results corresponding to each node vectorAnd adding each node vector to the corresponding multi-head attention calculation result to obtain an intermediate attention calculation result of each node vector. The intermediate attention calculation result of each node vector is expressed as +. >Wherein H is ^(l-1) For the output result of the previous attention layer, < +.>For the selected node vector, ++>The results are calculated for multiple head attentiveness.

The input of the multi-head attention formula is represented as Q and K, and the multi-head attention calculation formula is as follows:

MHA(Q，K)＝Concat(head ₁ ，...，head _h )W ^O

head _i ＝Attention(W _i ^Q Q，W _i ^K K，W _i ^V K)

wherein head _i Representing the processing result of the ith attention head, W _i ^Q ，W _i ^K ，W _i ^V ，W ^O Is a learnable parameter, W _i ^Q ，h is the number of multi-head attention heads, d is the node vector dimension, d _k ＝d _v ＝d/h。

Inputting intermediate attention calculation results of all node vectors into a feedforward layer for processing to obtain feedforward layer calculation results of each node vector, and inputting the intermediate attention calculation results of each node vector and the feedforward layer calculation results of each node vectorAnd adding to obtain the node characteristic relation of each node in the input data. The node characteristic relationship of each node can be expressed as Wherein (1)>Calculating a result for the intermediate attentiveness of each node vector,/->Results are calculated for the feed forward layer.

The feedforward layer calculation formula is as followsWherein x is the feed-forward layer input, ">Are all learnable parameters, < >> df is the feed-forward layer dimension.

This process can be represented as H ^(l) ＝AttentionLayer ^l (H ^(l-1) ) That is, the output result of the previous attention layer is processed to obtain the output of the current attention layer.

According to the method, normalization is removed from the attention layer, the generalization capability of the scheme generation model is improved, and the accuracy of calculating the shortest paths of a large number of cities is effectively improved.

Specifically, a plurality of sets of data including a start city, a destination city and a plurality of necessary cities are obtained, and a preset optimal solution label is obtained, wherein the optimal solution label is an optimal solution solved by an accurate solver, namely, the shortest path from the start city to the destination city and through all the necessary cities is obtained. Depending on the nature of the optimal invariance, the partial decomposition of the optimal solution must also be optimal. The invention enriches training data, and randomly extracts and generates partial decomposition from the optimal solution in the training stage. Generating a partial solution includes two phases: firstly, carrying out random inversion on an optimal solution; next, a part of the specified length of the optimal solution is randomly selected as a partial solution. Wherein the length of each partial decomposition is w E [4, V ]. The reason for setting the minimum value of w to 4 is that a portion of length less than 4 is too simple to construct.

In the travel planning scenario, as shown in FIG. 5, the optimal solution {1,2,3,4,5,6,7,8,9} is inverted to obtain {9,8,7,6,5,4,3,2,1}, and then a partial solution {6,5,4,3,2} is randomly extracted. Wherein each data in the optimal solution represents a city.

In a logistics transportation planning scenario, the end of the constraint part decomposition must be reversed by randomly extracting sub-paths in a specified city, i.e. the city in which the warehouse is located. As shown in fig. 6, one sub-path {0,4,5,6,0} is randomly selected for the optimal solution {0,1,2,3,0,4,5,6,0,7,8,9,0} and inverted, the sub-path after inversion becomes {0,6,5,4,0}, the other sub-paths remain unchanged, and then a partial solution {2,3,0,6,5,4,0} is randomly extracted.

After the partial decomposition is obtained, the corresponding vehicle path problem training information can be determined, and meanwhile, the characteristic information of the initial training city, the characteristic information of the final training city and the characteristic information of the training-necessary city of the vehicle path problem training information can be determined. The training data set is composed of the vehicle path problem training information and the corresponding labels. Because the vehicle path problem training information contains data of different directions, different starting point nodes and different end point nodes, the model can be trained better, so that the model has better generalization capability on the shortest path planning problem of a large number of cities.

processing the training output quantity set through a softmax function to obtain training selection probability of each unvisited training node;

Specifically, the characteristic information is coordinates or the characteristic information comprises coordinates and cargo demands. And simultaneously inputting a plurality of vehicle path problem training information to the initial scheme generation model. Each vehicle path problem training information is processed by an encoder to obtain a corresponding training node embedded vector matrix. The training nodes are embedded into the vector matrix and input into the multi-layer attention layer, and the characteristic relation of the training nodes is obtained through the layer-by-layer extraction of the multi-layer attention layer. Inputting the characteristic relation of the training node output by the last attention layer into the linear layer to obtain a training output quantity set; processing the training output quantity set through a softmax function to obtain training selection probability P of each unvisited training node _t ＝{p ₁ ,p ₂ ,…,p _u }. And selecting the unvisited training node corresponding to the maximum value in all the selection probabilities, and taking the unvisited training node as a training selection result. And taking the unvisited training node vector corresponding to the training selection result as a starting training node vector of the next iteration, and removing the unvisited training node vector corresponding to the training selection result from the unvisited training node vector set to obtain updated training input information. Obtaining a tag Y= { Y corresponding to the vehicle path problem training information ₁ ,y ₂ ,…,y _u And training the initial scheme generating model by taking the label as supervision information and a preset loss function as a target, updating the learnable parameters of the initial scheme generating model, inputting the updated training input information into a multi-layer attention layer for processing, and ending the iteration of the round when only one unvisited node vector exists in all the unvisited node vector sets. Wherein the loss function is expressed asy _i For the corresponding tag of city, p _i The probability of selection for the city. Wherein the tag indicates whether a city has to be selected in the current iteration to form the shortest path. If a city should be selected in the current iteration, the label corresponding to that city is 1, and if not, the label corresponding to that city is 0. For example, in the current iteration, cities that have not been visited include B1, B2, B3, B4, B5, while the optimal solution requires that the B3 city must be selected in the current iteration to constitute the shortest path. Thus, the tag of B3 city is 1, and the tags of B1, B2, B4, B5 are 0, then the tag is y= { Y ₁ ,y ₂ ,…,y _u = {0, 1, 0}. Each iteration may train the model according to the tag.

According to the method, the system and the device, the multiple pieces of vehicle path training problem information are trained at the same time, after the preset round training is completed, the trained scheme generation model is obtained, and the scheme generation model can obtain the shortest path with higher quality under the scene of path planning in a small number or a large number of cities through training of a large number of pieces of vehicle path training problem information of multiple rounds.

In one implementation, the vehicle path problem training information of a preset batch size may be set for training at the same time.

Specifically, training input data is obtained, wherein the training input data is a training output result of an upper layer and an attention layer. And processing the training input data to obtain the training node characteristic relation of the training input data. The method removes normalization in the attention layer, increases the expression capacity of the model, improves the generalization capacity of the scheme generation model, and has better generalization capacity when the model is applied to the shortest path planning of a large number of cities.

In one implementation, the node vector dimension is set to 128 and the attention layer in the decoder is 6 layers. In each attention layer, the number of attention heads of the multi-head attention layer is 8, and the dimension of the feed-forward layer is set to 512. 100 ten thousand training data are used, from which training data are extracted in turn in batches, all data being extracted completely each time being called one round. Each training data contains 100 cities. The model was optimized using Adam algorithm with an initial learning rate set to 1e-4. Aiming at the data of the travel path planning scene, the learning rate attenuation value of the model is set to be 0.97 of attenuation of each turn, and 150 turns of training is performed on the model generation model by using data with the batch size of 1024. Aiming at the data of the logistics transportation planning scene, the learning rate attenuation value of the model is set to be 0.9 for each round of attenuation, and 40 rounds of training are performed on the data with the batch size of 1024 for the model generation model. The hardware trained was a single NVIDIAGeForce RTX 3090GPU with 24GB video memory. The trained scheme generation model is good in path planning of a large number of cities, and can provide shortest path planning for 1000 cities in travel path planning and logistics transportation planning, so that a near-optimal solution is generated.

In an embodiment, as shown in fig. 7, based on the method for generating a vehicle path problem solution, the present invention further provides a device for generating a vehicle path problem solution, including:

the information acquisition module 100 is configured to acquire vehicle path problem information to be processed, and determine a start city, an end city, and a plurality of necessary cities according to the vehicle path problem information;

an information input module 200 for inputting the vehicle path problem information into a trained scenario generation model, the scenario generation model employing an encoder and a decoder as a model framework, the decoder including a plurality of attention layers;

an encoder processing module 300, configured to process the vehicle path problem information in the encoder to obtain a target node embedded vector matrix;

the solution generating module 400 is configured to embed the target node into all node vectors in a vector matrix, and process the node vectors by the decoder to obtain a shortest path from the start city to the end city and through all the necessary cities.

In an implementation manner, the invention further provides a terminal correspondingly, as shown in fig. 8, including: the system comprises a memory 20, a processor 10 and a vehicle path problem solution generation program 30 stored on the memory 20 and executable on the processor 10, wherein the vehicle path problem solution generation program 30, when executed by the processor 10, implements the steps of the vehicle path problem solution generation method as described above.

The present invention also provides a computer-readable storage medium storing a computer program executable for implementing the steps of the method of generating a vehicle path problem solution as described above.

In summary, the invention discloses a method and a device for generating a solution to a vehicle path problem, wherein the method comprises the following steps: acquiring vehicle path problem information to be processed, and determining a starting city, a destination city and a plurality of necessary cities according to the vehicle path problem information; inputting the vehicle path problem information into a trained solution generation model, wherein the solution generation model adopts an encoder and a decoder as a model frame, and the decoder comprises a plurality of attention layers; processing the vehicle path problem information in the encoder to obtain an embedded vector matrix of the target node; all node vectors of the target node embedded in a vector matrix are processed by the decoder, resulting in a shortest path from the starting city to the ending city and through all of the must-be cities. The invention can better extract the relation among a plurality of cities through the multi-layer attention layer in the decoder, thereby improving the quality of the path planning result.

It is to be understood that the invention is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.

Claims

1. A method of generating a vehicle path problem solution, comprising:

2. The method of generating a vehicle path problem solution according to claim 1, wherein the vehicle path problem information includes characteristic information of the start city, characteristic information of the end city, and characteristic information of all the necessary cities; the processing the vehicle path problem information in the encoder to obtain a target node embedded vector matrix comprises the following steps:

3. The method of generating a vehicle path problem solution according to claim 2, wherein the decoder further comprises a linear layer; said embedding all node vectors of said target node into a vector matrix, by said decoder, resulting in a shortest path from said starting city to said ending city and through all of said must-be cities, comprising:

4. A method of generating a vehicle path problem solution according to claim 3, wherein said embedding the target node into the vector matrix all node vectors are processed by the decoder to obtain the shortest path from the start city to the end city and through all the must-be cities, further comprising:

5. A method of generating a vehicle path problem solution as claimed in claim 3, wherein each of the attention layers comprises: a multi-headed attention layer and a feedforward layer;

6. The method of generating a vehicle path problem solution according to claim 1, wherein the training step of generating a model of the solution includes:

7. The method of claim 6, wherein the vehicle path problem training information includes characteristic information of a start training city, characteristic information of an end training city, and characteristic information of a plurality of must-be-trained cities; the training data set is utilized to carry out model training on the initial scheme generation model to obtain a trained scheme generation model, and the method comprises the following steps:

taking the initial training city as an initial training node, taking the terminal training city as a terminal training node, and taking all the training cities as unvisited training nodes; in a linear mapping layer of the encoder, converting the characteristic information of the initial training node in each vehicle path problem training information into an initial training node vector, converting the characteristic information of the final training node into an initial final training node vector, and converting the characteristic information of each unvisited training node into an initial unvisited training node vector;

8. The method of generating a vehicle path problem solution according to claim 7, wherein in each attention layer, the step of extracting training node characteristic relationships of the training input data includes:

9. A vehicle path problem solution generation apparatus, comprising:

10. A terminal, comprising: a memory, a processor and a vehicle path problem solution generation program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the vehicle path problem solution generation method according to any one of claims 1 to 8.

11. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program executable for implementing the steps of the vehicle path problem solution generation method according to any one of claims 1 to 8.