CN116629479A - AGV multi-target traffic control method and system for single-way collision-free AGV - Google Patents
AGV multi-target traffic control method and system for single-way collision-free AGV Download PDFInfo
- Publication number
- CN116629479A CN116629479A CN202310634753.7A CN202310634753A CN116629479A CN 116629479 A CN116629479 A CN 116629479A CN 202310634753 A CN202310634753 A CN 202310634753A CN 116629479 A CN116629479 A CN 116629479A
- Authority
- CN
- China
- Prior art keywords
- agv
- traffic control
- collision
- free
- path
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012549 training Methods 0.000 claims abstract description 64
- 238000011156 evaluation Methods 0.000 claims abstract description 34
- 230000002787 reinforcement Effects 0.000 claims abstract description 29
- 238000010606 normalization Methods 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims abstract description 14
- 239000013598 vector Substances 0.000 claims description 24
- 238000005457 optimization Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000007613 environmental effect Effects 0.000 claims description 11
- 238000011478 gradient descent method Methods 0.000 claims description 11
- 239000010410 layer Substances 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 7
- 230000009471 action Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 239000002356 single layer Substances 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000000903 blocking effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
- G06Q10/047—Optimisation of routes or paths, e.g. travelling salesman problem
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Mathematical Physics (AREA)
- Operations Research (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Traffic Control Systems (AREA)
Abstract
The application discloses a single-way collision-free AGV multi-target traffic control method and system, and particularly relates to the technical field of AGV collision-free, comprising the following steps of S10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database; s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient; s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; according to the application, the traffic control deep reinforcement learning model is utilized to perform normalization processing and full-connection linear operation, a sample training traffic control model is finally obtained, and the next scheduling option is predicted according to the output result of the sample training traffic control model, so that the AGV is facilitated to find the optimal scheduling value in each scheduling task, the collision-free running of the one-way road is automatically completed, the collision-free running efficiency of the AGV is improved, and the running cost of the AGV is reduced.
Description
Technical Field
The application relates to the technical field of AGV collision-free, in particular to a single-way collision-free AGV multi-target traffic control method and system.
Background
AGVs are short for automatic guided vehicles, which are automatically or manually loaded with goods and then automatically unloaded according to a set route or by pulling a loaded truck to a set place. Along with the continuous development of scientific technology, the AGV utilizes computer technology, artificial intelligence technique to realize how to understand tasks and plan reasonable paths, improves the performance of the AGV, reduces the cost of manufacturing industry, and therefore has wider scope in the automobile assembly line industry.
In the aspect of traffic control in the automobile assembly line industry, great progress has been made based on rule driving and heuristic learning, but most of the methods only emphasize a specific scheduling rule, neglect the real-time dynamic environment information of the whole system and only adapt to a specific certain AGV scheduling scene; and the time-consuming condition of collision waiting in the actual running of the AGV cannot be considered. Therefore, how to combine real-time environmental status information in collision-free constraint, so that less environmental resources are wasted, and meanwhile, the aim of reducing traffic collision blocking in the AGV driving process is considered, so that the problem to be solved in traffic control is solved.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present application provide a single-way collision-free AGV multi-target traffic control method and system, so as to solve the problems set forth in the above-mentioned background art.
In order to achieve the above purpose, the present application provides the following technical solutions: a single-way collision-free AGV multi-target traffic control method comprises the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; the traffic control depth intensity learning model is realized by the following formula:
x i =(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) Representing a traffic control deep reinforcement learning model, wherein the traffic control deep reinforcement learning model is obtained by iteratively accumulating scheduling values at the moment t and carrying out expected operation, and the Q value is used for reinforcement learning;
s40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
s50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
Preferably, the step S10 includes the following:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
Preferably, the step S20 includes the following:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
Preferably, the step S40 includes the following:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix.
Preferably, the step S40 further includes the following:
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model; the prediction submodule process comprises
Carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
Preferably, the step S40 further includes the following:
and carrying out iterative optimization by adopting a gradient descent method in the training sample prediction process, wherein the iterative optimization accords with the following expression:
wherein ,/>Representing the negative direction of the gradient, C j Representing the search step in the direction.
Preferably, the deployment prediction refers to inputting real-time environmental state information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and corresponding according to an output result of the modelDeparture time; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
A single-way collision-free AGV multi-target traffic control system comprising:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
The application has the technical effects and advantages that:
(1) According to the application, the AGV path task and the real-time map data are digitally marked to form real-time environment state information, the current single-way path resource occupation condition is calculated, the AGV collision-free evaluation coefficient is obtained, automatic adjustment of the sending speed and the time point of the AGV are realized according to the obtained AGV collision-free evaluation coefficient, normalization processing and full-connection linear operation are carried out by using a traffic control depth reinforcement learning model, a sample training traffic control model is finally obtained, the next scheduling option is predicted according to the output result of the sample training traffic control model, the AGV is facilitated to find the optimal scheduling value in each scheduling task, the single-way collision-free running of the AGV is automatically completed, the AGV collision-free running efficiency is improved, and the AGV running cost is reduced;
(2) The AGV multi-target traffic control method suitable for the single-way collision-free uses the resource occupation rule, so that real-time road condition information of a running path can be considered while the AGV runs, and the waste of single-way road path resources is reduced;
(3) The traffic control method provided by the application also introduces a deep reinforcement learning traffic control model, can pay attention to global environment information, dynamically feeds back real-time environment states, simultaneously introduces AGV collision waiting time consumption, can consider traffic collision blocking conditions in the AGV driving process, is more in line with the AGV traffic control in a specific scene, and has a great effect on improving the traffic control efficiency by multi-objective scheduling.
Drawings
FIG. 1 is a flow chart of the method of the present application.
Fig. 2 is a block diagram of the system architecture of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Traffic control: there are multiple material transport tasks in the manufacturing system, multiple AGV carts. Under certain constraint conditions, the execution sequence and departure time of the AGVs are required to be specified, and meanwhile, the real-time scheduling under the dynamic condition is carried out to complete the material transportation task, so that the whole system meets certain performance optimization indexes on the premise of no collision in a one-way road map.
Example 1
Referring to fig. 1-2, the embodiment provides a single-way collision-free AGV multi-target traffic control method, which specifically includes the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S10 includes the following steps:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
In the embodiment, the target real map and the path states of the AGVs are digitally marked to form a dynamic path planning map, so that collision-free calculation among a plurality of AGVs can be conveniently judged.
S20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S20 includes the following steps:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
In this embodiment, through obtaining the route state and the map state information of AGV and correlating, form AGV collision-free evaluation coefficient, according to actual conditions, select suitable waiting time consuming correction coefficient, comprehensively judge AGV collision-free evaluation coefficient, be favorable to the AGV to judge collision risk according to the calculation result to trace back the reason that appears collision risk, thereby be convenient for the AGV to carry out further processing.
S30, constructing a traffic control deep reinforcement learning model based on real-time environment state information;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S30 includes the following steps:
the traffic control depth intensity learning model is realized by the following formula:
x i =(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) The model represents a deep reinforcement learning model of traffic control, which is obtained by iteratively accumulating scheduling values at time t and performing expected operation, and the Q value is used for reinforcement learning.
In this embodiment, by using the convolutional neural network to obtain training sample characteristics and reduce data dimension for the AGV state and map state data, a traffic control deep reinforcement learning model is formed, which is favorable for giving different values according to different AGV states during initialization, further improves the accuracy and stability of the AGV collision-free data, and reduces the collision waiting time and the task completion time of the AGV while focusing on path resource waste.
S40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S40 includes the following steps:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix,
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model;
the prediction submodule process comprises the following steps:
carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
And S403, performing iterative optimization by adopting a gradient descent method in the training sample prediction process.
The scheduling option overlap optimization conforms to the following expression:
wherein Representing the negative direction of the gradient, C j The search step in the direction is indicated,
in the implementation, a sample training traffic control model is obtained by a prediction submodule through a gradient descent method, the sample training traffic control model is subjected to iterative optimization according to the negative direction of the gradient, the gradient direction is obtained through normalization processing, the searching step length is determined by a linear searching algorithm, namely the score coordinate of the next point is regarded as z k+1 And then find a function that satisfies f (z k+1 ) Z of the minimum value of (2) k+1 And the sample training traffic control model is obtained through iterative optimization scheduling by a gradient descent method, so that the optimal solution of the objective function is found, and the AGV optimal scheduling value is obtained.
S50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
In this embodiment, it should be specifically described that the deployment prediction refers to inputting real-time environmental status information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and a corresponding departure time according to an output result of the model; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
Example 2
Referring to fig. 1-2, the present embodiment provides an AGV multi-target traffic control system with single-way collision free, which specifically includes the following contents:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
In summary, the AGV path task and the real-time map data are digitally marked to form real-time environment state information, the current single-way path resource occupation condition is calculated, the AGV collision-free evaluation coefficient is obtained, automatic adjustment of the sending speed and the time point of the AGV are achieved according to the obtained AGV collision-free evaluation coefficient, the traffic control depth reinforcement learning model is utilized to perform normalization processing and full-connection linear operation, the sample training traffic control model is finally obtained, the next scheduling option is predicted according to the output result of the sample training traffic control model, the AGV can find the optimal scheduling value in each scheduling task, the single-way collision-free running of the AGV is automatically completed, the AGV collision-free running efficiency is improved, and the running cost of the AGV is reduced.
It should be noted that the above-described working procedure is merely illustrative, and does not limit the scope of the present application, and in practical application, a person skilled in the art may select part or all of them according to actual needs to achieve the purpose of the embodiment, which is not limited herein.
Other embodiments or specific implementations of a single-way collision-free AGV multi-target traffic control method and system according to the present application may refer to the above method embodiments, and are not described herein.
Finally: the foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.
Claims (8)
1. A single-way collision-free AGV multi-target traffic control method is characterized in that: the method comprises the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; the traffic control depth intensity learning model is realized by the following formula:
x i --(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) Representing a traffic control deep reinforcement learning model, wherein the traffic control deep reinforcement learning model is obtained by iteratively accumulating scheduling values at the moment t and carrying out expected operation, and the Q value is used for reinforcement learning;
s40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
s50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
2. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S10 includes the following:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
3. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S20 includes the following:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
4. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S40 includes the following:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix.
5. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S40 further includes the following:
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model; the prediction submodule process comprises the following steps:
carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
6. A method of single-way collision-free AGV multi-target traffic control as in claim 5, wherein: the step S40 further includes the following:
and carrying out iterative optimization by adopting a gradient descent method in the training sample prediction process, wherein the iterative optimization accords with the following expression:
wherein ,/>Representing the negative direction of the gradient, C j Representing the search step in the direction.
7. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the deployment prediction refers to inputting real-time environmental state information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and corresponding departure time according to an output result of the model; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
8. A single-way collision-free AGV multi-target traffic control system comprising:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310634753.7A CN116629479A (en) | 2023-05-31 | 2023-05-31 | AGV multi-target traffic control method and system for single-way collision-free AGV |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310634753.7A CN116629479A (en) | 2023-05-31 | 2023-05-31 | AGV multi-target traffic control method and system for single-way collision-free AGV |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116629479A true CN116629479A (en) | 2023-08-22 |
Family
ID=87613172
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310634753.7A Pending CN116629479A (en) | 2023-05-31 | 2023-05-31 | AGV multi-target traffic control method and system for single-way collision-free AGV |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116629479A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117077882A (en) * | 2023-10-17 | 2023-11-17 | 之江实验室 | Unmanned equipment scheduling method and device, storage medium and electronic equipment |
-
2023
- 2023-05-31 CN CN202310634753.7A patent/CN116629479A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117077882A (en) * | 2023-10-17 | 2023-11-17 | 之江实验室 | Unmanned equipment scheduling method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108459503B (en) | Unmanned surface vehicle track planning method based on quantum ant colony algorithm | |
CN112833905B (en) | Distributed multi-AGV collision-free path planning method based on improved A-x algorithm | |
CN110807236A (en) | Warehouse logistics simulation system based on multiple robots | |
CN106779252B (en) | AGV real-time route planning method based on improved quantum ant colony algorithm | |
CN116629479A (en) | AGV multi-target traffic control method and system for single-way collision-free AGV | |
US11774947B2 (en) | Industrial internet of things for material transportation control, control methods and media thereof | |
CN112465192B (en) | Task scheduling method, device, equipment and medium | |
CN111352713B (en) | Automatic driving reasoning task workflow scheduling method oriented to time delay optimization | |
CN116151499A (en) | Intelligent multi-mode intermodal route planning method based on improved simulated annealing algorithm | |
CN117093009B (en) | Logistics AGV trolley navigation control method and system based on machine vision | |
CN114444809A (en) | Data-driven multi-target strip mine card path optimization method | |
Xia et al. | A multi-AGV optimal scheduling algorithm based on particle swarm optimization | |
CN116720703A (en) | AGV multi-target task scheduling method and system based on deep reinforcement learning | |
CN113222248B (en) | Automatic taxi-driving charging pile selection method | |
Wang et al. | Research on optimization of multi-AGV path based on genetic algorithm considering charge utilization | |
US11614491B2 (en) | Systems and methods for predicting the cycle life of cycling protocols | |
CN112734111B (en) | Horizontal transport task AGV dynamic time prediction method | |
Yuan et al. | Research on flexible job shop scheduling problem with AGV using double DQN | |
Lin et al. | Multi-task assignment of logistics distribution based on modified ant colony optimization | |
Jiao et al. | The Optimization Model of E‐Commerce Logistics Distribution Path Based on GIS Technology | |
Okubo et al. | Multi-Agent Action Graph Based Task Allocation and Path Planning Considering Changes in Environment | |
CN112149921A (en) | Large-scale electric logistics vehicle path planning method and system and charging planning method | |
CN112465176B (en) | Driving route planning method and device | |
CN117151596B (en) | Logistics management method, system and storage medium for storage AGVs (automatic guided vehicle) through Internet of things | |
CN114924593B (en) | Quick planning method for vehicle and multi-unmanned aerial vehicle combined route |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |