CN116629479A - AGV multi-target traffic control method and system for single-way collision-free AGV - Google Patents

AGV multi-target traffic control method and system for single-way collision-free AGV Download PDF

Info

Publication number
CN116629479A
CN116629479A CN202310634753.7A CN202310634753A CN116629479A CN 116629479 A CN116629479 A CN 116629479A CN 202310634753 A CN202310634753 A CN 202310634753A CN 116629479 A CN116629479 A CN 116629479A
Authority
CN
China
Prior art keywords
agv
traffic control
collision
free
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310634753.7A
Other languages
Chinese (zh)
Inventor
吴小倩
郑益民
吴庆耀
秦卓睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Bangqi Technology Intelligent Development Co ltd
Original Assignee
Shenzhen Bangqi Technology Intelligent Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Bangqi Technology Intelligent Development Co ltd filed Critical Shenzhen Bangqi Technology Intelligent Development Co ltd
Priority to CN202310634753.7A priority Critical patent/CN116629479A/en
Publication of CN116629479A publication Critical patent/CN116629479A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Operations Research (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application discloses a single-way collision-free AGV multi-target traffic control method and system, and particularly relates to the technical field of AGV collision-free, comprising the following steps of S10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database; s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient; s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; according to the application, the traffic control deep reinforcement learning model is utilized to perform normalization processing and full-connection linear operation, a sample training traffic control model is finally obtained, and the next scheduling option is predicted according to the output result of the sample training traffic control model, so that the AGV is facilitated to find the optimal scheduling value in each scheduling task, the collision-free running of the one-way road is automatically completed, the collision-free running efficiency of the AGV is improved, and the running cost of the AGV is reduced.

Description

AGV multi-target traffic control method and system for single-way collision-free AGV
Technical Field
The application relates to the technical field of AGV collision-free, in particular to a single-way collision-free AGV multi-target traffic control method and system.
Background
AGVs are short for automatic guided vehicles, which are automatically or manually loaded with goods and then automatically unloaded according to a set route or by pulling a loaded truck to a set place. Along with the continuous development of scientific technology, the AGV utilizes computer technology, artificial intelligence technique to realize how to understand tasks and plan reasonable paths, improves the performance of the AGV, reduces the cost of manufacturing industry, and therefore has wider scope in the automobile assembly line industry.
In the aspect of traffic control in the automobile assembly line industry, great progress has been made based on rule driving and heuristic learning, but most of the methods only emphasize a specific scheduling rule, neglect the real-time dynamic environment information of the whole system and only adapt to a specific certain AGV scheduling scene; and the time-consuming condition of collision waiting in the actual running of the AGV cannot be considered. Therefore, how to combine real-time environmental status information in collision-free constraint, so that less environmental resources are wasted, and meanwhile, the aim of reducing traffic collision blocking in the AGV driving process is considered, so that the problem to be solved in traffic control is solved.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present application provide a single-way collision-free AGV multi-target traffic control method and system, so as to solve the problems set forth in the above-mentioned background art.
In order to achieve the above purpose, the present application provides the following technical solutions: a single-way collision-free AGV multi-target traffic control method comprises the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; the traffic control depth intensity learning model is realized by the following formula:
x i =(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) Representing a traffic control deep reinforcement learning model, wherein the traffic control deep reinforcement learning model is obtained by iteratively accumulating scheduling values at the moment t and carrying out expected operation, and the Q value is used for reinforcement learning;
s40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
s50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
Preferably, the step S10 includes the following:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
Preferably, the step S20 includes the following:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
Preferably, the step S40 includes the following:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix.
Preferably, the step S40 further includes the following:
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model; the prediction submodule process comprises
Carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
Preferably, the step S40 further includes the following:
and carrying out iterative optimization by adopting a gradient descent method in the training sample prediction process, wherein the iterative optimization accords with the following expression:
wherein ,/>Representing the negative direction of the gradient, C j Representing the search step in the direction.
Preferably, the deployment prediction refers to inputting real-time environmental state information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and corresponding according to an output result of the modelDeparture time; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
A single-way collision-free AGV multi-target traffic control system comprising:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
The application has the technical effects and advantages that:
(1) According to the application, the AGV path task and the real-time map data are digitally marked to form real-time environment state information, the current single-way path resource occupation condition is calculated, the AGV collision-free evaluation coefficient is obtained, automatic adjustment of the sending speed and the time point of the AGV are realized according to the obtained AGV collision-free evaluation coefficient, normalization processing and full-connection linear operation are carried out by using a traffic control depth reinforcement learning model, a sample training traffic control model is finally obtained, the next scheduling option is predicted according to the output result of the sample training traffic control model, the AGV is facilitated to find the optimal scheduling value in each scheduling task, the single-way collision-free running of the AGV is automatically completed, the AGV collision-free running efficiency is improved, and the AGV running cost is reduced;
(2) The AGV multi-target traffic control method suitable for the single-way collision-free uses the resource occupation rule, so that real-time road condition information of a running path can be considered while the AGV runs, and the waste of single-way road path resources is reduced;
(3) The traffic control method provided by the application also introduces a deep reinforcement learning traffic control model, can pay attention to global environment information, dynamically feeds back real-time environment states, simultaneously introduces AGV collision waiting time consumption, can consider traffic collision blocking conditions in the AGV driving process, is more in line with the AGV traffic control in a specific scene, and has a great effect on improving the traffic control efficiency by multi-objective scheduling.
Drawings
FIG. 1 is a flow chart of the method of the present application.
Fig. 2 is a block diagram of the system architecture of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Traffic control: there are multiple material transport tasks in the manufacturing system, multiple AGV carts. Under certain constraint conditions, the execution sequence and departure time of the AGVs are required to be specified, and meanwhile, the real-time scheduling under the dynamic condition is carried out to complete the material transportation task, so that the whole system meets certain performance optimization indexes on the premise of no collision in a one-way road map.
Example 1
Referring to fig. 1-2, the embodiment provides a single-way collision-free AGV multi-target traffic control method, which specifically includes the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S10 includes the following steps:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
In the embodiment, the target real map and the path states of the AGVs are digitally marked to form a dynamic path planning map, so that collision-free calculation among a plurality of AGVs can be conveniently judged.
S20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S20 includes the following steps:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
In this embodiment, through obtaining the route state and the map state information of AGV and correlating, form AGV collision-free evaluation coefficient, according to actual conditions, select suitable waiting time consuming correction coefficient, comprehensively judge AGV collision-free evaluation coefficient, be favorable to the AGV to judge collision risk according to the calculation result to trace back the reason that appears collision risk, thereby be convenient for the AGV to carry out further processing.
S30, constructing a traffic control deep reinforcement learning model based on real-time environment state information;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S30 includes the following steps:
the traffic control depth intensity learning model is realized by the following formula:
x i =(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) The model represents a deep reinforcement learning model of traffic control, which is obtained by iteratively accumulating scheduling values at time t and performing expected operation, and the Q value is used for reinforcement learning.
In this embodiment, by using the convolutional neural network to obtain training sample characteristics and reduce data dimension for the AGV state and map state data, a traffic control deep reinforcement learning model is formed, which is favorable for giving different values according to different AGV states during initialization, further improves the accuracy and stability of the AGV collision-free data, and reduces the collision waiting time and the task completion time of the AGV while focusing on path resource waste.
S40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
referring to fig. 1-2, the present embodiment provides a single-way collision-free AGV multi-target traffic control method, and step S40 includes the following steps:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix,
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model;
the prediction submodule process comprises the following steps:
carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
And S403, performing iterative optimization by adopting a gradient descent method in the training sample prediction process.
The scheduling option overlap optimization conforms to the following expression:
wherein Representing the negative direction of the gradient, C j The search step in the direction is indicated,
in the implementation, a sample training traffic control model is obtained by a prediction submodule through a gradient descent method, the sample training traffic control model is subjected to iterative optimization according to the negative direction of the gradient, the gradient direction is obtained through normalization processing, the searching step length is determined by a linear searching algorithm, namely the score coordinate of the next point is regarded as z k+1 And then find a function that satisfies f (z k+1 ) Z of the minimum value of (2) k+1 And the sample training traffic control model is obtained through iterative optimization scheduling by a gradient descent method, so that the optimal solution of the objective function is found, and the AGV optimal scheduling value is obtained.
S50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
In this embodiment, it should be specifically described that the deployment prediction refers to inputting real-time environmental status information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and a corresponding departure time according to an output result of the model; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
Example 2
Referring to fig. 1-2, the present embodiment provides an AGV multi-target traffic control system with single-way collision free, which specifically includes the following contents:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
In summary, the AGV path task and the real-time map data are digitally marked to form real-time environment state information, the current single-way path resource occupation condition is calculated, the AGV collision-free evaluation coefficient is obtained, automatic adjustment of the sending speed and the time point of the AGV are achieved according to the obtained AGV collision-free evaluation coefficient, the traffic control depth reinforcement learning model is utilized to perform normalization processing and full-connection linear operation, the sample training traffic control model is finally obtained, the next scheduling option is predicted according to the output result of the sample training traffic control model, the AGV can find the optimal scheduling value in each scheduling task, the single-way collision-free running of the AGV is automatically completed, the AGV collision-free running efficiency is improved, and the running cost of the AGV is reduced.
It should be noted that the above-described working procedure is merely illustrative, and does not limit the scope of the present application, and in practical application, a person skilled in the art may select part or all of them according to actual needs to achieve the purpose of the embodiment, which is not limited herein.
Other embodiments or specific implementations of a single-way collision-free AGV multi-target traffic control method and system according to the present application may refer to the above method embodiments, and are not described herein.
Finally: the foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (8)

1. A single-way collision-free AGV multi-target traffic control method is characterized in that: the method comprises the following steps:
s10, firstly, digitally marking a single-way path resource and an AGV path task of a target reality map and storing the single-way path resource and the AGV path task in a database;
s20, acquiring real-time environment state information, calculating the current single-way road path resource occupation condition, and acquiring an AGV collision-free evaluation coefficient;
s30, constructing a traffic control deep reinforcement learning model based on real-time environment state information; the traffic control depth intensity learning model is realized by the following formula:
x i --(S t ,A t ,R t ,S t+1 )
A t =(a k )
wherein ,xi For the ith training sample, S t For the environmental state at time t, A t Action taken for time t, S t+1 Represents an AGV assigned to the next departure, R t For the scheduling value obtained after taking action at time t,
C path,t indicating the idle loss of the single-line road path resource in the map module at the moment t,
indicating the crash waiting loss of the AGV at time t,
indicating the total length of time lost for all AGVs to complete the task,
Q(S t ,A t ) Representing a traffic control deep reinforcement learning model, wherein the traffic control deep reinforcement learning model is obtained by iteratively accumulating scheduling values at the moment t and carrying out expected operation, and the Q value is used for reinforcement learning;
s40, based on the traffic control deep reinforcement learning model, carrying out normalization processing on feature vectors of training samples, obtaining a sample training traffic control model through a prediction submodule, and completing iterative optimization by using a gradient descent method in the prediction process;
s50, carrying out deployment prediction on the sample training traffic control model, wherein the deployment prediction is to input real-time environment state information into the sample training traffic control model, and predicting the next scheduling option according to the output result of the sample training traffic control model.
2. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S10 includes the following:
s101, marking the abscissa and the ordinate of the workshop range and the drivable single-way road path as [ x ] i ,y i ];
S102, marking the abscissa and ordinate of the starting point and the end point of the AGV driving path as [ x ] 0 ,y 0 ]、[x n ,n];
S103, marking the path resource number of the AGV to travel as pj, the travel speed of the AGV as vj, the type number of the AGV as aj and the safety distance as d.
3. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S20 includes the following:
s201, real-time environment state information comprises an AGV state and a map state;
the AGV state comprises a departure point, a target point, a running path resource number pj, a running speed vj, a category number aj and waiting time w;
the map state comprises the abscissa [ x ] of the workshop range and the travelable one-way road path i ,y i ]The time t when the path resource is occupied and the AGV driving safety distance d.
S202, integrating the AGV state and the map state to form an AGV collision-free evaluation coefficient;
the AGV collision-free evaluation coefficient is based on the occupation condition of the single-way road path resources, the starting time of the collision-free AGV is obtained, and the calculation formula of the AGV collision-free evaluation coefficient is as follows:
p=p k ∩p j
[x,y]=p[0][0]
s j =t[x][y]+d×v j -(p j -p)×v j
w[x][y]=p×(v k -v j )+C
wherein p is the collision path set of two AGVs, p k 、p j Form path resource numbers, p 0, respectively denoted as two AGVs][0]Is the first coordinate of the first section collision path under the condition of the same direction or opposite direction, s j Is a as j The earliest possible departure time, w [ x ]][y]Is a as j And C is a correction coefficient of waiting time consumption, and the correction is carried out by a user according to actual conditions.
S203, comparing the AGV collision-free evaluation coefficient with a preset AGV collision-free evaluation coefficient threshold value, if the AGV collision-free evaluation coefficient exceeds a preset range, indicating that the AGV is easy to collide, and carrying out self-adaptive adjustment on the running speed or waiting time of the AGV when the collision risk is about to occur.
4. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S40 includes the following:
s401, carrying out normalization processing on the feature vector of the training sample to obtain a calculation formula of the feature vector of the current input training sample, wherein the calculation formula is as follows:
wherein ,feature vector, x representing current input sample i1 、x i2 、…、x in Representing the currently required input information, M represents a learnable embedded transformation matrix.
5. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the step S40 further includes the following:
s402, inputting the feature vector of the training sample into a prediction submodule to obtain a sample training traffic control model; the prediction submodule process comprises the following steps:
carrying out normalization and full-connection linear operation on feature vectors of training samples to finally obtain a sample training traffic control model, wherein the specific formula is as follows:
wherein ,to normalize the result, z j Training a traffic control model for samples after full-join linear operation, W 1 、b 1 Is a parameter that can be adaptively learned; LN (. Alpha.) 1 ) Representation layer normalization operation,/->Representation layer feedforward neural network operation, +.>
6. A method of single-way collision-free AGV multi-target traffic control as in claim 5, wherein: the step S40 further includes the following:
and carrying out iterative optimization by adopting a gradient descent method in the training sample prediction process, wherein the iterative optimization accords with the following expression:
wherein ,/>Representing the negative direction of the gradient, C j Representing the search step in the direction.
7. The single-way collision-free AGV multi-target traffic control method according to claim 1, wherein: the deployment prediction refers to inputting real-time environmental state information into a trained traffic control model, and predicting a multi-objective optimized AGV departure sequence with minimum traffic time and minimum waiting delay and corresponding departure time according to an output result of the model; the next scheduling option of the deployment prediction is referred to as a feature vectorAnd obtaining the score of each scheduling option through single-layer full-connection layer and Softmax operation in the prediction submodule, and taking the scheduling option with the highest score as the prediction result of the next scheduling option in the current state.
8. A single-way collision-free AGV multi-target traffic control system comprising:
the traffic control digitizing module is used for digitally marking the single-way path resource and the AGV path task of the target real map and storing the single-way path resource and the AGV path task into a database;
the collision-free evaluation coefficient module is used for acquiring real-time environment state information, calculating the current single-way road path resource occupation condition and acquiring an AGV collision-free evaluation coefficient;
the traffic control depth reinforcement module is used for constructing a traffic control depth reinforcement learning model based on the real-time environmental state information;
the traffic control iteration optimization module is used for carrying out normalization processing on the feature vector of the training sample based on the traffic control deep reinforcement learning model, obtaining a sample training traffic control model through the prediction submodule, and completing iteration optimization by using a gradient descent method in the prediction process;
the traffic control deployment prediction sub-module inputs the real-time environment state information into the sample training traffic control model to conduct deployment prediction, and predicts the next scheduling option according to the output result of the sample training traffic control model.
CN202310634753.7A 2023-05-31 2023-05-31 AGV multi-target traffic control method and system for single-way collision-free AGV Pending CN116629479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310634753.7A CN116629479A (en) 2023-05-31 2023-05-31 AGV multi-target traffic control method and system for single-way collision-free AGV

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310634753.7A CN116629479A (en) 2023-05-31 2023-05-31 AGV multi-target traffic control method and system for single-way collision-free AGV

Publications (1)

Publication Number Publication Date
CN116629479A true CN116629479A (en) 2023-08-22

Family

ID=87613172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310634753.7A Pending CN116629479A (en) 2023-05-31 2023-05-31 AGV multi-target traffic control method and system for single-way collision-free AGV

Country Status (1)

Country Link
CN (1) CN116629479A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117077882A (en) * 2023-10-17 2023-11-17 之江实验室 Unmanned equipment scheduling method and device, storage medium and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117077882A (en) * 2023-10-17 2023-11-17 之江实验室 Unmanned equipment scheduling method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN108459503B (en) Unmanned surface vehicle track planning method based on quantum ant colony algorithm
CN112833905B (en) Distributed multi-AGV collision-free path planning method based on improved A-x algorithm
CN110807236A (en) Warehouse logistics simulation system based on multiple robots
CN106779252B (en) AGV real-time route planning method based on improved quantum ant colony algorithm
CN116629479A (en) AGV multi-target traffic control method and system for single-way collision-free AGV
US11774947B2 (en) Industrial internet of things for material transportation control, control methods and media thereof
CN112465192B (en) Task scheduling method, device, equipment and medium
CN111352713B (en) Automatic driving reasoning task workflow scheduling method oriented to time delay optimization
CN116151499A (en) Intelligent multi-mode intermodal route planning method based on improved simulated annealing algorithm
CN117093009B (en) Logistics AGV trolley navigation control method and system based on machine vision
CN114444809A (en) Data-driven multi-target strip mine card path optimization method
Xia et al. A multi-AGV optimal scheduling algorithm based on particle swarm optimization
CN116720703A (en) AGV multi-target task scheduling method and system based on deep reinforcement learning
CN113222248B (en) Automatic taxi-driving charging pile selection method
Wang et al. Research on optimization of multi-AGV path based on genetic algorithm considering charge utilization
US11614491B2 (en) Systems and methods for predicting the cycle life of cycling protocols
CN112734111B (en) Horizontal transport task AGV dynamic time prediction method
Yuan et al. Research on flexible job shop scheduling problem with AGV using double DQN
Lin et al. Multi-task assignment of logistics distribution based on modified ant colony optimization
Jiao et al. The Optimization Model of E‐Commerce Logistics Distribution Path Based on GIS Technology
Okubo et al. Multi-Agent Action Graph Based Task Allocation and Path Planning Considering Changes in Environment
CN112149921A (en) Large-scale electric logistics vehicle path planning method and system and charging planning method
CN112465176B (en) Driving route planning method and device
CN117151596B (en) Logistics management method, system and storage medium for storage AGVs (automatic guided vehicle) through Internet of things
CN114924593B (en) Quick planning method for vehicle and multi-unmanned aerial vehicle combined route

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination