WO2023079690A1 - Dispositif de commande, système de correction de trafic, procédé de commande et programme - Google Patents
Dispositif de commande, système de correction de trafic, procédé de commande et programme Download PDFInfo
- Publication number
- WO2023079690A1 WO2023079690A1 PCT/JP2021/040810 JP2021040810W WO2023079690A1 WO 2023079690 A1 WO2023079690 A1 WO 2023079690A1 JP 2021040810 W JP2021040810 W JP 2021040810W WO 2023079690 A1 WO2023079690 A1 WO 2023079690A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- state
- control device
- dynamics
- vehicle
- traffic
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 50
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 238000013459 approach Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 description 26
- 238000004891 communication Methods 0.000 description 16
- 230000007704 transition Effects 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 12
- 238000013500 data storage Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000000354 decomposition reaction Methods 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 239000013598 vector Substances 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 238000009795 derivation Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 238000007430 reference method Methods 0.000 description 4
- 238000007796 conventional method Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 206010039203 Road traffic accident Diseases 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012887 quadratic function Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 102100025471 Epiphycan Human genes 0.000 description 1
- 101001056751 Homo sapiens Epiphycan Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005312 nonlinear dynamic Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/16—Anti-collision systems
Definitions
- the present invention relates to technology for autonomously rectifying traffic (which may also be referred to as traffic control) by multiple vehicles.
- the present invention has been made in view of the above points, and aims to provide technology for realizing traffic rectification without using traffic lights by autonomously controlling each vehicle.
- the control in a traffic control system includes a plurality of mobile bodies equipped with a control device, and the plurality of mobile bodies autonomously rectifies traffic so as to prevent collisions between the mobile bodies.
- a device Between mobiles based on state update dynamics, including sub-dynamics for state updates of said mobile and sub-dynamics for message passing between other mobiles in proximity to said mobile.
- a state updating unit that updates the state of the moving object under constraints for deterring collisions; an output unit that outputs the state updated by the state update unit and a message;
- a controller is provided comprising:
- a technology for realizing traffic rectification without using traffic lights by autonomously controlling each vehicle.
- FIG. 3 shows a NODE-based DNN architecture
- Fig. 3 shows an extended NODE-based DNN architecture following a state-space model
- Fig. 3 shows a NOS-based DNN architecture
- FIG. 1 illustrates Algorithm 1
- FIG. 4 is a diagram showing representative parameters; It is a figure for demonstrating an experiment. It is a figure for demonstrating an experiment.
- FIG. 2 is a diagram showing an example of the functional configuration of the vehicle 1;
- FIG. 2 is a diagram showing a functional configuration of a control device 100;
- FIG. 3 is a diagram showing a functional configuration of a control server 200;
- FIG. It is a figure which shows the hardware configuration example of an apparatus.
- FIG. 1 shows a configuration example of a traffic control system according to this embodiment.
- the traffic straightening system has a plurality of nodes, and each node wirelessly communicates with other nearby nodes. Connections between nodes (connections for wireless communication) are called edges.
- this traffic control system for example, vehicles can move without collisions.
- a node is not limited to a specific object, but in this embodiment, it is assumed that a node is a vehicle traveling on a road. In the following, a node may also be referred to as a "vehicle”. A node may also be called a "moving object”.
- Figure 2 shows an image of multiple vehicles running on the road.
- Each vehicle performs autonomous speed control so as not to collide with other vehicles and approach a target speed, based on communication with nearby vehicles and state updates by a DNN (deep neural network). running.
- a DNN deep neural network
- speed is used as the state, but this is an example. It is also possible to use states other than velocity, or both velocity and non-velocity. Other states than speed include, for example, route, lane, and steering direction.
- the neural network model installed in each vehicle is a model for solving the initial value problem of ordinary differential equations expressing state update dynamics.
- the ordinary differential equation can be expressed, for example, as follows.
- M1 corresponds to in-vehicle state updates
- M2 corresponds to near vehicle-to-vehicle communication (message passing).
- the state (x) is updated according to the dynamics (M 1 +M 2 ) under the constraint of collision avoidance (Ax+b ⁇ 0).
- M1 and M2 A detailed example of M1 and M2 will be described later.
- an ordinary differential equation is discretized to form a control algorithm (neural network) that alternately repeats (1) vehicle internal state update and (2) vehicle-to-vehicle communication.
- Fig. 3 shows an image of the above control algorithm when the number of vehicles is N.
- the in-vehicle state update in each vehicle and the communication between each vehicle's neighboring vehicles are alternately repeated.
- the overall computation is distributed and parallelized, so the computation on each vehicle is lightweight.
- inter-vehicle communication is close proximity sparse communication that does not put pressure on the NW band.
- the adjoint method is used for the neural network learning (learning the parameter ⁇ ) that updates the state as described above. That is, efficient differential calculation is performed by back propagation calculation of a neural network based on the adjoint method, and the parameter ( ⁇ ) is learned so as to accelerate the average speed toward the target speed under the constraint of collision prevention.
- the parameter ( ⁇ ) is common to all vehicles, but it is also possible to learn so that each vehicle uses a different parameter ⁇ .
- FIG. 4 is a diagram more specifically showing the system when the node is a vehicle.
- the control device 100 has the DNN model described above, and performs state acquisition, state update, and state output.
- a control server 200 is provided.
- the control server 200 receives information such as the state from each vehicle (may be the cost calculated by each vehicle), learns the parameter ( ⁇ ), and transmits the learned parameter ⁇ to each vehicle. .
- control server 200 The provision of the control server 200 is an example. Each vehicle may learn the parameter ⁇ by itself without the control server 200 .
- NODE Neural Ordinary Differential Equation
- IVPs Initial Value Problems
- FIG. 5 shows a NODE-based DNN architecture. As shown in FIG. 5, the states are updated sequentially and the cost is calculated based on each state.
- an external control input o(t) is further introduced, which NODE does not have.
- IVP can be defined by the following equation (1).
- NODE-based DNN architecture can be constructed as a basic IVP discretization
- different discretization methods in ODE solvers such as higher-order Runge-Kutta solvers, construct different NODE-based DNN architectures. available for
- NODE allows us to handle basic IVPs, its application to large-scale systems with complex dynamics M, which can be represented as a graph consisting of many subsystems (nodes) and their connections (edges), is difficult.
- An example of such a system is the system of vehicles that rectifies traffic as described above.
- the present embodiment adopts federated dynamics learning using neural operator splitting (NOS) as an extension of NODE.
- NOS neural operator splitting
- the overall dynamics M is decomposed into (i) M1 for state transitions of each node and (ii) M2 for message passing between nodes for efficient management of large-scale systems in graph form. and is represented by the following formula (2).
- M1 can be further decomposed into sub-dynamics at each node.
- M2 is set to have no learnable parameters as it relates to message passing.
- the state transition is the residual , recurrent, and alternating forms.
- state domain relaxation is performed to impose physical constraints. That is, in equation (2), the state domain of M1 and the state domain of M2 are the same as x, but in the NOS of this embodiment, in order to impose physical constraints on the state variables of many nodes, Relax the state domain constraint of equation (2).
- M2 By designing M2 as 0, some distance is preserved between other vehicles. As a result, it is expected that the parameter search area will be appropriately limited, that, for example, vehicle collisions will be prevented, and that .theta. will be learned quickly and stably.
- NOS is associated with a constrained cost minimization problem derived by the ADMM method (Alternating Direction Method of Multipliers).
- extended NODE following the state-space model
- extended NODE is an extended NODE-based DNN architecture following a state-space model assuming a noisy, nonlinear, and indirect observation process for the state variable x in real applications.
- R(t) is the covariance matrix of the measurement noise ⁇
- T represents the transpose of the matrix.
- the measurement noise follows a Gaussian distribution ⁇ ⁇ Norm(0, ⁇ 2 )
- J(x, ⁇ ) for learning ⁇ is given by equation (5) below: given by the constrained cost integral minimization problem.
- H in equation (4) is the identity operator, as assumed in NODE.
- the learning of ⁇ for the optimal dynamics model M is performed by alternately/repeatingly performing successive forward propagation (3) and successive backward propagation (7) using g done.
- the forward Euler method (8), backward Euler method (9), and Crank-Nicholson method (second-order Runge-Kutta) shown below can be used to approximate differential equations with discrete state transition rules.
- the NODE-based DNN architecture can be used for recursive stacks of K iterations based on residual state updates ((8), (9), or (10)) as shown in FIGS. be.
- the forward Euler method can be regarded as a ResNet.
- the adjoint method can be used to calculate the gradient in learning ⁇ .
- the NODE-based DNN architecture is extended to follow FIG. 6 by discretizing the continuous backward propagation (7), as shown in equation (11) below.
- a NOS-based DNN architecture for federated dynamics learning is introduced, and traffic rectification is achieved by alternately performing vehicle state updating (i) and message passing (ii) between vehicles. I am trying to solve the problem.
- NOS Neurological operator decomposition
- PRS and DRS are selected among these. Because they are based on the Crank-Nicholson method (10) with second-order accuracy, both are expected to achieve accurate state transitions.
- PRS-Net and DRS-Net are shown in FIG. From equations (12) and (13), PRS-Net and DRS-Net differ in that DRS-Net uses an averaging operation of twice the step size. Learning of ⁇ in NOS (PRS-Net, DRS-Net, etc.) can be performed according to the adjoint method described above.
- ADMM flow it takes into account the discrete time limitation to create a continuous ODE form of (discrete) ADMM.
- ⁇ M 1 , M 2 ⁇ are specified by the (sub)differential of the cost function V minimized with an affine transformation using (A T A) ⁇ 1 .
- V1 is a smooth but non-convex function
- V2 is a convex function but may not be differentiable.
- Equation (14) differs from the ADMM flow in the following ways: (i) the parameter ⁇ in V 1 is learnable and non-convex, (ii) V 2 may not be differentiable One, (iii) a bias b is added for generalization of the transform, (iv) ⁇ A,b ⁇ can be time-varying.
- Equation (14) we relate it to the constrained cost minimization problem.
- ⁇ V 1 , V 2 ⁇ we constrain ⁇ V 1 , V 2 ⁇ to be convex and make the constraint parameters ⁇ A, b ⁇ time-invariant.
- equation (15) A linearly constrained convex minimization problem is given by the following equation (15).
- Equation (16) Equation (16) below.
- V 1 in equation (14) can be non-convex, but if ⁇ t is small enough, it can be Taylor expanded around the previous state x k ⁇ 1 as in equation (17) below can be approximated by a local quadratic form.
- G(t) consists of a set N(t) of N(t) nodes and a set E(t) of E(t) edges.
- E i (t) ⁇ j ⁇ N(t)
- control input o i includes the edge information E i and the time-varying constraint parameters ⁇ A i
- j ⁇ is associated with ⁇ A,b ⁇ in equation (14) as follows.
- Equation (18) For federated dynamics learning, a cost function for state transition is formulated as shown in Equation (18) below.
- V local,i is designed to accelerate the velocity state towards the target
- inequality constraints are designed to maintain inter-vehicle distance Designed.
- Equation (18) Equation (19) below.
- the indicator function is expressed as Equation (20) below.
- the cost function (21) having the quadratic approximation (17) is substituted into the dynamics (14) in PRS-Net (12) and DRS-Net (13), so that federated dynamics learning A NOS-based DNN architecture for is defined.
- Each vehicle autonomously updates its state (eg, speed) using external control inputs (eg, inputs from cameras/LiDAR) and message passing between nearby vehicles.
- state e.g, speed
- external control inputs e.g, inputs from cameras/LiDAR
- the behavior of the distributed control system is a function of state transition dynamics, and the inputs are the previous state x and the control input o. Also, NOS-based federated dynamics learning is used to optimize the parameter ⁇ of the state transition dynamics.
- This technology describes using an ADMM to set each vehicle to a target speed while maintaining a certain distance between vehicles.
- problems (i) extra link nodes are needed to connect vehicle nodes; (ii) there are no learnable parameters in the state transition dynamics.
- the technology according to the present embodiment (i) realizes completely decentralized traffic rectification consisting only of vehicle nodes, and (ii) realizes federated dynamics learning for obtaining optimal autonomous state transition dynamics. are doing.
- a bar at the beginning of the character is indicated in front of the character, such as “ - ⁇ ”.
- the goal of federated dynamics learning for decentralized signal-free traffic rectification is to keep x i (t) close to the target x tar ⁇ [0,1] while keeping some distance between vehicles to avoid collisions. Na- is to find ⁇ .
- Equation (4) the following function is selected as the cost function in Equation (4).
- Each vehicle runs in the center of a single-lane road and overtaking is prohibited. That is, each vehicle follows the other vehicle across the intersection.
- the control inputs o i of each vehicle are the normalized velocity o spd,i , the 2D direction vector o dir,i , the 2D position vector o pos,i , the surrounding image o img,i of each vehicle, the mapping vector o map, i
- j is for converting the 2D positions into scalar values for measuring the distance from the i th vehicle to the j th vehicle.
- the quadratic function is the learnable acceleration
- This term uses ⁇ (0,1) to approximate x i (t) to the point between x tar and the previous state x i,k ⁇ 1 as follows: be.
- the value range limit for x i is x i ⁇ [0,1].
- ⁇ differs between PRS-NET and DRS-NET. Also, [ ] ⁇ [0, 1] indicates that the internal element value is clamped to the range between 0 and 1,
- i ⁇ are chosen to make the left hand side of equation (23) positive.
- ⁇ A, b ⁇ cannot be uniquely determined because there is ambiguity when converting equation (23) into the form of the inequality constraint Ax+b ⁇ 0 shown in equation (18).
- ⁇ A,b ⁇ is associated with each vehicle's acceleration/deceleration. For example, when A i
- a front flag/rear (back) flag to prevent the vehicle in front from braking as much as possible.
- j ⁇ in this embodiment is as shown in Equation (24) below.
- Equation (25) above corresponds to equation (23).
- equation (23) From the update rule for x in equation (22), it can be seen that vehicle i brakes when A i
- j ⁇ is to avoid collisions by adjusting the acceleration/deceleration of the vehicle behind after assigning a front/rear flag to each vehicle pair. is.
- the front/rear allocation is determined by the position and orientation of the vehicle pair.
- the current speed and position can be used to determine the assignment of the front/back flag with the estimated remaining time to the center of the intersection.
- equation (23) is to accelerate/decelerate the trailing vehicle as shown in equation (24). is done by normalizing A i
- Algorithm 1 Alg.1
- Algorithm 1 Algorithm 1
- each node i performs control input, edge connection, and acquisition of parameters ⁇ A i
- each node i performs message passing (receiving message z j
- each node updates its internal state based on equation (22). This allows node i to compute k+1 velocities (x i ) and k+1 messages (z i
- Each vehicle (or control server 200) performs backward propagation (11) by using the recorded data ⁇ x, o, A, b ⁇ to update ⁇ .
- Federated dynamics learning is performed by iterative I-rounds of forward-propagation/back-propagation.
- FIG. 7 shows an image in which the control server 200 records data ⁇ x, o, A, b ⁇ at each discrete time and performs backward propagation parameter learning based on cost calculation. ing.
- N 30 vehicles were randomly placed in the traffic simulator.
- 10 road maps were prepared. To avoid overfitting to the training data, each road map has small random perturbations for straight road lengths and intersection locations.
- An image o img,i of each vehicle and its surroundings was generated at each discrete time instant as the control input for determining the acceleration/deceleration of each vehicle.
- the size of the image is defined as 64 (W) x 64 (H) x 5 (Ch).
- Each channel consists of (1) the surrounding roads, (2) the surrounding vehicle positions, (3) their normalized velocities, and (4,5) the 2D direction vectors of the surrounding vehicles.
- Each image is rotated so that each vehicle is facing the same direction. Examples of o img for vehicle 1 and vehicle 30 in FIG. 10 are shown in FIGS. 11 and 12, respectively.
- FIG. 13 shows an image of repeating state update and learning for each round.
- PRS-Net and DRS-Net are set to Alg. 1.
- PRS-Net and DRS-Net are set to Alg. 1.
- PnP-Net uses state transitions implemented in recursive DNNs for updating x. The method avoided analytical derivation as much as possible and learned in a data-driven manner. The DNN parameter sizes of this PnP-Net were initially set as close as possible to the NOS parameter values.
- both PnP-Net and learning-impossible NOS alternately perform (i) internal state update and (ii) message passing between adjacent vehicles.
- FIG. 14 shows the experimental results.
- FIG. 14 shows that the average normalized speed changes with increasing number of rounds.
- the performance difference between them is small. That is, both methods are effective for NOS-based associative dynamics learning.
- - ⁇ was updated to maintain the inter-vehicle distance as much as possible.
- the average normalized speed is 0.30 for non-learnable PRS-Net and 0 for non-learnable DRS-Net. .31, and 0.56 with SUMO's native traffic control system.
- Experimental results show that NOS-based associative dynamics learning is effective.
- FIG. 15 is a diagram that more clearly shows the difference between NOS-based joint dynamics learning (proposed method) and the conventional method of SUMO implementation (conventional method). As shown in FIG. 15, in the proposed method, the normalization speed approaches the target value as learning progresses, and the speed is greatly improved over the conventional method (SUMO implementation).
- FIG. 16 shows a convergence curve for the evaluation set, showing better performance as it approaches 0.0.
- the loss value proportional to the difference between the normalized speed target value and the current value decreases with learning.
- FIG. 17 shows a configuration example of the vehicle 1.
- the vehicle 1 has a camera 11, a sensor 12, a control device 100, a communication section 13, and a drive section .
- Camera 11 acquires an image of the surroundings.
- the sensor 12 acquires its own position information by Lidar, GPS, or the like, for example.
- Sensor 12 may include the ability to acquire its own velocity. Also, a sensor that further acquires information other than these may be mounted.
- the control device 100 inputs external information acquired by the camera 11 and the sensor 12, performs the processing of Algorithm 1, and outputs the state (speed) x and the message z.
- the control device 100 may also include a function of parameter learning by backward propagation based on Equation (11).
- the communication unit 13 receives a message transmitted from another adjacent vehicle, passes the message to the control device 100, and transmits a message output from the control device 100 to the other adjacent vehicle. In addition, when learning is performed by the control server 200, the communication unit 13 transmits the recorded data ⁇ x, o, A, b ⁇ obtained in the state update to the control server 200, and the latest learned parameter ⁇ Received from control server 200 .
- the drive unit 14 includes a function (engine, motor, etc.) for running according to the state x output from the control device 100 . For example, when a certain speed is output as a state from the control device 100, the vehicle is driven so as to run at that speed.
- a function engine, motor, etc.
- FIG. 18 shows a configuration example of the control device 100.
- the control device 100 includes an input section 110 , a state update section 120 , an output section 130 , a data storage section 140 and a learning section 150 . Note that when learning is performed by the control server 200 , the learning unit 150 may not be provided in the control device 100 .
- the input unit 110 inputs external information o acquired by the camera 11 and the sensor 12 and a message z received from an adjacent vehicle.
- the state update unit 120 is a DNN that implements a NOS that updates the states of x and z according to Algorithm 1 .
- the output unit 130 outputs the state x obtained by the state update unit 120 and the message z.
- the data storage unit 140 records the data ⁇ x, o, A, b ⁇ obtained in the process of processing by the state update unit 120 for each discrete time. Further, the data storage unit 140 stores the latest learned parameter ⁇ , and the state update unit 120 executes state update processing using the latest learned parameter ⁇ .
- the learning unit 150 learns the parameter ⁇ by backward propagation (11) using the recorded data ⁇ x, o, A, b ⁇ , and stores the learned parameter ⁇ in the data storage unit 140.
- FIG. 19 shows a configuration example of the control server 200.
- the control server 200 includes an input unit 210, a learning unit 220, an output unit 230, and a data storage unit 240.
- the input unit 210 receives recorded data ⁇ x, o, A, b ⁇ from each vehicle.
- the data storage unit 240 stores recorded data ⁇ x, o, A, b ⁇ received from each vehicle.
- the learning unit 220 learns the parameter ⁇ by backward propagation (11) using the recorded data ⁇ x, o, A, b ⁇ stored in the data storage unit 240 .
- the output unit 230 transmits the learned parameter ⁇ to each vehicle.
- Both the control device 100 and the control server 200 can be realized, for example, by causing a computer to execute a program.
- the device can be realized by executing a program corresponding to the processing performed by the device using hardware resources such as a CPU, GPU, and memory built into the computer.
- the above program can be recorded in a computer-readable recording medium (portable memory, etc.), saved, or distributed. It is also possible to provide the above program through a network such as the Internet or e-mail.
- FIG. 20 is a diagram showing a hardware configuration example of the computer.
- the computer of FIG. 20 has a drive device 1000, an auxiliary storage device 1002, a memory device 1003, a processor 1004, an interface device 1005, a display device 1006, an input device 1007, an output device 1008, etc., which are interconnected by a bus BS.
- the processor 1004 may be a CPU, a GPU, or both a CPU and a GPU.
- a program that implements the processing in the computer is provided by a recording medium 1001 such as a CD-ROM or memory card, for example.
- a recording medium 1001 such as a CD-ROM or memory card
- the program is installed from the recording medium 1001 to the auxiliary storage device 1002 via the drive device 1000 .
- the program does not necessarily need to be installed from the recording medium 1001, and may be downloaded from another computer via the network.
- the auxiliary storage device 1002 stores installed programs, as well as necessary files and data.
- the memory device 1003 reads and stores the program from the auxiliary storage device 1002 when a program activation instruction is received.
- the processor 1004 implements the functions of the light touch maintaining device 100 according to programs stored in the memory device 1003 .
- the interface device 1005 is used as an interface for connecting to a network or the like.
- a display device 1006 displays a GUI (Graphical User Interface) or the like by a program.
- An input device 1007 is composed of a keyboard, a mouse, buttons, a touch panel, or the like, and is used to input various operational instructions.
- the output device 1008 outputs the calculation result.
- NOS is proposed as an extension of NODE for federated dynamics learning, and the two subdynamics of equation (2) are applied to (i) internal state updating and (ii) message passing. decided to assign.
- NOS-based DNN architectures such as PRS-Net and DRS-Net through the discretization of equation (2) based on the operator decomposition method.
- NOS has also been successfully applied to the problem of signal-free traffic rectification with the goal of finding the dynamics parameter - ⁇ that maximizes the average speed to a target value.
- a control device in a traffic rectification system comprising a plurality of mobile bodies equipped with a control device, wherein the plurality of mobile bodies autonomously rectify traffic so as to prevent collisions between the mobile bodies, Between mobiles based on state update dynamics, including sub-dynamics for state updates of said mobile and sub-dynamics for message passing between other mobiles in proximity to said mobile. a state updating unit that updates the state of the moving object under constraints for deterring collisions; an output unit that outputs the state updated by the state update unit and a message; A control device comprising: (Section 2) 2.
- the control device uses a neural network to alternately repeat the state update of the moving object and the message passing.
- (Section 3) 3.
- the state update unit calculates the state at the next point in time based on the state at the point in time, the external control input at the point in time, and the message received from another mobile object at the point in time. 4.
- the control device according to any one of items 3 to 3. (Section 5) 5.
- the control device wherein the restriction is to set the distance between the mobile body and the other mobile body to a predetermined distance or more, which is calculated based on the external control input.
- (Section 6) a plurality of mobile bodies equipped with the control device according to any one of items 1 to 5; a control server that updates the parameters of the neural network provided in the state update unit by back propagation calculation based on the adjoint method so that the average speed of a plurality of moving bodies approaches a target speed under the constraints. .
- (Section 7) A control method executed by a control device in a traffic control system comprising a plurality of mobile bodies equipped with a control device and autonomously controlling traffic so as to prevent collisions between the mobile bodies.
- a control method comprising: (Section 8) A program for causing a computer to function as each unit in the control device according to any one of items 1 to 5.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Traffic Control Systems (AREA)
Abstract
Dispositif de commande disposé dans chaque corps mobile d'une pluralité de corps mobiles qui sont présents dans un système de correction de trafic, la pluralité de corps mobiles effectuant de manière autonome une correction de trafic pour empêcher des collisions entre ces corps mobiles, ledit dispositif de commande comprenant : une unité de mise à jour d'états qui met à jour les états des corps mobiles sollicités pour empêcher les collisions entre les corps mobiles, sur la base d'une dynamique de mise à jour d'états comprenant une sous-dynamique permettant de mettre à jour les états des corps mobiles et une sous-dynamique permettant le passage de messages entre chaque corps mobile et d'autres corps mobiles à proximité ; et une unité de sortie qui délivre les états mis à jour par l'unité de mise à jour d'états et un message.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/040810 WO2023079690A1 (fr) | 2021-11-05 | 2021-11-05 | Dispositif de commande, système de correction de trafic, procédé de commande et programme |
JP2023557546A JPWO2023079690A1 (fr) | 2021-11-05 | 2021-11-05 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/040810 WO2023079690A1 (fr) | 2021-11-05 | 2021-11-05 | Dispositif de commande, système de correction de trafic, procédé de commande et programme |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023079690A1 true WO2023079690A1 (fr) | 2023-05-11 |
Family
ID=86240905
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/040810 WO2023079690A1 (fr) | 2021-11-05 | 2021-11-05 | Dispositif de commande, système de correction de trafic, procédé de commande et programme |
Country Status (2)
Country | Link |
---|---|
JP (1) | JPWO2023079690A1 (fr) |
WO (1) | WO2023079690A1 (fr) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010176353A (ja) * | 2009-01-29 | 2010-08-12 | Toyota Motor Corp | 隊列走行制御システム |
JP2019155949A (ja) * | 2018-03-07 | 2019-09-19 | ワブコジャパン株式会社 | 隊列走行方法および車両 |
-
2021
- 2021-11-05 JP JP2023557546A patent/JPWO2023079690A1/ja active Pending
- 2021-11-05 WO PCT/JP2021/040810 patent/WO2023079690A1/fr active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010176353A (ja) * | 2009-01-29 | 2010-08-12 | Toyota Motor Corp | 隊列走行制御システム |
JP2019155949A (ja) * | 2018-03-07 | 2019-09-19 | ワブコジャパン株式会社 | 隊列走行方法および車両 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2023079690A1 (fr) | 2023-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhu et al. | A survey of deep RL and IL for autonomous driving policy learning | |
Huegle et al. | Dynamic input for deep reinforcement learning in autonomous driving | |
Chen et al. | Fastrack: a modular framework for real-time motion planning and guaranteed safe tracking | |
Gidado et al. | A survey on deep learning for steering angle prediction in autonomous vehicles | |
Xu et al. | Two-layer distributed hybrid affine formation control of networked Euler–Lagrange systems | |
Zhao et al. | Combined longitudinal and lateral control for heterogeneous nodes in mixed vehicle platoon under V2I communication | |
Xu et al. | Heuristic and random search algorithm in optimization of route planning for Robot’s geomagnetic navigation | |
Kurt | Hybrid-state system modelling for control, estimation and prediction in vehicular autonomy | |
Liu et al. | Heuristics‐oriented overtaking decision making for autonomous vehicles using reinforcement learning | |
Németh et al. | Optimal control of overtaking maneuver for intelligent vehicles | |
Liu et al. | A new path plan method based on hybrid algorithm of reinforcement learning and particle swarm optimization | |
Eilers et al. | Learning the human longitudinal control behavior with a modular hierarchical bayesian mixture-of-behaviors model | |
Banerjee et al. | A survey on physics informed reinforcement learning: Review and open problems | |
Youssef et al. | Comparative study of end-to-end deep learning methods for self-driving car | |
WO2023079690A1 (fr) | Dispositif de commande, système de correction de trafic, procédé de commande et programme | |
Sun et al. | Feedback enhanced motion planning for autonomous vehicles | |
Wang et al. | Safe Reinforcement Learning for Automated Vehicles via Online Reachability Analysis | |
Xidias | A decision algorithm for motion planning of car-like robots in dynamic environments | |
Maheshwari et al. | PIAug--Physics Informed Augmentation for Learning Vehicle Dynamics for Off-Road Navigation | |
He et al. | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations | |
Choi et al. | Density matching reward learning | |
COVACIU | Development of a virtual reality simulator for an autonomous vehicle | |
Vikas et al. | Multi-robot path planning using a hybrid dynamic window approach and modified chaotic neural oscillator-based hyperbolic gravitational search algorithm in a complex terrain | |
Hu et al. | Constraint‐Following Approach for Platoon Control Strategy of Connected Autonomous Vehicles | |
Zarei et al. | Experimental study on optimal motion planning of wheeled mobile robot using convex optimization and receding horizon concept |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21963285 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023557546 Country of ref document: JP |