CN112867083A - Delay tolerant network routing algorithm based on multi-agent reinforcement learning - Google Patents

Delay tolerant network routing algorithm based on multi-agent reinforcement learning Download PDF

Info

Publication number
CN112867083A
CN112867083A CN202011588326.2A CN202011588326A CN112867083A CN 112867083 A CN112867083 A CN 112867083A CN 202011588326 A CN202011588326 A CN 202011588326A CN 112867083 A CN112867083 A CN 112867083A
Authority
CN
China
Prior art keywords
delay tolerant
tolerant network
community
algorithm
agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011588326.2A
Other languages
Chinese (zh)
Inventor
姚海鹏
韩晨晨
忻向军
张尼
童炉
李韵聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tibet Gaochi Science And Technology Information Industry Group Co ltd
Beijing University of Posts and Telecommunications
Original Assignee
Tibet Gaochi Science And Technology Information Industry Group Co ltd
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tibet Gaochi Science And Technology Information Industry Group Co ltd, Beijing University of Posts and Telecommunications filed Critical Tibet Gaochi Science And Technology Information Industry Group Co ltd
Priority to CN202011588326.2A priority Critical patent/CN112867083A/en
Publication of CN112867083A publication Critical patent/CN112867083A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/14Routing performance; Theoretical aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/20Hop count for routing purposes, e.g. TTL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a delay tolerant network routing algorithm based on multi-agent reinforcement learning, which is characterized by comprising the following steps of: firstly, a Louvian clustering algorithm is carried out on a delay tolerant network node, and a centralized and distributed hierarchical architecture is provided; secondly, modeling a next hop problem selected by the DTN node into a distributed partially observable Markov decision process (Dec-POMDP) model by combining positive social characteristics; compared with the prior art, the technical scheme of the patent provides a layered architecture compared with the existing time delay tolerant network routing scheme based on social attributes, and can conveniently capture social information of edge equipment; on one hand, the routing decision issued by the computing center is executed in a distributed mode, and on the other hand, the routing algorithm is trained in a centralized mode in the computing center according to the state transmitted by the service unit. The method can more effectively utilize social characteristics to carry out route forwarding in the delay tolerant network, so that the delivery rate is improved and the average delay is reduced.

Description

Delay tolerant network routing algorithm based on multi-agent reinforcement learning
Technical Field
The invention relates to the technical field of network routing algorithms, in particular to a delay tolerant network routing algorithm based on multi-agent reinforcement learning.
Background
A Delay Tolerant Network (DTN) is a wireless ad hoc network that employs store-carry-forward routing decisions in a network environment where the end-to-end path does not exist a priori. Compared with the traditional wireless network, the DTN has higher flexibility and can be better applied to the network environment with high time delay and frequent link disconnection.
Currently, many routing protocols are used to handle delay tolerant networks, most of which rely on comparisons between each node's metrics to make forwarding policies. However, the efficiency of message delivery is poor due to the unreliability of the link. The social-based approach is more promising than the opportunity-based routing protocol because social attributes are more stable than mobility in predicting and processing routes in DTNs. However, these algorithms transfer a large amount of messages to nodes with high social indicators without limitation, which results in an excessively large queue length of a node buffer and affects the overall routing performance.
Disclosure of Invention
Aiming at the problem that the delivery rate effect of the conventional routing algorithm in the delay tolerant network is poor, the invention provides a delay tolerant network routing algorithm based on multi-agent reinforcement learning; subsequently, the DTN node selection next hop problem is modeled as a distributed partially observable markov decision process (Dec-POMDP) model in conjunction with positive social characteristics. And finally, solving through the collaborative multi-agent reinforcement learning QMIX.
A delay tolerant network routing algorithm based on multi-agent reinforcement learning is characterized by comprising the following steps:
firstly, a Louvian clustering algorithm is carried out on a delay tolerant network node, and a centralized and distributed hierarchical architecture is provided;
secondly, modeling a next hop problem selected by the DTN node into a distributed partially observable Markov decision process (Dec-POMDP) model by combining positive social characteristics;
and thirdly, solving through the collaborative multi-agent reinforcement learning QMIX.
Preferably, the algorithm introduces two different centrality indices, at the community level, while taking into account the discount factor over time,
Figure RE-GDA0003007077110000021
represents Community CiLocal centrality index of (a):
Figure RE-GDA0003007077110000022
wherein M represents the total number of connections of the node s to the node d, phi (0< phi <1) refers to the discount coefficient, t refers to the time index:
Figure RE-GDA0003007077110000023
wherein, TnowIs the simulation time, TintervalIs a variable time slice parameter. Similarly, we can get community CiGlobal centrality index of (a):
Figure RE-GDA0003007077110000024
preferably, the maximum buffer occupancy calculation formula in the community Ci is as follows:
Figure RE-GDA0003007077110000025
thus, for each agent (community), the optimization goal is:
maxα·ΔwC-β·Lbuf
we consider the DTN routing strategy based on social attribute clustering as a fully collaborative multi-agent task scenario, which can be described by Dec-POMDP. The Dec-POMDP can be represented by a primitive, N represents the number of agents, A is the action space, is the observation space, and is the discount factor.
Preferably, the QMIX algorithm combines nodes with forward social features (such as community and centrality) by using a hybrid network to extract a distributed policy consistent with a centralized policy.
Compared with the prior art, the technical scheme of the patent provides a layered architecture compared with the existing time delay tolerant network routing scheme based on social attributes, and can conveniently capture social information of edge equipment; on one hand, the routing decision issued by the computing center is executed in a distributed mode, and on the other hand, the routing algorithm is trained in a centralized mode in the computing center according to the state transmitted by the service unit. Meanwhile, the occupancy rate of bottleneck cache regions of all communities is considered in a reward function in a fine-grained manner, so that the routing forwarding in the delay tolerant network can be performed by more effectively utilizing social characteristics, and the delivery rate is improved and the average delay is reduced.
Drawings
The invention is described in further detail below with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a diagram of a centralized and distributed hierarchical architecture of a delay tolerant network routing algorithm DTN based on multi-agent reinforcement learning according to the present invention;
FIG. 2 is a diagram of a QMIX algorithm of cooperative multi-agent reinforcement learning according to the present invention;
FIG. 3 is a flow chart of the cooperative MARL routing algorithm of the present invention;
FIG. 4 is a graph of the average prize for MARL and DQN of the present invention;
FIG. 5 is a graph of delivery rates of different routing algorithms under the INFOCOM05 in the present invention;
fig. 6 is a graph of average packet delay for different routing algorithms under INFOCOM05 in the present invention.
Detailed Description
In order that the objects and advantages of the invention will be more clearly understood, the invention is further described below with reference to examples; it should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and do not limit the scope of the present invention.
It should be noted that in the description of the present invention, the terms of direction or positional relationship indicated by the terms "upper", "lower", "left", "right", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, which are only for convenience of description, and do not indicate or imply that the device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
Firstly, a delay tolerant network node is subjected to a Louvian clustering algorithm, and a centralized distributed hierarchical architecture is provided; subsequently, the DTN node selection next hop problem is modeled as a distributed partially observable markov decision process (Dec-POMDP) model in conjunction with positive social characteristics. And finally, solving through the collaborative multi-agent reinforcement learning QMIX.
Referring to fig. 1, a delay tolerant network is first modeled as a directed graph, with edges weighted by total connection time between two nodes. And then, carrying out community clustering by using a round-by-round heuristic iterative optimization modularity algorithm, wherein the service units are user side units, and the service units integrate and analyze own community social information, upload the attributes to a computing center, and carry out centralized training and release strategies by the computing center.
Referring to fig. 2, a diagram of a cooperative multi-agent reinforcement learning algorithm QMIX is shown, which uses a centralized training and distributed execution framework for learning decision-making. The social attribute definition, problem description, QMIX algorithm, and overall flow diagram introduced by the present invention are presented next.
First, social attribute definition
The invention introduces two different centrality indexes at the community level, and simultaneously considers the discount factor along with the time to represent the local centrality index of the community:
Figure RE-GDA0003007077110000041
wherein M represents the total number of connections of the node s to the node d, phi (0< phi <1) refers to the discount coefficient, t refers to the time index:
Figure RE-GDA0003007077110000042
wherein, TnowIs the simulation time, TintervalIs a variable time slice parameter. Similarly, we can get community CiGlobal centrality index of (a):
Figure RE-GDA0003007077110000043
second, description of the problem
If a node transmits a message to a more popular community, the data is more likely to be transmitted to the destination. However, unlimited transmission may cause node buffer overflow, which is a major cause of reduced transmission rate. We should make a trade-off between popularity and link congestion. Specifically, with the proposed DTN hierarchy, we can perform fine-grained detection and control on the buffer occupancy at each time step, rather than passively accepting congestion information. Considering the bottleneck effect, the maximum buffer occupancy calculation formula in the community Ci is as follows:
Figure RE-GDA0003007077110000051
thus, for each agent (community), the optimization goal is:
maxα·ΔwC-β·Lbuf
we consider the DTN routing strategy based on social attribute clustering as a fully collaborative multi-agent task scenario, which can be described by Dec-POMDP. The Dec-POMDP can be represented by a primitive, N represents the number of agents, A is the action space, is the observation space, and is the discount factor. Specifically, the observation space of each agent refers to which agents have the agent as a relay in a time slice, and the occupancy rate of the bottleneck buffer area: (ii) a
The action space represents which agent the agent is transmitted to before the next action arrives; the reward is designed as follows:
Figure RE-GDA0003007077110000052
if the package is transmitted within a community, and it is transmitted to other communities or communities less central than the original community, the reward is obviously negative. Instead, this action facilitates packet transmission. We also consider buffer occupancy as an important factor affecting message transmission.
Three, QMIX algorithm
The QMIX algorithm is a cooperative multi-agent reinforcement learning algorithm, and a hybrid network is adopted to combine local value functions of single agents to improve the performance of the algorithm. Nodes with positive social characteristics (such as community and centrality) can extract decentralized policies that are consistent with the centralized policies. We only need to ensure that the global argmax performed on Qtot yields the same result as the result of a separate set of argmax operations performed on each Qa:
Figure RE-GDA0003007077110000053
Figure RE-GDA0003007077110000054
the above two equations indicate that monotonicity can be achieved by constraining the relationship between Qtot and each Qa. For each Qa, agent a may perform a distributed greedy operation. It is very easy to calculate argmaxaQtot. Also, the policy of each agent can be extracted explicitly. To implement the above constraints, the hybrid network takes agent network outputs as inputs and mixes monotonically, producing Qtot values.
The final cost function of QMIX is as follows:
Figure RE-GDA0003007077110000061
please refer to FIG. 3 for the overall flowchart.
Referring to fig. 4, we performed experiments on a real data set INFOCOM05, where the number of DTN nodes in INFOCOM05 is 41, and the simulation time is 275000 s. The network card transmission rate is 250Kbps, the node buffer area size is 5M, the data packet arrival time is randomly distributed in (300s,400s), and the single data packet size is randomly distributed in (0.8KB,1 KB).
We first compared the QMIX algorithm to Deep Q-network (DQN), set the discount coefficient to 0.95, and the neural network to have 3 layers, each with 64 neurons. In INFCOM 05:
our cooperative MARL routing algorithm has a higher return and more stable performance than the DQN algorithm. The lack of cooperation between agents for distributed training and distributed execution of DQN results in higher rewards for individual agents and less rewards for other agents. In a specific routing scenario where we choose the next relay node, one agent (community) sends its messages to other agents with high centrality without restriction. To increase the second portion of the reward of equation 7, i.e., reduce buffer occupancy, a node will be reluctant to receive messages from other agents. Therefore, the training curve is unstable, and the training process is not as good as our cooperative multi-agent strong chemistry habit. By adopting the cooperative multi-agent reinforcement learning QMIX, good performance can be obtained due to the consideration of the cooperation among the multi-agents.
In order to evaluate the effect of a routing algorithm, the routing algorithm provided by the invention is compared with the conventional classical algorithm BubbleRap and direct transmission (DirectDelivery), wherein BubbleRap forwards messages by means of greedy thought, and the messages search for a node with the highest global centrality as a relay node until the node reaches the same as a target community of the node; at this point, message forwarding is determined by the local centrality ranking. Since the forwarding policy is like a bubble policy, it is called a BubbleRap routing algorithm. In the DirectDelivery routing, each node carries its own message and continuously moves until reaching the destination node, which means that other nodes are never used as relay nodes in the whole communication process. The packet lifetime is taken as an independent variable, and the delivery rates and the average time delay of different routes are analyzed in a simulation mode:
referring to fig. 5, the delivery rate of the cooperative MARL routing algorithm is higher than that of the BubbleRap algorithm, which selects relay nodes according to the rank of "popular" nodes, resulting in buffer overload at high-center nodes. As the time-to-live (TTL) of a packet increases, it may reduce the instances where the packet is deleted in the source node buffer before forwarding. There will be more opportunities for messages to be transmitted between nodes. However, it can cause congestion of the node links, making the increased delivery rate less noticeable. In our proposed algorithm, agent rewards are designed taking into account not only positive social attributes but also the occupancy of the buffer. By proper weight parameters alpha and beta, the advantage of the forwarding capability of the 'popular' community can be fully utilized, and the problem of buffer overflow caused by unlimited relays is avoided.
Referring to fig. 6, the average delay of the three routing algorithms is depicted in fig. 6, with the TTL of the message as the argument. Because messages can only passively wait for their target node to communicate with them, the highest average latency is the direct delivery routing algorithm. However, it is time-consuming to select the best relay node among dozens of nodes by using the BubbleRap, whether the transmission is performed in the community or outside. In contrast, our proposed routing algorithm uses an improved community detection algorithm, making routes of tens of nodes countable community forwarding. Thus, the average delay is reduced compared to BubbleRap.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention; various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A delay tolerant network routing algorithm based on multi-agent reinforcement learning is characterized by comprising the following steps:
firstly, a Louvian clustering algorithm is carried out on a delay tolerant network node, and a centralized and distributed hierarchical architecture is provided;
secondly, modeling a next hop problem selected by the DTN node into a distributed partially observable Markov decision process (Dec-POMDP) model by combining positive social characteristics;
and thirdly, solving through the collaborative multi-agent reinforcement learning QMIX.
2. The multi-agent reinforcement learning-based delay tolerant network routing algorithm as claimed in claim 1, wherein the algorithm introduces two different centrality indexes at community level while considering a discount factor, w, over timelocalRepresents Community CiLocal centrality index of (a):
Figure RE-FDA0003007077100000012
wherein M represents the total number of connections of the node s to the node d, phi (0< phi <1) refers to the discount coefficient, t refers to the time index:
Figure RE-FDA0003007077100000013
wherein, TnowIs the simulation time, TintervalIs a variable time slice parameter. Similarly, we can get community CiGlobal centrality index of (a):
Figure RE-FDA0003007077100000014
3. the multi-agent reinforcement learning-based delay tolerant network routing algorithm according to claim 2, wherein the maximum buffer occupancy calculation formula in the community Ci is as follows:
Figure RE-FDA0003007077100000015
thus, for each agent (community), the optimization goal is:
maxα·ΔwC-β·Lbuf
we consider the DTN routing strategy based on social attribute clustering as a fully collaborative multi-agent task scenario, which can be described by Dec-POMDP. The Dec-POMDP can be represented by a primitive, N represents the number of agents, A is the action space, is the observation space, and is the discount factor.
4. The multi-agent reinforcement learning-based delay tolerant network routing algorithm of claim 1, wherein the QMIX algorithm employs a hybrid network to combine nodes with forward social features (such as community and centrality) of a single agent local value function to extract decentralized strategy consistent with centralized strategy.
CN202011588326.2A 2020-12-29 2020-12-29 Delay tolerant network routing algorithm based on multi-agent reinforcement learning Pending CN112867083A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011588326.2A CN112867083A (en) 2020-12-29 2020-12-29 Delay tolerant network routing algorithm based on multi-agent reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011588326.2A CN112867083A (en) 2020-12-29 2020-12-29 Delay tolerant network routing algorithm based on multi-agent reinforcement learning

Publications (1)

Publication Number Publication Date
CN112867083A true CN112867083A (en) 2021-05-28

Family

ID=75998022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011588326.2A Pending CN112867083A (en) 2020-12-29 2020-12-29 Delay tolerant network routing algorithm based on multi-agent reinforcement learning

Country Status (1)

Country Link
CN (1) CN112867083A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114244767A (en) * 2021-11-01 2022-03-25 北京邮电大学 Load balancing-based link minimum end-to-end delay routing algorithm
CN113645589B (en) * 2021-07-09 2024-05-17 北京邮电大学 Unmanned aerial vehicle cluster route calculation method based on inverse fact policy gradient
CN118433112A (en) * 2024-07-04 2024-08-02 北京邮电大学 Heterogeneous network dynamic load balancing method and device with interrupt delay tolerance
CN118433112B (en) * 2024-07-04 2024-09-24 北京邮电大学 Heterogeneous network dynamic load balancing method and device with interrupt delay tolerance

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150126192A1 (en) * 2013-11-07 2015-05-07 Transpacific Ip Management Group Ltd. Cell selection or handover in wireless networks
CN111431804A (en) * 2020-03-12 2020-07-17 中国人民解放军陆军炮兵防空兵学院 DTN routing algorithm based on behavior prediction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150126192A1 (en) * 2013-11-07 2015-05-07 Transpacific Ip Management Group Ltd. Cell selection or handover in wireless networks
CN111431804A (en) * 2020-03-12 2020-07-17 中国人民解放军陆军炮兵防空兵学院 DTN routing algorithm based on behavior prediction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHANG BINBIN 等: "《2020 2nd International Conference on Advanced Computer Control》", 29 March 2010 *
孙彧 等: "多智能体深度强化学习研究综述", 《计算机工程与应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113645589B (en) * 2021-07-09 2024-05-17 北京邮电大学 Unmanned aerial vehicle cluster route calculation method based on inverse fact policy gradient
CN114244767A (en) * 2021-11-01 2022-03-25 北京邮电大学 Load balancing-based link minimum end-to-end delay routing algorithm
CN114244767B (en) * 2021-11-01 2023-09-26 北京邮电大学 Link minimum end-to-end delay routing algorithm based on load balancing
CN118433112A (en) * 2024-07-04 2024-08-02 北京邮电大学 Heterogeneous network dynamic load balancing method and device with interrupt delay tolerance
CN118433112B (en) * 2024-07-04 2024-09-24 北京邮电大学 Heterogeneous network dynamic load balancing method and device with interrupt delay tolerance

Similar Documents

Publication Publication Date Title
Tang et al. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption
Kaur et al. Edge computing in the industrial internet of things environment: Software-defined-networks-based edge-cloud interplay
Kout et al. AODVCS, a new bio-inspired routing protocol based on cuckoo search algorithm for mobile ad hoc networks
Lindgren et al. Probabilistic routing in intermittently connected networks
Grundy et al. Promoting congestion control in opportunistic networks
Hossain et al. Multi-objective Harris hawks optimization algorithm based 2-Hop routing algorithm for CR-VANET
Li et al. QGrid: Q-learning based routing protocol for vehicular ad hoc networks
Han et al. QMIX aided routing in social-based delay-tolerant networks
Ding et al. Intelligent data transportation in smart cities: A spectrum-aware approach
CN112867083A (en) Delay tolerant network routing algorithm based on multi-agent reinforcement learning
Boldrini et al. Modelling social-aware forwarding in opportunistic networks
Zhao et al. An improved ant colony optimization for communication network routing problem
CN105228215A (en) Based on many copies method for routing of decision tree mechanism in vehicular ad hoc network
Vafaei et al. QoS-aware multi-path video streaming for urban VANETs using ACO algorithm
Pirzadi et al. A novel routing method in hybrid DTN–MANET networks in the critical situations
Kaddoura et al. SDODV: A smart and adaptive on-demand distance vector routing protocol for MANETs
Zhang et al. V2V routing in VANET based on fuzzy logic and reinforcement learning
CN117041132A (en) Distributed load balancing satellite routing method based on deep reinforcement learning
CN107995114A (en) Delay Tolerant Network method for routing based on Density Clustering
CN110417572B (en) Method for predicting message transfer node based on target node meeting probability
CN107483560A (en) It is a kind of towards the multimode group-net communication of shared electricity consumption and system of selection
Nigam et al. Bonding based technique for message forwarding in social opportunistic network
CN109769283B (en) Family-based hierarchical opportunistic routing method based on genetic relationship under MSN (multiple spanning tree)
Quy et al. An adaptive on-demand routing protocol with QoS support for urban-MANETs
Toorchi et al. Deep reinforcement learning enhanced skeleton based pipe routing for high-throughput transmission in flying ad-hoc networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210528

RJ01 Rejection of invention patent application after publication