CN113660710A - Routing method of mobile ad hoc network based on reinforcement learning - Google Patents

Routing method of mobile ad hoc network based on reinforcement learning Download PDF

Info

Publication number
CN113660710A
CN113660710A CN202110756598.7A CN202110756598A CN113660710A CN 113660710 A CN113660710 A CN 113660710A CN 202110756598 A CN202110756598 A CN 202110756598A CN 113660710 A CN113660710 A CN 113660710A
Authority
CN
China
Prior art keywords
node
value
nodes
neighbor
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110756598.7A
Other languages
Chinese (zh)
Other versions
CN113660710B (en
Inventor
王英赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dianji University
Original Assignee
Shanghai Dianji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dianji University filed Critical Shanghai Dianji University
Priority to CN202110756598.7A priority Critical patent/CN113660710B/en
Publication of CN113660710A publication Critical patent/CN113660710A/en
Application granted granted Critical
Publication of CN113660710B publication Critical patent/CN113660710B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/04Communication route or path selection, e.g. power-based or shortest path routing based on wireless node resources
    • H04W40/10Communication route or path selection, e.g. power-based or shortest path routing based on wireless node resources based on available power or energy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/12Communication route or path selection, e.g. power-based or shortest path routing based on transmission quality or channel quality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a mobile self-organizing network routing method based on reinforcement learning, which solves the defects that the existing routing protocol is not suitable for a non-uniform distribution network and the relation between nodes and the network cannot be well measured; taking a complex network correlation method as a generation basis of a Q value table under a reinforcement learning frame, and providing a standard for preliminary evaluation of node quality; the routing method of the mobile self-organizing network based on reinforcement learning can effectively establish a network topology structure, reduce the maintenance cost of the network structure, and utilize the characteristics of a non-uniform distribution network to realize high-efficiency data transmission.

Description

Routing method of mobile ad hoc network based on reinforcement learning
Technical Field
The invention relates to a wireless communication technology, in particular to a mobile ad hoc network routing method based on reinforcement learning.
Background
The mobile ad hoc network is a multi-hop wireless communication network formed by self-organizing mobile nodes participating in data transmission without management of central nodes such as base stations. The network form has the decentralized characteristics of flexible networking, simple configuration and strong survivability. In the development of mobile ad hoc networks, technology combining network topology control and transmission routing strategies is an area of great interest. According to the range related to the routing information, the routing protocol is summarized into local information routing, global information routing and mixed information routing research. The local information routing comprises a random walk routing strategy, a maximum routing strategy, a local intermediate routing strategy, a preferential routing strategy and the like. Of particular interest is a prioritized routing strategy with tunable parameters. The strategy introduces sequence parameters to describe the position of a network phase change point so as to measure the critical point of network congestion. The global information routing comprises a shortest path routing strategy, an effective path routing strategy and an optimized random walk routing strategy. Global information routing focuses more on the overall transmission capabilities of the network. In addition to local and global routing protocols, there are also mixed information routes, and such routing strategies blend various factors present in the network as the target basis for transferring data.
In the above studies, various routing protocols have two disadvantages. Firstly, networks to which each routing protocol is applicable are basically established on the basis of a network topology in which nodes are uniformly distributed, and the characteristics of the network in which the nodes are non-uniformly distributed are not considered, so that the networks are not applicable to the non-uniformly distributed networks. Secondly, most routing protocols pay attention to single-target implementation, that is, a reward strategy is established through a single target, so that the relationship between nodes and a network cannot be well measured, and a space for improvement is left.
Disclosure of Invention
The invention aims to provide a mobile self-organizing network routing method based on reinforcement learning, which can effectively establish a network topological structure, reduce the maintenance cost of the network structure, and realize high-efficiency data transmission by utilizing the characteristics of a non-uniform distribution network.
The technical purpose of the invention is realized by the following technical scheme:
a routing method of a mobile ad hoc network based on reinforcement learning comprises the following steps:
s1, calculating the residual energy percentage of the opposite end node, and determining the forwarding intention of the opposite end node; calculating the Hello packet delivery rate of the node and the opposite node, and determining the link quality between the nodes;
s2, determining neighbor nodes through probabilistic connection according to the residual energy factors and the Hello packet delivery rate factors, and completing construction of a network topology structure;
s3, calculating the instant reward value R according to the residual energy factor and the Hello packet delivery rate factors(i) Evaluating the quality of the neighbor nodes; carrying out iteration updating regularly to obtain Q values of all nodes in a coverage range;
s4, when a node needs to send data, calculating a forwarding reward value R according to the average value of betweenness of each node on the shortest path from the node to the destination nodes(d,i);
S5, evaluating value Q of neighbor node according to current nodes(i) And forwarding the prize value Rs(d, i) calculating a selection factor Qs(d, i) selecting factor Q of neighbor nodes(d, i) sorting by selecting the selection having the largest selection factor QsAnd the node of (d, i) is used as a next hop node to transmit data.
In conclusion, the invention has the following beneficial effects:
the routing strategy is divided into two stages, wherein the first stage is a network structure establishment stage based on a complex network, and the second stage is a routing stage based on reinforcement learning. In the stage of establishing the network structure, the invention takes a complex network correlation method as a generation basis of a Q value table under a reinforcement learning frame, and provides a standard for preliminary evaluation of the node quality. In the second stage of routing selection, the routing strategy adopts the node betweenness on the whole path as the calculation basis of routing reward, and fully expresses the requirement of the shortest path in the non-uniform network. And integrating the two stages to form a routing strategy based on network topology control, wherein the strategy can effectively reduce the time delay and the congestion probability of the network, improve the survival time of the nodes and further improve the routing capability.
Drawings
FIG. 1 is a schematic flow diagram of the process.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
According to one or more embodiments, a mobile ad hoc network routing method based on reinforcement learning is disclosed, which comprises the following steps:
the nodes maintain and update the Q-value table periodically. The nodes regularly broadcast the Hello messages and receive response messages fed back by other nodes in the coverage range.
S1, calculating the residual energy percentage of the opposite end node, and determining the forwarding intention of the opposite end node; calculating the Hello packet delivery rate of the node and the opposite node, and determining the link quality between the nodes;
s2, determining neighbor nodes through probabilistic connection according to the residual energy factors and the Hello packet delivery rate factors, and completing construction of a network topology structure;
s3, calculating the instant reward value R according to the residual energy factor and the Hello packet delivery rate factors(i) Evaluating the quality of the neighbor nodes; carrying out iteration updating regularly to obtain Q values of all nodes in a coverage range;
s4, when a node needs to send data, calculating a forwarding reward value R according to the average value of betweenness of each node on the shortest path from the node to the destination nodes(d,i);
S5, evaluating value Q of neighbor node according to current nodes(i) And forwarding the prize value Rs(d, i) calculating a selection factor Qs(d, i) selecting factor Q of neighbor nodes(d, i) sorting by selecting the selection having the largest selection factor QsAnd the node of (d, i) is used as a next hop node to transmit data.
The mobile ad hoc network with nodes distributed non-uniformly means that the nodes in the network are not randomly distributed in a network scene, and the node densities of different areas are different. The topological phenomenon of non-uniform distribution of the nodes can affect the applicability of the routing strategy of the mobile ad hoc network.
The network node refers to a mobile terminal participating in data transmission of the mobile ad hoc network. Connected edges (simply "edges") refer to relationships between network nodes. The edges determine the topology of the network.
A neighbor refers to a set of all nodes that have a connecting edge with a node. In the mobile ad-hoc network referred to in the present invention, the other nodes within the coverage of the node are not necessarily all neighbors of the node.
The betweenness refers to the number of exactly nodes x present in all shortest paths in the network. The nodes with large betweenness are not necessarily large, and do not necessarily occupy a central position in the network topology. Network betweenness can generally characterize the degree of centralization of a network.
Routing strategy based on non-uniform distribution network: the routing strategy comprises two aspects, namely (1) network topology establishment and node evaluation, which are responsible for generating a neighbor relation according to node intentions and link quality and finishing quality evaluation of neighbor nodes; (2) a data forwarding selection process for selecting the next hop node during data forwarding according to the betweenness characteristics of the network
The routing strategy is divided into two stages, wherein the first stage is a network structure establishment stage based on a complex network, and the second stage is a routing stage based on reinforcement learning.
Establishing a network structure:
in a large-scale self-organizing network, because of numerous nodes, if a node i establishes links by taking all nodes in the coverage area as neighbors, the node i will be burdened, and a lot of unnecessary signaling data can be transmitted in the network, so that the load is added to the operation of the network. Therefore, in the process of constructing the network topology, the establishment of the node link is restricted, and the node which can express the intention of the network is selected to construct the neighbor relation.
In the routing strategy, a network topology is determined according to node residual energy and a Hello packet receiving ratio.
1) Node residual energy calculation
The node residual energy directly indicates the survival time of the node in the network. The node residual energy is generally considered to influence the node forwarding willingness, that is, when the residual energy is more, the node is willing to participate in data forwarding, and when the residual energy is less, the node refuses unnecessary data forwarding in order to prolong the self survival time. Therefore, the amount of the residual energy of the node can reflect the forwarding intention of the node, and becomes a factor for establishing the neighbor relation.
g (E) is arbitrary of the remaining energy of the nodeMonotonically increasing function, usually given as g (E) EτAnd E ≠ 0, which represents the role of the node residual energy E in selecting the next-hop node, and the role of the residual energy also has a certain difference with the difference of the g (E) function form. In the model, τ is 1.
2) Hello packet delivery rate (acceptance ratio) between nodes
Besides the node residual energy is taken into consideration as the node forwarding intention, the characteristics of the link between the nodes are also considered, and the strategy adopts the Hello packet delivery rate (receiving ratio) as a reference factor of the link quality between the nodes. The Hello packet delivery rate (reception ratio) is defined as a ratio of a Hello packet received by a node i within the coverage to a Hello packet transmitted by the node. The value can well measure the transmission quality of the link between the nodes and ensure the stability of data forwarding. The delivery rate of the Hello packets was calculated using the following formula:
Figure BDA0003147433510000051
wherein H (i) the delivery rate of the node and the node i in the coverage area ht(i) Indicates the number of Hello packets sent by the node, hr(i) The number of Hello packets received for node i. Lambda epsilon 0 and 1 is a regulating parameter which indicates the importance degree of the delivery rate. Since there are insufficient Hello packets to determine link quality when there are fewer, the policy agrees on ht(i)<At 20, the delivery rate was 0.
3) Calculation of Q value
The nodes regularly broadcast Hello data packets in the network, and the purpose is to find nodes which are suitable to become neighbor relations in the coverage range of the nodes. The data packet requires that nodes within the node coverage return an acknowledgement message (ACK) and include its own remaining energy ratio value. The selection principle of the neighbor nodes is that the nodes which meet certain energy requirements and have good communication quality of links among the nodes are used as the neighbor nodes. The selection algorithm of the neighbor node is defined by the following formula:
suppose that the probability that node i is connected to the node is piiThis probability is subject to node residual energy and Hello packetsConstraints on delivery rate.
Figure BDA0003147433510000061
Wherein f (g), (E), H ═ g (E)αH(1-α)And g (E) is a monotonic function of the residual energy of the node. H is the delivery success rate of the Hello packet. Alpha is an adjustable parameter that can adjust the relationship between energy and packet reception rate. N is a radical ofsIs the set of neighbors of this node s. j is a neighbor of the node s.
When the determination of the neighbor relation is completed, defining an instantaneous reward value R according to the node residual energy factor and the Hello packet receiving rate factors(i) To evaluate routing trends.
Rs(i)=Es,i·Hs,i=g(Ei)αHi (1-α)
Completing the definition of the instantaneous reward value of the next hop node, and the current node s needs to update the corresponding Q value table
Figure BDA0003147433510000062
Comprises the following steps:
Figure BDA0003147433510000063
wherein eta is the learning rate, the larger eta is, the less the original Q value is reserved, gamma is the discount factor,
Figure BDA0003147433510000064
and representing the node j with the maximum Q value in the Q value table of the neighbor nodes. If the neighbor node i is a newly added node in the coverage range of the current node s, Q in the Q value table of the node ss(i)=0。
And performing probabilistic connection on other nodes in the node coverage range by using the residual energy factor and the Hello packet delivery rate factor to form a network topology structure. And calculating an initial Q value by using the two factors according to a reinforcement learning method, forming a Q value table and maintaining the Q value table. And periodically performing probabilistic connection calculation on the nodes in the coverage range, determining whether the neighbors are continuously connected or not according to the calculation result, and if not, deleting the corresponding neighbor items in the Q value table.
Through the formula, the establishment strategy of the link structure is given from two aspects, the network topology is interpreted from two aspects of the network global capability and the link level, and a foundation is laid for the establishment of the next route.
2. Data forwarding method
The current node needs to regularly maintain and update the Q value items of the neighbor nodes in the Q value table, and the quality of the neighbor nodes is evaluated. When data needs to be transmitted, the mean value of the betweenness of each node on the shortest path from the neighbor node i to the target node d needs to be considered, and the forwarding reward value R based on the betweenness of the nodes is defined according to the mean values(d, i). The larger the value, the larger the forwarded prize value, expressed as:
Figure BDA0003147433510000071
relaying the reward value Rs(d, i) is the average of the sum of the betweenness of all nodes on the shortest path from the neighbor node i of the current node s to the destination node d, Rs(d,i)∈(0,1]. And L is the number of nodes on the path. It can be found that the closer the current node is to the destination node, RsThe larger (d, i), the larger the forward prize.
3. Routing policy flow
By rewarding R for forwardingsAnd (d, i) calculating, and combining the Q values of the neighbor nodes i in the Q value table of the current node to determine the next hop forwarding node. Definition of Qs(d, i) selecting the neighbor node i as the Q value of the next hop node in the process of forwarding data to the destination node d by the current node s, Qs(d, i) is expressed as:
Qs(d,i)=Qs(i)+Rs(d,i)
assuming that the current node s has N neighbor nodes, sequentially calculating Q of the N neighbor nodes according to Q value table entries of the node s and the forwarding reward value based on the path betweennesss(d,i),i=1,2,3,.. ang, N. Selecting Q in neighbor nodes(d, i) the largest node as the data forwarding node.
From the above description, the role of the two main phases involved in the present routing strategy is summarized as follows: 1) the first stage, network topology establishment and node evaluation. The nodes do not have data packets to be transmitted, and need to regularly broadcast Hello packets to the nodes within the coverage range of the nodes, maintain the network structure through the received response, and update the Q value table of the nodes; 2) and the second stage, data forwarding selection process. If data need to be sent, calculating the forwarding reward value R on the shortest path from all the neighbor nodes to the target nodes(d, i) and combining the Q value table item of the current node to select the final Q value QsAnd (d, i) the largest neighbor node is used as the next hop sending node, and the data is sent out.
Reinforcement learning is an important development direction in the field of artificial intelligence, and has attracted more attention in recent years, and a great deal of research is being conducted. Reinforcement learning includes four elements of an agent, an environment, an action, and a reward. The intelligent agent can select a proper action according to a certain strategy; the environment makes feedback, namely rewards, according to the action selected by the intelligent agent in a certain state; and the intelligent agent adjusts the strategy according to the reward, and then updates the behavior of the intelligent agent. And the process of optimizing decision is achieved through reciprocating adjustment. The earliest algorithm to apply reinforcement learning to mobile ad hoc networks was the Q-routing algorithm. The weight value of the algorithm for measuring the path quality is placed in a Q table maintained by each node, and a next hop node is selected according to the Q table. In addition, the routing algorithm based on reinforcement learning is as follows. The algorithm for adjusting the learning rate of reinforcement learning according to the degree of nodes in the network topology uses less time to detect the real state of the network. The Q values of the neighbor nodes are obtained from the broadcast messages of the nodes, so that the time required for exploring the network state is reduced, and the performance loss of the algorithm in the learning process is reduced. The stability of the route under the high-load condition is improved by randomly polling the self-adaptive Q-routing of the neighbor nodes. The distributed reinforcement learning routing protocol suitable for the high-speed moving scene of the vehicle nodes estimates the state information of the network topology, and uses a unicast control information packet to check the availability of the paths among the vehicles. A mobile self-adaptive routing protocol based on reinforcement learning aims at the problems of unorganized and unstable network topological structure and improves dynamic adaptability to network node changes through a distributed Q learning algorithm. In conclusion, the reinforcement learning framework can be applied to a routing algorithm of the mobile ad hoc network, and routing paths are planned through continuous iteration reward values, so that a certain routing purpose is achieved, and a task of transmitting data is well completed.
In the stage of establishing the network structure, the invention takes a complex network correlation method as a generation basis of a Q value table under a reinforcement learning frame, and provides a standard for preliminary evaluation of the node quality. In the second stage of routing selection, the routing strategy adopts the node betweenness on the whole path as the calculation basis of routing reward, and fully expresses the requirement of the shortest path in the non-uniform network. And integrating the two stages to form a routing strategy based on network topology control, wherein the strategy can effectively reduce the time delay and the congestion probability of the network, improve the survival time of the nodes and further improve the routing capability.
Compared with the prior art, the invention adopts a mobile self-organizing network topological structure construction technology with double objective decisions, can comprehensively consider the characteristics of the mobile self-organizing network, and reasonably establishes the network structure. Different from a network with infrastructure, the multi-hop property of the mobile self-organizing network determines the node capacity and the link capacity participating in data transmission and determines the transmission efficiency, so that the network characteristics cannot be comprehensively measured by adopting a single target as the basis for constructing the network topology.
Secondly, the invention not only adopts multi-objective decision to construct a network topology structure, but also introduces node index indexes as important reference bases for data forwarding. The node betweenness is used as an important index for measuring the centrality of the network, and is extremely suitable for reflecting the structural characteristics of the non-uniform distribution network. Most mobile self-organizing networks show the characteristic of node non-uniform distribution, so the routing method provided by the invention can more quickly and efficiently plan the routing path from the source node to the destination node, and improve the efficiency of data transmission.
Thirdly, the invention adopts a routing strategy combining a complex network and a reinforcement learning method, and continuously optimizes the node set participating in transmission according to the transmission reward value in the routing selection process, thereby further ensuring the high-efficiency transmission of data.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.

Claims (4)

1. A routing method of a mobile ad hoc network based on reinforcement learning is characterized by comprising the following steps:
s1, calculating the residual energy percentage of the opposite end node, and determining the forwarding intention of the opposite end node; calculating the Hello packet delivery rate of the node and the opposite node, and determining the link quality between the nodes;
s2, determining neighbor nodes through probabilistic connection according to the residual energy factors and the Hello packet delivery rate factors, and completing construction of a network topology structure;
s3, calculating the instant reward value R according to the residual energy factor and the Hello packet delivery rate factors(i) Evaluating the quality of the neighbor nodes; carrying out iteration updating regularly to obtain Q values of all nodes in a coverage range;
s4, when a node needs to send data, calculating a forwarding reward value R according to the average value of betweenness of each node on the shortest path from the node to the destination nodes(d,i);
S5, evaluating value Q of neighbor node according to current nodes(i) And forwarding the prize value Rs(d, i) calculating a selection factor Qs(d, i) selecting factor Q of neighbor nodes(d, i) sorting, selecting the possessions bestLarge selection factor QsAnd the node of (d, i) is used as a next hop node to transmit data.
2. The reinforcement learning-based mobile ad hoc network routing method according to claim 1, wherein the determination of the neighboring node in step S2 is specifically:
suppose that the probability that the node i is connected to the node is piiAnd the probability is constrained by the node residual energy and the Hello packet delivery rate:
Figure FDA0003147433500000011
wherein, f (g), (E), H ═ g (E)αH(1-α)G (E) is a monotonic function of the node residual energy, H is the delivery success rate of the Hello packet, and alpha is an adjustable parameter and can adjust the relation between the energy and the packet receiving rate; n is a radical ofsIs a set of neighbors of the node s; j is a neighbor of the node s.
3. The reinforcement learning-based mobile ad hoc network routing method according to claim 2, wherein the calculation of the instantaneous rewarding value and the updated Q-value table is specifically as follows:
defining an instantaneous prize value Rs(i) And the trend of the route is evaluated,
Rs(i)=Es,i·Hs,i=g(Ei)αHi (1-α)
completing the definition of the instantaneous reward value of the next hop node, and the current node s needs to update the corresponding Q value table
Figure FDA0003147433500000021
Figure FDA0003147433500000022
Wherein eta is learning rate, and the larger eta is, the original Q value is retainedThe less, gamma is the discount factor,
Figure FDA0003147433500000023
representing the node j with the maximum Q value in the Q value table of the neighbor node;
if the neighbor node i is a newly added node in the coverage range of the current node s, Q in the Q value table of the node ss(i)=0。
4. The reinforcement learning-based mobile ad hoc network routing method according to claim 3, wherein the data forwarding routing policy is specifically:
when data needs to be transmitted, the mean value of the betweenness of each node on the shortest path from the neighbor node i to the destination node d is considered, and a forwarding reward value R based on the betweenness of the nodes is defineds(d,i),
Figure FDA0003147433500000024
Relaying the reward value Rs(d, i) is the average of the sum of the betweenness of all nodes on the shortest path from the neighbor node i of the current node s to the destination node d, Rs(d,i)∈(0,1](ii) a L is the number of nodes on the path;
determining a next hop forwarding node by combining the Q values of the neighbor nodes i in the Q value table of the current node; definition of Qs(d, i) selecting the neighbor node i as the Q value of the next hop node in the process of forwarding data to the destination node d by the current node s, Qs(d, i) is expressed as
Qs(d,i)=Qs(i)+Rs(d,i)
Assuming that the current node s has N neighbor nodes, sequentially calculating Q of the N neighbor nodes according to Q value table entries of the node s and the forwarding reward value based on the path betweennesss(d,i),i=1,2,3,...,N;
Selecting Q in neighbor nodesAnd (d, i) the largest node is used as a data forwarding node for data transmission.
CN202110756598.7A 2021-07-05 2021-07-05 Mobile self-organizing network routing method based on reinforcement learning Active CN113660710B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110756598.7A CN113660710B (en) 2021-07-05 2021-07-05 Mobile self-organizing network routing method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110756598.7A CN113660710B (en) 2021-07-05 2021-07-05 Mobile self-organizing network routing method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN113660710A true CN113660710A (en) 2021-11-16
CN113660710B CN113660710B (en) 2023-10-31

Family

ID=78477952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110756598.7A Active CN113660710B (en) 2021-07-05 2021-07-05 Mobile self-organizing network routing method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN113660710B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114449608A (en) * 2022-01-21 2022-05-06 重庆邮电大学 Unmanned aerial vehicle ad hoc network self-adaptive routing method based on Q-Learning
CN114900255A (en) * 2022-05-05 2022-08-12 吉林大学 Near-surface wireless network link gradient field construction method based on link potential energy

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107104899A (en) * 2017-06-09 2017-08-29 中山大学 A kind of method for routing based on ant group algorithm being applied in vehicular ad hoc network
CN111479306A (en) * 2020-04-02 2020-07-31 中国科学院上海微系统与信息技术研究所 Q-learning-based QoS (quality of service) routing method for self-organizing network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107104899A (en) * 2017-06-09 2017-08-29 中山大学 A kind of method for routing based on ant group algorithm being applied in vehicular ad hoc network
CN111479306A (en) * 2020-04-02 2020-07-31 中国科学院上海微系统与信息技术研究所 Q-learning-based QoS (quality of service) routing method for self-organizing network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG YINGHE: "Evolving Network Model with Local-Area Preference for Mobile Ad Hoc Network", NETWORK TECHNOLOGY AND APPLICATION *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114449608A (en) * 2022-01-21 2022-05-06 重庆邮电大学 Unmanned aerial vehicle ad hoc network self-adaptive routing method based on Q-Learning
CN114900255A (en) * 2022-05-05 2022-08-12 吉林大学 Near-surface wireless network link gradient field construction method based on link potential energy
CN114900255B (en) * 2022-05-05 2023-03-21 吉林大学 Near-surface wireless network link gradient field construction method based on link potential energy

Also Published As

Publication number Publication date
CN113660710B (en) 2023-10-31

Similar Documents

Publication Publication Date Title
Qasim et al. Mobile Ad Hoc Networking Protocols' Evaluation through Simulation for Quality of Service.
CN1886942B (en) Method and system for routing traffic in AD HOC networks
Yadav et al. Performance comparison and analysis of table-driven and on-demand routing protocols for mobile ad-hoc networks
WO2019169874A1 (en) Wireless mesh network opportunistic routing algorithm based on quality of service assurance
CN101945432A (en) Multi-rate opportunistic routing method for wireless mesh network
Deepalakshmi et al. Ant colony based QoS routing algorithm for mobile ad hoc networks
Qasim et al. Mobile Ad hoc Networks simulations using Routing protocols for Performance comparisons
Hendriks et al. Q 2-routing: A Qos-aware Q-routing algorithm for wireless ad hoc networks
CN108449271B (en) Routing method for monitoring path node energy and queue length
CN113660710B (en) Mobile self-organizing network routing method based on reinforcement learning
CN108684063A (en) A kind of on-demand routing protocol improved method based on network topology change
CN110932969B (en) Advanced metering system AMI network anti-interference attack routing algorithm for smart grid
Wannawilai et al. AOMDV with sufficient bandwidth aware
Ferdous et al. Randomized energy-based AODV protocol for wireless ad-Hoc network
Chettibi et al. FEA-OLSR: An adaptive energy aware routing protocol for manets using zero-order sugeno fuzzy system
Zhang et al. LocalTree: An efficient algorithm for mobile peer-to-peer live streaming
Abolhasan et al. LPAR: an adaptive routing strategy for MANETs
CN112533262B (en) Multi-path on-demand routing method of rechargeable wireless sensor network
Lafta et al. Efficient routing protocol in the mobile ad-hoc network (MANET) by using genetic algorithm (GA)
Liu et al. A biologically inspired congestion control routing algorithm for MANETs
Chaudhari et al. Multilayered distributed routing for power efficient MANET performance
Chetret et al. Reinforcement learning and CMAC-based adaptive routing for manets
Bokhari et al. AMIRA: interference-aware routing using ant colony optimization in wireless mesh networks
Ramakrishnan et al. Mathematical modeling of routing protocol selection for optimal performance of MANET
Yi et al. A node-disjoin multipath routing in mobile ad hoc networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant