CN111065105B - Distributed intelligent routing method for unmanned aerial vehicle network slice - Google Patents

Distributed intelligent routing method for unmanned aerial vehicle network slice Download PDF

Info

Publication number
CN111065105B
CN111065105B CN201911395351.6A CN201911395351A CN111065105B CN 111065105 B CN111065105 B CN 111065105B CN 201911395351 A CN201911395351 A CN 201911395351A CN 111065105 B CN111065105 B CN 111065105B
Authority
CN
China
Prior art keywords
node
link
network
aerial vehicle
unmanned aerial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911395351.6A
Other languages
Chinese (zh)
Other versions
CN111065105A (en
Inventor
陈博伦
孙耀
秦爽
冯钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911395351.6A priority Critical patent/CN111065105B/en
Publication of CN111065105A publication Critical patent/CN111065105A/en
Application granted granted Critical
Publication of CN111065105B publication Critical patent/CN111065105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/22Traffic simulation tools or models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/12Communication route or path selection, e.g. power-based or shortest path routing based on transmission quality or channel quality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/22Communication route or path selection, e.g. power-based or shortest path routing using selective relaying for reaching a BTS [Base Transceiver Station] or an access point
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/24Connectivity information management, e.g. connectivity discovery or connectivity update
    • H04W40/248Connectivity information update
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a distributed intelligent routing method facing unmanned aerial vehicle network slicing, which comprises the following steps: modeling an unmanned aerial vehicle network into a network model; setting constraint conditions for the network model, wherein the constraint conditions comprise: time delay limit, rate limit, packet loss rate limit; when the slice is oriented to low time delay, a constrained network model is established as a multi-constraint optimization model; solving the multi-constraint optimization model through a reinforcement learning model to obtain a solution value, wherein each communication node independently stores the link conditions of the communication node and the neighbor nodes and updates the link conditions in real time in the solution process; and carrying out dynamic routing according to the solved value and the link condition. The invention effectively solves the problems that the routing method in the prior art cannot adapt to the characteristics that the unmanned aerial vehicle network has high requirement on time delay and the network dynamically changes at any time. The invention fills the technical gap and creates better conditions for the network environment of the unmanned aerial vehicle.

Description

Distributed intelligent routing method for unmanned aerial vehicle network slice
Technical Field
The invention relates to a wireless network routing method, in particular to a distributed intelligent routing method facing unmanned aerial vehicle network slicing.
Background
The traditional algorithm for solving the routing strategy based on the shortest path algorithm is difficult to apply to the scene of unmanned aerial vehicle network slicing. This is because the current network has poor adaptability to the network environment changing in real time, and the shortest-path algorithm has a single selected path, so that when the number of traffic flows in the network increases gradually, the probability of network congestion increases significantly. If the congestion level of the network is expected to be reduced on the basis of the shortest-path algorithm, the number of the carried service flows is sacrificed. In addition, in an actual scenario, due to the mobility of the node, the communication quality of the link in the network may also change, and the dynamic property of the network may also affect the result of the routing algorithm.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in the prior art, a routing algorithm is single and cannot adapt to the changing requirements of network dynamics, so that the quantity of service flows in a network is in a direct proportion relation with network congestion and cannot be applied to the scene of unmanned aerial vehicle network slicing. The invention aims to provide a distributed intelligent routing method facing unmanned aerial vehicle network slicing, and solves the problem of how to improve communication quality under the dynamic network environment of an unmanned aerial vehicle network slicing scene.
The invention is realized by the following technical scheme:
a distributed intelligent routing method facing unmanned aerial vehicle network slicing comprises the following steps:
s1: modeling an unmanned aerial vehicle network into a network model;
s2: setting constraints on the network model, wherein the constraints comprise: time delay limit, rate limit, packet loss rate limit;
s3: when the slice is oriented to low time delay, the constrained network model is built into a multi-constraint optimization model;
s4: solving the multi-constraint optimization model through a reinforcement learning model to obtain a solved value, wherein each communication node independently stores the link conditions of the communication node and the neighbor nodes and updates the link conditions in real time in the solving process;
s5: and carrying out dynamic routing according to the solved value and the link condition.
Firstly, establishing an unmanned aerial vehicle network as a network model and a network model in a static environment, and then setting constraint conditions of time delay limit, rate limit and packet loss rate limit for the grid model. Aiming at the requirement of the unmanned aerial vehicle network, when the time delay oriented slicing is carried out, a constrained network model is built into a multi-constrained optimization model. And establishing a reinforcement learning model, solving the multi-constraint optimization model through the reinforcement learning model to obtain a solution value, simultaneously, independently storing the link conditions of each communication node and the neighbor nodes in the network in the solution process, and updating the current link conditions of each communication node in real time. And finally, finding a routing path which is between the source node and the destination node and meets the minimum time delay requirements of the speed, the packet loss rate and the like according to the solved value and the link state of the communication node. The path dynamically changes over time as network conditions and constraints change.
Further, a network model in a static environment is established as an undirected graph G (V, E, W) with weighted edges, wherein V is a communication node set in the network, E is a communication link set in the network, and W represents the weighted value of the links in the network. Techniques used by the communication link include: TDMA, CSMA, or polling.
Further, the constraint conditions of the network model are expressed by QoS ═ δ, ν, γ, δ denotes delay, ν denotes rate, γ denotes packet loss rate, and the constraint conditions include:
i,j=1…nD(Lij)≤δ、
mini,j=1…nV(Lij)≥v、
1-Πi,j=1…n(1-R(Lij))≤γ;
Lijrepresenting a link of the communication node i to the communication node j; d (L)ij) Represents the link LijTime delay of (2); v (L)ij) Represents the link LijThe rate of (d); r (L)ij) Represents the link LijThe packet loss rate of (1).
Further, when facing the slice of the low time delay of the unmanned aerial vehicle network, the multi-constraint optimization model is:
minimize Delay=E[Σi,j=1…nD(Lij)]t
subject to Error=E[1-Πi,j=1…n(1-R(Lij))]t≤γ、
TransRate=E[mini,j=1…nV(Lij)]t≥ν、
Lij∈p=(Lsi,…,Lij,…,Ljd);
E[θ]tdenotes the expectation of theta in its duration, t denotes the duration of the traffic flow, Error denotes the packet loss rate of the traffic flow, TransRate denotes the transmission rate of the link, and L denotes the rate of the linkijThe start node belonging to the same group as the source node s and the end node belonging to the same group as the destination node d; the minimize Delay represents the optimization objective to minimize the propagation Delay of the path.
Further, the communication node measures the quality of the link QoS between the communication node and the destination node by using the Q value, and the step S5 includes the following sub-steps:
s51: defining a communication node for sending a data packet as a data packet node;
s52: the data packet node sends a data packet to a neighbor node;
s53: judging a communication node of a next hop of a data packet according to the Q value of the neighbor node and the link QoS, and taking the communication node of the next hop as a data packet node;
s54: repeating the steps S52-S54 until the data packet node is the destination node.
And the data packet node independently calculates the Q value from the data packet node to the destination node.
The communication node uses a Q value to measure the quality of the QoS of the link between the current communication node and the destination node. The communication node sends data packets to all the neighbor nodes, judges which is the next communication node of the data packet according to the Q values of the neighbor nodes and the link QoS, transmits the data packets until the data packets reach the destination node according to the judgment result, and completes the iterative process of routing from a source node to the destination node. On the basis of the process, a distributed algorithm is preferably adopted, each communication node independently calculates the Q value from the current communication node to the destination node, each communication node and the neighbor node interactively acquire the Q value and the link QoS, and the acquired Q value and the link QoS value are used as the selection standard of the next communication node.
Further, when the network fluctuates greatly, the communication node discards the currently stored link condition and performs the link condition calculation again.
Further, the reinforcement learning adopts a value iteration method.
Further, modeling the routing problem as a Markov Decision Process (MDP), the MDP model comprising: a state machine, an action set, a probability transition matrix, a reward matrix, a discount factor, the discount factor used to calculate a cumulative reward.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention establishes a multi-constraint optimization model for low-delay network slicing, the model has the dynamic characteristic, the optimal delay path from a source node to a destination node can be found by solving the multi-constraint optimization model, and the path dynamically changes at any time along with the network condition. The method is particularly suitable for the unmanned aerial vehicle network environment, and has the characteristics of high requirement on time delay and random change of network nodes. Meanwhile, the network node adopts a distributed method, and each user independently stores the link condition between the user and the neighbor and updates the link state, so that the selection of the path is faster and more efficient.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is a schematic view of example 1.
Fig. 2 is a schematic diagram of a network model in embodiment 1.
Fig. 3 is a schematic diagram of a Q learning method in embodiment 2.
FIG. 4 is a diagram of a simulation environment of embodiment 5.
FIG. 5 is a graph showing the convergence of the average Q values at different ε in example 5.
FIG. 6 is a diagram showing the convergence of example 5 with a small number of iterations.
Fig. 7 shows the routing results of example 6 under different QoS requirements.
Fig. 8 is a diagram of a situation of a node loss in the network according to embodiment 6.
Fig. 9 is a diagram of a case of link failure in the network according to embodiment 6.
Fig. 10 is a graph comparing the service transmission rate of the DSDV algorithm according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.
Example 1.
As shown in fig. 1. Embodiment 1 is a distributed intelligent routing method for unmanned aerial vehicle network slicing. Firstly, modeling an unmanned aerial vehicle network into a network model; and setting constraint conditions for the network model, wherein the constraint conditions comprise: time delay limit, rate limit, packet loss rate limit; when the slice is oriented to low time delay, the constrained network model is built into a multi-constraint optimization model; then solving the multi-constraint optimization model through a reinforcement learning model to obtain a solved value, wherein each communication node independently stores the link conditions of the communication node and the neighbor nodes and updates the link conditions in real time in the solving process; and then, carrying out dynamic routing selection according to the solved value and the link condition.
In a static network environment, the network model in embodiment 1 is established as an undirected graph G (V, E, W) with weighted edges, where V is a set of communication nodes in the network, and is denoted as V ═ W1,v2,…,vnN is the number of nodes; e is the set of communication links in the network, E ═ E11a,e12a,…,eijk},eijkIndicating a link from node i to node j, k indicating the communication technology used by the link, k being taken to a valueThere are three {1,2,3}, which respectively represent the TDMA, CSMA, and polling techniques; w represents the weight value of the link in the network, W ═ dijk,vijk,pijk) Denotes a link eijkCorresponds to the link delay (delay), the link transmission rate (transmission rate) and the packet loss rate (packetloss) of the link. Link e in the unlicensed figureijkAnd ejikIndicating the same link, as shown in fig. 2, node No. 4 and node No. 5 have 3 heavy edges, indicating that the link from node 4 to node 5 supports 3 communication technologies.
On the basis of the network model, each service flow applying for accessing the network has the corresponding requirements of time delay, rate and packet loss rate, and is represented by a triple group QoS (δ, ν, γ) (the corresponding value of each link can be obtained by sending a detection packet). The purpose of QoS routing is to find a suitable path P (L) in the set P of paths that satisfy the requirements of the source node s and the destination node d of the traffic fsi,…,Lij,…,Ljd) The following constraints are satisfied:
1. delay limitation of service request:
Σi,j=1…nD(Lij)≤δ
2. rate limiting of service requests:
mini,j=1…nV(Lij)≥ν
3. packet loss rate limitation of service request:
1-Πi,j=1…n(1-R(Lij))≤γ
wherein D (L)i) Represents a link LiTransmission delay of, V (L)i) Represents a link LiAvailable bandwidth of R (L)i) Represents a link LiThe packet loss rate of (1).
Under a slicing scene, different types of services have different requirements on different QoS characteristics, and taking low-delay-oriented slicing as an example, when the services have strict limitation requirements on delay, the following multi-constraint optimization model can be established:
minimize Delay=E[Σi,j=1…nD(Lij)]t
subject to Error=E[1-Πi,j=1…n(1-R(Lij))]t≤γ;
TransRate=E[mini,j=1…nV(Lij)]t≥ν;
Lij∈p=(Lsi,…,Lij,…,Ljd).
wherein, E [ theta ]]tIndicating the expectation of theta over its duration, where t denotes the duration of the traffic flow. Constraint one (Error) is the packet loss rate (bit Error rate) constraint of the traffic flow, constraint two (TransRate) is the transmission rate of the link, and constraint three (L)ijE.g. p) the starting node which restricts a route must be the source node s, the terminating node is the destination node d; the optimization objective (minimize Delay) is to minimize the transmission Delay of the path.
And establishing a reinforcement learning model to solve the multi-constraint optimization model.
The routing problem is first modeled as a Markov Decision Process (MDP) and solved using a reinforcement learning model. Under the model, each node of the network is regarded as one State (State) in the MDP, the process of selecting the neighbor node as the next hop of the route in each State is used as Action (Action) selection, each node is used as agent to independently perform own Action selection and Reward (Reward) calculation, and the updating of the whole network is realized through information interaction among the nodes.
The MDP model contains five main elements<S,A,P,R,γ>S represents a state machine, A represents an action set, P represents a probability transition matrix, R represents a reward matrix, and gamma represents a discount factor, for calculating a cumulative reward. Since the transition of the state after the selection of the designated action in the designated state is definite, the action transition probability p becomes 1. The QoS index of the link is used as an instant reward value r, and a specific QoS index dij(t)、eij(t) and vij(t) represents latency expectation, bit error rate expectation and rate of link i-j, respectively, over a period of time. In addition, three Q values Q are definedij(d)、Qij(e) And Qij(v) Means that when the j node is selected as the next hop from the i node, the cumulative time delay and the cumulative time delay of all links of the whole path from the i node to the destination node are obtainedThe product error rate and the accumulation rate. Corresponding to the optimization model, the optimization goal is that the delay expectation accumulation of all links of the selected path is minimum, so that the discount factor γ is set to be 1, at this time, the update strategy of the Q value is simplified, and a general update formula of the Q learning cost function is as follows: q (s, a) — Q (s, a) + α (r + γ Q (s ', a') -Q (s, a)), where Q (s, a) is the value of the state-action (s, a), α is the learning rate, γ is the discount factor, and r is the immediate reward. Depending on routing issues, the formula can be reduced to Q (s, a) ═ r + Q (s ', a').
In the learning model, the QoS performance of the link is obtained through interaction among nodes, a global link QoS performance matrix (return R) can be obtained at any time t after the system starts to operate, and at the time t, < S, A, P, R, gamma > are known, so that the problem can be solved by using reinforcement learning with the model.
According to the Q value update formula Q (s, a) ═ r + Q (s ', a'), Q (s, a) represents the shortest path delay from the s node to the d node when the s node selects its neighbor node a as the next hop. The optimization objective considered is to minimize the delay, so minimizing Q (s, a) is equivalent to minimizing r + Q (s ', a'), while the immediate reward term r is a definite value at a certain time, so only Q (s ', a') needs to be minimized. Specifically, after finding out the next hop node j of a certain node i in the network by the epsilon-greedy algorithm, only the next hop k with the minimum Q value in the Q value table of the node j (i.e. the next hop k with the minimum Q value) needs to be found
Figure BDA0002346146560000061
) Updated, thereby calculated QijThe value is the optimal value for the current stage.
Example 2.
The specific process of Q learning will be described by taking fig. 3 as an example. A path is found from v1 to v7 that meets the QoS requirements. Each node is assigned a state, and in the network shown in fig. 3, the number of states is 7. In each state, an action is defined as a selectable next hop of each node, taking v1 as an example, the action set size is 3, and three selectable actions are v2, v3 and v 4. Three Q values are defined in each state and are respectively used for describing the conditions of time delay, packet loss rate and bandwidth from the current node to the destination node. v1 firstly uses epsilon-greedy strategy to send data packet, and judges which node the next hop of the data packet should be sent to according to Q values of v2, v3 and v 4. If the packet loss rate and the bandwidth meet the requirements when v2 is selected (judged by Q12(e) and Q12 (v)), Q12(d) is updated, the same steps are repeated by v2 until v7 is selected, and an iterative process is completed.
A large amount of node information and link information need to be used in a network environment, if all data information is stored in the SDN controller, a large amount of cache space of the controller is occupied, the complexity of updating and accessing the information is also on the order of O (| V | | E |), and taking a complete graph as an example, the complexity is O (| V | | E |)3) This can greatly increase the pressure on the SDN controller.
For the situation, considering that each node has certain computing power and storage space, the invention adopts a distributed method, and each user independently saves the link situation between the user and the neighbor and updates the Q value. Namely, each node independently calculates the Q value of v7 by using the RREQ data packet in the same way, but the Q values and the link QoS values of other nodes are obtained among the nodes through interaction and are used as the standard for action selection, so that the training speed is higher, and the complexity is lower.
Example 3.
It is assumed that a service flow in the current network initiates a routing request, the source node is s, the destination node is d, Qij(d)、Qij(e) And Qij(v) Respectively represents the accumulated link delay, the accumulated packet loss rate and the path rate from the current node to the destination node d, infinity represents infinity, each node maintains the 3 types of Q value information of the node,
Figure BDA0002346146560000062
the neighbor node i, l representing node j represents the experimental round.
Now, it is known that: source node s, destination node j, service type (here, low delay service is taken as an example), QoS requirements of the service flow;
the purpose is to obtain: the path selection scheme for this traffic pi.
An initialization stage:
Figure BDA0002346146560000072
Qij(d)=∞,Qij(e)=1,Qij(v) 0// initializing Q-value
And (3) an online learning stage:
Figure BDA0002346146560000071
Figure BDA0002346146560000081
and (3) outputting:
Figure BDA0002346146560000083
example 4.
The present embodiment introduces a method for a network to cope with link fluctuations. Each node i E V is regarded as an independent agent, and the set of neighbor users is
Figure BDA0002346146560000085
The node will send a Route Request message (RREQ) according to an epsilon-greedy rule within a fixed iteration period, so as to detect the QoS value of the wireless link between the current node and the neighboring user.
For a certain node i in the network, the node is set to bear the load according to the self bearing capacity and the computing capacity of the node
Figure BDA0002346146560000084
Information of a plurality of neighbors (all the neighbors can be updated under the condition that the number of the neighbors is not large), and the RREQ is sent once by selecting k neighbors each time. The neighbor node returns a Route Reply message (RREP) after receiving the RREQ, and returns the RREP according to the link condition detected by the RREQInstantaneous status information (including bit error rate e) of additional linksijTime delay dijAnd velocity vij) And Q value information Q stored by the neighbor nodejk(d)、Qjk(e) And Qjk(v) And also returns the link status to the node i through the RP to complete the update of the link status.
After the node i selects k neighbors to send RREQs, the RREP sent back by the neighbor nodes is received, and the QoS value of the link i-j is updated by using the information carried in the RREP.
An off-line stage:
initialization:
dij(0)=0,eij(0)=0,vij(0)=0
Figure BDA0002346146560000082
Figure BDA0002346146560000091
as can be seen from the above, if the network fluctuates greatly in a period of time, the currently stored expected value of the link state can be discarded, and the link state calculation can be performed again.
Example 5.
The algorithm was simulated using python. Based on the system model, the following parameters are set for the simulation environment:
the number of nodes is as follows: 50; area size: 1000km multiplied by 1000 km; node communication range: 300 km; link error rate: normal distribution; link rate: uniformly distributing 50kbps to 450 kbps; link delay: normal distribution; source node, destination node: random integers of 0 to 49; service requirements are as follows: and (4) randomly generating.
The error rate of the link is set as a random number which follows normal distribution, the parameter of the normal distribution is related to the degree of the node (the number of neighbors of the node), and the delay of the link is also set as a random number which follows normal distribution, and the parameter is related to the length of the link. In order to generate a dynamic environment, certain dynamic property is added to the QoS value of the link under the condition that the link connection state is not changed, the QoS value is changed within a small range, and the link fluctuation in an actual scene is simulated.
In the simulation process, the node is defined as a python class, and the inclusion attribute is set as follows: node position, neighbor node set and Q value table for different QoS, wherein each type of Q value table of each node is length
Figure BDA0002346146560000092
And storing the list. The adjacency matrix (two-dimensional array) of the network environment map is stored, and the positions of 50 nodes are obtained by randomly generating horizontal and vertical coordinates within the range of 1000km × 1000km, and the specific network environment map is shown in fig. 4.
In fig. 4, Source node of the traffic flow is set to be node No. 2, destination node Destiny is node No. 20, and the specific location has been calibrated in the figure. To facilitate the verification of convergence, the QoS requirements for the traffic flow are first set as: the speed requirement v is more than or equal to 50kbps, and the error rate requirement e is less than or equal to 1. Under such a demand, the problem translates into an unconditionally minimized latency routing problem.
Fig. 5 shows the convergence process of the algorithm under different values of epsilon (epsilon), fig. 6 is the convergence details when epsilon changes linearly and epsilon is 0.5 when the iteration time (iteration time) in fig. 5 is within 7000 times, and the algorithm can be considered to converge when the Q value (Qd _ value) no longer changes within a period of time. When epsilon is small, the algorithm is explored with small probability and utilized with large probability, network information cannot be fully acquired at the initial stage of system operation, so that the convergence speed is slow, the convergence speed of the algorithm gradually increases along with the increase of epsilon, and as can be seen in fig. 6, the fastest convergence speed can be obtained by adopting the linearly changed epsilon value related to iteration rounds, and convergence can be achieved after about 2000 iterations. After the algorithm converges, the source node starts to use the Q value, and selects the node with the smallest Q value as the next hop at each node, and the finally obtained path is [2,36,49,23,20], which is the route with the lowest delay under the network, and the lowest delay is 16.9647 ms.
Example 6.
As shown in fig. 7, when a path from node 32 (Source) to node 20 (Destiny) is found for a traffic flow under the condition of different bit error rate and rate requirements, the path result given by the method of the present invention is used, and the simulation data gives detailed QoS requirement setting and algorithm operation results:
the QoS requirements for traffic flow 1 are: gamma is less than or equal to 0.15, and v is more than or equal to 100 kbps;
actual routing results: γ is 0.119, ν is 106.78kbps, δ is 19.294 ms;
the QoS requirements for traffic flow 2 are: gamma is less than or equal to 0.15, and v is more than or equal to 120 kbps;
actual routing results: γ is 0.137, ν is 169.497kbps, δ is 19.74 ms;
the QoS requirements for traffic flow 3 are: gamma is less than or equal to 0.25, and v is more than or equal to 250 kbps;
actual routing results: γ is 0.238, ν is 272.289kbps, δ is 23.769 ms;
the QoS requirements for traffic flow 4 are: gamma is less than or equal to 0.75, and v is more than or equal to 450 kbps;
actual routing results: γ is 1, ν is 0, δ is infinity;
the QoS requirements for traffic flow 5 are: gamma is less than or equal to 0.05, and v is more than or equal to 50 kbps;
actual routing results: γ is 1, ν is 0, δ is infinity.
For the service flow 1, the service flow 2 and the service flow 3, the present invention finds the route with the shortest delay under the given requirement, and the specific path is as shown in fig. 7. Compared with the service flow 1 and the service flow 2, the service flow 3 has stronger QoS limitation, so that the routing path of the service flow 3 is longer, the hop count is more, and the time delay is also obviously longer. For the service flow 4 and the service flow 5, because a path meeting the requirement does not exist in the network, the time delay result returned by the algorithm is infinity, for the service flow 4, the network cannot meet the requirement of the communication rate, and for the service flow 5, the network cannot meet the requirement of the error rate.
Fig. 8 shows the case of partial node loss in the network, and fig. 9 shows the case of partial link failure. The method of the present invention is set to occur when 7000 times of iteration, and as can be seen from the figure, when some nodes or links in the network are abnormal, the method of the present invention can adaptively perform route recovery after sensing the change. Because the node stores the Q value information of each action, if some nodes or links in the optimal route fail, the method can select an alternative path to forward the service, and the network survivability is enhanced.
According to the QoS performance of the algorithm, the distributed Q learning method (also called QLRA) and the DSDV routing algorithm are compared. From a source node s to a destination node d, the obtained path delay is 13.6436ms, the bit error rate is 0.1140 and the rate is 131.0551kbps by a QLRA method; and the DSDV routing algorithm obtains the path delay of 8.3547ms, the bit error rate of 0.1274 and the rate of 57.2383 kbps. The path in the DSDV takes hop counts (hops) as a measurement standard, the given routing information can only satisfy one QoS requirement of hop counts, and when the QoS requirement types of services increase, the DSDV algorithm cannot satisfy all requirements. When the QoS requirement of the service is limited to be less than or equal to 0.12, and the communication speed is more than or equal to 50kbps, the DSDV algorithm cannot provide a routing result meeting the requirement, and the DSDV algorithm can only provide the result according to the shortest time delay.
As shown in fig. 10, the probability of correct transmission of traffic at different interaction frequencies in a dynamic network environment is shown. It can be seen that the QLRA method can sense the change of the network environment, so that the accuracy of service delivery gradually increases with the increase of the learning round, and the DSDV can only store the local network topology information obtained last time, so that the accuracy of transmission cannot be guaranteed, and information needs to be interacted again and the routing table needs to be updated each time the network environment changes. Fig. 10(a) shows the success rate of service transmission at a high interaction frequency, and fig. 10(b) shows the service transmission rate at a low interaction frequency, and it can be seen that the success rates of the two algorithms decrease to different degrees with the decrease of the interaction frequency, but since the QLRA method performs message update in return for long-term expectation of obtaining an environment state, even if the interaction frequency between nodes is decreased, the distributed Q learning method still learns a network change trend to a certain degree, and ensures a certain accuracy, and the accuracy of the DSDV algorithm decreases to a greater degree, because it can only ensure that the link information interacted at a certain time is correct, and the accuracy of the algorithm is greatly affected when the environment dynamically changes.
The DSDV algorithm adopts a flooding mode to carry out routing information interaction and acquisition, a large number of invalid messages can appear in a network, and the problem of broadcast storm is caused. In addition, the route table maintained by the QLRA method of the invention does not provide a standby route for the node to forward the message. When the network environment fluctuates, such as some links or nodes in the network are damaged, the DSDV algorithm cannot guarantee the delivery rate of the service, but the QLRA method of the present invention can select the backup path to forward according to the Q value information of other nodes in the Q table.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A distributed intelligent routing method facing unmanned aerial vehicle network slices is characterized by comprising the following steps:
s1: modeling an unmanned aerial vehicle network into a network model; the network model is an undirected graph G (V, E, W) with a weighted edge, wherein V is a communication node set in the network, E is a communication link set in the network, and W represents the weighted value of the link in the network;
s2: setting constraints on the network model, wherein the constraints comprise: time delay limit, rate limit, packet loss rate limit;
s3: when the slice is oriented to low time delay, a constrained network model is built into a multi-constraint optimization model; the multi-constraint optimization model is as follows:
minimize Delay=E[Σi,j=1…nD(Lij)]t
subject to Error=E[1-Πi,j=1…n(1-R(Lij))]t≤γ、
TransRate=E[mini,j=1…nV(Lij)]t≥ν、
Lij∈p=(Lsi,…,Lij,…,Ljd);
E[θ]tdenotes the expectation of theta in its duration, t denotes the duration of the traffic flow, Error denotes the packet loss rate of the traffic flow, TransRate denotes the transmission rate of the link, and L denotes the rate of the linkijThe start node belonging to the same group as the source node s and the end node belonging to the same group as the destination node d; minimize Delay represents the transmission Delay with the optimization goal of minimizing the path;
s4: solving the multi-constraint optimization model through a reinforcement learning model to obtain a solved value, wherein each communication node independently stores the link conditions of the communication node and the neighbor nodes and updates the link conditions in real time in the solving process;
s5: and carrying out dynamic routing according to the solved value and the link condition.
2. The distributed intelligent routing method for unmanned aerial vehicle network slices of claim 1, wherein the technologies used by the communication link include: TDMA, CSMA, or polling.
3. The unmanned-aerial-vehicle-network-slice-oriented distributed intelligent routing method of claim 1, wherein the constraint is expressed by QoS ═ δ, ν, γ, δ denotes latency, ν denotes rate, γ denotes packet loss rate, and the constraint comprises:
i,j=1…nD(Lij)≤δ、
mini,j=1…nV(Lij)≥v、
1-Πi,j=1…n(1-R(Lij))≤γ;
Lijrepresenting a link of the communication node i to the communication node j; d (L)ij) Represents the link LijTime delay of (2); v (L)ij) Represents the link LijThe rate of (d); r (L)ij) Represents the link LijThe packet loss rate of (1).
4. The distributed intelligent routing method for unmanned aerial vehicle network slice according to claim 1, wherein the communication node measures the quality of the link QoS between the communication node and the destination node by using a Q value, and the step S5 includes the following sub-steps:
s51: defining a communication node for sending a data packet as a data packet node;
s52: the data packet node sends a data packet to a neighbor node;
s53: judging a communication node of a next hop of a data packet according to the Q value of the neighbor node and the link QoS, and taking the communication node of the next hop as a data packet node;
s54: repeating the steps S52-S54 until the data packet node is the destination node.
5. The distributed intelligent routing method for unmanned aerial vehicle network slices of claim 4, wherein the packet node independently calculates a Q value from the packet node to a destination node.
6. The distributed intelligent routing method for unmanned aerial vehicle network slices of claim 1, wherein when the network fluctuates greatly, the communication node discards the currently stored link condition and performs the link condition calculation again.
7. The distributed intelligent routing method for unmanned aerial vehicle network slices of claim 1, wherein the reinforcement learning employs a value iteration method.
8. The distributed intelligent routing method for unmanned aerial vehicle network slices of claim 1, wherein the routing problem is modeled as a Markov Decision Process (MDP), the MDP model comprising: a state machine, an action set, a probability transition matrix, a reward matrix, a discount factor, the discount factor used to calculate a cumulative reward.
CN201911395351.6A 2019-12-30 2019-12-30 Distributed intelligent routing method for unmanned aerial vehicle network slice Active CN111065105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911395351.6A CN111065105B (en) 2019-12-30 2019-12-30 Distributed intelligent routing method for unmanned aerial vehicle network slice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911395351.6A CN111065105B (en) 2019-12-30 2019-12-30 Distributed intelligent routing method for unmanned aerial vehicle network slice

Publications (2)

Publication Number Publication Date
CN111065105A CN111065105A (en) 2020-04-24
CN111065105B true CN111065105B (en) 2021-06-11

Family

ID=70304598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911395351.6A Active CN111065105B (en) 2019-12-30 2019-12-30 Distributed intelligent routing method for unmanned aerial vehicle network slice

Country Status (1)

Country Link
CN (1) CN111065105B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112202848B (en) * 2020-09-15 2021-11-30 中国科学院计算技术研究所 Unmanned system network self-adaptive routing method and system based on deep reinforcement learning
CN112161630B (en) * 2020-10-12 2022-07-15 北京化工大学 AGV (automatic guided vehicle) online collision-free path planning method suitable for large-scale storage system
CN112383482B (en) * 2020-11-16 2021-10-08 北京邮电大学 Dynamic Q value route calculation method and device based on data plane
CN112672110B (en) * 2020-12-16 2023-05-26 深圳市国电科技通信有限公司 Unmanned aerial vehicle inspection real-time video transmission system based on network slicing
CN112822109B (en) * 2020-12-31 2023-04-07 上海缔安科技股份有限公司 SDN core network QoS route optimization method based on reinforcement learning
CN112887999B (en) * 2021-01-27 2022-04-01 重庆邮电大学 Intelligent access control and resource allocation method based on distributed A-C
CN113347102B (en) * 2021-05-20 2022-08-16 中国电子科技集团公司第七研究所 SDN link surviving method, storage medium and system based on Q-learning
CN115843037A (en) * 2021-08-17 2023-03-24 华为技术有限公司 Data processing method and device
CN114499648B (en) * 2022-03-10 2024-05-24 南京理工大学 Unmanned aerial vehicle cluster network intelligent multi-hop routing method based on multi-agent cooperation
CN114726434B (en) * 2022-03-18 2023-09-19 电子科技大学 Millisecond-level rapid path-finding method suitable for large-scale optical network
CN116208560B (en) * 2023-03-03 2024-04-30 济南大学 SDN data center network load balancing method and system for elephant flow

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104168620A (en) * 2014-05-13 2014-11-26 北京邮电大学 Route establishing method in wireless multi-hop backhaul network
CN104581862A (en) * 2014-12-27 2015-04-29 中国人民解放军63655部队 Measurement and control communication method and system based on low-altitude unmanned aerial vehicle self-network
KR20150123120A (en) * 2014-04-24 2015-11-03 울산대학교 산학협력단 Apparatus for providing qos in wireless ad hoc networks
CN107517158A (en) * 2017-08-29 2017-12-26 北京航空航天大学 The design method of Communication Network for UAVS joint route agreement
CN108521375A (en) * 2018-04-17 2018-09-11 中国矿业大学 The transmission of the network multi-service flow QoS based on SDN a kind of and dispatching method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080008202A1 (en) * 2002-10-31 2008-01-10 Terrell William C Router with routing processors and methods for virtualization
CN102393747B (en) * 2011-08-17 2015-07-29 清华大学 The collaborative interactive method of unmanned plane cluster
CN103246204B (en) * 2013-05-02 2016-01-20 天津大学 Multiple no-manned plane system emulation and verification method and device
US9565689B2 (en) * 2013-10-23 2017-02-07 Texas Instruments Incorporated Near-optimal QoS-based resource allocation for hybrid-medium communication networks
KR102339925B1 (en) * 2017-01-10 2021-12-16 한국전자통신연구원 Apparatus for controlling multi-drone networks and method for the same
CN108040353A (en) * 2017-12-18 2018-05-15 北京工业大学 A kind of unmanned plane swarm intelligence Geographic routing method of Q study

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150123120A (en) * 2014-04-24 2015-11-03 울산대학교 산학협력단 Apparatus for providing qos in wireless ad hoc networks
CN104168620A (en) * 2014-05-13 2014-11-26 北京邮电大学 Route establishing method in wireless multi-hop backhaul network
CN104581862A (en) * 2014-12-27 2015-04-29 中国人民解放军63655部队 Measurement and control communication method and system based on low-altitude unmanned aerial vehicle self-network
CN107517158A (en) * 2017-08-29 2017-12-26 北京航空航天大学 The design method of Communication Network for UAVS joint route agreement
CN108521375A (en) * 2018-04-17 2018-09-11 中国矿业大学 The transmission of the network multi-service flow QoS based on SDN a kind of and dispatching method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A QoS routing algorithm for group communications in Cognitive Radio ad hoc networks;Liming Xie; Jingjing Xi;《2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC)》;20110923;1953-1956 *
基于层叠网的QoS自适应路由系统的设计与实现;吕保平;《中国优秀硕士学位论文全文数据库 信息科技辑(月刊)》;20110515;全文 *
战术MANET中的路由协议及QoS路由算法研究;杜青松;《中国博士学位论文全文数据库 信息科技辑(月刊)》;20170215;全文 *

Also Published As

Publication number Publication date
CN111065105A (en) 2020-04-24

Similar Documents

Publication Publication Date Title
CN111065105B (en) Distributed intelligent routing method for unmanned aerial vehicle network slice
Jung et al. QGeo: Q-learning-based geographic ad hoc routing protocol for unmanned robotic networks
Zheng et al. A mobility and load aware OLSR routing protocol for UAV mobile ad-hoc networks
CN111416771B (en) Method for controlling routing action based on multi-agent reinforcement learning routing strategy
De Rango et al. Link-stability and energy aware routing protocol in distributed wireless networks
Magaia et al. A multi-objective routing algorithm for wireless multimedia sensor networks
CN109962773B (en) Wide-area quantum cryptography network data encryption routing method
CN108684063B (en) On-demand routing protocol improvement method based on network topology change
Hendriks et al. Q 2-routing: A QoS-aware Q-routing algorithm for wireless ad hoc networks
CN111130853B (en) Future route prediction method of software defined vehicle network based on time information
KR101506586B1 (en) Method for routing and load balancing in communication networks
Harshavardhana et al. Power control and cross-layer design of RPL objective function for low power and lossy networks
CN107094112A (en) Bandwidth constraint multicast routing optimization method based on drosophila optimized algorithm
CN105848238A (en) IPv6 routing method of wireless sensor networks based on multiple parameters
Arkian et al. FcVcA: A fuzzy clustering-based vehicular cloud architecture
Long et al. Research on applying hierachical clustered based routing technique using artificial intelligence algorithms for quality of service of service based routing
CN113727408A (en) Unmanned aerial vehicle ad hoc network improved AODV routing method based on speed and energy perception
Gallardo et al. Multipath routing using generalized load sharing for wireless sensor networks
Bokhari et al. AMIRA: interference-aware routing using ant colony optimization in wireless mesh networks
Wang et al. A reliability-aware adaptive greedy-multicast routing protocol for 3D highly dynamic networks
Dong et al. Topology control mechanism based on link available probability in aeronautical ad hoc network
Ziane et al. Inductive routing based on dynamic end-to-end delay for mobile networks
Garai et al. A novel architecture for qos provision on vanet
Onwuegbuzie et al. Shortest Path Priority-based RPL (SPPB-RPL): The Case of a Smart Campus
KR101639149B1 (en) A method of information transmission using location information including measurement errors in wireless mobile networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant