CN111629415B - Opportunistic routing protocol design method based on Markov decision process model - Google Patents
Opportunistic routing protocol design method based on Markov decision process model Download PDFInfo
- Publication number
- CN111629415B CN111629415B CN202010331293.7A CN202010331293A CN111629415B CN 111629415 B CN111629415 B CN 111629415B CN 202010331293 A CN202010331293 A CN 202010331293A CN 111629415 B CN111629415 B CN 111629415B
- Authority
- CN
- China
- Prior art keywords
- node
- state
- value
- packet
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000008569 process Effects 0.000 title claims abstract description 28
- 238000013461 design Methods 0.000 title claims description 8
- 238000004891 communication Methods 0.000 claims abstract description 9
- 238000001514 detection method Methods 0.000 claims abstract description 8
- 230000007613 environmental effect Effects 0.000 claims abstract description 4
- 230000009471 action Effects 0.000 claims description 47
- 230000007704 transition Effects 0.000 claims description 20
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000009331 sowing Methods 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims 1
- 230000000875 corresponding effect Effects 0.000 description 19
- 230000005540 biological transmission Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000002787 reinforcement Effects 0.000 description 5
- 206010011906 Death Diseases 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W40/00—Communication routing or communication path finding
- H04W40/02—Communication route or path selection, e.g. power-based or shortest path routing
- H04W40/04—Communication route or path selection, e.g. power-based or shortest path routing based on wireless node resources
- H04W40/10—Communication route or path selection, e.g. power-based or shortest path routing based on wireless node resources based on available power or energy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W40/00—Communication routing or communication path finding
- H04W40/02—Communication route or path selection, e.g. power-based or shortest path routing
- H04W40/12—Communication route or path selection, e.g. power-based or shortest path routing based on transmission quality or channel quality
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/38—Services specially adapted for particular environments, situations or purposes for collecting sensor information
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses an opportunistic routing protocol based on a Markov decision process model, which comprises the following steps of firstly evaluating the quality of an environmental link and evaluating the receiving rate of a packet: acquiring packet receiving rate data under the same RSSI value and LQI average value and packet receiving rate data under different communication distances, establishing a sample space, and performing curve family regression fitting on the LQI average value and the packet receiving rate data to obtain an estimation formula of the packet receiving rate; broadcasting wireless sensor nodes to construct a wireless sensor network; periodically broadcasting and receiving a detection packet by a sensor node, and establishing a neighbor information table; the sensor node establishes a candidate node set; the node where the effective data packet is located broadcasts the data packet, the candidate node which receives the data packet recalculates the state value corresponding to the node according to the value iteration formula, and the data packet is sent Fang Xuanqu to return the node with the maximum corresponding state value as the next hop forwarding node. The invention optimizes and balances the energy use of the wireless sensor network.
Description
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a method for designing an opportunistic routing protocol based on a Markov decision process model.
Background
The wireless sensor network is a network formed by a plurality of sensor nodes in a multi-hop self-organizing mode, and has a very wide application prospect. In a great deal of research work on wireless sensor networks, the research on routing protocols is always key content, and reasonable routing design can effectively improve network performance. Because the sensor nodes are scattered in the unmanned area at random, the battery is difficult to replace, and the problems of saving the consumption of the node energy and balancing the use of the network energy become unavoidable.
Conventional wireless sensor network routing protocols select one or more optimized fixed paths before data transmission begins, and data packets are transmitted along the predetermined fixed paths. Unlike conventional routing protocols, each node in the opportunistic routing that receives a data packet may act as a relay node, and the routing path from the source node to the destination node is not fixed. And each node in the network acquires neighbor node information and network parameters through periodically sending and receiving the detection packet, and selects a proper neighbor node as a candidate node to form a candidate node set (CRS). In the forwarding process, the node selects an optimal next-hop forwarding node from candidate nodes which successfully receive the data packet. This process is repeated until the packet is forwarded to the destination node.
The traditional routing protocol has the defects that the routing path is fixed, the network environment change cannot be well adapted, and the network performance is greatly influenced by the changes of factors such as network link quality, node residual energy, node position and the like. If the data transmission of the next hop node in the sending direction fails in a certain data transmission process, the sending party retransmits the data packet until the next hop node receives the data packet successfully, and the data packet is not forwarded by other nodes receiving the data packet, so that the waste of network resources is caused. The disadvantage of opportunistic routing is that the data packet is hop-by-hop selected to be forwarded to the destination node, and the forwarding node has no global information of the network, so that the routing decision can be carried out only by depending on the position information of the neighbor node and the destination node, and the data packet is ensured to be forwarded to the destination node continuously. In this case, if there is a deviation in the positioning of the sensor nodes in the network or an unknown variation in the node positions, the routing protocol performance will be severely affected. Besides the influence of the position information error, the opportunistic routing protocol only considers the parameters of the neighbor nodes, and the single-step optimal solution is only sought, so that the global optimal cannot be realized. The above problems limit further improvement of the opportunistic routing protocol performance, and how to overcome the dependence on the location information, and it is a great difficulty to select the best node from the global perspective to forward the data packet.
Disclosure of Invention
The invention aims to provide an opportunistic routing protocol based on a Markov decision process model, so as to solve the problems that the routing protocol performance is poor because node position information is relied on and a global optimal solution cannot be realized in the prior art; the invention realizes the design of the opportunistic routing protocol through the Markov decision process of reinforcement learning, so that the energy use of the wireless sensor network is optimized and balanced, and the aim of prolonging the life cycle of the network is fulfilled.
The technical solution for realizing the purpose of the invention is as follows:
an opportunistic routing protocol based on a markov decision process model comprising the steps of:
acquiring packet receiving rate data under the same RSSI value and LQI average value and packet receiving rate data under different communication distances, establishing a sample space, and performing curve family regression fitting on the LQI average value and the packet receiving rate data to obtain an estimation formula of the packet receiving rate;
Step 6, repeating the data packet forwarding process of the step 5 until the data packet is forwarded to the sink node; and finally, obtaining an optimal routing path by continuously carrying out data packet forwarding and state value iteration.
Compared with the prior art, the invention has the remarkable advantages that:
(1) The invention provides a method for combining a Markov decision process of a classical reinforcement learning method, which provides a new method for the field of current wireless sensor network routing protocol design, and when the Markov Decision Process (MDP) modeling is carried out on the opportunistic routing optimization problem, a probability transition matrix P is derived by utilizing the packet receiving rate among sensor nodes, and then the problem of searching the optimal solution of the Markov decision process model is solved by utilizing dynamic programming, so that the algorithm is more efficient and has better convergence.
(2) The invention calculates the packet receiving rate between the sensor nodes in real time by utilizing the Received Signal Strength Indication (RSSI) and the Link Quality Indication (LQI) provided by the network physical layer, so that the algorithm can adapt to the change of the network state.
(3) The invention designs the opportunistic routing protocol by using the reinforced-learning Markov decision process model, does not depend on the position information of the sensor node, and can search a proper forwarding path from the global optimal angle after continuous learning.
Drawings
FIG. 1 is a schematic view of the present invention
FIG. 2 is a schematic diagram of a sensor network layout
FIG. 3 is a schematic diagram of state transition matrix computation
FIG. 4 is a plot of the energy ratio k of the action prize R to the current node surplus F Relation of (2)
FIG. 5 is a schematic diagram of a learned opportunistic routing path
FIG. 6 is a schematic diagram of an end-of-life wireless sensor network
Detailed Description
The invention is further described with reference to the drawings and specific embodiments.
The invention provides an opportunity routing protocol based on a Markov decision process model, which utilizes reinforcement learning to find an energy optimal forwarding path, and comprises the following specific steps:
and selecting a certain area as a data acquisition area, carrying out multiple communication experiments in the area by utilizing two sensor nodes under different communication distances, acquiring packet receiving rate data under the same RSSI value, and establishing a sample space by LQI average values and packet receiving rate data under different communication distances. When the link communication quality is good, the correlation between the RSSI value and the packet receiving rate is the best, so the RSSI value is used for estimating the packet receiving rate. When-70 dBm is less than or equal to RSSI, the packet receiving rate is 100%; when the RSSI is less than or equal to-75 dBm and less than or equal to-70 dBm, the wrapping yield is 99 percent; when the RSSI is less than or equal to-80 dBm and less than or equal to-75 dBm, the wrapping yield is 98 percent; when the RSSI is less than or equal to-85 dBm and less than or equal to-80 dBm, the packet receiving rate is (RSSI+177%; when the RSSI is less than-85 dBm, the packet receiving rate is estimated by using the LQI mean value, and curve family regression fitting is carried out on the LQI mean value and the packet receiving rate data, so as to obtain an estimation formula of the packet receiving rate. The method utilizes the RSSI and LQI value information carried in the data packet transmission to estimate the packet receiving rate in real time, and can adapt the routing protocol to the change of the network.
as shown in fig. 2, a plurality of wireless sensor nodes are randomly scattered in a selected data acquisition area to form a wireless sensor network, wherein the wireless sensor network comprises a sink node which is responsible for collecting data acquired by common nodes in the area and uploading the data to the network. The common node has limited energy, and sink node has sufficient energy.
each sensor node in the sensor network periodically broadcasts a detection packet, wherein the detection packet comprises a node ID, a node corresponding state value, a node sleep/interception duty cycle and a candidate node set of the node. Wherein the node corresponding state value is used to evaluate the value of a node as a data packet forwarding node, and the candidate node set of nodes refers to a set of nodes that can receive and forward data packets from the node. Each sensor node receives detection packets from the neighbor nodes, builds a neighbor information table, acquires RSSI and LQI values when receiving data packets, and estimates the packet receiving rate between the sensor node and the neighbor nodes by using the fitting formula obtained in the step 1. Considering the sleep/listening period of a node, in fact, the probability L that node j successfully receives a node i broadcast packet ij Can be calculated as follows:
L ij =p ij ·k jw
wherein ,pij For the packet reception rate, k, between node i and node j jw Is the listening time duty cycle of node j. This step is performed periodically during the life cycle of the sensor network.
the sensor node sorts the neighbor nodes according to the corresponding state values, and sequentially selects the neighbor nodes with the state values larger than or equal to the neighbor nodes as candidate nodes until the probability that the data packet broadcast by the sensor node is successfully received by at least one candidate node is larger than 90% or no selectable neighbor node. The candidate nodes constitute a candidate node set. If a data packet broadcast by a sensor node at a time is not received by any candidate node, the node rebroadcasts the data packet. After updating the state value of the neighbor node, the node periodically repeats the step and reestablishes the candidate node set according to the new state value.
5.1 modeling opportunistic routing problems with a Markov decision process model:
in the opportunistic routing problem, the agent is a valid packet that needs to be forwarded. The modeling process is in the prior art, wherein S represents a state set, states are represented by S, different states of an effective data packet are represented by S, the effective data packet is located in different sensor nodes, each state S of the effective data packet corresponds to a node, and a value of a state corresponding to each node is a state value corresponding to the node and is used for representing the value of the node as a forwarding node; a represents an action set, the action is represented by a, the action set A is a broadcast data packet, a next hop forwarding node is selected according to a certain rule, and the difference between different actions is that the rule for selecting the next hop node is different, so that the generated state transition probability matrixes P are also different; p is a state transition matrix, which represents the probability that an effective data packet is at a certain node after taking a certain action, and different actions can generate different state transition probabilities related to the action taken; r is an action reward, and taking an action in a certain state generates a corresponding reward.
5.2 calculating a state transition probability matrix P:
according to the invention, a state transition probability matrix P is obtained according to the packet receiving rate between nodes, and fig. 3 is a schematic diagram of state transition probability matrix calculation. In the graph, CRS i is a candidate node set of node i, and the node set has m candidate nodes and corresponds to a state value v (j) 1 )>v(j 2 )>v(j 3 )>v(j 4 )>…>v( m), wherein j1 、j 2 、j 3 、j 4 、j m Are candidate nodes for node i. The action taken by the node where the effective data packet is located is broadcasting the data packet, and the node with the largest state value is selected from candidate nodes receiving the data packet as a forwarding node of the next hop according to a greedy strategy, so that the effective data packet is transferred from the node i to the node j y And is composed of j y Probability of forwardingCan be calculated by the following formula:
wherein ,representing node j y Probability of successful reception of a node i broadcast packet, < >>Representing node j t The probability of successfully receiving a broadcast packet by node i, t is the amount of change from 1 to y-1, and y is the amount of change from 1 to m. Node j other than the candidate node x (x=m+1, m+2.,. N) will not act as a forwarding node for the node i broadcast packet, and therefore +.>Is the amount varying from m+1 to N. Probability calculated by node i>The ith row and the jth row of the state transition probability matrix P respectively 1 、j 2 、…、j m 、…、j N And calculating the value of the column and the value of the corresponding row of the state transition probability matrix P by each node to obtain the complete state transition probability matrix P.
5.3 developing a reward function R:
in reinforcement learning, each walker generates a walking reward R,representing rewards earned by taking action a in state s. In each state, the action set A can be broadcast data packet and select the candidate node of next hop according to a certain rule. The broadcast data packet generates corresponding action rewards for actual actions, so that rewards corresponding to all actions in the same action set are the same, and the action rewards only relate to the current state.
And (3) reasonably formulating an action reward, and realizing a corresponding optimization task by a reinforcement learning algorithm. The final purpose of the energy-aware opportunistic routing is to optimize the network energy use and improve the network life cycle. To achieve this, on the one hand, network energy is saved, and the data packet is forwarded to the destination node by the shortest path possible; on the other hand, network energy is balanced, and premature energy exhaustion of part of nodes due to frequent use is avoided. In order to balance the above two problems, the present invention formulates an action rewarding function R s =-1+f(k E), wherein Rs Representing an action prize, k, for broadcasting a data packet in state s E Ratio of remaining energy to initial energy for current node, f (k E ) For remaining energy proportion k with respect to front node E Is a function of (2). Broadcast packets consume energy, so the action rewards are all negative. The data packet has a basic reward of-1 after each transmission, and can ensure that after a certain learning process, the state value function corresponding to the node far away from the destination nodeSmaller. f (k) E ) The lower the current node's remaining energy, the greater the cost of forwarding the packet, and the less the action reward. Based on this principle, f (k) E ) The expression of (2) is as follows:
at this time, the ratio k of the energy remaining in the current node to the action prize R E The relationship of (2) is shown in FIG. 4.
5.4, formulating an action strategy:
the action strategy in the Markov decision process model adopts a greedy strategy, namely, an optimal action is adopted under the state s, so that the state value of the state s after iteration is maximized. In order to better explore the state space, most algorithms give a certain randomness to the action strategy, so that the intelligent agent has probability to perform random actions to find possible better solutions. However, in the opportunistic routing problem, since the reward function is negative, all arrived states are assigned negative state values. Thus, the algorithm automatically explores the unknown state space, and the unused nodes are preferentially utilized to forward data packets. Through the continuous learning process, the data packet reaches the destination node along the path with the minimum forwarding cost. The present algorithm uses a greedy strategy as the action strategy.
5.5 the candidate nodes iterate corresponding state values and return the state values:
broadcasting data packets by nodes where effective data packets are located, and dynamically planning a value iteration formula of candidate nodes receiving the data packets according to the dynamic programming value
The own state value is recalculated, but instead of immediately replacing the original state value, the value is returned to the source node. Wherein k represents the kth iteration, k+1 represents the kth+1 iteration, v represents a state value, s represents a current state, namely, a state corresponding to a candidate node receiving the data packet, and s' representsShowing the next time state, v k+1 (s) is the value of state s at the (k+1) th iteration, v k (s ') is the value of state s' at the kth iteration, a represents an action that can be taken in the current state s, A is the set of actions, gamma is the discount factor, R is the action reward,representing rewards for taking action a in state s, P is a state transition probability matrix,representing the probability that the state becomes s' at the next time after taking action a in state s, a corresponding value can be found in the state transition probability matrix P, max indicating that the action policy is a greedy policy.
And 5.6 selecting a next hop forwarding node.
The data packet sending node receives the status value returned by the candidate node, and selects the node with the highest status value corresponding to success as the next hop forwarding node, and broadcasts the message. The candidate node selected as the forwarding node uses the state value calculated by the median iteration formula in the step 5.5 as a new state value, other candidate nodes do not update the state value, and the received data packet is abandoned.
And 6, repeating the data packet forwarding process in the step 5 until the data packet is forwarded to the sink node. And finally, obtaining an optimal routing path by continuously carrying out data packet forwarding and state value iteration. Fig. 5 is a schematic diagram of a learned routing path. If a certain data packet has no path to reach the sink node, the data packet transmission fails. And when 30% of data packet transmission fails in a certain longer period, the life cycle of the wireless sensor network is ended. Fig. 6 is a schematic diagram of the end of life of the network, where the black filled circles indicate sensor nodes that die from energy depletion.
Claims (1)
1. The opportunistic routing protocol design method based on the Markov decision process model is characterized by comprising the following steps of:
step 1, evaluating the environmental link quality and the packet receiving rate:
carrying out multiple communication experiments in a data acquisition area by utilizing two sensor nodes under different communication distances, acquiring packet receiving rate data under the same RSSI value and establishing a sample space by LQI average value and packet receiving rate data under different communication distances; when-70 dBm is less than or equal to RSSI, the packet receiving rate is 100%; when the RSSI is less than or equal to-75 dBm and less than or equal to-70 dBm, the wrapping yield is 99 percent; when the RSSI is less than or equal to-80 dBm and less than or equal to-75 dBm, the wrapping yield is 98 percent; when the RSSI is less than or equal to-85 dBm and less than or equal to-80 dBm, the packet receiving rate is (RSSI+177%; when the RSSI is less than-85 dBm, estimating the packet receiving rate by using an LQI mean value, and performing curve family regression fitting on the LQI mean value and the packet receiving rate data to obtain an estimation formula of the packet receiving rate;
step 2, sowing wireless sensor nodes, and constructing a wireless sensor network: the wireless sensor network comprises a sink node which is responsible for collecting data collected by common nodes in the area and uploading the data to the network;
step 3, periodically broadcasting and receiving the detection packet by the wireless sensor node, and establishing a neighbor information table; establishing a neighbor information table, wherein the neighbor information table comprises neighbor node IDs, neighbor node corresponding state values, neighbor node sleep/interception duty ratios, candidate node sets of the neighbor nodes and the packet receiving rate of the neighbor nodes to the neighbor nodes;
step 4, the wireless sensor node establishes a candidate node set; establishing a candidate node set, namely sequencing neighbor nodes of the candidate node set according to the corresponding state values through wireless sensor nodes, sequentially selecting the neighbor nodes with the state values larger than or equal to the state values as the candidate nodes until the probability that the data packet broadcast by the candidate nodes is successfully received by at least one candidate node is larger than a set value or no selectable neighbor nodes, and stopping the candidate nodes to form the candidate node set;
step 5, solving a forwarding node selection problem in the opportunistic routing by using a Markov decision process: the node where the effective data packet is located broadcasts the data packet, the candidate node which receives the data packet recalculates the corresponding state value of the node according to a value iteration formula, and the data packet is transmitted Fang Xuanqu to the node with the largest corresponding state value to serve as a next hop forwarding node; the method for solving the forwarding node selection problem in the opportunistic routing by using the Markov decision process specifically comprises the following steps:
5.1 modeling the opportunistic routing problem by using a Markov decision process model;
5.2 calculating a state transition probability matrix P: obtaining a state transition probability matrix P according to the packet receiving rate between nodes, and transferring the effective data packet from the node i to the candidate node j y And is composed of j y Probability of forwarding(y=1, 2, …, m) can be calculated by the following formula:
wherein ,representing node j y Probability of successful reception of a node i broadcast packet, < >>Representing node j t The probability of successfully receiving the broadcast data packet of the node i is that m represents m candidate nodes in the candidate node set of the node i; the rest nodes except the candidate node are j x (x=m+1,m+2,…,N),/>N represents N wireless sensor nodes in total of the network; probability calculated by node i>The ith row and the jth row of the state transition probability matrix P respectively 1 、j 2 、…、j m 、…、j N The values of the columns and the values of the corresponding rows of the state transition probability matrix P are calculated by each node, so that a complete state transition probability matrix P can be obtained;
5.3, making action rewarding function:
formulating an action rewarding function R s =-1+f(k E), wherein Rs Representing an action prize, k, for broadcasting a data packet in state s E Ratio of remaining energy to initial energy for current node, f (k E ) For the ratio k of the remaining energy to the initial energy with respect to the current node E Is a function of (2);
design f (k) E ) The expression of (2) is as follows:
the less the current node residual energy, the greater the cost of forwarding the data packet, and the smaller the action rewards;
5.4, formulating an action strategy: adopting a greedy strategy, namely taking optimal action under the state s, so that the state value of the state s after iteration is maximized;
5.5 the candidate nodes iterate corresponding state values and return the state values:
broadcasting data packets by nodes where effective data packets are located, and dynamically planning a value iteration formula of candidate nodes receiving the data packets according to the dynamic programming value
Re-calculating the state value of the self, but not immediately replacing the original state value with the value, and returning the value to the source node; wherein k represents the kth iteration, k+1 represents the kth+1 iteration, v represents the state value, s represents the current state, i.e. the state corresponding to the candidate node receiving the data packet, s' represents the state at the next moment, v k+1 (s) is the value of state s at the (k+1) th iteration, v k (s ') is the value of state s' at the kth iteration, a represents an action that can be taken in the current state s, A is the set of actions, lambda is the discount factor, R is the action reward,is represented in state sRewarding with action a, P is a state transition probability matrix,>representing the probability that the state becomes s' at the next moment after taking action a in state s, a corresponding value can be found in the state transition probability matrix P, and max refers to an action policy as a greedy policy;
5.6 selecting a next hop forwarding node:
the data packet sending node receives the status value returned by the candidate node, selects the node with the highest status value which is successfully corresponding to the status value as the next hop forwarding node, and broadcasts the message;
step 6, repeating the data packet forwarding process of the step 5 until the data packet is forwarded to the sink node; and finally, obtaining an optimal routing path by continuously carrying out data packet forwarding and state value iteration.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010331293.7A CN111629415B (en) | 2020-04-24 | 2020-04-24 | Opportunistic routing protocol design method based on Markov decision process model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010331293.7A CN111629415B (en) | 2020-04-24 | 2020-04-24 | Opportunistic routing protocol design method based on Markov decision process model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111629415A CN111629415A (en) | 2020-09-04 |
CN111629415B true CN111629415B (en) | 2023-04-28 |
Family
ID=72260539
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010331293.7A Active CN111629415B (en) | 2020-04-24 | 2020-04-24 | Opportunistic routing protocol design method based on Markov decision process model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111629415B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112702710A (en) * | 2020-12-22 | 2021-04-23 | 杭州电子科技大学 | Opportunistic routing optimization method based on link correlation in low duty ratio network |
CN112954769B (en) * | 2021-01-25 | 2022-06-21 | 哈尔滨工程大学 | Underwater wireless sensor network routing method based on reinforcement learning |
CN113950113B (en) * | 2021-10-08 | 2022-10-25 | 东北大学 | Internet of vehicles switching decision method based on hidden Markov |
CN114125984B (en) * | 2021-11-22 | 2023-05-16 | 北京邮电大学 | Efficient opportunistic routing method and device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105848247A (en) * | 2016-05-17 | 2016-08-10 | 中山大学 | Vehicular Ad Hoc network self-adaption routing protocol method |
-
2020
- 2020-04-24 CN CN202010331293.7A patent/CN111629415B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105848247A (en) * | 2016-05-17 | 2016-08-10 | 中山大学 | Vehicular Ad Hoc network self-adaption routing protocol method |
Also Published As
Publication number | Publication date |
---|---|
CN111629415A (en) | 2020-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111629415B (en) | Opportunistic routing protocol design method based on Markov decision process model | |
CN112469100B (en) | Hierarchical routing algorithm based on rechargeable multi-base-station wireless heterogeneous sensor network | |
CN108712767B (en) | Inter-cluster multi-hop routing control method with balanced energy consumption in wireless sensor network | |
CN104301965A (en) | Wireless sensor network inhomogeneous cluster node scheduling method | |
Haberman et al. | Overlapping particle swarms for energy-efficient routing in sensor networks | |
Barbato et al. | Resource oriented and energy efficient routing protocol for IPv6 wireless sensor networks | |
CN108566658B (en) | Clustering algorithm for balancing energy consumption in wireless sensor network | |
Micheletti et al. | CER-CH: combining election and routing amongst cluster heads in heterogeneous WSNs | |
CN114501576B (en) | SDWSN optimal path calculation method based on reinforcement learning | |
CN105764110B (en) | A kind of wireless sensor network routing optimization method based on immune clonal selection | |
CN116261202A (en) | Farmland data opportunity transmission method and device, electronic equipment and medium | |
CN113316214B (en) | Self-adaptive cooperative routing method of energy heterogeneous wireless sensor | |
Darabkh et al. | An innovative RPL objective function for broad range of IoT domains utilizing fuzzy logic and multiple metrics | |
CN106685819B (en) | A kind of AOMDV agreement power-economizing method divided based on node energy | |
Fatima et al. | Route discovery by cross layer approach for MANET | |
Bongale et al. | Firefly algorithm inspired energy aware clustering protocol for wireless sensor network | |
CN114567917A (en) | Multi-channel Internet of things routing method based on fuzzy hierarchical analysis | |
Prakash et al. | Best cluster head selection and route optimization for cluster based sensor network using (M-pso) and Ga algorithms | |
Rabelo et al. | An approach based on fuzzy inference system and ant colony optimization for improving the performance of routing protocols in Wireless Sensor Networks | |
CN114501575A (en) | Agricultural Internet of things self-adaptive routing method based on fuzzy logic | |
CN114531716A (en) | Routing method based on energy consumption and link quality | |
CN113965943A (en) | Method for optimizing AODV (Ad hoc on-demand distance vector) routing based on bidirectional Q-Learning | |
Sharmin et al. | Efficient and scalable ant colony optimization based WSN routing protocol for IoT | |
Riva et al. | Pheromone-based in-network processing for wireless sensor network monitoring systems | |
Rasheed et al. | Cluster-quality based hybrid routing for large scale mobile multi-hop networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |