CN112351400B - Underwater multi-modal network routing strategy generation method based on improved reinforcement learning - Google Patents

Underwater multi-modal network routing strategy generation method based on improved reinforcement learning Download PDF

Info

Publication number
CN112351400B
CN112351400B CN202011103398.3A CN202011103398A CN112351400B CN 112351400 B CN112351400 B CN 112351400B CN 202011103398 A CN202011103398 A CN 202011103398A CN 112351400 B CN112351400 B CN 112351400B
Authority
CN
China
Prior art keywords
node
transmission
data
underwater
information value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011103398.3A
Other languages
Chinese (zh)
Other versions
CN112351400A (en
Inventor
刘春凤
赵昭
曲雯毓
广晓芸
余涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202011103398.3A priority Critical patent/CN112351400B/en
Publication of CN112351400A publication Critical patent/CN112351400A/en
Application granted granted Critical
Publication of CN112351400B publication Critical patent/CN112351400B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/38Services specially adapted for particular environments, situations or purposes for collecting sensor information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/04Communication route or path selection, e.g. power-based or shortest path routing based on wireless node resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/24Connectivity information management, e.g. connectivity discovery or connectivity update
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an underwater multi-modal network routing strategy generation method based on improved reinforcement learning, which comprises the following steps: in the off-line stage of the routing strategy implementation initial stage: preliminarily learning the transmission relationship between the network nodes in an iterative mode from the water surface sink node, so that each node obtains the maximum transmission benefit from each information value quantity grade data to the sink node; in the online phase of network operation: the expected income of the water surface sink node is obtained by combining the relay node and the transmission frequency band for each node through a reinforcement learning model, so that a transmission path suitable for data with different information value levels is constructed; the method reduces the transmission delay of high information value data; reduce and balance network energy consumption, prolong network operation time.

Description

Underwater multi-modal network routing strategy generation method based on improved reinforcement learning
Technical Field
The invention mainly relates to the technical field of underwater wireless sensor networks, in particular to an underwater multi-modal network routing strategy generation method based on improved reinforcement learning.
Background
The underwater wireless sensor network can help people know and understand the sea more conveniently, obtain valuable marine data information, and improve the monitoring and predicting capability of the marine environment and the capability of processing marine emergencies. The system can be widely applied to marine information acquisition, environment monitoring, deep sea detection, disaster prediction, auxiliary navigation, distributed tactical monitoring and the like. Various oceans are increasingly applied, and the requirements on ocean data transmission performance are different due to different application types and different time sensitivity. The underwater wireless sensor network provider needs to consider how to further optimize the network performance on the premise of meeting the data transmission requirements of marine applications, so that the network benefit is improved.
Specifically, subsea data typically includes an event type and an event timeliness, which may be referred to as a data value measure. The more important the event type of a piece of data is, the stronger the event timeliness is, and the higher the data value quantity of the data is; the data needs to be transmitted quickly and conversely can be transmitted slowly in order to improve network performance. Currently, in order to improve the transmission efficiency of ocean data, a multi-mode underwater wireless sensor network is proposed. In the network, the sensor nodes are provided with a plurality of non-interfering underwater communication module combinations which can simultaneously communicate, for example: the underwater acoustic communication and underwater optical communication are combined, or the underwater acoustic communication combination containing a plurality of frequency bands which are mutually orthogonal is combined. In addition, the nodes of the underwater wireless sensor network are usually powered by batteries, the energy of the batteries of the nodes is limited, and underwater energy charging is difficult; therefore, for a multi-modal underwater wireless sensor network, one of the most fundamental problems is: under the premise of meeting the data transmission delay requirements of different marine applications, a network provider needs to design a routing strategy suitable for an underwater dynamic communication environment, so that the network energy consumption is further reduced and balanced, and the service life of the network is prolonged.
However, as far as we know, the existing reinforcement learning-based multi-modal underwater wireless sensor network does not comprehensively consider the data value quantity and the network life of marine applications. For example, a journal article "MarLIN-Q," Multi-modal communications for reliable and low-latency underserver data delivery, "proposes a Multi-modal underwater wireless sensor network routing strategy based on reinforcement learning, which aims at minimizing transmission delay and improving data transmission reliability, and dynamically selects a relay node and a communication frequency band according to information fed back by a current neighbor node. Although the method can effectively reduce transmission delay and improve the success rate of data transmission, the method does not analyze data transmission characteristics, data value quantity and balance network energy consumption; therefore, the problems of high energy consumption of algorithm operation, high transmission delay of part of important data, short service life of a network and the like are caused. The invention provides an underwater multi-mode network transmission strategy generation method based on improved reinforcement learning, aiming at the transmission problem of an underwater wireless sensor network containing multi-type data. The network energy consumption is balanced to prolong the service life of the network while the transmission delay of high-value data is effectively reduced.
Disclosure of Invention
In order to solve the technical problems, the invention provides an underwater multi-mode network transmission strategy generation method based on improved reinforcement learning, which can reduce the transmission delay of high information value data; reduce and balance network energy consumption, prolong network operation time.
Aiming at the problems in the prior art, the invention adopts the following technical scheme:
an underwater multi-mode network routing strategy generation method based on improved reinforcement learning,
in the off-line stage of the routing strategy implementation initial stage: preliminarily learning the transmission relationship between the network nodes in an iterative mode from the water surface sink node, so that each node obtains the maximum transmission benefit from each information value quantity grade data to the sink node;
in the online phase of network operation: the expected income of the water surface sink node is obtained by combining the relay node and the transmission frequency band for each node through a reinforcement learning model, so that a transmission path suitable for data with different information value levels is constructed.
Further, each node obtains the maximum transmission benefit in the off-line stage of the initial stage of the implementation of the routing strategy:
s1, the sink node on the water surface generates an advertisement packet for each transmission frequency band combination; then the packet is sent out in a broadcast mode through the corresponding transmission frequency band combination;
s2, the underwater node calculates the final reward function of the convergent node from the underwater node to the water surface through the reward function, namely the final reward function
Figure BDA0002726150550000024
Wherein: the reward function is embodied as
Figure BDA0002726150550000021
Wherein Nr (i) is a node niNode set for receiving ADV data packet, g represents node niSlave node njTransmission of received ADV packetID, G of transmission band combinationijRepresenting a node niSlave node njA set of transmission band combinations of one ADV packet received,
Figure BDA0002726150550000022
representing a node niTransmitting data with information value quantity level l to node n by using transmission mode gjThe cost of the transfer of time;
s3, broadcasting the advertisement packet containing the ID information of the transmission frequency band combination in a broadcasting mode according to the final reward value of each underwater node;
and S4, judging whether all the underwater nodes obtain the final reward value of the sink nodes from the underwater nodes to the water surface.
Further, a transmission path process suitable for data with different information value levels is constructed in the online stage of network operation:
s1, when the underwater nodes transmit data packets, each node calculates the current state S by using a revenue function according to the information value quantity level l of the datahTimely profit per action a of
Figure BDA0002726150550000023
S2, calculating the current state S by the underwater node by using the Q value functionhThe final profit Q of each action aπ(sh,a);
S3, the underwater node is according to the current state ShThe final profit Q of each action aπ(shA) calculating a profit value for the optimal strategy and the optimal strategy, wherein the optimal strategy is calculated as expressed by
Figure BDA0002726150550000031
In the formula
Figure BDA0002726150550000032
Representing a node niIn a state shOptimal for data with lower transmission information value quantity level lAnd (4) strategy.
Advantageous effects
1. The invention initially learns the transmission relationship between the network nodes in an iterative mode from the water surface sink node, so that each node obtains the maximum transmission benefit from each information value level data to the sink node. Then, in the online stage of network operation, a multi-level link cost function comprehensively considering link communication delay, node residual energy and transmission load is designed through a reinforcement learning model, so that expected income of each node reaching a water surface sink node by adopting different transmission strategies (combination of a relay node and a transmission frequency band) is calculated, and a transmission path suitable for data of different information value quantities is constructed; and the nodes distribute corresponding paths for data transmission according to the information value quantity grades of the collected data. Generally, a path with high transmission efficiency transmits data with high information value quantity, so that the time delay of the data with high information value quantity is reduced; meanwhile, the common goals of balancing network energy consumption and reducing data time delay are achieved, and data with low information value quantity are transmitted through a path with high energy efficiency. Therefore, the network can reduce the data transmission delay and balance the network energy at the same time, and the service life of the network is prolonged.
2. The invention designs a multi-mode underwater wireless sensor network routing strategy suitable for various data transmission by utilizing a reinforcement learning model, can adaptively and dynamically select a transmission path for a data packet, meets the requirement of marine application on data delay and prolongs the service life of the network.
3. The invention utilizes an iterative method to quickly obtain network connection and transmission delay information when the network does not start to operate, thereby accelerating the convergence speed of the reinforcement learning model adopted in the online selection stage and reducing energy consumption.
Drawings
FIG. 1 is a flow chart of an underwater multi-modal network routing strategy generation method based on improved reinforcement learning according to the present invention
The specific implementation mode is as follows:
for a more clear description of the embodiment, it is assumed that there are K levels of information value in the network that need to be transmitted; each node has G combinations of transmission bands. The following detailed description of the specific modes, structures, features and functions of the underwater data routing strategy designed according to the present invention will be provided with reference to fig. 1.
1. Off-line training phase
Step 1: a sink node (sink node) positioned on the water surface generates an advertisement packet (ADV packet) for each transmission frequency band combination; then the packet is sent out in a broadcast form through the corresponding transmission frequency band combination, and the back-off time T is setbThe timer is started at 0. The ADV contains sink node coordinate information and back-off time TbEach information value quantity level being the final reward Re of the datas(l) And currently broadcasting transmission band combination ID information of the ADV packet.
Step 2: suppose a certain node niReceives n from a certain node (including a sink node)jA certain transmission band combination g, node niStores the information in the ADV packet and waits for a time T at this momentwAnd starting timing. When T iswWhen the predetermined value is reached, the node niCalculating the final reward of sending data with the information value quantity level of l to the sink node through a reward function
Figure BDA0002726150550000041
When it gives way for time TbArrival deadline, node niThe ADV packet is transmitted in a broadcast form through the corresponding transmission frequency band combination, and the node niIncluding its ID, coordinates, each information value level
Figure BDA0002726150550000042
The transmission band combination ID information of the ADV packet is currently broadcast.
Wherein the waiting time TwIs a fixed value in order to enable the node to collect ADV packets from other nodes more comprehensively.
Wherein the reward function is expressed as formula (1)
Figure BDA0002726150550000043
Wherein Nr (i) is a node niA set of nodes for ADV packets is received. g represents a node niSlave node njID of the transmission band combination of the received one ADV packet. GijRepresenting a node niSlave node njA set of transmission band combinations of one ADV packet received.
Figure BDA0002726150550000044
Representing a node niTransmitting data with information value quantity level l to node n by using transmission mode gjThe cost of the transfer of time.
Wherein the transmission costs
Figure BDA0002726150550000045
Is expressed by formula (2)
Figure BDA0002726150550000046
Wherein beta (l) is an adjustment coefficient corresponding to the information value quantity level of data as l, and is used for adjusting the transmission efficiency cost
Figure BDA0002726150550000047
And energy efficiency cost
Figure BDA0002726150550000048
Weight therebetween, β (l) is ∈ [0,1 ]]。
Wherein the transmission efficiency costs
Figure BDA0002726150550000049
Is expressed as formula (3)
Figure BDA0002726150550000051
In the formula
Figure BDA0002726150550000052
Representing nodesniTransmitting data packets with information value quantity level l to node n by using transmission mode gjThe time of transmission. PTijRepresenting data packets under water from node niTo node njThe propagation time of (c). TR (transmitter-receiver)maxRepresenting a node niCombining the transmission frequency bands with the lowest transmission rate to the node njThe transmission time of the transmission packet. PTmaxRepresenting the propagation time of the data packet when it propagates underwater to the maximum communication distance of all the transmission band combinations.
Figure BDA0002726150550000053
Data packet with information value quantity grade x at node njThe queuing time in the transmission queue.
Wherein the cost of energy efficiency
Figure BDA0002726150550000054
Is expressed as formula (4)
Figure BDA0002726150550000055
In the formula E0Representing the node initial energy value. ErjRepresenting a node njThe remaining energy of (c).
Figure BDA0002726150550000056
Representing a node niTransmitting data with information value quantity level l to node n by using transmission mode gjThe transmission power consumption of (2). e.g. of the typemaxData with information value quantity grade of l is transmitted to node n by using transmission frequency band combination with maximum energy consumptionjThe transmission power consumption of (2).
Wherein the back-off time TbIs expressed as formula (5)
Figure BDA0002726150550000057
Wherein Nr (i) is a node niA set of nodes for ADV packets is received. g represents a node niSlave node njID of the transmission band combination of the received one ADV packet. GijRepresenting a node niSlave node njA set of transmission band combinations of one ADV packet received.
Figure BDA0002726150550000058
Representing a node niTransmitting data packets with information value quantity level l to node n by using transmission mode gjThe time of transmission. PTijRepresenting data packets under water from node niTo node njThe propagation time of (c). TW is the waiting time.
And 4, step 4: repeating the step 3; until all nodes obtain their Re for each information value levelni(l) In that respect The above steps are only run a limited number of times according to the underwater communication environment in the off-line phase performed by the transmission strategy generation method of the present invention.
2. On-line selection phase
At this stage, the invention adopts a data transmission strategy selection method based on reinforcement learning, so that the nodes dynamically select the next hop relay node and the corresponding transmission frequency band combination according to the information value quantity grade of the data packet. The reinforced learning model mainly comprises six components: the system comprises an agent, a state set S, an action set A, a strategy set pi, benefits R and a state transition probability matrix P. In the method, an intelligent agent is an underwater sensor node, and a state set S is composed of retransmission times h, successful transmission suc and data discarding drop. The action set is formed by combining the relay nodes and the corresponding transmission frequency bands. A policy set is composed of a combined mapping of states and actions. The benefit represents the corresponding reward when the node adopts a certain strategy. The state transition probability matrix represents a probability matrix of the current state of the node being transferred to some other state; in our method, the state transition includes 1) a transition from a retransmission number H state to a retransmission number H +1, 2) a transition from a retransmission number H state to a transmission success suc, and 3) a transition from a retransmission number H to a data discard drop state when a maximum retransmission number H is reached.
Step 1: when any node niWhen data is required to be transmitted, it is based on the information price of the data to be transmittedValue class l, calculating current state s by using gain functionhEach action a ═ njTimely gain of k
Figure BDA0002726150550000061
Wherein the revenue function is expressed as formula (6)
Figure BDA0002726150550000062
In the formula
Figure BDA0002726150550000063
Representing a node niUsing the action a ═ nj,k>Temporal state slave shTransition to the probability of transmitting a successful suc state.
Figure BDA0002726150550000064
Representing a node njIn a state s0Maximum benefit for transmitting data with information value volume level l. It is important to note that at the start of the network operation, each node niObtain its own initial
Figure BDA0002726150550000065
By passing
Figure BDA0002726150550000066
s0Indicating that the number of retransmissions is 0. H denotes the maximum number of retransmissions of the data. h represents the current number of retransmissions of the data. ε represents an adjustment coefficient by which the data transmission success rate can be improved, and is usually set to [0,10 ]]。
Figure BDA0002726150550000067
Representing a node niTransmitting data with information value quantity level l to node n by using transmission mode gjThe transmission cost of time is expressed by the formula (2).
Wherein
Figure BDA0002726150550000068
Is expressed as formula (7)
Figure BDA0002726150550000071
Wherein f represents the frequency band of one underwater acoustic communication module in the transmission frequency band combination k. Pij(f) Representing a node niUsing frequency band f to node njA data packet transmission success rate when transmitting data; it may be composed of node n in generaljListening to node niDividing the number of transmitted data packets by the number of nodes niThe number of data packets actually transmitted in total is determined.
Step 2: then, the node niCalculating the current state s using a Q-value functionhThe final profit Q of each action aπ(sh,a)。
Wherein the Q value function is expressed by the formula (8)
Figure BDA0002726150550000072
In the formula
Figure BDA0002726150550000073
Representing a node niState slave s when action a is takenhTransition to State s'hThe probability of (c). Gamma is the discount coefficient, and takes on the value range [0,1 ].
Figure BDA0002726150550000074
Representing a node njAt state s'hMaximum benefit for transmitting data of rank l, its expression is formula (9)
Figure BDA0002726150550000075
In the formula
Figure BDA0002726150550000076
To representNode niIn a state shThe final benefit of action a is then taken and is calculated from equation (8).
And step 3: then, the node niAccording to the current state shThe final profit Q of each action aπ(shA) calculating a profit value for the optimal strategy, and the optimal strategy
Figure BDA0002726150550000077
(i.e. the combination of the relay node selected by the node and the corresponding communication frequency band under the current retransmission times).
Wherein the optimal strategy is represented by formula (10)
Figure BDA0002726150550000078
In the formula
Figure BDA0002726150550000079
Representing a node niIn a state shAnd (5) the optimal strategy adopted for transmitting data with the information value quantity level l is adopted.
The present invention is not limited to the above-described embodiments. The foregoing description of the specific embodiments is intended to describe and illustrate the technical solutions of the present invention, and the above specific embodiments are merely illustrative and not restrictive. Those skilled in the art can make many changes and modifications to the invention without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (2)

1. An underwater multi-mode network routing strategy generation method based on improved reinforcement learning is characterized by comprising the following steps:
in the off-line stage of the routing strategy implementation initial stage: preliminarily learning the transmission relationship between the network nodes in an iterative mode from the water surface sink node, so that each node obtains the maximum transmission benefit from each information value quantity grade data to the sink node;
in the online phase of network operation: the expected income of the water surface sink node is obtained by combining the relay node and the transmission frequency band for each node through a reinforcement learning model, so that a transmission path suitable for data with different information value levels is constructed; wherein:
the process of constructing transmission paths suitable for data with different information value levels at the online stage of network operation comprises the following steps:
s1, when the underwater nodes transmit data packets, each node calculates the current state S by using a revenue function according to the information value quantity level l of the datahTimely profit per action a of
Figure FDA0003326877710000011
S2, calculating the current state S by the underwater node by using the Q value functionhThe final profit Q of each action aπ(sh,a);
S3, the underwater node is according to the current state ShThe final profit Q of each action aπ(shAnd a) calculating the profit value of the optimal strategy and the optimal strategy, wherein the optimal strategy is calculated and expressed by the following formula:
Figure FDA0003326877710000012
in the formula
Figure FDA0003326877710000013
Representing a node niIn a state shAnd (5) the optimal strategy adopted for transmitting data with the information value quantity level l is adopted.
2. The underwater multi-modal network routing strategy generation method based on the improved reinforcement learning of claim 1 is characterized in that: the method comprises the following steps of obtaining the maximum transmission profit of each node in an off-line stage at the initial stage of routing strategy implementation:
s1, the sink node on the water surface generates an advertisement packet for each transmission frequency band combination; then the packet is sent out in a broadcast mode through the corresponding transmission frequency band combination;
s2, the underwater node calculates the final reward function of the convergent node from the underwater node to the water surface through the reward function, namely the final reward function
Figure FDA0003326877710000014
Wherein: the reward function is embodied as
Figure FDA0003326877710000015
Wherein Nr (i) is a node niNode set for receiving ADV data packet, g represents node niSlave node njID, G of a received transmission band combination of an ADV packetijRepresenting a node niSlave node njA set of transmission band combinations of one ADV packet received,
Figure FDA0003326877710000016
representing a node niTransmitting data with information value quantity level l to node n by using transmission mode gjThe cost of the transfer of time;
s3, broadcasting the advertisement packet containing the ID information of the transmission frequency band combination in a broadcasting mode according to the final reward value of each underwater node;
and S4, judging whether all the underwater nodes obtain the final reward value of the sink nodes from the underwater nodes to the water surface.
CN202011103398.3A 2020-10-15 2020-10-15 Underwater multi-modal network routing strategy generation method based on improved reinforcement learning Active CN112351400B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011103398.3A CN112351400B (en) 2020-10-15 2020-10-15 Underwater multi-modal network routing strategy generation method based on improved reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011103398.3A CN112351400B (en) 2020-10-15 2020-10-15 Underwater multi-modal network routing strategy generation method based on improved reinforcement learning

Publications (2)

Publication Number Publication Date
CN112351400A CN112351400A (en) 2021-02-09
CN112351400B true CN112351400B (en) 2022-03-11

Family

ID=74360733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011103398.3A Active CN112351400B (en) 2020-10-15 2020-10-15 Underwater multi-modal network routing strategy generation method based on improved reinforcement learning

Country Status (1)

Country Link
CN (1) CN112351400B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113141592B (en) * 2021-04-11 2022-08-19 西北工业大学 Long-life-cycle underwater acoustic sensor network self-adaptive multi-path routing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109362113A (en) * 2018-11-06 2019-02-19 哈尔滨工程大学 A kind of water sound sensor network cooperation exploration intensified learning method for routing
CN110996349A (en) * 2019-11-09 2020-04-10 天津大学 Multi-stage transmission strategy generation method based on underwater wireless sensor network
CN111065145A (en) * 2020-01-13 2020-04-24 清华大学 Q learning ant colony routing method for underwater multi-agent
CN111405513A (en) * 2020-03-19 2020-07-10 北京工商大学 Event-driven water quality sensor network route optimization algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010529782A (en) * 2007-06-04 2010-08-26 ニュー ジャージー インスティチュート オブ テクノロジー Multi-criteria optimization for relaying in multi-hop wireless ad hoc and sensor networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109362113A (en) * 2018-11-06 2019-02-19 哈尔滨工程大学 A kind of water sound sensor network cooperation exploration intensified learning method for routing
CN110996349A (en) * 2019-11-09 2020-04-10 天津大学 Multi-stage transmission strategy generation method based on underwater wireless sensor network
CN111065145A (en) * 2020-01-13 2020-04-24 清华大学 Q learning ant colony routing method for underwater multi-agent
CN111405513A (en) * 2020-03-19 2020-07-10 北京工商大学 Event-driven water quality sensor network route optimization algorithm

Also Published As

Publication number Publication date
CN112351400A (en) 2021-02-09

Similar Documents

Publication Publication Date Title
Zhang et al. A multi-path routing protocol based on link lifetime and energy consumption prediction for mobile edge computing
Zhang et al. New approach of multi-path reliable transmission for marginal wireless sensor network
CN106993320B (en) Wireless sensor network cooperative transmission routing method based on multiple relays and multiple hops
CN108809443B (en) Underwater optical communication network routing method based on multi-agent reinforcement learning
Chithaluru et al. ARIOR: adaptive ranking based improved opportunistic routing in wireless sensor networks
CN113141592B (en) Long-life-cycle underwater acoustic sensor network self-adaptive multi-path routing method
Ahmed et al. Energy harvesting techniques for routing issues in wireless sensor networks
CN110167054A (en) A kind of QoS CR- LDP method towards the optimization of edge calculations node energy
CN112351400B (en) Underwater multi-modal network routing strategy generation method based on improved reinforcement learning
Peng et al. Energy harvesting reconfigurable intelligent surface for UAV based on robust deep reinforcement learning
CN116170844A (en) Digital twin auxiliary task unloading method for industrial Internet of things scene
CN113923743B (en) Routing method, device, terminal and storage medium for electric power underground pipe gallery
CN110932969A (en) Advanced metering system AMI network anti-interference attack routing algorithm for smart grid
CN114154685A (en) Electric energy data scheduling method in smart power grid
CN109660375B (en) High-reliability self-adaptive MAC (media Access control) layer scheduling method
Deldouzi et al. A novel harvesting-aware rl-based opportunistic routing protocol for underwater sensor networks
Yang et al. Energy-aware real-time opportunistic routing for wireless ad hoc networks
Zhao et al. MLRS-RL: An energy efficient multi-level routing strategy based on reinforcement learning in multimodal UWSNs
CN116113008A (en) Multi-agent routing algorithm for unmanned aerial vehicle self-organizing network
Ren et al. An opportunistic routing for energy-harvesting wireless sensor networks with dynamic transmission power and duty cycle
Dai et al. MEC enabled cooperative sensing and resource allocation for industrial IoT systems
CN115173926A (en) Communication method and communication system of satellite-ground converged relay network based on auction mechanism
Nagadivya et al. Energy efficient Markov prediction based opportunistic routing (Eempor) for wireless sensor networks
Fang et al. Heterogeneous multi-AUV aided green internet of underwater things
Lv et al. A dynamic spectrum access method based on Q-learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant