CN114024639A - Distributed channel allocation method in wireless multi-hop network - Google Patents

Distributed channel allocation method in wireless multi-hop network Download PDF

Info

Publication number
CN114024639A
CN114024639A CN202111318928.0A CN202111318928A CN114024639A CN 114024639 A CN114024639 A CN 114024639A CN 202111318928 A CN202111318928 A CN 202111318928A CN 114024639 A CN114024639 A CN 114024639A
Authority
CN
China
Prior art keywords
node
network
channel
nodes
hop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111318928.0A
Other languages
Chinese (zh)
Other versions
CN114024639B (en
Inventor
雷建军
尚凤军
王颖
刘捷
周盈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Skysoft Info & Tech Co ltd
Shenzhen Hongyue Information Technology Co ltd
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202111318928.0A priority Critical patent/CN114024639B/en
Publication of CN114024639A publication Critical patent/CN114024639A/en
Application granted granted Critical
Publication of CN114024639B publication Critical patent/CN114024639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/382Monitoring; Testing of propagation channels for resource allocation, admission control or handover
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention relates to the field of wireless network communication, in particular to a distributed channel allocation method in a wireless multi-hop network, which comprises the steps of adopting a physical framework at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer forms a multi-hop wireless communication network by n wireless nodes randomly deployed in the network, and each node is used as an autonomous Agent and interacts with an uncertain network environment through a local decision module; the aggregation node of the computation layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, has an edge computation function, and can train an asynchronous DRL (distributed resource distribution) model based on experience information acquired in a node distributed manner, model a multi-channel distribution problem into a POMDP (point-of-sale) problem and perform channel distribution by using the trained asynchronous DRL model; the invention solves the problems of hidden terminals and exposed terminals in a high-density multi-hop wireless network, and effectively avoids the problems of data conflict and channel resource waste.

Description

Distributed channel allocation method in wireless multi-hop network
Technical Field
The invention relates to the field of wireless network communication, in particular to a distributed channel allocation method in a wireless multi-hop network.
Background
Multi-channel medium access control (MMAC) technology may enable communication links that interfere with each other in single-channel communications to achieve interference-free data transmission in multiple orthogonal channels. MMAC can effectively avoid the problem of single channel interference and improve the throughput of the whole network, and therefore, is considered to be a very potential technology for alleviating the shortage of wireless network channel resources at present. Although multi-channel communication has many advantages over single-channel communication, it brings with it many new problems:
channel allocation and negotiation: the most basic and important problem of the multi-channel-based MAC communication technology is how to reasonably allocate channel resources to ensure that each node maximizes the network capacity of the entire network under the premise of normal communication. Furthermore, prior to communication, negotiation between nodes is required to solve the problem of channel usage to ensure that two communicating nodes operate on the same channel during data transmission.
Multi-channel broadcasting: the wireless network based on the single-channel model can easily realize broadcasting because each sensor node is in the same channel; however, in a multi-channel environment, when a certain node performs broadcasting, some nodes cannot receive the broadcasting content because the nodes are distributed over a plurality of channels. The broadcast function plays an important role in network applications, and therefore, how to implement the broadcast function is another difficult problem facing multi-channel communication.
Multi-hop hidden and exposed terminals: as shown in fig. 1, the multi-hop hidden terminal is a node within the communication range of the receiving node and out of the communication range of the transmitting node. These nodes may not receive the data sent by the sending node, but may send data to the same receiving node, causing data transmission collisions. At high densities, hidden terminal problems can lead to unnecessary data collisions, greatly reducing network performance. The multi-hop exposed terminal problem refers to a node that is within the coverage of a transmitting node but outside the coverage of a receiving node, and the exposed terminal delays transmission by hearing the transmission of the transmitting node. The presence of exposed terminals may result in unnecessary waste of channel resources.
Disclosure of Invention
In order to effectively reduce interference and data collision in a network, improve the utilization rate of a channel and the throughput of a system and ensure the reliability of data service transmission among nodes, the invention provides a distributed channel allocation method in a wireless multi-hop network, which adopts a physical framework at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer forms a multi-hop wireless communication network by n wireless nodes randomly deployed in the network, and each node is used as an autonomous Agent and interacts with an uncertain network environment through a local decision module; the aggregation node of the computation layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, the node has an edge computation function or adopts a special edge server node, namely, the computation task of the node can be unloaded, an asynchronous DRL model can be trained based on empirical information collected in a distributed mode by the node, a multi-channel distribution problem is modeled into a POMDP problem, and distributed channel distribution is carried out by using the asynchronous DRL model trained by a centralized node or an edge server.
Further, modeling the multi-channel allocation problem as a POMDP problem, i.e. Agent observes the current network state s and performs action a in time period t, and after performing action a, transitions to the network state s' in the next time period with a state transition probability P, and obtains a corresponding reward R from the environment, the POMDP problem is expressed as:
M=<S,A,P,R,γ>;
wherein M represents a POMDP problem model; s is a state set representation state space; a is an action set representing an action space, wherein the action a belongs to the channel number to be switched by the node represented by A; r is a reward function; gamma is a discount factor. I.e. given the environment state S e S, Agent performs the action a e a, the environment state will migrate from S to S ', i.e. S → S', while getting the corresponding reward R from the environment.
Further, the environmental state observed by the node i in the t-th time period
Figure BDA0003344541690000021
Expressed as:
Figure BDA0003344541690000022
wherein the content of the first and second substances,
Figure BDA0003344541690000023
representing the occupation condition of the neighbor node of the node i to each wireless channel, namely the potential interference degree of each channel; k is the number of available channels, N is the number of nodes;
Figure BDA0003344541690000024
indicating the occupation of the channel j by the neighbor node of the node i in the t-th time period,
Figure BDA0003344541690000031
a neighbor node indicating the presence of node i uses channel j,
Figure BDA0003344541690000032
indicating that the neighbor node of the node i uses the channel j;
Figure BDA0003344541690000033
ni,othe total number of neighbor nodes of node i.
Further, the reward R obtained from the environment when the node, after performing action a, transitions from state s to the next state s' may be expressed as:
Figure BDA0003344541690000034
wherein, the R (s, a) node i switches the channel to the reward R after the channel k in the t-th data period, that is, R ═ R (s, a);
Figure BDA0003344541690000035
the neighbor node indicating whether the node i exists in the current period uses the channel k: if not presentThe neighbor node of the node i uses the channel k, then
Figure BDA0003344541690000036
On the contrary, the method can be used for carrying out the following steps,
Figure BDA0003344541690000037
Figure BDA0003344541690000038
the neighbor successful transmission probability of node i for the t-th time period.
Further, the asynchronous DRL model deployed in the computation layer comprises a current network, a target network, an error computation module, an experience pool and a decision module deployed locally in the wireless node, wherein the network structure of the local decision module is the same as that of the current network, and parameters of the local decision module are periodically acquired from the edge node; wherein:
the target network fixes the network parameters and obtains a target value function,
Figure BDA0003344541690000039
the current network is used for evaluating strategy updating parameters and approximating a value function;
updating the parameter theta of the current network every time period; parameter θ of target network-Updating every a plurality of fixed time periods, wherein the time period is kept unchanged;
the experience e ═ S, a, r, S '>, S, S' belongs to S, a belongs to A, and the node in the network asynchronously collects from the wireless multi-hop network environment;
the error calculation module updates the parameters of the current network through the TD deviation calculated by the target network and the current network; in addition, the parameters of the current network are copied to the target network at regular intervals.
Further, the objective function
Figure BDA00033445416900000310
The calculation of (a) includes:
Figure BDA00033445416900000311
wherein R(s)t,at) For node i e [1, N ]](N is the number of nodes) state s at the t-th time periodtE.g. S performs action atThe reward obtained in the t time period after belonging to A; q(s)t+1,at+1;θ-),(st+1∈S,at+1E.g. A) represents a network, i.e. the t +1 time period is based on the target network, i.e. the parameter is theta-Node i in state st+1Performing action at+1The network of (2); st+1Is the state of the node i in the t +1 th time period; a ist+1An action performed for node i at time period t + 1; maxat+1∈AQ(st+1,at+1;θ-) Representing node i based on the target network (parameter θ)-) In a state st+1Lower selection action at+1To maximize the corresponding Q value.
Further, the error calculation module calculates a current network Q(s)t,at(ii) a θ) and target value
Figure BDA0003344541690000041
The error between:
Figure BDA0003344541690000042
updating neural network parameters with gradient descent:
Figure BDA0003344541690000043
wherein L (theta) is a TD error function of the model;
Figure BDA0003344541690000044
expressing the expectation of the selected mini-batch empirical data; theta is the parameter of the current network updated in real time; an alpha learning rate;
Figure BDA0003344541690000045
is the corresponding gradient; q(s)t,at(ii) a Theta) represents a network, i.e. the node i takes the state s at the time period of the t-th time when the network parameter is thetatPerforming action atThe network of (2).
Furthermore, the whole system time is divided into a plurality of continuous superframe time, one superframe time is a time period, each superframe comprises a beacon frame, a control period and a data transmission period, and the control period adopts a fixed control channel to transmit related control information and channel allocation decisions; the data transmission period adopts K non-overlapping channels to support non-interference parallel data transmission; and in the control period, all nodes in the network are switched to the control channel to monitor and send the related control information; in the data transmission period, a node to be sent data is switched to a channel where a parent node of the node is located to transmit data based on a channel access mechanism.
Further, in the process of performing the action a, the node adopts a channel access mechanism based on RTS/DCTS, which includes:
if the node d is positioned in the mth hop and the node of the (m + 1) th hop next to the mth hop is the node i, the node d is the father node of the node i; if the node e is positioned in the mth hop and the node of the (m + 1) th hop next to the mth hop is the node j, the node e is the father node of the node j; the four nodes work on the same channel, and the backoff values of the node i and the node j are 0;
when the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when receiving an RTS frame from the node i, the node e waits for an SIFS frame and returns a CTS frame to inform the child node of delaying data transmission during the transmission period of the node i;
wherein RTS refers to request to send; CTS means clear to send; the CIFS is an interframe space used for returning CTS by the destination node; SIFS means to separate frames belonging to one dialog, and CIFS is slightly larger than SIFS.
Further, if node j is located within the communication range of node i and its parent node is not located within the communication range of node i, after node j receives the RTS frame and waits for an RIFS, node j sends the RTS frame to parent node e.
The invention solves the problems of hidden terminals and exposed terminals in a high-density multi-hop wireless network, effectively avoids the problems of data conflict and channel resource waste, and improves the overall network performance. In addition, an asynchronous DRL model is provided for dynamically optimizing a channel allocation strategy of the node aiming at the wireless multi-hop multi-channel network based on the channel access performance and the channel occupation condition of the node in the data transmission period. A novel wireless mode based on Mobile Edge Computing (MEC) is provided, the computing and storage pressure of a terminal node is solved, and a distributed interaction (micro-learning) and centralized training (macro-learning) framework is designed to train an asynchronous DRL model. Therefore, the asynchronous DRL model proposed by the present invention can be implemented even on resource-constrained terminals. In addition, the invention considers the non-stationary problem in the multi-agent scene (MAS), and can further accelerate the network convergence while avoiding the severe dynamic change of the network by only utilizing the neighbor local information.
Drawings
Fig. 1 is a diagram of an example of a hidden and exposed terminal in multiple channels provided in the prior art;
FIG. 2 is a diagram of an edge computing enabled system architecture according to an embodiment of the present invention;
FIG. 3 is a superframe structure diagram employed by the present invention;
FIG. 4 is an asynchronous DRL model based on a distributed decision making architecture in accordance with the present invention;
fig. 5 is a centralized training flow of the asynchronous DRL model according to the embodiment of the present invention.
Fig. 6 is one of the schematic diagrams of RTS/DCTS operation provided by the embodiment of the present invention;
fig. 7 is a second schematic diagram of RTS/DCTS operation according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a distributed channel allocation method in a wireless multi-hop network, which adopts a physical framework at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer forms a multi-hop wireless communication network by n wireless nodes randomly deployed in the network, and each node is used as an autonomous Agent and interacts with an uncertain network environment through a local decision module; the aggregation node of the computation layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, and the node has an edge computation function, namely, the computation task of the node can be unloaded, an asynchronous DRL model can be trained based on empirical information acquired in a distributed mode by the node, a multi-channel distribution problem is modeled into a POMDP problem, and the trained asynchronous DRL model is used for carrying out channel distribution.
Example 1
The present embodiment presents a system architecture diagram, as shown in fig. 2, the system architecture includes a physical device layer, a computing layer, and a network service layer. The physical equipment layer is a multi-hop wireless communication network consisting of n wireless nodes randomly deployed in the network, and each node is used as an autonomous Agent and interacts with an uncertain network environment through a local decision module; the aggregation node of the computation layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, has an edge computation function and can unload computation tasks of the node, and an asynchronous DRL model can be trained on the basis of experience information acquired in a distributed mode by the node.
In the data transmission process, the present embodiment selects to perform data transmission in a superframe structure, where the superframe structure is shown in fig. 3, the system time is divided into a plurality of consecutive superframe times, and each superframe includes a beacon frame, a control period and a data transmission period. Wherein, the control period adopts a fixed control channel to transmit the relevant control information and channel allocation decision; the data transmission period employs K non-overlapping channels to support interference-free parallel data transmission. Thus, during a control period, all nodes in the network are to switch to a control channel to listen and transmit related control information (routing, time synchronization, channel switching, etc.); in the data transmission period, a node to be sent data is switched to a channel where a parent node of the node is located to transmit data based on a channel access mechanism.
As shown in fig. 4, the asynchronous DRL model adopted in this embodiment solves the problem of dynamic multi-channel allocation in a multi-hop wireless network by using DRL. The embodiment of the invention combines the DQN function approximation capability and an A3C asynchronous empirical sampling framework, provides an asynchronous DRL model, and aims to reasonably allocate channels for nodes so as to improve the reliability of data transmission to the maximum extent. The DRL model deployed on the edge server adopts a DQN framework, DNN is introduced to extract features from original data to approximate a behavior value function, and an asynchronous training framework of A3C is combined to solve the problem that the DQN is not suitable for a high-dimensional action space and an MAS, so that the correlation between experiences is broken, the convergence speed of the network is remarkably improved, and the problem that an A3C algorithm cannot be realized on a wireless node with limited resources is solved.
In the embodiment, the limited computing capability, energy and memory capability of the wireless node under certain scenes are considered, so that the computing bottleneck and low performance are caused, the support of high-level application is limited, and a computing-intensive task, namely the training of the DRL model, is operated. Therefore, the embodiment of the invention adopts a wireless network architecture based on edge computing enabling, and transfers the computing task of the node training asynchronous DRL model to the edge nodes (sink nodes) with rich resources. As shown in fig. 2, the asynchronous DRL model deployed at the computation layer is composed of a current network (main), a target network (target), and an experience pool (experience replay). Thus, the edge computation-enabled sink nodes complete the training and updating tasks of the model.
When the asynchronous DRL model is adopted for channel allocation, the invention combines the function approximation capability of DQN and the asynchronous interaction architecture of A3CThe distributed interaction module (micro-learning) in the asynchronous DRL model presented in fig. 4 allows the terminal node to asynchronously select channel resources using local observation information. In addition, a centralized training module (macro learning) trains the asynchronous DRL model by adjusting operating parameters, thereby directing the system to move toward an application-specific global optimization goal (e.g., maximizing reliability of data transfer). Wherein each terminal node maintains a DRL prediction model to independently allocate channels. In particular, embodiments of the present invention model the multi-channel allocation problem as a POMDP problem, which consists of five tuples, M ═ M<S,A,P,R,γ>State s, action a, state transition probability P, reward function R, and discount factor γ. The Agent observes the current network state s and executes action a at each control period of time step t. Then, the system transfers to the next state according to the state transition probability to obtain the reward R from the environmentt+1
State space, S ═ S1,S2,...,S2K+N}. Where K is the number of available channels and N is the number of nodes. For a particular node i, at the t-th cycle, its state vector,
Figure BDA0003344541690000081
Figure BDA0003344541690000082
wherein the content of the first and second substances,
Figure BDA0003344541690000083
indicating the occupancy of channel j by the neighbor node of node i,
Figure BDA0003344541690000084
indicating that the neighbor channel of the node i occupies the channel j; otherwise, Si,t,j=0。
Figure BDA0003344541690000085
Is the total number of neighbor nodes for node i.
Motion space, a ═ a1,a2...,aK},akE.g. A. Wherein, the channel number used for indicating the node i to switch in the next data transmission period, ak=chi,t,k,chi,t,k=k∈[1,K]。
A reward function, R. When the node i is in the t data period, the local observation state
Figure BDA0003344541690000086
Performing an action
Figure BDA0003344541690000087
Switching to channel chi,t,kAt the end of the data transmission cycle, the environment returns to the node an immediate reward value, R (s, a), which can be solved by the following function:
Figure BDA0003344541690000088
wherein the content of the first and second substances,
Figure BDA0003344541690000089
in the current data cycle, the neighbor node indicating that there is no node i uses the channel chi,t,k(ii) a On the contrary, the method can be used for carrying out the following steps,
Figure BDA00033445416900000810
Figure BDA00033445416900000811
is to use the channel chi,t,kThe number of neighbor nodes of node i of k.
Figure BDA00033445416900000812
Is the node is in chi,t,kThe probability of successful transmission of the data transmission is performed.
The edge computing enabled sink node trains the DRL model in a centralized mode based on experience information acquired by each node in a distributed asynchronous mode in the network, updated network model parameters are sent to the nodes, and each node can acquire the latest network parameters from a parent node of the node.
The centralized training process of the DRL model is shown in fig. 5, where two networks with the same structure but different parameters exist in the asynchronous DRL model, and the current value of the Q estimate is predicted, using the latest parameters; and predicting the neural network target value parameter of Q reality, which uses the previous old parameter. In the embodiment, the state of a node is used as the input of the neural network, each node executes different actions as the class of the node, the probability of each action executed by the node is predicted by the neural network, and the probability is used as the output of the neural network, namely the value of Q, for example, Q (s, a; theta) represents the probability of executing the action a by inputting the state s of the node when the parameter of the neural network is theta.
When the model is trained, some (mini-batch) experiences are randomly taken out from the experience pool to be trained so as to break the correlation between the experiences. In addition, because the experience information in the experience pool is provided by the intelligent agent in an asynchronous sampling mode, the correlation between experiences can be further broken, and richer experiences are provided.
From FIG. 5, it can be seen that<s,a>Information is used as the input of the current value network to obtain Q (s, a; theta) used for evaluating the current state behavior value function; the S ' S information is used as input to the target value network to obtain the corresponding maxQ (S ', a '; theta)-) (ii) a Calculate out
Figure BDA0003344541690000091
The method comprises the following steps:
Figure BDA0003344541690000092
thus, based on
Figure BDA0003344541690000093
And the value can be further calculated by adopting a DQN error function module:
Figure BDA0003344541690000094
the current network updates the parameters of the current network based on the error function gradient:
Figure BDA0003344541690000095
wherein S ∈ S, and a ∈ A. Copying parameters of the current value network to a target value network after a certain number of iterations;
θ-←θ
repeating the above process to make the network reach a stable state.
Although the asynchronous DRL based channel allocation model improves network performance by applying multiple parallel data transmissions, the hidden and exposed terminal problems on a specific channel will be further exacerbated in a high-density wireless multi-hop network scenario. Fig. 1 illustrates the hidden terminal and exposure problem in a wireless multihop network, when node D is transmitting data to node C, since node B is located outside the communication range of node D. Therefore, the node B mistakenly thinks that the channel is in an idle state, so when the node B sends data to the nodes C and a at the same time, data collision occurs at the node C, which causes unnecessary data retransmission, and further aggravates the network congestion degree; furthermore, when the node B1 transmits data to the node a1, since the node B2 is in the communication range of the node B1 and the node B2 and a2 are not in the communication ranges of the node a1 and the node B1, respectively, the node B2 mistakenly considers that the channel is in the idle state and delays data transmission, which causes unnecessary waste of channel resources. Therefore, the embodiment of the present invention proposes to solve the hidden terminal and exposed terminal problems in the wireless multi-hop network based on the RTS/DCTS mechanism. The RTS/DCTS mechanism is further described below by way of example.
Fig. 6 is a diagram illustrating a solution to the hidden terminal problem in the wireless multi-hop network based on RTS/DCTS according to a preferred embodiment of the present invention. Wherein, nodes i and j, and nodes d and e are respectively located at m and m +1 hops (which refer to different and adjacent hop counts) and operate on the same channel. Node d is a parent node of node i and node e is a parent node of node j. Node e is also a neighbor node of node i. Assume that the backoff values of nodes i and j are both 0 at this time.
When the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when node e receives the RTS frame from node i, waits for a SIFS, and returns a CTS frame to inform its child node of delaying data transmission during the transmission of node i, thereby avoiding the hidden terminal problem.
In the channel access mechanism in the multi-hop environment, the hidden terminal problem is unavoidable, so the probability of successful transmission of node i on a particular channel k,
Figure BDA0003344541690000101
the following formula can be used for calculation:
Figure BDA0003344541690000102
wherein τ is a transmission probability in the channel access slot. In particular, the amount of the solvent to be used,
Figure BDA0003344541690000103
(nsis the total number of child nodes of the parent node of the node). n isaRepresents the number of neighbor nodes of node i, and nfThe number of neighbor nodes of the parent node of node i (excluding the child nodes of the parent node).
Referring to fig. 7, fig. 7 is a schematic diagram illustrating an example of solving the problem of exposed terminals in a wireless multi-hop network based on RTS/DCTS according to a preferred embodiment of the present invention. Wherein, nodes i and j, and nodes d and e are respectively located at m and m +1 hops (which refer to different and adjacent hop counts) and operate on the same channel. Node d is a parent node of node i and node e is a parent node of node j. Node j is also a neighbor node to node i. Assume that the backoff values of nodes i and j are both 0 at this time.
When the node i sends RTS to the node d, the node d waits for a CIFS time and returns a CTS frame; since node j is within communication range of node i. Therefore, node j will also receive the RTS frame, but since the destination node of the RTS frame is not the destination node of node j, node j will not set the NAV according to the Duration field information of the RTS;
after receiving the RTS frame, the node j waits for an RIFS and judges whether a CTS frame is received or not; since the parent node e is not in the communication range of the node i, the node e does not return a CTS after SIFS; therefore, node j does not receive a CTS frame after RIFS; node j sends RTS frame to father node e;
the nodes in the network execute the above processes, so that the problems of data conflict and channel resource waste caused by hidden terminals and exposed terminals in the network can be effectively solved; thus, the successful transmission probability can be rewritten as:
Figure BDA0003344541690000111
based on the RTS/DCTS mechanism, data collision between data links under adjacent father nodes on the same channel can be effectively avoided through SIFS and CTS; in addition, the channel access mechanism introduces RIFS interframe space to solve the problem of violent terminals in the network, thereby improving the successful transmission probability of the nodes, namely
Figure BDA0003344541690000112
Therefore, the channel access mechanism can improve the successful transmission probability of the nodes in the network;
in addition, P can be seen from the above formulasAnd parameters
Figure BDA0003344541690000113
naAnd nfDirectly related, while the parameter ns,naAnd nfCan be further optimized by optimizing the channel allocation strategy; therefore, the embodiment of the invention ensures the successful transmission probability of the node on the channel
Figure BDA0003344541690000114
As part of the channel allocation model reward function, to further optimize network performance.
The channel allocation and channel access mechanism provided by the embodiment of the invention optimizes channel resources from different layers, and the channel allocation optimizes the channel resources from the frequency domain and the channel access from the time domain. In addition, a reasonable channel allocation mechanism can further alleviate the interference problem in the channel access process, and the channel access performance of the node can further optimize the channel allocation strategy.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A distributed channel allocation method in a wireless multi-hop network is characterized in that a physical architecture at least comprising a physical equipment layer, a calculation layer and a network service layer is adopted, the physical equipment layer forms a multi-hop wireless communication network by n wireless nodes randomly deployed in the network, a multi-channel allocation problem is modeled into a POMDP problem, distributed channel allocation is realized by utilizing an asynchronous DRL model, each node is used as an autonomous Agent, the local decision module interacts with uncertain network environment, the aggregation node of the computation layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, and the node has edge computation function, the calculation tasks of the nodes can be unloaded, the asynchronous DRL model can be trained based on experience information acquired in a distributed mode by the nodes, and the wireless nodes periodically update local decision module parameters from the sink nodes.
2. The method of claim 1, wherein the multi-channel allocation problem is modeled as a POMDP problem, that is, when an Agent observes a current network state s and performs an action a in a time period t, and after performing the action a, transitions to a network state s' in a next time period with a state transition probability P, and obtains a corresponding reward R from an environment, the POMDP problem is expressed as:
M=<S,A,P,r,γ>;
wherein M represents a POMDP problem model; s is a state set representation state space; a is an action set representing an action space, wherein the action a belongs to the channel number to be switched by the node represented by A; r is a reward function; gamma is a discount factor. I.e. given the environment state S e S, Agent performs the action a e a, the environment state will migrate from S to S ', i.e. S → S', while getting the corresponding reward R from the environment.
3. The method of claim 2, wherein the environmental status observed by the node i in the t time period
Figure FDA0003344541680000011
Expressed as:
Figure FDA0003344541680000012
wherein the content of the first and second substances,
Figure FDA0003344541680000013
representing the occupation condition of the neighbor node of the node i to each wireless channel, namely the potential interference degree of each channel; k is the number of available channels, N is the number of nodes;
Figure FDA0003344541680000014
indicating the occupation of the channel j by the neighbor node of the node i in the t-th time period,
Figure FDA0003344541680000015
indicating the presence of a neighbor node of node iWith the use of the channel j,
Figure FDA0003344541680000016
indicating that the neighbor node of the node i uses the channel j;
Figure FDA0003344541680000017
ni,othe total number of neighbor nodes of node i.
4. A method according to claim 1, wherein the reward function obtained from the environment when the node performs action a and moves from state s to next state s' is represented as:
Figure FDA0003344541680000021
wherein R (s, a) represents a reward value obtained from the environment when node i switches the channel to channel k at the t-th data period;
Figure FDA0003344541680000022
the neighbor node indicating whether the node i exists in the current period uses the channel k: if the neighbor node without the node i uses the channel k, then
Figure FDA0003344541680000023
On the contrary, the method can be used for carrying out the following steps,
Figure FDA0003344541680000024
Figure FDA0003344541680000025
the number of nodes using the channel k in the neighbor nodes of the node i in the t-th time period;
Figure FDA0003344541680000026
for data transmission on channel k for node iThe probability of successful transmission.
5. The method according to claim 2, wherein the asynchronous DRL model deployed in the computation layer includes a current network, a target network, an error computation module, an experience pool, and a decision module deployed locally at the wireless node, the network structure of the local decision module is the same as that of the current network, and parameters of the local decision module are periodically obtained from the edge node; wherein:
the target network fixes the network parameters and obtains a target value function,
Figure FDA0003344541680000027
the current network is used for evaluating strategy updating parameters and approximating a value function;
updating the parameter theta of the current network every time period; parameter θ of target network-Updating every a plurality of fixed time periods, wherein the time period is kept unchanged;
the experience e ═ S, a, r, S '>, S, S' e S, a e a in the experience pool, and the node in the network asynchronously collects from the wireless multi-hop network environment;
the error calculation module updates the parameters of the current network through the TD deviation calculated by the target network and the current network; in addition, the parameters of the current network are copied to the target network at regular intervals.
6. The method of claim 5, wherein the objective function is a function of the target value
Figure FDA0003344541680000028
The calculation of (a) includes:
Figure FDA0003344541680000029
wherein R(s)t,at) Performed for the node at the t time periodAction atE, awards obtained in the t-th time period after the E is in the A, i belongs to the E, 1, N-and N are the number of nodes; q(s)t+1,at+1;θ-) Representing a network, i.e. the t +1 th time period being θ based on the network parameter-In which node i is in state st+1Performing action at+1The network of (2); st+1Is the state of the node i in the t +1 th time period; a ist+1An action performed for node i at time period t + 1;
Figure FDA0003344541680000031
representing node i as θ based on a network parameter-In which nodes are in state st+1Lower selection action at+1To maximize the corresponding Q value.
7. The method of claim 5, wherein the error calculation module calculates the current network Q(s)t,at(ii) a Theta) and target network Q(s)t+1,at+1;θ-) TD error between is expressed as:
Figure FDA0003344541680000032
updating neural network parameters with gradient descent:
Figure FDA0003344541680000033
wherein
Figure FDA0003344541680000034
L (theta) is a TD error function of the model;
Figure FDA0003344541680000035
expressing the expectation; theta is a real-time updated network parameter; alpha is schoolThe learning rate;
Figure FDA0003344541680000036
is the gradient of L (θ); q(s)t,at(ii) a Theta) represents a network, i.e. the network parameter is the node i e, 1, N-in the state s under the condition of theta at the t-th time periodtPerforming action atThe network of (2).
8. The method of claim 2, wherein the whole system time is divided into a plurality of consecutive superframes, one superframe time is a time period, each superframe comprises a beacon frame, a control period and a data transmission period, and the control period uses a fixed control channel to transmit related control information and channel allocation decisions; the data transmission period adopts K non-overlapping channels to support non-interference parallel data transmission; and in the control period, all nodes in the network are switched to the control channel to monitor and send the related control information; in the data transmission period, a node to be sent data is switched to a channel where a parent node of the node is located to transmit data based on a channel access mechanism.
9. The method according to claim 2, wherein the node employs a channel access mechanism based on RTS/DCTS in performing action a, and the method comprises:
if the node d is positioned in the mth hop and the node of the (m + 1) th hop next to the mth hop is the node i, the node d is the father node of the node i; if the node e is positioned in the mth hop and the node of the (m + 1) th hop next to the mth hop is the node j, the node e is the father node of the node j; the four nodes work on the same channel, and the backoff values of the node i and the node j are 0;
when the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when receiving an RTS frame from the node i, the node e waits for an SIFS frame and returns a CTS frame to inform the child node of delaying data transmission during the transmission period of the node i;
wherein RTS refers to request to send; CTS means clear to send; the CIFS is an interframe space used for returning CTS by the destination node; SIFS means to separate frames belonging to one dialog, and CIFS is slightly larger than SIFS.
10. The method according to claim 9, wherein if node j is located within the communication range of node i and its parent node is not located within the communication range of node i, node j sends an RTS frame to parent node e after waiting for an RIFS after node j receives the RTS frame.
CN202111318928.0A 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network Active CN114024639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111318928.0A CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111318928.0A CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Publications (2)

Publication Number Publication Date
CN114024639A true CN114024639A (en) 2022-02-08
CN114024639B CN114024639B (en) 2024-01-05

Family

ID=80062994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111318928.0A Active CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Country Status (1)

Country Link
CN (1) CN114024639B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016263A (en) * 2022-05-27 2022-09-06 福州大学 DRL-based control logic design method under continuous microfluidic biochip
CN116054982A (en) * 2022-06-30 2023-05-02 荣耀终端有限公司 Data processing method and terminal

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090298526A1 (en) * 2005-09-28 2009-12-03 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Signaling method for decentralized allocation of online transmission power in a wireless network
CN101951612A (en) * 2010-09-01 2011-01-19 南京航空航天大学 DCF protocol fairness guarantee method suitable for multi-hop ad hoc network
US20110038358A1 (en) * 2009-08-14 2011-02-17 Li-Chun Wang Apparatus And Method For Neighbor-Aware Concurrent Transmission Media Access Control Protocol
CN103415018A (en) * 2013-08-23 2013-11-27 山东省计算中心 Communication resource allocation method of wireless sensor network
US20140286156A1 (en) * 2013-03-15 2014-09-25 Sanjai Kohli Distribution Node and Client Node for Next Generation Data Network
CN105245608A (en) * 2015-10-23 2016-01-13 同济大学 Telematics network node screening and accessibility routing construction method based on self-encoding network
US20200351839A1 (en) * 2019-05-01 2020-11-05 Qualcomm Incorporated Dynamic physical downlink control channel (pdcch) resource sharing between pdcch monitoring and pdcch transmission in a multi-hop network
US20200359296A1 (en) * 2018-02-07 2020-11-12 Hochschule Anhalt Method of adaptive route selection in a node of a wireless mesh communication network corresponding apparatus for performing the method of adaptive route selection and corresponding computer program
CN112954736A (en) * 2019-12-10 2021-06-11 深圳先进技术研究院 Policy-based computation offload of wireless energy-carrying internet-of-things equipment
CN113613339A (en) * 2021-07-10 2021-11-05 西北农林科技大学 Channel access method of multi-priority wireless terminal based on deep reinforcement learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090298526A1 (en) * 2005-09-28 2009-12-03 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Signaling method for decentralized allocation of online transmission power in a wireless network
US20110038358A1 (en) * 2009-08-14 2011-02-17 Li-Chun Wang Apparatus And Method For Neighbor-Aware Concurrent Transmission Media Access Control Protocol
CN101951612A (en) * 2010-09-01 2011-01-19 南京航空航天大学 DCF protocol fairness guarantee method suitable for multi-hop ad hoc network
US20140286156A1 (en) * 2013-03-15 2014-09-25 Sanjai Kohli Distribution Node and Client Node for Next Generation Data Network
CN103415018A (en) * 2013-08-23 2013-11-27 山东省计算中心 Communication resource allocation method of wireless sensor network
CN105245608A (en) * 2015-10-23 2016-01-13 同济大学 Telematics network node screening and accessibility routing construction method based on self-encoding network
US20200359296A1 (en) * 2018-02-07 2020-11-12 Hochschule Anhalt Method of adaptive route selection in a node of a wireless mesh communication network corresponding apparatus for performing the method of adaptive route selection and corresponding computer program
US20200351839A1 (en) * 2019-05-01 2020-11-05 Qualcomm Incorporated Dynamic physical downlink control channel (pdcch) resource sharing between pdcch monitoring and pdcch transmission in a multi-hop network
CN112954736A (en) * 2019-12-10 2021-06-11 深圳先进技术研究院 Policy-based computation offload of wireless energy-carrying internet-of-things equipment
CN113613339A (en) * 2021-07-10 2021-11-05 西北农林科技大学 Channel access method of multi-priority wireless terminal based on deep reinforcement learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JING XU 等: "Monitoring Multi-Hop Multi-Channel Wireless Networks: Online Sniffer Channel Assignment", 《2016 IEEE 41ST CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN)》 *
YANG LEI 等: "An Energy Efficient Multiple-Hop Routing Protocol for Wireless Sensor Networks", 《2008 FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKS AND INTELLIGENT SYSTEMS》 *
丁凯琪: "高密度无线网络多信道分配与接入技术研究", 《万方学位论文》 *
尚凤军 等: "无线传感器网络中分布式多跳路由算法研究", 《传感技术学报》, vol. 25, no. 4 *
张震 等: "无线多跳网络信道分配优化算法研究", 《宜春学院学报》, vol. 42, no. 3 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016263A (en) * 2022-05-27 2022-09-06 福州大学 DRL-based control logic design method under continuous microfluidic biochip
CN116054982A (en) * 2022-06-30 2023-05-02 荣耀终端有限公司 Data processing method and terminal
CN116054982B (en) * 2022-06-30 2023-11-14 荣耀终端有限公司 Data processing method and terminal

Also Published As

Publication number Publication date
CN114024639B (en) 2024-01-05

Similar Documents

Publication Publication Date Title
Tang et al. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption
Koushik et al. Intelligent spectrum management based on transfer actor-critic learning for rateless transmissions in cognitive radio networks
CN113543074B (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN114024639B (en) Distributed channel allocation method in wireless multi-hop network
US20080107069A1 (en) Joint Channel Assignment and Routing in Wireless Networks
CN110753319B (en) Heterogeneous service-oriented distributed resource allocation method and system in heterogeneous Internet of vehicles
CN112954651A (en) Low-delay high-reliability V2V resource allocation method based on deep reinforcement learning
Naveen Raj et al. A survey and performance evaluation of reinforcement learning based spectrum aware routing in cognitive radio ad hoc networks
Zhou et al. DRL-based low-latency content delivery for 6G massive vehicular IoT
Ashtari et al. Knowledge-defined networking: Applications, challenges and future work
CN111601398A (en) Ad hoc network medium access control method based on reinforcement learning
Balcı et al. Massive connectivity with machine learning for the Internet of Things
Wang et al. Energy-efficient and delay-guaranteed routing algorithm for software-defined wireless sensor networks: A cooperative deep reinforcement learning approach
Wang et al. Reliability optimization for channel resource allocation in multihop wireless network: A multigranularity deep reinforcement learning approach
Dai et al. Multi-objective intelligent handover in satellite-terrestrial integrated networks
Mazandarani et al. Self-sustaining multiple access with continual deep reinforcement learning for dynamic metaverse applications
Yang et al. Task-driven semantic-aware green cooperative transmission strategy for vehicular networks
CN113316156B (en) Intelligent coexistence method on unlicensed frequency band
Dinh et al. Deep reinforcement learning-based offloading for latency minimization in 3-tier v2x networks
Li et al. A Lightweight Transmission Parameter Selection Scheme Using Reinforcement Learning for LoRaWAN
CN115865833A (en) Power service access system and access method based on terminal resource scheduling
Tian et al. Deep reinforcement learning based resource allocation with heterogeneous QoS for cellular V2X
Zhao et al. Multi-agent deep reinforcement learning based resource management in heterogeneous V2X networks
Mondal et al. Station Grouping Mechanism using Machine Learning Approach for IEEE 802.11 ah
Lei et al. QoS-oriented media access control using reinforcement learning for next-generation WLANs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231212

Address after: No. 702, 703, 7th Floor, Building 7, No. 219 Tianhua Second Road, Chengdu High tech Zone, China (Sichuan) Pilot Free Trade Zone, Chengdu City, Sichuan Province, 610041

Applicant after: CHENGDU SKYSOFT INFO & TECH CO.,LTD.

Address before: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant before: Shenzhen Hongyue Information Technology Co.,Ltd.

Effective date of registration: 20231212

Address after: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Hongyue Information Technology Co.,Ltd.

Address before: 400065 Chongwen Road, Nanshan Street, Nanan District, Chongqing

Applicant before: CHONGQING University OF POSTS AND TELECOMMUNICATIONS

GR01 Patent grant
GR01 Patent grant