CN114024639B - Distributed channel allocation method in wireless multi-hop network - Google Patents

Distributed channel allocation method in wireless multi-hop network Download PDF

Info

Publication number
CN114024639B
CN114024639B CN202111318928.0A CN202111318928A CN114024639B CN 114024639 B CN114024639 B CN 114024639B CN 202111318928 A CN202111318928 A CN 202111318928A CN 114024639 B CN114024639 B CN 114024639B
Authority
CN
China
Prior art keywords
node
network
channel
hop
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111318928.0A
Other languages
Chinese (zh)
Other versions
CN114024639A (en
Inventor
雷建军
尚凤军
王颖
刘捷
周盈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Skysoft Info & Tech Co ltd
Shenzhen Hongyue Information Technology Co ltd
Original Assignee
Chengdu Skysoft Info & Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Skysoft Info & Tech Co ltd filed Critical Chengdu Skysoft Info & Tech Co ltd
Priority to CN202111318928.0A priority Critical patent/CN114024639B/en
Publication of CN114024639A publication Critical patent/CN114024639A/en
Application granted granted Critical
Publication of CN114024639B publication Critical patent/CN114024639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/382Monitoring; Testing of propagation channels for resource allocation, admission control or handover
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention relates to the field of wireless network communication, in particular to a distributed channel allocation method in a wireless multi-hop network, which comprises a physical architecture at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer comprises n wireless nodes which are randomly deployed in the network to form a multi-hop wireless communication network, each node is used as an autonomous Agent, and interacts with an uncertain network environment through a local decision module; the aggregation node of the computing layer is responsible for aggregating, analyzing and processing data collected by other stations in the network, has an edge computing function, can train an asynchronous DRL model based on experience information acquired by the node in a distributed mode, models a multi-channel allocation problem as a POMDP problem, and performs channel allocation by using the trained asynchronous DRL model; the invention solves the problems of hidden terminals and exposed terminals in the high-density multi-hop wireless network, and effectively avoids the problems of data collision and channel resource waste.

Description

Distributed channel allocation method in wireless multi-hop network
Technical Field
The invention relates to the field of wireless network communication, in particular to a distributed channel allocation method in a wireless multi-hop network.
Background
Multi-channel media control access (multiple media access control, MMAC) techniques enable interference-free data transmission in multiple orthogonal channels for communication links that interfere with each other in single-channel communications. MMAC can effectively avoid the problem of single channel interference, and improve the throughput of the whole network, so that MMAC is considered as a technology with great potential for relieving the shortage of wireless network channel resources at present. While multi-channel communication has many advantages over single-channel communication, it brings about many new problems:
channel allocation and negotiation: the most basic and important problem of the multi-channel-based MAC communication technology is how to reasonably allocate channel resources so as to ensure that each node maximizes the network capacity of the whole network on the premise of normal communication. In addition, prior to communication, negotiations between nodes are required to address the use of the channel to ensure that two communication nodes operate on the same channel during data transmission.
Multi-channel broadcasting: wireless networks based on a single channel model can easily implement broadcasting because each sensor node is on the same channel; however, in a multi-channel environment, when a certain node performs broadcasting, certain nodes cannot receive broadcasting contents due to the distribution of the nodes over a plurality of channels. Broadcast functions play an important role in network applications, and therefore, how to implement broadcast functions is based on yet another challenge faced by multi-channel communication.
Multi-hop hidden terminals and exposed terminals: as shown in fig. 1, the multi-hop hidden terminal is a node that is within the communication range of the receiving node and outside the communication range of the transmitting node. These nodes may transmit data to the same receiving node because they cannot receive the transmission data from the transmitting node, resulting in collision of data transmission. In high density situations, hidden terminal problems can lead to unnecessary data collisions, greatly degrading network performance. The multi-hop exposure terminal problem refers to a node that is within the coverage of a transmitting node and out of the coverage of a receiving node, and the exposure terminal delays transmission by hearing the transmission of the transmitting node. The presence of the exposed terminal may result in unnecessary waste of channel resources.
Disclosure of Invention
In order to effectively reduce interference and data conflict in a network, improve the utilization rate of channels and the throughput of a system and ensure the reliability of data service transmission among nodes, the invention provides a distributed channel allocation method in a wireless multi-hop network, which adopts a physical architecture at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer comprises n wireless nodes which are randomly deployed in the network to form a multi-hop wireless communication network, and each node is used as an autonomous Agent and interacts with an uncertain network environment through a local decision module; the aggregation node of the computing layer is responsible for aggregating, analyzing and processing data collected by other sites in the network, and the node has an edge computing function or adopts a special edge server node, so that the computing task of the node can be unloaded, an asynchronous DRL model can be trained based on the experience information acquired by the node in a distributed mode, the multi-channel allocation problem is modeled as a POMDP problem, and the distributed channel allocation is carried out by utilizing the asynchronous DRL model trained by the centralized node or the edge server.
Further, the multi-channel allocation problem is modeled as a POMDP problem, that is, an Agent observes a current network state s and performs an action a in a time period t, and after performing the action a, the Agent transitions to a network state s' in a next time period with a state transition probability P, and obtains a corresponding reward R from the environment, where the POMDP problem is expressed as:
M=<S,A,P,R,γ>;
wherein M represents a POMDP problem model; s is the state set representing the state space; a is an action set representing an action space, wherein an action a epsilon A represents a channel number to be switched by a node; r is a reward function; gamma is the discount factor. I.e. given an environmental state S e S, an Agent performs an action a e a, the environmental state will migrate from S to S ', i.e. s→s', while obtaining a corresponding return R from the environment.
Further, the environmental state observed by the node i in the t-th time periodExpressed as:
wherein,the occupation condition of the neighbor node of the node i on each wireless channel is represented, namely the potential interference degree of each channel; k is the number of available channels and N is the number of nodes; />Representing the occupancy of channel j by the neighbor node of node i during the t-th time period,/>Neighbor node indicating the presence of node i uses channel j, and>indicating that a neighboring node of the existing node i uses a channel j; />n i,o Is the total number of neighbor nodes of node i.
Further, the prize R that is obtained from the environment when the node is after performing action a and transitions from state s to the next state s' can be expressed as:
wherein R (s, a) node i switches the channel to a reward R after channel k in the t-th data period, i.eR=R(s,a);A neighbor node indicating whether node i exists in the current period uses channel k: if the neighbor node of the node i does not exist uses the channel k, then +.>On the contrary, let(s)> For the time period t, the probability of successful transmission of the neighbor of the node i.
Further, the asynchronous DRL model deployed in the computing layer comprises a current network, a target network, an error computing module and an experience pool, and a decision module deployed in the wireless node, wherein the network structure of the local decision module is the same as that of the current network, and the parameters of the local decision module are periodically acquired from the edge node; wherein:
the target network fixes the network parameters and obtains the target value function,the current network is used for evaluating the strategy updating parameters and approximating the value function;
the parameter theta of the current network is updated every time period; parameter θ of target network - Updating once every fixed multiple time periods, wherein the period is kept unchanged;
experience e= < S, a, r, S '>, S, S' ∈s, a∈a in the experience pool, asynchronously collected by nodes in the network from the wireless multi-hop network environment;
the error calculation module updates parameters of the current network through TD deviation calculated by the target network and the current network; in addition, parameters of the current network are copied to the target network at regular intervals.
Further, the target value functionThe calculation of (1) comprises:
wherein R(s) t ,a t ) For node i E [1, N](N is the number of nodes), at the t-th time period state s t E S executing action a t Rewards obtained in the t-th time period after E A; q(s) t+1 ,a t+1 ;θ - ),(s t+1 ∈S,a t+1 E A) represents a network, i.e. the t+1st time period is based on the target network, i.e. the parameter θ - Node i is in state s t+1 Executing action a t+1 Is a network of (a); s is(s) t+1 The state of the node i in the t+1th time period; a, a t+1 An action performed for node i at time period t+1th; max (max) at+1 ∈AQ(s t+1 ,a t+1 ;θ - ) The representation node i is based on the target network (parameter θ - ) In state s t+1 Lower selection action a t+1 To maximize the corresponding Q value.
Further, the error calculation module calculates a current network Q (s t ,a t The method comprises the steps of carrying out a first treatment on the surface of the θ) and target valueError between:
gradient descent is used to update neural network parameters:
wherein L (θ) is the TD error function of the model;representing a desire for selected mini-batch empirical data; parameters of the current network updated in real time by theta; alpha learning rate; />For a corresponding gradient; q(s) t ,a t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents a network, i.e. node i is in state s at the time period of t when the network parameter is θ t Executing action a t Is a network of (a) a network of (b) a plurality of (c) networks.
Further, dividing the whole system time into a plurality of continuous super-frame times, wherein one super-frame time is a time period, each super-frame comprises a beacon frame, a control period and a data transmission period, and the control period adopts a fixed control channel to transmit related control information and channel allocation decisions; k non-overlapping channels are adopted in the data transmission period to support interference-free parallel data transmission; and in the control period, all nodes in the network switch to the control channel to intercept and send the relevant control information; and switching the node with data to be sent to a channel where a parent node is located in the data transmission period to perform data transmission based on a channel access mechanism.
Further, in the process of executing the action a, the node adopts a channel access mechanism based on RTS/DCTS, which comprises the following steps:
if the node d is located in the m-th hop and the m+1st hop node of the next hop is the node i, namely the node d is a father node of the node i; if the node e is located in the m-th hop and the m+1st hop node of the next hop is the node j, namely the node e is the father node of the node j; the four nodes all work on the same channel, and the back-off value of the node i and the node j is 0;
when the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when node e receives the RTS frame from node i, waiting for a SIFS, returning a CTS frame to inform the child node thereof that the child node delays data transmission during the transmission of node i;
wherein, RTS refers to request sending; CTS refers to clear to send; CIFS is the interframe space for the destination node to return CTS; SIFS refers to a technique for separating frames belonging to a session, and CIFS is slightly larger than SIFS.
Further, if the node j is located in the communication range of the node i, and the parent node thereof is not located in the communication range of the node i, after the node j receives the RTS frame, the node j waits for a RIFS and then sends the RTS frame to the parent node e.
The invention solves the problems of hidden terminals and exposed terminals in the high-density multi-hop wireless network, and effectively avoids the problems of data collision and channel resource waste so as to improve the overall network performance. In addition, an asynchronous DRL model is provided for a wireless multi-hop multi-channel network to dynamically optimize the channel allocation strategy of the node based on the channel access performance and the channel occupation condition of the node in the data transmission period. A novel wireless mode based on Mobile Edge Computing (MEC) is provided to solve the computing and storage pressure of terminal nodes, and a distributed interactive (micro learning) and centralized training (macro learning) framework is designed to train an asynchronous DRL model. Therefore, the asynchronous DRL model proposed by the invention can be implemented even on a resource-constrained terminal. In addition, the invention considers the non-stationary problem in the multi-agent scene (MAS), and only utilizes the neighbor local information, thereby avoiding the severe dynamic change of the network and further accelerating the network convergence.
Drawings
Fig. 1 is an exemplary diagram of hidden and exposed terminals in multiple channels provided in the prior art;
FIG. 2 is a diagram of a system architecture for edge computation enabled provided by an embodiment of the present invention;
fig. 3 is a superframe structure diagram used in the present invention;
FIG. 4 is an asynchronous DRL model based on a distributed decision architecture in the present invention;
fig. 5 is a centralized training flow of an asynchronous DRL model according to an embodiment of the present invention.
FIG. 6 is one of the operational schematics of RTS/DCTS provided by an embodiment of the present invention;
FIG. 7 is a second schematic diagram of RTS/DCTS operation according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a distributed channel allocation method in a wireless multi-hop network, which adopts a physical architecture at least comprising a physical equipment layer, a calculation layer and a network service layer, wherein the physical equipment layer comprises n wireless nodes which are randomly deployed in the network to form a multi-hop wireless communication network, each node is used as an autonomous Agent, and interacts with an uncertain network environment through a local decision module; the aggregation node of the computing layer is responsible for aggregating, analyzing and processing data collected by other stations in the network, the node has an edge computing function, so that the computing task of the node can be unloaded, an asynchronous DRL model can be trained based on the experience information acquired by the node in a distributed mode, the multi-channel allocation problem is modeled as a POMDP problem, and the trained asynchronous DRL model is utilized for channel allocation.
Example 1
The present embodiment provides a system architecture diagram, as shown in fig. 2, where the system architecture includes a physical device layer, a computing layer, and a network service layer. The physical equipment layer is a multi-hop wireless communication network formed by n wireless nodes which are randomly deployed in the network, each node is used as an autonomous Agent, and the nodes interact with an uncertain network environment through a local decision module; the aggregation node of the computing layer is responsible for aggregating, analyzing and processing data collected by other stations in the network, has an edge computing function, can unload computing tasks of the nodes, and can train an asynchronous DRL model based on experience information acquired by the nodes in a distributed mode.
In the data transmission process, the present embodiment selects to perform data transmission in a superframe structure, as shown in fig. 3, where the system time is divided into a plurality of consecutive superframe times, and each superframe includes a beacon frame, a control period, and a data transmission period. Wherein, the control period adopts a fixed control channel to transmit related control information and channel allocation decision; the data transmission period employs K non-overlapping channels to support interference-free parallel data transmission. Thus, during a control period, all nodes in the network switch to the control channel to listen and send relevant control information (routing, time synchronization, channel switching, etc.); and switching the node with data to be sent to a channel where a parent node is located in the data transmission period to perform data transmission based on a channel access mechanism.
The asynchronous DRL model adopted in this embodiment is shown in fig. 4, and the DRL is adopted to solve the problem of dynamic multichannel allocation in the multihop wireless network. The embodiment of the invention combines the DQN function approximation capability and an A3C asynchronous experience sampling architecture, provides an asynchronous DRL model, and aims to reasonably allocate channels for nodes so as to furthest improve the reliability of data transmission. The DRL model deployed on the edge server adopts a DQN architecture, DNN is introduced to extract features from original data to approach a behavior value function, and meanwhile, an asynchronous training framework of A3C is combined to solve the problem that the DQN is not suitable for a high-dimensional action space and MAS, so that the correlation between experiences is broken, the convergence speed of a network is remarkably improved, and the problem that an A3C algorithm cannot be realized on a wireless node with limited resources is solved.
The present embodiment considers that the limited computing power, energy and memory capability of the wireless node in some scenarios results in computing bottlenecks and low performance, limits support for advanced applications, and runs computationally intensive tasks, i.e., trains the DRL model. Therefore, the embodiment of the invention adopts a wireless network architecture based on edge computing energization, and transfers the computing task of the node training asynchronous DRL model to the edge node (sink node) with rich resources. As shown in fig. 2, the asynchronous DRL model deployed at the computing layer consists of a current network (main), a target network (target), and an experience pool (experience replay). Thus, the edge compute enabled sink node performs the training and updating tasks of the model.
When the asynchronous DRL model is adopted for channel allocation, the method combines the function approximation capability of DQN and the asynchronous interaction architecture of A3C, and a distributed interaction module (micro-learning) in the asynchronous DRL model shown in fig. 4 allows the terminal node to asynchronously select channel resources by using local observation information. In addition, a centralized training module (macro learning) trains the asynchronous DRL model by adjusting the operating parameters, directing the system towards an application-specific global optimization objective (e.g., maximizing reliability of data transmission). Wherein each terminal node maintains a DRL predictive model to independently allocate channels. Specifically, embodiments of the present invention model the multi-channel allocation problem as a POMDP problem, which consists of five tuples: m=<S,A,P,R,γ>State s, action a, state transition probability P, reward function R, and discount factor γ. The Agent observes the current network state s and performs action a at each control cycle of time step t. Then transition to the next state with state transition probability, obtaining rewards R from the environment t+1
State space, s= { S 1 ,S 2 ,...,S 2K+N }. Where K is the number of available channels and N is the number of nodes. For a particular node i, at the t-th cycle, its state vector,
wherein,representing the occupancy of channel j by the neighbor node of node i,/>Indicating that the neighbor channel with node i occupies channel j; on the contrary, S i,t,j =0。/>Is the total number of neighbor nodes for node i.
Motion space, a= { a 1 ,a 2 ...,a K },a k E A. Wherein, the channel number, a, used for indicating the node i to switch in the next data transmission period k =ch i,t,k ,ch i,t,k =k∈[1,K]。
A bonus function, R. When the node i is in the t data period, the state is locally observedExecuting an actionSwitching to channel ch i,t,k At the end of the data transmission period, the environment returns to the node an immediate prize value, r=r (s, a), which can be solved by the following function:
wherein,in the current data period, the neighbor node without the node i uses the channel ch i,t,k The method comprises the steps of carrying out a first treatment on the surface of the On the contrary, let(s)> Is to use channel ch i,t,k Number of neighbor nodes of node i of=k. />Is node ch i,t,k And successful transmission probability of data transmission is carried out on the data.
The aggregation node with edge computing enabled trains the DRL model in a centralized mode based on the experience information acquired by each node in the network in a distributed asynchronous mode, and sends updated network model parameters to the nodes, and each node can acquire the latest network parameters from the father node.
The centralized training process of the DRL model is shown in fig. 5, two networks with identical structures but different parameters exist in the asynchronous DRL model, the current value of Q estimation is predicted, and the latest parameters are used; whereas the neural network target value parameters of Q reality are predicted, which use the previous old parameters. In this embodiment, the state of the node is taken as an input of the neural network, each node performs a different action as a class of the node, the probability of each action performed by the node is predicted by the neural network, and the probability is taken as an output of the neural network, i.e., a value of Q, for example, Q (s, a; θ) represents a probability of performing the action a by the node when the node state s is input and the parameter of the neural network is θ.
During model training, some (mini-batch) experiences are randomly taken from an experience pool to train so as to break the correlation between the experiences. In addition, since the experience information in the experience pool is provided by the agent asynchronously, the correlation between experiences can be further broken and more abundant experiences can be provided.
As can be seen from fig. 5<s,a>Information is used as input of a current value network to acquire Q (s, a; theta) for evaluating a current state behavior value function; the S ' S information is used for the input of the target value network to obtain the corresponding maxQ (S ', a '; θ) - ) The method comprises the steps of carrying out a first treatment on the surface of the CalculatingComprising the following steps:
thus, based onThe value, adopting the DQN error function module, can further calculate the error value:
the current network updates parameters of the current value network based on the error function gradient:
wherein s.epsilon.S and a.epsilon.A. Copying parameters of the current value network to the target value network every time a certain number of iterations are performed;
θ - ←θ
repeating the above process to make the network reach a stable state.
Although asynchronous DRL based channel allocation models improve network performance by applying multiple parallel data transmissions, hidden and exposed terminal problems on specific channels are further exacerbated in highly dense wireless multi-hop network scenarios. Fig. 1 illustrates hidden terminals and exposure problems in a wireless multi-hop network when node D is transmitting data to node C, since node B is outside the communication range of node D. Therefore, the node B misdeems the channel to be in an idle state, so when the node B sends data to the nodes C and a at this time, data collision occurs at the node C, which results in unnecessary data retransmission, further aggravating the network congestion degree; further, when node B1 transmits data to node A1, since node B2 is in the communication range of node B1 and nodes B2 and A2 are not in the communication range of nodes A1 and B1, respectively, node B2 erroneously recognizes that the channel is in an idle state to delay data transmission, which leads to unnecessary waste of channel resources. Therefore, the embodiment of the invention proposes to solve the problems of hidden terminals and exposed terminals in the wireless multi-hop network based on an RTS/DCTS mechanism. The RTS/DCTS mechanism is further described below by way of example.
Fig. 6 is a schematic diagram of solving the problem of hidden terminals in a wireless multi-hop network based on RTS/DCTS according to a preferred embodiment of the present invention. Nodes i and j, nodes d and e are located in m and m+1 hops (referring to different and adjacent hop counts), respectively, and operate on the same channel. Node d is the parent of node i and node e is the parent of node j. Node e is also a neighbor node of node i. Let the backoff values of nodes i and j be 0 at this time.
When the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when node e receives the RTS frame from node i, waits for a SIFS, returns a CTS frame to inform its child nodes that it is delaying data transmission during node i transmission, thereby avoiding hidden terminal problems.
In the channel access mechanism in the multi-hop environment, hidden terminal problems are unavoidable, so that the probability of successful transmission of node i on a specific channel k,the calculation can be performed using the following formula:
where τ is the probability of transmission in the channel access slot. In particular, the method comprises the steps of,(n s the total number of child nodes that are parent nodes of the node). n is n a Representing the number of neighbor nodes of node i, and n f Representing the number of neighbor nodes of the parent node of node i (child nodes not including the parent node).
Referring to fig. 7, fig. 7 is a schematic diagram illustrating an example of the problem of the exposed terminal in the wireless multi-hop network based on RTS/DCTS according to the preferred embodiment of the present invention. Nodes i and j, nodes d and e are located in m and m+1 hops (referring to different and adjacent hop counts), respectively, and operate on the same channel. Node d is the parent of node i and node e is the parent of node j. Node j is also a neighbor node to node i. Let the backoff values of nodes i and j be 0 at this time.
When the node i sends RTS to the node d, the node d waits for a CIFS time and returns a CTS frame; because node j is located within the communication range of node i. Therefore, node j will also receive the RTS frame, but since the destination node of the RTS frame is not the destination node of node j, node j will not set a NAV according to the Duration field information of the RTS;
after node j receives the RTS frame and waits for a RIFS, judging whether a CTS frame is received or not; since its parent node e is not within communication range of node i, node e does not return a CTS after SIFS; therefore, node j does not receive a CTS frame after RIFS; node j sends an RTS frame to parent node e;
the nodes in the network execute the process, so that the problems of data conflict and channel resource waste caused by hidden terminals and exposed terminals in the network can be effectively solved; thus, the probability of successful transmission can be rewritten as:
based on the RTS/DCTS mechanism, data collision between data links under adjacent father nodes on the same channel can be effectively avoided through SIFS and CTS; in addition, the channel access mechanism introduces the RIFS interframe space to solve the problem of violent terminals in the network, thereby improving the successful transmission probability of the nodes, namelyTherefore, the channel access mechanism can improve the successful transmission probability of the nodes in the network;
furthermore, from the above formula, it can be seen that P s And parametersn a And n f Directly related to parameter n s ,n a And n f Further optimization may be achieved by optimizing the channel allocation strategy; therefore, the embodiment of the invention ensures that the successful transmission probability of the node on the channel>A portion of the reward function for the channel allocation model is aimed at further optimizing network performance.
The channel allocation and channel access mechanism provided by the embodiment of the invention optimizes channel resources from different layers, optimizes channel resources from a frequency domain in channel allocation and optimizes channel resources from a time domain in channel access. In addition, the reasonable channel allocation mechanism can further alleviate the interference problem in the channel access process, and the channel access performance of the node can further optimize the channel allocation strategy.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. A distributed channel allocation method in a wireless multi-hop network is characterized in that a physical architecture at least comprising a physical device layer, a computing layer and a network service layer is adopted, the physical device layer comprises n wireless nodes which are randomly deployed in the network to form a multi-hop wireless communication network, the multi-channel allocation problem is modeled as a POMDP problem, an asynchronous DRL model is utilized to realize distributed channel allocation, each node is used as an autonomous Agent, interaction is carried out with an uncertain network environment through a local decision module, a gathering node of the computing layer is responsible for gathering, analyzing and processing data collected by other stations in the network, the node has an edge computing function, the computing task of the node can be unloaded, the asynchronous DRL model can be trained based on experience information collected by the node in a distributed mode, and the wireless node periodically updates parameters of the local decision module from the gathering node, and the method concretely comprises the following steps:
the POMDP problem consists of five tuples, m= < S, a, P, R, γ >, state S, action a, state transition probability P, reward function R, and discount factor γ;
the Agent observes the current network state s and executes the action a in the control period of each time step t; then transition to the next state with state transition probability, obtaining rewards R from the environment t+1
State spaceWhere K is the number of available channels and N is the number of nodes; for a specific node i, at the t-th period, its state vector, +.>
Wherein,representing the occupancy of channel j by the neighbor node of node i,/>Indicating that the neighbor channel with node i occupies channel j; on the contrary, S i,t,j =0;/>Is the total number of neighbor nodes of node i;
motion space a= { a 1 ,a 2 ...,a K },a k E A, wherein a is a channel number for indicating that node i is to switch in the next data transmission period k =ch i,t,k ,ch i,t,k =k∈[1,K];
Reward function R, when node i is in the t data period, locally observing stateExecuting an actionSwitching to channel ch i,t,k At the end of the data transmission period, the environment returns to the node an immediate prize value, r=r (s, a), which can be solved by the following function:
wherein,in the current data period, the neighbor node without the node i uses the channel ch i,t,k The method comprises the steps of carrying out a first treatment on the surface of the On the contrary, let(s)> Is to use channel ch i,t,k Number of neighbor nodes of node i of=k; />Is node ch i,t,k Successful transmission probability of data transmission is carried out on the data;
the aggregation node energized by edge calculation is used for intensively training a DRL model based on the experience information acquired by each node in the network in a distributed and asynchronous way, and sending updated network model parameters to the nodes, wherein each node can acquire the latest network parameters from a father node;
taking the states of the nodes as inputs of the neural network, and executing each node differentlyThe action is taken as the category of the node, the probability of each action executed by the node is predicted by the neural network, and the probability is taken as the output of the neural network, namely<s,a>Information is used as input of a current value network to acquire Q (s, a; theta) for evaluating a current state behavior value function; the S ' S information is used for the input of the target value network to obtain the corresponding maxQ (S ', a '; θ) - ) The method comprises the steps of carrying out a first treatment on the surface of the CalculatingComprising the following steps:
thus, based onThe value, adopting the DQN error function module, can further calculate the error value:
the current network updates parameters of the current value network based on the error function gradient:
wherein S epsilon S and a epsilon A; copying parameters of the current value network to the target value network every time a certain number of iterations are performed;
θ - ←θ
repeating the above process to make the network reach a stable state.
2. The method of claim 1, wherein the entire system time is divided into a plurality of consecutive super-frame times, one super-frame time being a time period, each super-frame including a beacon frame, a control period and a data transmission period, the control period employing a fixed control channel to transmit the associated control information and channel allocation decisions; k non-overlapping channels are adopted in the data transmission period to support interference-free parallel data transmission; and in the control period, all nodes in the network switch to the control channel to intercept and send the relevant control information; and switching the node with data to be sent to a channel where a parent node is located in the data transmission period to perform data transmission based on a channel access mechanism.
3. The method for distributed channel allocation in a wireless multi-hop network according to claim 1, wherein the node uses an RTS/DCTS-based channel access mechanism in performing act a, comprising:
if the node d is located in the m-th hop and the m+1st hop node of the next hop is the node i, namely the node d is a father node of the node i; if the node e is located in the m-th hop and the m+1st hop node of the next hop is the node j, namely the node e is the father node of the node j; the four nodes all work on the same channel, and the back-off value of the node i and the node j is 0;
when the node i sends an RTS frame to the node d, the node d waits for a CIFS time and returns a CTS frame;
after receiving the RTS frame of the node i or the CTS frame of the node d, the child node of the node d sets a corresponding NAV based on the information in the Duration field;
when node e receives the RTS frame from node i, waiting for a SIFS, returning a CTS frame to inform the child node thereof that the child node delays data transmission during the transmission of node i;
wherein, RTS refers to request sending; CTS refers to clear to send; CIFS is the interframe space for the destination node to return CTS; SIFS refers to a technique for separating frames belonging to a session, and CIFS is slightly larger than SIFS.
4. A method for distributing channels in a wireless multi-hop network according to claim 3, wherein if node j is located in the communication range of node i and its parent node is not located in the communication range of node i, when node j receives the RTS frame, after waiting for a RIFS, node j sends the RTS frame to parent node e.
CN202111318928.0A 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network Active CN114024639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111318928.0A CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111318928.0A CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Publications (2)

Publication Number Publication Date
CN114024639A CN114024639A (en) 2022-02-08
CN114024639B true CN114024639B (en) 2024-01-05

Family

ID=80062994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111318928.0A Active CN114024639B (en) 2021-11-09 2021-11-09 Distributed channel allocation method in wireless multi-hop network

Country Status (1)

Country Link
CN (1) CN114024639B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016263A (en) * 2022-05-27 2022-09-06 福州大学 DRL-based control logic design method under continuous microfluidic biochip
CN116054982B (en) * 2022-06-30 2023-11-14 荣耀终端有限公司 Data processing method and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101951612A (en) * 2010-09-01 2011-01-19 南京航空航天大学 DCF protocol fairness guarantee method suitable for multi-hop ad hoc network
CN103415018A (en) * 2013-08-23 2013-11-27 山东省计算中心 Communication resource allocation method of wireless sensor network
CN105245608A (en) * 2015-10-23 2016-01-13 同济大学 Telematics network node screening and accessibility routing construction method based on self-encoding network
CN112954736A (en) * 2019-12-10 2021-06-11 深圳先进技术研究院 Policy-based computation offload of wireless energy-carrying internet-of-things equipment
CN113613339A (en) * 2021-07-10 2021-11-05 西北农林科技大学 Channel access method of multi-priority wireless terminal based on deep reinforcement learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102005047753B4 (en) * 2005-09-28 2007-10-11 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Signaling method for decentralized online transmission power allocation in a wireless network
TWI387381B (en) * 2009-08-14 2013-02-21 Ind Tech Res Inst Apparatus and method for neighbor-aware concurrent transmission media access control protocol
US9277480B2 (en) * 2013-03-15 2016-03-01 Facebook, Inc. Cloud controller for next generation data network
RU2757663C1 (en) * 2018-02-07 2021-10-20 Хохшуле Анхальт Method for adaptive route selection in a node of a wireless cellular communication network, associated apparatus for implementing the method for adaptive route selection, and associated computer program
US11265865B2 (en) * 2019-05-01 2022-03-01 Qualcomm Incorporated Dynamic physical downlink control channel (PDCCH) resource sharing between PDCCH monitoring and PDCCH transmission in a multi-hop network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101951612A (en) * 2010-09-01 2011-01-19 南京航空航天大学 DCF protocol fairness guarantee method suitable for multi-hop ad hoc network
CN103415018A (en) * 2013-08-23 2013-11-27 山东省计算中心 Communication resource allocation method of wireless sensor network
CN105245608A (en) * 2015-10-23 2016-01-13 同济大学 Telematics network node screening and accessibility routing construction method based on self-encoding network
CN112954736A (en) * 2019-12-10 2021-06-11 深圳先进技术研究院 Policy-based computation offload of wireless energy-carrying internet-of-things equipment
CN113613339A (en) * 2021-07-10 2021-11-05 西北农林科技大学 Channel access method of multi-priority wireless terminal based on deep reinforcement learning

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
An Energy Efficient Multiple-Hop Routing Protocol for Wireless Sensor Networks;Yang Lei 等;《2008 First International Conference on Intelligent Networks and Intelligent Systems》;全文 *
Monitoring Multi-Hop Multi-Channel Wireless Networks: Online Sniffer Channel Assignment;Jing Xu 等;《2016 IEEE 41st Conference on Local Computer Networks (LCN)》;全文 *
无线传感器网络中分布式多跳路由算法研究;尚凤军 等;《传感技术学报》;第25卷(第4期);全文 *
无线多跳网络信道分配优化算法研究;张震 等;《宜春学院学报》;第42卷(第3期);全文 *
高密度无线网络多信道分配与接入技术研究;丁凯琪;《万方学位论文》;全文 *

Also Published As

Publication number Publication date
CN114024639A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
Zhang et al. Beyond D2D: Full dimension UAV-to-everything communications in 6G
Koushik et al. Intelligent spectrum management based on transfer actor-critic learning for rateless transmissions in cognitive radio networks
CN114024639B (en) Distributed channel allocation method in wireless multi-hop network
US20080107069A1 (en) Joint Channel Assignment and Routing in Wireless Networks
CN110972162B (en) Underwater acoustic sensor network saturation throughput solving method based on Markov chain
KR102178880B1 (en) Network system and data trasmission method based on device clustering in lorawan communication
CN111601398B (en) Ad hoc network medium access control method based on reinforcement learning
CN113727306B (en) Decoupling C-V2X network slicing method based on deep reinforcement learning
Balcı et al. Massive connectivity with machine learning for the Internet of Things
Wang et al. Reliability optimization for channel resource allocation in multihop wireless network: A multigranularity deep reinforcement learning approach
Huang et al. A platoon-centric multi-channel access scheme for hybrid traffic
Meng et al. Intelligent routing orchestration for ultra-low latency transport networks
CN116634450A (en) Dynamic air-ground heterogeneous network user association enhancement method based on reinforcement learning
CN113316156B (en) Intelligent coexistence method on unlicensed frequency band
CN111491301A (en) Spectrum management device, electronic device, wireless communication method, and storage medium
Tian et al. Deep reinforcement learning based resource allocation with heterogeneous QoS for cellular V2X
Lei et al. QoS-oriented media access control using reinforcement learning for next-generation WLANs
Mondal et al. Station Grouping Mechanism using Machine Learning Approach for IEEE 802.11 ah
Zhang et al. A TDMA-based hybrid transmission MAC protocol for heterogeneous vehicular network
Priya et al. Improving the quality of service (qos) and resource allocation in vehicular platoon using meta-heuristic optimization algorithm
Lei et al. Reinforcement learning based multi-parameter joint optimization in dense multi-hop wireless networks
Nguyen Adaptive multiple access schemes for massive MIMO machine-type communication networks
CN116193405B (en) Heterogeneous V2X network data transmission method based on DONA framework
WO2022199315A1 (en) Data processing method and apparatus
Kongkham et al. Recurrent Network Based Protocol Design for Spectrum Sensing in Cognitive Users

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20231212

Address after: No. 702, 703, 7th Floor, Building 7, No. 219 Tianhua Second Road, Chengdu High tech Zone, China (Sichuan) Pilot Free Trade Zone, Chengdu City, Sichuan Province, 610041

Applicant after: CHENGDU SKYSOFT INFO & TECH CO.,LTD.

Address before: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant before: Shenzhen Hongyue Information Technology Co.,Ltd.

Effective date of registration: 20231212

Address after: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Hongyue Information Technology Co.,Ltd.

Address before: 400065 Chongwen Road, Nanshan Street, Nanan District, Chongqing

Applicant before: CHONGQING University OF POSTS AND TELECOMMUNICATIONS

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant