CN115499849A

CN115499849A - Wireless access point and reconfigurable intelligent surface cooperation method

Info

Publication number: CN115499849A
Application number: CN202211429707.5A
Authority: CN
Inventors: 罗弦; 廖荣涛; 杨荣浩; 李想; 姚渭箐; 董亮; 刘芬; 张岱; 郭岳; 王逸兮; 李磊; 孟浩华; 王敬靖; 胡欢君; 龙霏; 袁翔宇; 王博涛
Original assignee: Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Current assignee: Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2022-12-20
Anticipated expiration: 2042-11-16
Also published as: CN115499849B

Abstract

The application relates to a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which comprises the following steps: building an equipment communication architecture based on the power internet of things; according to the set up equipment communication architecture based on the power internet of things, a corresponding access point and intelligent reconfigurable surface cooperation method is designed, the aim of maximizing system energy efficiency is taken as the target, and the service quality requirements of mass equipment under the power internet of things on the aspects of transmission data rate and reliability are met; and each access point is cooperated with the reconfigurable intelligent surface according to the trained model so as to meet the access requirement of the mass equipment in the power internet of things. According to the method, the giant wireless communication network is modeled into a graph, and the graph is subjected to dimension reduction by using a graph embedding method to obtain an efficient graph representation, so that the model training complexity can be effectively reduced, and highly customized communication is realized.

Description

Wireless access point and reconfigurable intelligent surface cooperation method

Technical Field

The application belongs to the technical field of power Internet of things, and particularly relates to a wireless access point and reconfigurable intelligent surface cooperation method.

Background

In recent years, with the rapid development of the power internet of things, mass equipment is deployed at the network edge of the power internet of things. Because the power network system is complex and huge, and the problems of high management difficulty, high cost and the like exist only by relying on manpower to manage and control, a new information communication technology needs to be introduced to improve the operation performance and the management and control efficiency of the power system. In order to realize intelligent management and control of the power internet of things, the allocation condition and performance of the power network need to be sensed and measured in real time. Therefore, the power internet of things needs to meet the requirements of network edge internet of things equipment access and mass data transmission, so that efficient and reliable operation of the power internet of things is guaranteed. With the continuous development of information communication technology, a new generation of mobile communication technology can provide high-speed and stable service when a large amount of power equipment is accessed to a power network, but due to the heterogeneity of network edge equipment, high-degree customized and intelligent communication cannot be realized at present, namely, network resources are dynamically configured to support ultra-dense connection.

Reconfigurable intelligent surfaces are a totally new and revolutionary technology that can intelligently reconfigure the wireless propagation environment by integrating a large number of low-cost passive reflective elements in a planar fashion, thereby significantly improving the performance of wireless communication networks. The reconfigurable intelligent surface provides possibility for high customization, and can reconfigure a wireless propagation environment through highly controllable and intelligent signal reflection, thereby providing a new degree of freedom for further improving the performance of a wireless link and paving a road for realizing an intelligent programmable wireless environment. By means of a reconfigurable intelligent surface technology, a wireless access point and the wireless access point cooperate to flexibly configure mixed space beams, data are enhanced according to needs, interference suppression is flexibly carried out, efficient mixed airspace and power domain multiplexing is carried out, and high-degree customized communication and intelligent communication can be effectively carried out. Therefore, in the power internet of things scene with heterogeneous power grids and massive devices, an effective wireless access point and reconfigurable intelligent surface cooperation technology needs to be designed urgently, so that highly customized communication and intelligent communication are realized.

Disclosure of Invention

The embodiment of the application aims to provide a method for cooperation between a wireless access point and a reconfigurable intelligent surface, wherein a wireless communication network is modeled into a graph representation, an embedded representation of the network is obtained by using a graph embedding method, a low-dimensional representation of the graph can be effectively obtained by using the graph embedding method, the model training complexity is reduced, and high-degree customized communication is realized.

In order to achieve the above purpose, the present application provides the following technical solutions:

the embodiment of the application provides a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which is characterized by comprising the following steps:

step 1: building a device communication architecture based on an electric power internet of things, wherein the network architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;

step 2: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;

and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.

The step 1 is specifically as follows:

step 1: in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented as

Expressing reconfigurable intelligent surfaces in a network as

M wireless access points and J reconfigurable intelligent surfaces are expressed as different intelligent agent nodesThe wireless access points and the reconfigurable intelligent surfaces are expressed as nodes in graph neural network input, the access information of the power internet of things equipment and the mixed space beam configuration between the wireless access points and the reconfigurable intelligent surfaces are regarded as features in graph topology and input into the message transmission graph neural network, and stable node feature graph embedded representation is obtained through a message transmission mechanism of the message transmission graph neural network.

The step 2 is specifically as follows:

step 2.1: in order to achieve a dynamic maximization of the system energy efficiency of the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:

wherein

The network energy efficiency of the time slot t is represented,

and representing user parameters, combining the selection of a reconfigurable intelligent surface unit, coordinating the discrete phase shift control and the power distribution strategy, and modeling the long-term energy efficiency optimization problem into a decentralized part observable Markov decision process. After the optimization problem is converted into a decentralized part observable Markov decision process, the optimization function after conversion is as follows:

wherein

A positive coefficient representing a trade-off between controlling energy efficiency and transmission reliability,

is a nonA negative parameter, which imposes a penalty on violating the data rate,

which is indicative of a data rate limit,

at each time slot is a fixed value,

which represents the data rate at each time slot,

the number of the antennas is represented and,

representing the access point and the users of the reconfigurable intelligent surface collaboration service.

Its global reward function can be expressed as:

step 2.2: more efficient cooperative learning is achieved through two technologies of integration graph embedding and different rewards, the intelligent bodies represent wireless access points and reconfigurable intelligent surfaces, the interaction between the intelligent bodies represents a wireless communication environment and a communication mode thereof, and the intelligent bodies and the interaction between the intelligent bodies are modeled into a directed communication graph

Where agents are modeled as nodes I, the interactions between agents are modeled as directed edges

，

The characteristics of the nodes are represented by,

the characteristics of the edges are represented by,

the node characteristics of a wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:

the edge being characterized as an agent

To the agent

The interaction between them can be expressed mathematically as:

step 2.3: because graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided, and each distributed node is provided with a plurality of distributed nodes

A messaging graph neural network is maintained. Similar to the multi-layered perceptron, the message passing graph neural network adopts a layered structure, in each message passing graph neural network layer, each agent first transmits embedded information to its neighboring agents, and then aggregates the embedded information from the neighboring agents and updates its local hidden state, and the message passing process is as follows:

wherein

The function of the message is represented by,

representing an update operation, after the graph is embedded in the module, the agent

Local embedding state according to output using gated cyclic unit

Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows:

intelligent agent

Local action taken

Is to generate sub-policies from actions

The obtained result of the medium sampling is that,

step 2.4: representing combined parameters of graph embedding module and action generating module in distributed strategy as

Our goal is to maximize the performance function:

wherein

Is to follow a federated policy

Based on the dominance function, a policy gradient is calculated, which is given by:

wherein

Is the actual input to the graph embedding,

representing the time difference advantage, given by:

wherein

A value representing the global state is indicated,

representing global state-action values, training a distributed network using value decomposition to solve the credit allocation problem during training, and applying the global state values

The decomposition is in the form of a combination with a mixing function as shown in the following equation:

wherein

Representing an agent

In a centralized training process, each agent may receive different rewards by evaluating its contribution to global reward improvement based on local graph-embedded features to further facilitate coordination between agents

Expressed as weight parameters of a distributed network, shared among agents, using

Representing a hybrid network

By small batch gradient descent, the distributed and hybrid networks are optimized such that the following losses are minimized:

wherein

Is an n-step return guided by the last state, the upper limit of n is T, and the parameters of the hybrid network can be updated by the following formula:

wherein

The learning rate of the mixed network update, and further sharing the weight parameter of the non-output layer in the distributed network, wherein the combined weight parameter of the distributed network is represented as

In connection with

The gradient of (d) can be calculated as:

the update rule for a distributed network can be derived as:

wherein the content of the first and second substances,

and

respectively representing a strategy improvement learning rate and a critic learning rate.

The step 3 is specifically as follows:

step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate

，

Step 3.2: extracting data of a batch from an experience pool

Calculating the policy gradient according to the formula derived in step 2.4

And network loss

Based on step 2.4The hybrid network parameter update formula updates the hybrid network parameters,

step 3.3: further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges,

step 3.4: the trained network parameters are updated regularly, or the network parameters are retrained and updated when the power internet of things is changed greatly, so that the access requirements of equipment in the circuit internet of things are met, and customized communication is realized.

Compared with the prior art, the beneficial effects of this application are: the application provides a wireless access point and reconfigurable intelligent surface cooperation framework aiming at the demand of an electric power internet of things, so that the access demand of mass equipment is met. According to the method and the device, the cooperation between the wireless access point and the reconfigurable intelligent surface is realized, and the system energy efficiency is dynamically maximized, so that high-efficiency communication is realized. In addition, the application provides a graph embedding-based wireless network representation method, which models a huge wireless communication network into a graph and reduces the dimension of the graph to obtain efficient graph representation by using the graph embedding method. The method provided by the application can effectively reduce the complexity of model training and realize highly customized communication.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is a flow chart of a method according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

Referring to fig. 1, the present application provides a method for a wireless access point to cooperate with a reconfigurable intelligent surface, which includes the following steps.

Preferably, the step 1 is as follows:

step 1: in the equipment communication architecture of the power internet of things, the network is connected with the networkThe pre-installed access point is denoted as

Expressing reconfigurable intelligent surfaces in a network as

The method comprises the steps of expressing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, expressing the M wireless access points and the J reconfigurable intelligent surfaces as different intelligent body nodes, expressing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, considering power physical connection equipment access information, mixed space wave beam configuration between a plurality of wireless access points and a plurality of reconfigurable intelligent surfaces as characteristics in a topological graph, inputting the characteristics into a message transmission graph neural network, and obtaining stable node characteristic graph embedding expression through a message transmission mechanism of the message transmission graph neural network.

Preferably, the step 2 is specifically as follows:

step 2.1: because the network edge of the power internet of things is provided with mass equipment, and a high-performance mass equipment access framework needs to be elaborately designed, the hybrid beams can be flexibly and coordinately reconstructed by designing the cooperation between the access point and the reconfigurable intelligent surface, so that the equipment is coordinately accessed into a communication network, and the customizable intelligent communication is realized. Therefore, to achieve a system energy efficiency that dynamically maximizes the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:

wherein

Representing the network energy efficiency of the time slot t. This objective function can be modeled as a constrained markov decision process, however, due to the large-scale joint state-action space and the multiple wireless access points and reconfigurable intelligent surface-to-setThe high-dimensional information exchange of the controller is very expensive and solving the above problem in a centralized manner is computationally inefficient. To address the above issues in an efficient and low-complexity manner and to maximize network energy efficiency while ensuring diversified user performance, we can model the above long-term energy efficiency optimization problem as a decentralized partially observable markov decision process in conjunction with reconfigurable intelligent surface unit selection, coordinated discrete phase shift control, and power allocation strategies. In particular, the partially observable Markov decision process provides a general framework to describe the Markov decision process with incomplete information, while the decentralized partially observable Markov decision process extends it to discrete locations.

Based on the Lyapunov optimization theory, we can convert the above optimization problem into a decentralized partially observable markov decision process, and the converted optimization function is as follows:

wherein

is a non-negative parameter, which imposes a penalty on violating the data rate,

which is indicative of a data rate limit,

at each time slot is a fixed value,

which represents the data rate at each time slot,

the number of the antennas is represented and,

Its global reward function can be expressed as:

step 2.2: the optimization problem described in step 2.1 can be solved using the conventional multi-agent reinforcement learning method, but because information needs to be exchanged between adjacent agents to achieve cooperation, the conventional multi-agent reinforcement learning method causes high communication overhead and delay in processing high-dimensional information, so the conventional multi-agent reinforcement learning method is inefficient in solving the observable markov decision process problem of the highly-coupled decentralized part. The common centralized training and decentralized execution in the existing multi-agent reinforcement learning algorithm is expanded, and more efficient cooperative learning is realized by integrating two technologies of graph embedding and different rewards. The intelligence represents a wireless access point and a reconfigurable intelligent surface. The interaction between agents represents the wireless communication environment and its way of communication. Agents and interactions therebetween are modeled as directed communication graphs

. Where agents are modeled as nodes I, the interactions between agents are modeled as directed edges

，

The characteristics of the nodes are represented by,

representing the characteristics of the edge.

the edge being characterized as an agent

To the agent

The interaction between them can be expressed mathematically as:

step 2.3: since graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided. The module utilizes the low-dimensional embedding characteristic of the message transfer graph neural network learning directed graph, can effectively improve the generalization capability of the network and enhance the cooperation capability between the wireless access point and the reconfigurable intelligent surface, and simultaneously only needs lower information exchange overhead.

We are at each distributed node

A messaging graph neural network is maintained. Similar to the multi-tier perceptrons, the messaging graph neural network employs a hierarchical structure. Within each messaging graph neural network layer, each agent first transmits embedded information to its neighboring agents, then aggregates the embedded information from the neighboring agents and updates its local hidden state, the messaging process is as follows:

wherein

The function of the message is represented by,

indicating an update operation. After the graph is embedded in the module, the agent

Local embedding state according to output using gated cyclic unit

intelligent agent

Local action taken

Is to generate sub-policies from actions

Obtained by middle sampling.

Our goal is to maximize the performance function:

wherein

Is to follow a federated policy

The joint state transition of (1). Therefore, we compute the policy gradient based on the merit function, which is given by:

wherein

Is the actual input to the graph embedding,

representing the time difference advantage, given by:

wherein

A value representing the global state is indicated,

representing a global state-action value. To solve the credit allocation problem during training, we train the distributed network using value decomposition, applying global state values

wherein

Representing an agent

The local state value of (2). In the centralized training process, each agent gains different rewards by evaluating its contribution to global reward improvement based on the local graph embedding features, thereby further facilitating coordination between agents. Will be provided with

Representing a hybrid network

The weight of (c). The distributed and hybrid networks are optimized by small batch gradient descent, minimizing the following losses:

wherein

Is an n-step return from the last state, with the upper limit of n being T. Thus, the parameters of the hybrid network may be updated by:

wherein

Is a hybrid networkAn updated learning rate. In order to reduce complexity, we further share the weight parameter of the non-output layer in the distributed network, and the combined weight parameter representing the distributed network is

. Thus, in connection with

The gradient of (d) can be calculated as:

thus, the update rule for a distributed network can be derived as:

wherein, the first and the second end of the pipe are connected with each other,

and

Preferably, the step 3 is specifically as follows:

。

Step 3.2: extracting data of a batch from an experience pool

Calculating the policy gradient according to the formula derived in step 2.4

And network loss

The hybrid network parameters are updated based on the hybrid network parameter update formula in step 2.4.

Step 3.3: and further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges.

Step 3.4: and the trained network parameters are updated periodically, or the network parameters are retrained and updated when the power internet of things is greatly changed. Therefore, the access requirement of the equipment in the circuit Internet of things is met, and customized communication is realized.

The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for cooperation between a wireless access point and a reconfigurable intelligent surface is characterized by comprising the following steps:

step 1: building a device communication architecture based on an electric power internet of things, wherein the device communication architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;

2. The method for the cooperation of the wireless access point and the reconfigurable intelligent surface according to claim 1, wherein the step 1 is as follows:

in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented as

Expressing a reconfigurable intelligent surface in a network as

The method comprises the steps of representing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, representing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, taking access information of the power internet of things equipment, mixed space wave beam configuration between the wireless access points and the reconfigurable intelligent surfaces as features in graph topology, inputting the features into a message transfer graph neural network, and obtaining stable node feature graph embedded representation through a message transfer mechanism of the message transfer graph neural network.

3. The method for cooperation between a wireless access point and a reconfigurable intelligent surface according to claim 1, wherein the step 2 is specifically as follows:

step 2.1: modeling the system energy efficiency optimization problem as a decentralized part observable Markov decision process;

step 2.2: more efficient cooperative learning is realized through two technologies of integration graph embedding and different rewards;

step 2.3: maintaining a message passing graph neural network at each distributed node i, wherein in each message passing graph neural network layer, each agent firstly transmits embedded information to adjacent agents, and then aggregates the embedded information from the adjacent agents and updates the local hidden state of the agents;

step 2.4: each agent further facilitates coordination between agents by assessing its contribution to global reward improvements based on the local graph embedding features to achieve different rewards.

4. The method for the cooperation of the wireless access point and the reconfigurable intelligent surface according to claim 3, wherein the step 3 is as follows:

，

Step 3.2: data B of a batch is extracted from the experience pool, and the strategy gradient is calculated according to the formula derived in step 2.4

And network loss

Updating the hybrid network parameters based on the hybrid network parameter updating formula in step 2.4,