CN115499849A - Wireless access point and reconfigurable intelligent surface cooperation method - Google Patents

Wireless access point and reconfigurable intelligent surface cooperation method Download PDF

Info

Publication number
CN115499849A
CN115499849A CN202211429707.5A CN202211429707A CN115499849A CN 115499849 A CN115499849 A CN 115499849A CN 202211429707 A CN202211429707 A CN 202211429707A CN 115499849 A CN115499849 A CN 115499849A
Authority
CN
China
Prior art keywords
things
network
access point
graph
power internet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211429707.5A
Other languages
Chinese (zh)
Other versions
CN115499849B (en
Inventor
罗弦
廖荣涛
杨荣浩
李想
姚渭箐
董亮
刘芬
张岱
郭岳
王逸兮
李磊
孟浩华
王敬靖
胡欢君
龙霏
袁翔宇
王博涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Original Assignee
Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd filed Critical Information and Telecommunication Branch of State Grid Hubei Electric Power Co Ltd
Priority to CN202211429707.5A priority Critical patent/CN115499849B/en
Publication of CN115499849A publication Critical patent/CN115499849A/en
Application granted granted Critical
Publication of CN115499849B publication Critical patent/CN115499849B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/18Network planning tools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The application relates to a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which comprises the following steps: building an equipment communication architecture based on the power internet of things; according to the set up equipment communication architecture based on the power internet of things, a corresponding access point and intelligent reconfigurable surface cooperation method is designed, the aim of maximizing system energy efficiency is taken as the target, and the service quality requirements of mass equipment under the power internet of things on the aspects of transmission data rate and reliability are met; and each access point is cooperated with the reconfigurable intelligent surface according to the trained model so as to meet the access requirement of the mass equipment in the power internet of things. According to the method, the giant wireless communication network is modeled into a graph, and the graph is subjected to dimension reduction by using a graph embedding method to obtain an efficient graph representation, so that the model training complexity can be effectively reduced, and highly customized communication is realized.

Description

Wireless access point and reconfigurable intelligent surface cooperation method
Technical Field
The application belongs to the technical field of power Internet of things, and particularly relates to a wireless access point and reconfigurable intelligent surface cooperation method.
Background
In recent years, with the rapid development of the power internet of things, mass equipment is deployed at the network edge of the power internet of things. Because the power network system is complex and huge, and the problems of high management difficulty, high cost and the like exist only by relying on manpower to manage and control, a new information communication technology needs to be introduced to improve the operation performance and the management and control efficiency of the power system. In order to realize intelligent management and control of the power internet of things, the allocation condition and performance of the power network need to be sensed and measured in real time. Therefore, the power internet of things needs to meet the requirements of network edge internet of things equipment access and mass data transmission, so that efficient and reliable operation of the power internet of things is guaranteed. With the continuous development of information communication technology, a new generation of mobile communication technology can provide high-speed and stable service when a large amount of power equipment is accessed to a power network, but due to the heterogeneity of network edge equipment, high-degree customized and intelligent communication cannot be realized at present, namely, network resources are dynamically configured to support ultra-dense connection.
Reconfigurable intelligent surfaces are a totally new and revolutionary technology that can intelligently reconfigure the wireless propagation environment by integrating a large number of low-cost passive reflective elements in a planar fashion, thereby significantly improving the performance of wireless communication networks. The reconfigurable intelligent surface provides possibility for high customization, and can reconfigure a wireless propagation environment through highly controllable and intelligent signal reflection, thereby providing a new degree of freedom for further improving the performance of a wireless link and paving a road for realizing an intelligent programmable wireless environment. By means of a reconfigurable intelligent surface technology, a wireless access point and the wireless access point cooperate to flexibly configure mixed space beams, data are enhanced according to needs, interference suppression is flexibly carried out, efficient mixed airspace and power domain multiplexing is carried out, and high-degree customized communication and intelligent communication can be effectively carried out. Therefore, in the power internet of things scene with heterogeneous power grids and massive devices, an effective wireless access point and reconfigurable intelligent surface cooperation technology needs to be designed urgently, so that highly customized communication and intelligent communication are realized.
Disclosure of Invention
The embodiment of the application aims to provide a method for cooperation between a wireless access point and a reconfigurable intelligent surface, wherein a wireless communication network is modeled into a graph representation, an embedded representation of the network is obtained by using a graph embedding method, a low-dimensional representation of the graph can be effectively obtained by using the graph embedding method, the model training complexity is reduced, and high-degree customized communication is realized.
In order to achieve the above purpose, the present application provides the following technical solutions:
the embodiment of the application provides a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which is characterized by comprising the following steps:
step 1: building a device communication architecture based on an electric power internet of things, wherein the network architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;
step 2: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.
The step 1 is specifically as follows:
step 1: in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented as
Figure 127484DEST_PATH_IMAGE001
Expressing reconfigurable intelligent surfaces in a network as
Figure 687034DEST_PATH_IMAGE002
M wireless access points and J reconfigurable intelligent surfaces are expressed as different intelligent agent nodesThe wireless access points and the reconfigurable intelligent surfaces are expressed as nodes in graph neural network input, the access information of the power internet of things equipment and the mixed space beam configuration between the wireless access points and the reconfigurable intelligent surfaces are regarded as features in graph topology and input into the message transmission graph neural network, and stable node feature graph embedded representation is obtained through a message transmission mechanism of the message transmission graph neural network.
The step 2 is specifically as follows:
step 2.1: in order to achieve a dynamic maximization of the system energy efficiency of the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:
Figure 342138DEST_PATH_IMAGE003
wherein
Figure 206188DEST_PATH_IMAGE004
The network energy efficiency of the time slot t is represented,
Figure 394462DEST_PATH_IMAGE005
and representing user parameters, combining the selection of a reconfigurable intelligent surface unit, coordinating the discrete phase shift control and the power distribution strategy, and modeling the long-term energy efficiency optimization problem into a decentralized part observable Markov decision process. After the optimization problem is converted into a decentralized part observable Markov decision process, the optimization function after conversion is as follows:
Figure 561132DEST_PATH_IMAGE007
wherein
Figure 532893DEST_PATH_IMAGE008
A positive coefficient representing a trade-off between controlling energy efficiency and transmission reliability,
Figure 76001DEST_PATH_IMAGE009
is a nonA negative parameter, which imposes a penalty on violating the data rate,
Figure 384360DEST_PATH_IMAGE010
which is indicative of a data rate limit,
Figure 107948DEST_PATH_IMAGE011
at each time slot is a fixed value,
Figure 65540DEST_PATH_IMAGE012
which represents the data rate at each time slot,
Figure 677918DEST_PATH_IMAGE013
the number of the antennas is represented and,
Figure 44046DEST_PATH_IMAGE014
representing the access point and the users of the reconfigurable intelligent surface collaboration service.
Its global reward function can be expressed as:
Figure 942732DEST_PATH_IMAGE016
step 2.2: more efficient cooperative learning is achieved through two technologies of integration graph embedding and different rewards, the intelligent bodies represent wireless access points and reconfigurable intelligent surfaces, the interaction between the intelligent bodies represents a wireless communication environment and a communication mode thereof, and the intelligent bodies and the interaction between the intelligent bodies are modeled into a directed communication graph
Figure 561189DEST_PATH_IMAGE017
Where agents are modeled as nodes I, the interactions between agents are modeled as directed edges
Figure 977258DEST_PATH_IMAGE018
Figure 994630DEST_PATH_IMAGE019
The characteristics of the nodes are represented by,
Figure 408425DEST_PATH_IMAGE020
the characteristics of the edges are represented by,
the node characteristics of a wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:
Figure 606188DEST_PATH_IMAGE021
the edge being characterized as an agent
Figure 530675DEST_PATH_IMAGE022
To the agent
Figure 169598DEST_PATH_IMAGE023
The interaction between them can be expressed mathematically as:
Figure 908622DEST_PATH_IMAGE025
step 2.3: because graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided, and each distributed node is provided with a plurality of distributed nodes
Figure 203468DEST_PATH_IMAGE022
A messaging graph neural network is maintained. Similar to the multi-layered perceptron, the message passing graph neural network adopts a layered structure, in each message passing graph neural network layer, each agent first transmits embedded information to its neighboring agents, and then aggregates the embedded information from the neighboring agents and updates its local hidden state, and the message passing process is as follows:
Figure 758077DEST_PATH_IMAGE026
wherein
Figure 752971DEST_PATH_IMAGE027
The function of the message is represented by,
Figure 898782DEST_PATH_IMAGE028
representing an update operation, after the graph is embedded in the module, the agent
Figure 179459DEST_PATH_IMAGE022
Local embedding state according to output using gated cyclic unit
Figure 741022DEST_PATH_IMAGE029
Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows:
Figure 682433DEST_PATH_IMAGE030
intelligent agent
Figure 375976DEST_PATH_IMAGE022
Local action taken
Figure 442152DEST_PATH_IMAGE031
Is to generate sub-policies from actions
Figure 368258DEST_PATH_IMAGE032
The obtained result of the medium sampling is that,
step 2.4: representing combined parameters of graph embedding module and action generating module in distributed strategy as
Figure 773962DEST_PATH_IMAGE033
Our goal is to maximize the performance function:
Figure 700723DEST_PATH_IMAGE034
wherein
Figure 316512DEST_PATH_IMAGE035
Is to follow a federated policy
Figure 954298DEST_PATH_IMAGE036
Based on the dominance function, a policy gradient is calculated, which is given by:
Figure 103258DEST_PATH_IMAGE038
wherein
Figure 371559DEST_PATH_IMAGE039
Is the actual input to the graph embedding,
Figure 740223DEST_PATH_IMAGE040
representing the time difference advantage, given by:
Figure 417586DEST_PATH_IMAGE041
wherein
Figure 922517DEST_PATH_IMAGE042
A value representing the global state is indicated,
Figure 922571DEST_PATH_IMAGE043
representing global state-action values, training a distributed network using value decomposition to solve the credit allocation problem during training, and applying the global state values
Figure 653898DEST_PATH_IMAGE044
The decomposition is in the form of a combination with a mixing function as shown in the following equation:
Figure 492541DEST_PATH_IMAGE046
wherein
Figure 986668DEST_PATH_IMAGE047
Representing an agent
Figure 924668DEST_PATH_IMAGE022
In a centralized training process, each agent may receive different rewards by evaluating its contribution to global reward improvement based on local graph-embedded features to further facilitate coordination between agents
Figure 500880DEST_PATH_IMAGE048
Expressed as weight parameters of a distributed network, shared among agents, using
Figure 549739DEST_PATH_IMAGE049
Representing a hybrid network
Figure 406093DEST_PATH_IMAGE050
By small batch gradient descent, the distributed and hybrid networks are optimized such that the following losses are minimized:
Figure 842890DEST_PATH_IMAGE051
wherein
Figure 17651DEST_PATH_IMAGE052
Is an n-step return guided by the last state, the upper limit of n is T, and the parameters of the hybrid network can be updated by the following formula:
Figure 696632DEST_PATH_IMAGE053
wherein
Figure 702765DEST_PATH_IMAGE054
The learning rate of the mixed network update, and further sharing the weight parameter of the non-output layer in the distributed network, wherein the combined weight parameter of the distributed network is represented as
Figure 421716DEST_PATH_IMAGE055
In connection with
Figure 146089DEST_PATH_IMAGE056
The gradient of (d) can be calculated as:
Figure 864646DEST_PATH_IMAGE057
the update rule for a distributed network can be derived as:
Figure 223821DEST_PATH_IMAGE058
wherein the content of the first and second substances,
Figure 877788DEST_PATH_IMAGE059
and
Figure 918818DEST_PATH_IMAGE060
respectively representing a strategy improvement learning rate and a critic learning rate.
The step 3 is specifically as follows:
step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate
Figure 50853DEST_PATH_IMAGE061
Step 3.2: extracting data of a batch from an experience pool
Figure 530114DEST_PATH_IMAGE062
Calculating the policy gradient according to the formula derived in step 2.4
Figure 449922DEST_PATH_IMAGE063
And network loss
Figure 791298DEST_PATH_IMAGE064
Based on step 2.4The hybrid network parameter update formula updates the hybrid network parameters,
step 3.3: further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges,
step 3.4: the trained network parameters are updated regularly, or the network parameters are retrained and updated when the power internet of things is changed greatly, so that the access requirements of equipment in the circuit internet of things are met, and customized communication is realized.
Compared with the prior art, the beneficial effects of this application are: the application provides a wireless access point and reconfigurable intelligent surface cooperation framework aiming at the demand of an electric power internet of things, so that the access demand of mass equipment is met. According to the method and the device, the cooperation between the wireless access point and the reconfigurable intelligent surface is realized, and the system energy efficiency is dynamically maximized, so that high-efficiency communication is realized. In addition, the application provides a graph embedding-based wireless network representation method, which models a huge wireless communication network into a graph and reduces the dimension of the graph to obtain efficient graph representation by using the graph embedding method. The method provided by the application can effectively reduce the complexity of model training and realize highly customized communication.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flow chart of a method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
Referring to fig. 1, the present application provides a method for a wireless access point to cooperate with a reconfigurable intelligent surface, which includes the following steps.
Step 1: building a device communication architecture based on an electric power internet of things, wherein the network architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;
step 2: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.
Preferably, the step 1 is as follows:
step 1: in the equipment communication architecture of the power internet of things, the network is connected with the networkThe pre-installed access point is denoted as
Figure 789341DEST_PATH_IMAGE001
Expressing reconfigurable intelligent surfaces in a network as
Figure 857529DEST_PATH_IMAGE002
The method comprises the steps of expressing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, expressing the M wireless access points and the J reconfigurable intelligent surfaces as different intelligent body nodes, expressing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, considering power physical connection equipment access information, mixed space wave beam configuration between a plurality of wireless access points and a plurality of reconfigurable intelligent surfaces as characteristics in a topological graph, inputting the characteristics into a message transmission graph neural network, and obtaining stable node characteristic graph embedding expression through a message transmission mechanism of the message transmission graph neural network.
Preferably, the step 2 is specifically as follows:
step 2.1: because the network edge of the power internet of things is provided with mass equipment, and a high-performance mass equipment access framework needs to be elaborately designed, the hybrid beams can be flexibly and coordinately reconstructed by designing the cooperation between the access point and the reconfigurable intelligent surface, so that the equipment is coordinately accessed into a communication network, and the customizable intelligent communication is realized. Therefore, to achieve a system energy efficiency that dynamically maximizes the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:
Figure 977932DEST_PATH_IMAGE065
wherein
Figure 898614DEST_PATH_IMAGE004
Representing the network energy efficiency of the time slot t. This objective function can be modeled as a constrained markov decision process, however, due to the large-scale joint state-action space and the multiple wireless access points and reconfigurable intelligent surface-to-setThe high-dimensional information exchange of the controller is very expensive and solving the above problem in a centralized manner is computationally inefficient. To address the above issues in an efficient and low-complexity manner and to maximize network energy efficiency while ensuring diversified user performance, we can model the above long-term energy efficiency optimization problem as a decentralized partially observable markov decision process in conjunction with reconfigurable intelligent surface unit selection, coordinated discrete phase shift control, and power allocation strategies. In particular, the partially observable Markov decision process provides a general framework to describe the Markov decision process with incomplete information, while the decentralized partially observable Markov decision process extends it to discrete locations.
Based on the Lyapunov optimization theory, we can convert the above optimization problem into a decentralized partially observable markov decision process, and the converted optimization function is as follows:
Figure 405075DEST_PATH_IMAGE066
wherein
Figure 298076DEST_PATH_IMAGE008
A positive coefficient representing a trade-off between controlling energy efficiency and transmission reliability,
Figure 353495DEST_PATH_IMAGE009
is a non-negative parameter, which imposes a penalty on violating the data rate,
Figure 495894DEST_PATH_IMAGE010
which is indicative of a data rate limit,
Figure 337204DEST_PATH_IMAGE011
at each time slot is a fixed value,
Figure 819133DEST_PATH_IMAGE012
which represents the data rate at each time slot,
Figure 717556DEST_PATH_IMAGE013
the number of the antennas is represented and,
Figure 940727DEST_PATH_IMAGE014
representing the access point and the users of the reconfigurable intelligent surface collaboration service.
Its global reward function can be expressed as:
Figure 756367DEST_PATH_IMAGE067
step 2.2: the optimization problem described in step 2.1 can be solved using the conventional multi-agent reinforcement learning method, but because information needs to be exchanged between adjacent agents to achieve cooperation, the conventional multi-agent reinforcement learning method causes high communication overhead and delay in processing high-dimensional information, so the conventional multi-agent reinforcement learning method is inefficient in solving the observable markov decision process problem of the highly-coupled decentralized part. The common centralized training and decentralized execution in the existing multi-agent reinforcement learning algorithm is expanded, and more efficient cooperative learning is realized by integrating two technologies of graph embedding and different rewards. The intelligence represents a wireless access point and a reconfigurable intelligent surface. The interaction between agents represents the wireless communication environment and its way of communication. Agents and interactions therebetween are modeled as directed communication graphs
Figure 633146DEST_PATH_IMAGE017
. Where agents are modeled as nodes I, the interactions between agents are modeled as directed edges
Figure 203936DEST_PATH_IMAGE018
Figure 147359DEST_PATH_IMAGE019
The characteristics of the nodes are represented by,
Figure 766690DEST_PATH_IMAGE020
representing the characteristics of the edge.
The node characteristics of a wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:
Figure 879003DEST_PATH_IMAGE021
the edge being characterized as an agent
Figure 856579DEST_PATH_IMAGE022
To the agent
Figure 195288DEST_PATH_IMAGE023
The interaction between them can be expressed mathematically as:
Figure 913583DEST_PATH_IMAGE068
step 2.3: since graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided. The module utilizes the low-dimensional embedding characteristic of the message transfer graph neural network learning directed graph, can effectively improve the generalization capability of the network and enhance the cooperation capability between the wireless access point and the reconfigurable intelligent surface, and simultaneously only needs lower information exchange overhead.
We are at each distributed node
Figure 490189DEST_PATH_IMAGE022
A messaging graph neural network is maintained. Similar to the multi-tier perceptrons, the messaging graph neural network employs a hierarchical structure. Within each messaging graph neural network layer, each agent first transmits embedded information to its neighboring agents, then aggregates the embedded information from the neighboring agents and updates its local hidden state, the messaging process is as follows:
Figure 169825DEST_PATH_IMAGE026
wherein
Figure 464672DEST_PATH_IMAGE027
The function of the message is represented by,
Figure 986658DEST_PATH_IMAGE028
indicating an update operation. After the graph is embedded in the module, the agent
Figure 542404DEST_PATH_IMAGE022
Local embedding state according to output using gated cyclic unit
Figure 330625DEST_PATH_IMAGE029
Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows:
Figure 909505DEST_PATH_IMAGE030
intelligent agent
Figure 533384DEST_PATH_IMAGE022
Local action taken
Figure 848697DEST_PATH_IMAGE031
Is to generate sub-policies from actions
Figure 837512DEST_PATH_IMAGE032
Obtained by middle sampling.
Step 2.4: representing combined parameters of graph embedding module and action generating module in distributed strategy as
Figure 467470DEST_PATH_IMAGE033
Our goal is to maximize the performance function:
Figure 832724DEST_PATH_IMAGE034
wherein
Figure 2543DEST_PATH_IMAGE035
Is to follow a federated policy
Figure 490156DEST_PATH_IMAGE036
The joint state transition of (1). Therefore, we compute the policy gradient based on the merit function, which is given by:
Figure 43628DEST_PATH_IMAGE037
wherein
Figure 182879DEST_PATH_IMAGE039
Is the actual input to the graph embedding,
Figure 708669DEST_PATH_IMAGE040
representing the time difference advantage, given by:
Figure 537823DEST_PATH_IMAGE041
wherein
Figure 47432DEST_PATH_IMAGE042
A value representing the global state is indicated,
Figure 255953DEST_PATH_IMAGE043
representing a global state-action value. To solve the credit allocation problem during training, we train the distributed network using value decomposition, applying global state values
Figure 760884DEST_PATH_IMAGE044
The decomposition is in the form of a combination with a mixing function as shown in the following equation:
Figure 760939DEST_PATH_IMAGE069
wherein
Figure 492266DEST_PATH_IMAGE047
Representing an agent
Figure 973319DEST_PATH_IMAGE022
The local state value of (2). In the centralized training process, each agent gains different rewards by evaluating its contribution to global reward improvement based on the local graph embedding features, thereby further facilitating coordination between agents. Will be provided with
Figure 332756DEST_PATH_IMAGE048
Expressed as weight parameters of a distributed network, shared among agents, using
Figure 270756DEST_PATH_IMAGE049
Representing a hybrid network
Figure 456756DEST_PATH_IMAGE050
The weight of (c). The distributed and hybrid networks are optimized by small batch gradient descent, minimizing the following losses:
Figure 771194DEST_PATH_IMAGE070
wherein
Figure 510040DEST_PATH_IMAGE052
Is an n-step return from the last state, with the upper limit of n being T. Thus, the parameters of the hybrid network may be updated by:
Figure 87783DEST_PATH_IMAGE071
wherein
Figure 652756DEST_PATH_IMAGE054
Is a hybrid networkAn updated learning rate. In order to reduce complexity, we further share the weight parameter of the non-output layer in the distributed network, and the combined weight parameter representing the distributed network is
Figure 941524DEST_PATH_IMAGE055
. Thus, in connection with
Figure 275553DEST_PATH_IMAGE056
The gradient of (d) can be calculated as:
Figure 56821DEST_PATH_IMAGE072
thus, the update rule for a distributed network can be derived as:
Figure 515615DEST_PATH_IMAGE073
wherein, the first and the second end of the pipe are connected with each other,
Figure 280178DEST_PATH_IMAGE059
and
Figure 406397DEST_PATH_IMAGE060
respectively representing a strategy improvement learning rate and a critic learning rate.
Preferably, the step 3 is specifically as follows:
step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate
Figure 686462DEST_PATH_IMAGE061
Step 3.2: extracting data of a batch from an experience pool
Figure 835814DEST_PATH_IMAGE062
Calculating the policy gradient according to the formula derived in step 2.4
Figure 358063DEST_PATH_IMAGE063
And network loss
Figure 837323DEST_PATH_IMAGE064
The hybrid network parameters are updated based on the hybrid network parameter update formula in step 2.4.
Step 3.3: and further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges.
Step 3.4: and the trained network parameters are updated periodically, or the network parameters are retrained and updated when the power internet of things is greatly changed. Therefore, the access requirement of the equipment in the circuit Internet of things is met, and customized communication is realized.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (4)

1. A method for cooperation between a wireless access point and a reconfigurable intelligent surface is characterized by comprising the following steps:
step 1: building a device communication architecture based on an electric power internet of things, wherein the device communication architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;
step 2: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.
2. The method for the cooperation of the wireless access point and the reconfigurable intelligent surface according to claim 1, wherein the step 1 is as follows:
in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented as
Figure 347463DEST_PATH_IMAGE001
Expressing a reconfigurable intelligent surface in a network as
Figure 656216DEST_PATH_IMAGE002
The method comprises the steps of representing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, representing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, taking access information of the power internet of things equipment, mixed space wave beam configuration between the wireless access points and the reconfigurable intelligent surfaces as features in graph topology, inputting the features into a message transfer graph neural network, and obtaining stable node feature graph embedded representation through a message transfer mechanism of the message transfer graph neural network.
3. The method for cooperation between a wireless access point and a reconfigurable intelligent surface according to claim 1, wherein the step 2 is specifically as follows:
step 2.1: modeling the system energy efficiency optimization problem as a decentralized part observable Markov decision process;
step 2.2: more efficient cooperative learning is realized through two technologies of integration graph embedding and different rewards;
step 2.3: maintaining a message passing graph neural network at each distributed node i, wherein in each message passing graph neural network layer, each agent firstly transmits embedded information to adjacent agents, and then aggregates the embedded information from the adjacent agents and updates the local hidden state of the agents;
step 2.4: each agent further facilitates coordination between agents by assessing its contribution to global reward improvements based on the local graph embedding features to achieve different rewards.
4. The method for the cooperation of the wireless access point and the reconfigurable intelligent surface according to claim 3, wherein the step 3 is as follows:
step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate
Figure 841209DEST_PATH_IMAGE003
Step 3.2: data B of a batch is extracted from the experience pool, and the strategy gradient is calculated according to the formula derived in step 2.4
Figure 801075DEST_PATH_IMAGE004
And network loss
Figure 531265DEST_PATH_IMAGE005
Updating the hybrid network parameters based on the hybrid network parameter updating formula in step 2.4,
step 3.3: further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges,
step 3.4: the trained network parameters are updated regularly, or the network parameters are retrained and updated when the power internet of things is changed greatly, so that the access requirements of equipment in the circuit internet of things are met, and customized communication is realized.
CN202211429707.5A 2022-11-16 2022-11-16 Wireless access point and reconfigurable intelligent surface cooperation method Active CN115499849B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211429707.5A CN115499849B (en) 2022-11-16 2022-11-16 Wireless access point and reconfigurable intelligent surface cooperation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211429707.5A CN115499849B (en) 2022-11-16 2022-11-16 Wireless access point and reconfigurable intelligent surface cooperation method

Publications (2)

Publication Number Publication Date
CN115499849A true CN115499849A (en) 2022-12-20
CN115499849B CN115499849B (en) 2023-04-07

Family

ID=85115737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211429707.5A Active CN115499849B (en) 2022-11-16 2022-11-16 Wireless access point and reconfigurable intelligent surface cooperation method

Country Status (1)

Country Link
CN (1) CN115499849B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019219969A1 (en) * 2018-05-18 2019-11-21 Deepmind Technologies Limited Graph neural network systems for behavior prediction and reinforcement learning in multple agent environments
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
CN113472419A (en) * 2021-06-23 2021-10-01 西北工业大学 Safe transmission method and system based on space-based reconfigurable intelligent surface
WO2021208771A1 (en) * 2020-04-18 2021-10-21 华为技术有限公司 Reinforced learning method and device
CN113573293A (en) * 2021-07-14 2021-10-29 南通大学 Intelligent emergency communication system based on RIS
US20210344384A1 (en) * 2020-04-29 2021-11-04 The Regents Of The University Of California Virtual mimo with smart surfaces
CN114286369A (en) * 2021-12-28 2022-04-05 杭州电子科技大学 AP and RIS combined selection method of RIS auxiliary communication system
CN114422056A (en) * 2021-12-03 2022-04-29 北京航空航天大学 Air-ground non-orthogonal multiple access uplink transmission method based on intelligent reflecting surface
CN114466388A (en) * 2022-02-16 2022-05-10 北京航空航天大学 Intelligent super-surface-assisted wireless energy-carrying communication method
US20220247480A1 (en) * 2021-02-01 2022-08-04 Ntt Docomo, Inc. Method and apparatus for user localization and tracking using radio signals reflected by reconfigurable smart surfaces
CN115103372A (en) * 2022-06-17 2022-09-23 东南大学 Multi-user MIMO system user scheduling method based on deep reinforcement learning
CN115146538A (en) * 2022-07-11 2022-10-04 河海大学 Power system state estimation method based on message passing graph neural network
CN115310775A (en) * 2022-07-13 2022-11-08 武汉大学 Multi-agent reinforcement learning rolling scheduling method, device, equipment and storage medium
CN115333143A (en) * 2022-07-08 2022-11-11 国网黑龙江省电力有限公司大庆供电公司 Deep learning multi-agent micro-grid cooperative control method based on double neural networks

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019219969A1 (en) * 2018-05-18 2019-11-21 Deepmind Technologies Limited Graph neural network systems for behavior prediction and reinforcement learning in multple agent environments
WO2021208771A1 (en) * 2020-04-18 2021-10-21 华为技术有限公司 Reinforced learning method and device
US20210344384A1 (en) * 2020-04-29 2021-11-04 The Regents Of The University Of California Virtual mimo with smart surfaces
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
US20220247480A1 (en) * 2021-02-01 2022-08-04 Ntt Docomo, Inc. Method and apparatus for user localization and tracking using radio signals reflected by reconfigurable smart surfaces
CN113472419A (en) * 2021-06-23 2021-10-01 西北工业大学 Safe transmission method and system based on space-based reconfigurable intelligent surface
CN113573293A (en) * 2021-07-14 2021-10-29 南通大学 Intelligent emergency communication system based on RIS
CN114422056A (en) * 2021-12-03 2022-04-29 北京航空航天大学 Air-ground non-orthogonal multiple access uplink transmission method based on intelligent reflecting surface
CN114286369A (en) * 2021-12-28 2022-04-05 杭州电子科技大学 AP and RIS combined selection method of RIS auxiliary communication system
CN114466388A (en) * 2022-02-16 2022-05-10 北京航空航天大学 Intelligent super-surface-assisted wireless energy-carrying communication method
CN115103372A (en) * 2022-06-17 2022-09-23 东南大学 Multi-user MIMO system user scheduling method based on deep reinforcement learning
CN115333143A (en) * 2022-07-08 2022-11-11 国网黑龙江省电力有限公司大庆供电公司 Deep learning multi-agent micro-grid cooperative control method based on double neural networks
CN115146538A (en) * 2022-07-11 2022-10-04 河海大学 Power system state estimation method based on message passing graph neural network
CN115310775A (en) * 2022-07-13 2022-11-08 武汉大学 Multi-agent reinforcement learning rolling scheduling method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115499849B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Qi et al. Knowledge-driven service offloading decision for vehicular edge computing: A deep reinforcement learning approach
Mocanu et al. On-line building energy optimization using deep reinforcement learning
CN113282368B (en) Edge computing resource scheduling method for substation inspection
Gurgen et al. Self-aware cyber-physical systems and applications in smart buildings and cities
Geetha et al. Green energy aware and cluster based communication for future load prediction in IoT
Liu et al. Federated reinforcement learning for decentralized voltage control in distribution networks
Shi et al. Machine learning for large-scale optimization in 6g wireless networks
CN112598150A (en) Method for improving fire detection effect based on federal learning in intelligent power plant
Kumari et al. An energy efficient smart metering system using edge computing in LoRa network
CN110267292A (en) Cellular network method for predicting based on Three dimensional convolution neural network
CN114885340B (en) Ultra-dense wireless network power distribution method based on deep migration learning
Ebrahim et al. A deep learning approach for task offloading in multi-UAV aided mobile edge computing
CN115409431A (en) Distributed power resource scheduling method based on neural network
Shen et al. Edgematrix: A resource-redefined scheduling framework for sla-guaranteed multi-tier edge-cloud computing systems
CN112330021A (en) Network coordination control method of distributed optical storage system
Mishra et al. Enabling cyber‐physical demand response in smart grids via conjoint communication and controller design
Hlophe et al. AI meets CRNs: A prospective review on the application of deep architectures in spectrum management
Qin et al. Dynamic IoT service placement based on shared parallel architecture in fog-cloud computing
CN115499849B (en) Wireless access point and reconfigurable intelligent surface cooperation method
Wang et al. A resource allocation strategy for edge services based on intelligent prediction
Bellavista et al. Edge Cloud as an Enabler for Distributed AI in Industrial IoT Applications: the Experience of the IoTwins Project.
Zhou et al. Binary quantum elite particle swarm optimization algorithm for spectrum allocation in cognitive wireless medical sensor network
Zhang et al. Application of artificial intelligence for space-air-ground-sea integrated network
Marinescu et al. Deep learning–based coverage and capacity optimization
Rodway et al. Differential evolution optimized fuzzy controller for wireless sensor network energy management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant