CN116321236A - RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method - Google Patents
RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method Download PDFInfo
- Publication number
- CN116321236A CN116321236A CN202310367185.9A CN202310367185A CN116321236A CN 116321236 A CN116321236 A CN 116321236A CN 202310367185 A CN202310367185 A CN 202310367185A CN 116321236 A CN116321236 A CN 116321236A
- Authority
- CN
- China
- Prior art keywords
- ris
- value
- mimo system
- energy efficiency
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000005540 biological transmission Effects 0.000 claims abstract description 36
- 239000013598 vector Substances 0.000 claims abstract description 23
- 238000012545 processing Methods 0.000 claims abstract description 22
- 238000004891 communication Methods 0.000 claims abstract description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 27
- 230000001413 cellular effect Effects 0.000 claims description 26
- 230000009471 action Effects 0.000 claims description 23
- 238000012549 training Methods 0.000 claims description 22
- 238000005265 energy consumption Methods 0.000 claims description 16
- 230000003595 spectral effect Effects 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 12
- 238000001228 spectrum Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 7
- 230000010363 phase shift Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 3
- 238000001829 resonance ionisation spectroscopy Methods 0.000 description 41
- 238000013528 artificial neural network Methods 0.000 description 5
- 239000000654 additive Substances 0.000 description 3
- 230000000996 additive effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002787 reinforcement Effects 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B7/00—Radio transmission systems, i.e. using radiation field
- H04B7/02—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas
- H04B7/04—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas
- H04B7/06—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station
- H04B7/0613—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission
- H04B7/0615—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission of weighted versions of same signal
- H04B7/0619—Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission of weighted versions of same signal using feedback from receiving side
- H04B7/0621—Feedback content
- H04B7/0626—Channel coefficients, e.g. channel state information [CSI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/06—Testing, supervising or monitoring using simulated traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/04—Transmission power control [TPC]
- H04W52/18—TPC being performed according to specific parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/04—Transmission power control [TPC]
- H04W52/18—TPC being performed according to specific parameters
- H04W52/24—TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters
- H04W52/243—TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters taking into account interferences
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention relates to the technical field of wireless communication networks, and discloses an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method.S. 1, the AP carries out channel estimation according to a pilot sequence sent by a legal user and an active eavesdropper, obtains channel state information and beam coefficients of the legal user and the active eavesdropper, and sends the channel state information and the beam coefficients to a central processing unit CPU through a return link; s2, the CPU calculates interference vectors between legal users and between the legal users and an active eavesdropper, and sends the interference vectors to all APs and reconfigurable intelligent surfaces connected with the CPU; and S3, the AP adjusts power according to the interference vector transmitted by the CPU and the transmission data of the legal user and the active eavesdropper, and the reconfigurable intelligent surface adjusts phase. The invention can improve the energy efficiency of the system on the premise of realizing safe communication.
Description
Technical Field
The invention relates to the technical field of wireless communication networks, in particular to an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization design method.
Background
In the last decades, multiple input multiple output (Multiple Input and Multiple Output, MIMO) technology has received widespread attention. However, since conventional wireless communication networks employ cellular structures, the performance of multi-cell MIMO systems is often limited by inter-cell interference. In order to solve this problem, a new user-centric network paradigm-no cellular network has recently been proposed. Unlike conventional cellular network design principles, non-cellular networks employ user-centric transmission designs, i.e., all of the networks together provide collaborative services without cell boundaries to all users. Due to the efficient cooperation among the distributed systems, the inter-cell interference is effectively relieved, and the method has high potential in improving the network spectrum and energy efficiency. This technique is considered as a potential candidate for future communication systems, and has attracted increasing research interest in recent years, such as resource allocation, precoding/beamforming, and channel estimation. However, deploying a large number of distributed devices can result in higher costs and power consumption, resulting in less than optimal energy efficient performance.
Fortunately, a new revolutionary technology, known as reconfigurable intelligent surface (Reconfigurable Intelligent Surface, RIS), has been identified as a low cost and low energy consumption solution to spectral efficiency that can effectively solve this problem. The RIS is made up of a large number of low cost passive elements whose phase shift is controlled by simple programmable PIN diodes, which can reflect the signal and generate a directional beam to the user. Unlike large-scale phased array antennas implemented by phase shifters in a non-cellular network, RIS does not require additional hardware implementations, such as complex digital phase shifting circuitry, thus greatly saving the energy consumption and complexity of signal processing. Thus, deploying RIS into a non-cellular network enables the same services to be implemented with lower levels of power consumption, effectively solving the problems described above.
The system aims at the problems of high cost and high energy consumption caused by the need of large-scale deployment in the existing non-cellular network (RIS-assisted non-cellular large-scale MIMO system), and proposes that the RIS is adopted to replace part of the communication so as to realize relatively low-cost and low-energy consumption communication. Furthermore, since no cellular network is vulnerable to active eavesdropper during the uplink channel estimation phase.
Therefore, on the premise of realizing secure communication, how to optimize the system energy efficiency of the RIS-assisted non-cellular large-scale MIMO system and improve the energy efficiency of the system is a problem to be solved.
Disclosure of Invention
The invention provides an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method to solve the problems.
The invention is realized by the following technical scheme:
the utility model provides a safe no honeycomb massive MIMO system energy efficiency optimization method of RIS auxiliary, the safe no honeycomb massive MIMO system of RIS auxiliary includes a central processing unit CPU, M many antennas 'APs, K single antenna's legal users, R reflection unit number N's reconfigurable intelligent surface and a single antenna's initiative eavesdropper, the concrete steps are:
s1, in an uplink training stage, channel estimation is carried out according to pilot sequences sent by legal users and eavesdroppers, so that channel state information and beam coefficients of the legal users and the eavesdroppers are obtained, and the channel state information and the beam coefficients are sent to a central processing unit CPU through a return link;
s2, in a downlink transmission stage, the central processing unit CPU calculates an interference vector according to the received channel state information, the received beam coefficients and the spatial relation among antennas, and sends the interference vector to all APs and reconfigurable intelligent surfaces connected with the central processing unit CPU;
and S3, the AP carries out power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the eavesdropper, the reconfigurable intelligent surface carries out phase adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the eavesdropper, and after adjustment, the AP and the reconfigurable intelligent surface send the transmission data to all legal users and active eavesdroppers connected with the AP and the reconfigurable intelligent surface.
As an optimization, in S3, the specific steps of the access point performing phase adjustment and power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user are as follows:
s3.1, setting related constraint conditions for the AP to send transmission information and the reconfigurable intelligent surface to reflect the transmission information to legal users and eavesdroppers according to the interference vector and the transmission data;
s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;
s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;
s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;
and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
As optimization, in S3.1, the relevant constraint conditions are:
C 3 :R sec ≥R th
C 1 indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 The indicated total power control coefficient of the transmission is not more than 1, which is the transmission power constraint; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 The gain magnitude of each element representing the reconfigurable smart surface is 1, which is a constraint on the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,
representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R sec Indicating privacy rate of eavesdropped user, R th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
As optimization, the total energy consumption model of the RIS-assisted safe non-cellular massive MIMO system is:
wherein P is m Is the power consumption generated by the amplifier and the circuit at the mth AP, including the power consumption of the transceiver link and the power consumption of the signal processing, P k Is the circuit loss of the kth user, P r,n Representing the low power consumption, P, produced by the nth reflective element in the nth reconfigurable intelligent surface fh,m The power consumed for the backhaul link connecting the central processing unit CPU and the mth AP is used for transmitting data between the AP and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable smart surface, and R represents the total number of reconfigurable smart surfaces.
As an optimization, the total throughput of the RIS-assisted safe non-cellular massive MIMO system is:
wherein B is the system bandwidth, S ek ({η mk ,θ r,n }) is the spectral efficiency per user, K is the total number of users.
As an optimization, the power consumption P generated by the amplifier and the circuit at the mth AP m The method comprises the following steps:
wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas of the AP, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d Maximum normalized transmit power for the AP;
as an optimization, the power P consumed by the forward link connecting the central processing unit CPU and the mth AP fh,m The method comprises the following steps:
P fh,m =P 0,m +B·S e ({η mk ,θ r,n })·P bt,m ;
wherein P is 0,m Is the fixed energy consumption per backhaul, B is the system bandwidth, P bt,m Is the power consumption related to the flow, the unit is Watt/(Gbits/S), S e ({η mk ,θ r,n -j) is the overall spectral efficiency of the system;
as optimization, solving the optimization problem by adopting a DDPG-PER algorithm, wherein the DDPG-PER algorithm flow is specifically as follows:
t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;
t2, in each training round, generating an action value by the Actor network according to the current state value;
t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;
t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;
t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;
t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;
t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;
t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;
and T9, making a decision by using the trained Actor network to obtain a final strategy.
As an optimization of this process,
the state value is: signal-to-interference-noise ratio of legitimate users:signal-to-interference-noise ratio of active eavesdroppers: />
The action value is as follows: a phase shift matrix θ and a power control coefficient matrix η;
the prize value is: energy efficiency of the system.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method, which can improve the energy efficiency of the system on the premise of realizing safe communication by carrying out joint optimization on a reconfigurable intelligent surface reflection coefficient theta and a power control coefficient eta, and can obviously improve the energy efficiency performance of the honeycomb-free large-scale MIMO system under the conditions of ensuring the safe communication of eavesdropped users, the reliable communication of general legal users and no additional cost through simulation verification display, thereby having very strong application value and development potential;
the invention adopts a deep reinforcement learning algorithm to solve, and jointly optimizes the reconfigurable intelligent surface reflection coefficient theta and the power control coefficient eta, and maximizes the energy efficiency of the system on the premise of meeting the safety communication of legal users.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the drawings that are needed in the examples will be briefly described below, it being understood that the following drawings only illustrate some examples of the present invention and therefore should not be considered as limiting the scope, and that other related drawings may be obtained from these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is a schematic diagram of the structural composition of a RIS-assisted safe non-cellular massive MIMO system according to the present invention;
fig. 2 is a graph of energy efficiency versus AP transmit power;
FIG. 3 is a graph of energy efficiency versus number of legitimate users;
fig. 4 is a graph of safety energy efficiency versus safety rate threshold.
Detailed Description
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
An energy efficiency optimization method for an RIS-assisted safe non-cellular massive MIMO system in embodiment 1, where the composition of the RIS-assisted safe non-cellular massive MIMO system (called "system") is shown in fig. 1, and the system includes a central processing unit CPU, M multi-antenna APs (which can be understood as base stations), K legal users with single antennas, R reconfigurable intelligent surfaces with N reflection units, and a single-antenna active eavesdropper, where M, K, R are all positive integers.
The basic working principle of the RIS-assisted honeycomb-free massive MIMO system is as follows: in RIS-assisted non-cellular massive MIMO systems, data is transmitted mostly in Time division duplex (Time-Division Duplexing, TDD) mode. For a system in TDD mode, the uplink and downlink data transmissions occur in the same frequency band. In the uplink training stage, a legal user sends a training sequence in a certain time slot, an Access Point (AP) carries out channel estimation and beam selection on signals after receiving the training sequence, and then sends channel state information (Channel State Information, CSI) obtained by calculation and a beam coefficient after selection to a Central Processing Unit (CPU) through a return link; in the downlink transmission stage, a Central Processing Unit (CPU) calculates an interference vector according to the received CSI, beam coefficients and spatial relations among antennas, and sends the interference vector to all APs connected with the CPU, after each AP receives interference vector information transmitted by the CPU, the CPU adjusts the transmitting power and the phase shift of an RIS according to the interference vector and the transmitted data, and then transmits the interference vector information to all legal users connected with the CPU and an active eavesdropper, and each user receives signals transmitted by all APs and RIS serving the user, and then performs interference elimination and signal decoding to obtain required data.
S3.1, setting related constraint conditions for the access point to send transmission information to the legal user according to the interference vector and the transmission data;
s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;
s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;
s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;
and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
From the above description, the signal received by the kth legal user can be obtained as follows:
wherein ρ is d For the maximum normalized transmitting power of the AP, 0 is less than or equal to eta mk Less than or equal to 1 is a power control coefficient,is a composite channel between the mth and kth legal users, w mk Precoding vector representing the transmission of data by an AP to a legitimate user, for example>Representing the data signal sent to legal user k, and +.> Additive white gaussian noise (Additive White Gaussion Noise, AWGN) at legal user k;
similarly, the signal received by the active eavesdropper is:
The signal to noise ratios of the legitimate user and the active eavesdropper are respectively:
the security rate of the legitimate user 1 thus obtained is the following:
R sec,j =log 2 (1+γ j )-log 2 (1+γ E )
furthermore, the total energy consumption of the system can be modeled as:
wherein P is m Is the power consumption (including the power consumption of the transmit-receive link and the power consumption of the signal processing) generated by the amplifier and the circuit at the mth AP, P k Is the circuit loss of the kth user, P r,n Representing low power consumption, P, produced by the nth reflective element in the nth RIS fh,m Is the power consumed by the forward link connecting the CPU and the mth AP forData is transferred between the APs and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable intelligent surface, and R represents the total number of reconfigurable intelligent surfaces.
Power consumption P generated by amplifier and circuit at mth AP m Can be modeled as:
wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d Is the maximum normalized transmit power.
Backhaul is used to transfer data between an AP and a CPU (each backhaul represents a process of transferring data between an AP and a CPU), and its power consumption is proportional to spectral efficiency, and can be expressed as:
P fh,m =P 0,m +B·S e ({η mk ,θ r,n })·P bt,m
wherein P is 0,m Is the fixed energy consumption per backhaul, B is the system bandwidth, P bt,m Is the power consumption (unit: watt/(Gbits/S), S associated with the flow e ({η mk ,θ r,n }) is the overall spectral efficiency of the system, the corresponding expression is:
the spectral efficiency of each user can be expressed as:
wherein τ c Is the length of the coherence interval τ p Is the pilot length, these two terms of art are the terms of art that occur during pilot training, and are notAnd is further detailed.
In order to solve the problems of low energy efficiency and poor safety of the non-cellular large-scale MIMO system, the invention introduces RIS, and maximizes the energy efficiency of the RIS-assisted safe non-cellular large-scale MIMO system by jointly optimizing the reconfigurable intelligent surface reflection coefficient theta and the AP power control coefficient eta.
C 3 :R sec ≥R th
C 1 Indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 The indicated total power control coefficient of the transmission is not more than 1, which is the transmission power constraint; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 Representing reconfigurableThe gain magnitude of each element of the smart surface is 1, which is a constraint of the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,
representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R SEC Privacy rate, R, representing the legal user who is eavesdropped th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
Since the optimization problem comprises a non-convex objective function and a constraint function, the optimization problem is a non-convex optimization problem which is difficult to solve, and the scheme does not directly solve the challenging optimization problem mathematically, but adopts a deep reinforcement learning algorithm to solve.
The invention adopts a depth deterministic strategy gradient (Deep Deterministic Policy Gradient with Prioritized Experience Replay, DDPG-PER) algorithm based on preferential experience playback to solve the optimization problem, and the algorithm totally comprises four neural networks: an Actor network, a Critic network, a Target Actor network, a Target Critic network, and a preferential experience playback pool. The Actor network and the Target Actor network are used for estimating the probability of taking various actions under the current state, namely determining the current strategy, the Critic network and the Target Critic network are used for evaluating the value of the current state and the action pair, namely determining the optimal action under the current state, and the priority experience playback pool is used for storing experiences.
Solving the optimization problem by adopting a DDPG-PER algorithm, wherein the DDPG-PER algorithm comprises the following steps:
t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;
t2, in each training round, generating an action value by the Actor network according to the current state value;
t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;
t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;
t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;
t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;
t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;
t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;
and T9, making a decision by using the trained Actor network to obtain a final strategy.
Specifically, the Actor network receives the current state as input, outputs the current policy, and the Critic network receives the current state and the action under the current policy as input, and outputs the value Q under the current state. The Target Actor network and the Target Critic network are Target networks of the Actor network and the Critic network, respectively, and are used for estimating Target strategies and Target values. The priority experience playback pool gives each experience different priority according to the importance of the experience, the priority can be calculated according to the TD error of the experience, and the TD error refers to the difference between the rewards obtained by taking a certain action in the current state and the cost function of the next state under the current strategy. Experience with higher priority is frequently extracted, thereby increasing the importance of these experiences.
At the beginning of the algorithm, the parameters of the priority empirical playback pool, the four neural networks, actions θ and η need to be initialized.
The algorithm runs a total of E rounds, each round being trained T times. For each round, the algorithm terminates when it converges or reaches the maximum number of exercises allowed. In addition, the present invention utilizes DRL to obtain optimal θ and η instead of training the neural network for online processing.
1. status: state at time tProviding useful information about the environment, helping to train the network, the present invention considers the signal-to-interference-and-noise ratio of the user and the signal-to-interference-and-noise ratio of the active eavesdropper as state vectors.
2. The actions are as follows: the invention considers the motion a taking the phase shift matrix theta and the power control coefficient matrix eta as the time t t . Because the input of the neural network can only be real, and the RIS phase shift matrix is complex, the invention converts the reflection coefficient into an angle and then inputs the angle to the neural network.
3. Rewarding: the invention considers the energy efficiency of the system as a rewarding value, but only when the action output by the Actor network meets all constraint conditions, the system can obtain rewards, otherwise, the system can be punished to a certain degree.
In order to verify the superiority of RIS-assisted honeycomb-free large-scale MIMO system in energy efficiency performance and the superiority of DDPG-PER algorithm used in the invention, two other schemes and an algorithm are designed as comparison, and the two schemes and the algorithm are respectively as follows: 1.RIS is not adopted but power control coefficients are optimized; 2. the power control coefficient scheme used by the invention is not adopted but RIS is adopted; 3. a conventional DDPG algorithm is used.
The superiority of the algorithm used and the system performance of the invention are verified by simulation experiments. The system simulation parameters were set as follows: the total number of the AP numbers M and the reconfigurable intelligent surface numbers R is 10, the number of users K=5, the number of eavesdroppers is 1, the number of antennas b=2 of each AP, the total number of reflection units of all the reconfigurable intelligent surfaces is 20-120, and the transmitting power p of the APs d =0 to 30dBm, a noise figure of 9dB, a system bandwidth of b=20 MHz, a pilot sequence length τ p Length of coherence interval τ =30 c Pilot transmit power p of legal user =200 u =0.1w, active eavesdropper pilot transmit power p E =0.1w, power amplifier efficiency α m Power consumption P of each antenna =0.4 tc,m =0.2W, fixed power consumption P per backhaul 0,m Flow dependent power consumption p=0.825W bt,m =0.25W/(Gbits), backhaul link capacity between mth AP and CPUSafety rate threshold R th =0.2 bit/S, minimum spectrum requirement S for kth user ok =0.7 bit/s/Hz. The channels from the AP to all legal users, from the AP to the eavesdropper, and from the reconfigurable intelligent surface to the users are modeled as rayleigh channels, the channels from the AP to the reconfigurable intelligent surface are modeled as rice channels, the path loss at the reference distance of 1m is 30db, and the path loss indexes of the AP-RIS link, the RIS-user link, and the AP-user link are 2.2, 2.8, and 3.5, respectively.
Fig. 2 shows the energy efficiency versus transmit power under different schemes and algorithms. It can be seen that as the transmit power increases, the energy efficiency under different schemes and algorithms gradually increases and then stabilizes, because as the transmit power threshold increases, the power allocated by the AP to the user increases, resulting in an increase in energy efficiency. However, the threshold value of the transmitting power is further increased, the transmitting power reaches the optimal value, and the energy efficiency tends to be stable. In addition, under the same transmitting power, the scheme and algorithm of the invention are superior to other schemes and the traditional DDPG algorithm, because RIS reflection phase is optimized, thereby the legal user has high-efficiency transmission rate, and the priority mechanism is used, thereby the utilization rate of important experience of the algorithm can be improved.
Fig. 3 shows the energy efficiency versus the number of legitimate users. It can be seen that the energy efficiency increases monotonically with increasing number of users in the case of different numbers of APs and RIS, because multi-user diversity is exploited. It can be found that when the total number of APs and RIS is fixed, deploying more RIS can achieve better performance because the RIS is made up of a large number of passive components, which can reduce system power consumption, while the reconfigurable nature of the RIS improves the spectral efficiency of legitimate users. Furthermore, in FIG. 3, the total number of reflective elements for all RISs is unchanged, and when the number of users is fixed, the energy efficiency increases as the number of RISs increases, which illustrates the advantage of RISs employing distributed deployment.
Fig. 4 shows the relationship of the safety energy efficiency to the safety rate threshold under different schemes and algorithms. It can be seen that, as the safety rate threshold increases, the energy efficiency of the system is kept stable and then reduced, because under the constraint of smaller safety rate, the throughput of the user easily meets the constraint condition of the safety rate and remains unchanged, but as the safety rate threshold increases, the transmitting power needs to be increased to meet the constraint condition, so that the power consumption of the system is increased, and the energy efficiency is reduced. It can also be seen from fig. 4 that the use of the proposed solution can be better than other solutions in terms of safety and energy efficiency performance.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (9)
- The RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method is characterized in that the RIS-assisted honeycomb-free large-scale MIMO system comprises a central processing unit CPU, M multi-antenna APs, K single-antenna legal users, R reconfigurable intelligent surfaces with N reflecting units and a single-antenna active eavesdropper, and the method comprises the following specific steps of:s1, in an uplink training stage, the AP carries out channel estimation according to pilot sequences sent by legal users and active eavesdroppers to obtain channel state information and beam coefficients of the transmission data, and sends the channel state information and the beam coefficients to the CPU through a return link;s2, in a downlink transmission stage, the central processing unit CPU calculates interference vectors between legal users and between the legal users and an active eavesdropper according to the received channel state information, the received beam coefficients and the spatial relation among antennas, and sends the interference vectors to all APs and reconfigurable intelligent surfaces connected with the central processing unit CPU;and S3, the AP carries out power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user, the reconfigurable intelligent surface carries out phase adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user, and after adjustment, the AP and the reconfigurable intelligent surface send the transmission data to all legal users and active eavesdroppers connected with the AP and the reconfigurable intelligent surface.
- 2. The method for optimizing the energy efficiency of the RIS-assisted safe non-cellular massive MIMO system according to claim 1, wherein in S3, the specific steps of the access point performing phase adjustment and power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the active eavesdropper are as follows:s3.1, setting related constraint conditions about the sending of transmission information by the AP and the reflection of the transmission information by the reconfigurable intelligent surface to legal users and eavesdroppers according to the interference vector and the transmission data;s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
- 3. The method for optimizing the energy efficiency of a safe non-cellular massive MIMO system with the aid of claim 2, wherein in S3.1, the relevant constraint conditions are:C 3 :R sec ≥R thC 1 indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 Indicating that the total power control coefficient of the transmission of the AP is not largeAt 1, a transmit power constraint for the AP; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 The gain magnitude of each element representing the reconfigurable smart surface is 1, which is a constraint on the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R sec Indicating privacy rate of eavesdropped user, R th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
- 4. A method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system according to claim 3, wherein the total energy consumption model of the RIS-assisted safe non-cellular massive MIMO system is:wherein P is m Is the power consumption generated by the amplifier and the circuit at the mth AP, including the power consumption of the transceiver link and the power consumption of the signal processing, P k Is the circuit loss of the kth user, P r,n Representing the r-th reconfigurable intelligenceLow power consumption, P, generated by the nth reflective element in the surface fh,m The power consumed for the backhaul link connecting the central processing unit CPU and the mth AP is used for transmitting data between the AP and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable smart surface, and R represents the total number of reconfigurable smart surfaces.
- 5. The method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system of claim 4, wherein the total throughput of the RIS-assisted safe non-cellular massive MIMO system is:wherein B is the system bandwidth, S ek ({η mk ,θ r,n }) is the spectral efficiency per user, K is the total number of users.
- 6. The method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system according to claim 5, wherein the power consumption P at the mth AP is generated by an amplifier and a circuit m The method comprises the following steps:wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas of the AP, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d The maximum normalized transmit power for the AP.
- 7. The RIS-assisted safe non-cellular massive MIMO system energy efficiency optimization method according to claim 5, wherein the power P consumed by the backhaul link connecting the CPU and the mth AP fh,m The method comprises the following steps:P fh,m =P 0,m +B·S e ({η mk ,θ r,n })·P bt,m ;
- 8. the RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method according to claim 1, wherein the optimization problem is solved by adopting a DDPG-PER algorithm, and the DDPG-PER algorithm flow is specifically as follows:t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;t2, in each training round, generating an action value by the Actor network according to the current state value;t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;and T9, making a decision by using the trained Actor network to obtain a final strategy.
- 9. The RIS-assisted safe non-cellular massive MIMO system energy efficiency optimization method of claim 8, wherein,the state value is: signal-to-interference-noise ratio of legitimate users:signal-to-interference-noise ratio of active eavesdroppers: />The action value is as follows: a phase shift matrix θ and a power control coefficient matrix η;the prize value is: energy efficiency of the system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310367185.9A CN116321236A (en) | 2023-04-07 | 2023-04-07 | RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310367185.9A CN116321236A (en) | 2023-04-07 | 2023-04-07 | RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116321236A true CN116321236A (en) | 2023-06-23 |
Family
ID=86792585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310367185.9A Pending CN116321236A (en) | 2023-04-07 | 2023-04-07 | RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116321236A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116865798A (en) * | 2023-07-06 | 2023-10-10 | 河北大学 | Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system |
-
2023
- 2023-04-07 CN CN202310367185.9A patent/CN116321236A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116865798A (en) * | 2023-07-06 | 2023-10-10 | 河北大学 | Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system |
CN116865798B (en) * | 2023-07-06 | 2024-01-05 | 河北大学 | Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Perera et al. | Sum rate maximization in STAR-RIS assisted full-duplex communication systems | |
Pradhan et al. | Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications | |
CN110149127B (en) | NOMA technology-based precoding vector optimization method for D2D communication system | |
Wang et al. | Intelligent reflecting surface assisted massive MIMO communications | |
Almasi et al. | Joint beamwidth and power optimization in mmWave hybrid beamforming-NOMA systems | |
Khisa et al. | Energy consumption optimization in ris-assisted cooperative rsma cellular networks | |
Gao et al. | Resource allocation in IRSs aided MISO-NOMA networks: A machine learning approach | |
Deshpande et al. | Resource allocation design for spectral-efficient URLLC using RIS-aided FD-NOMA system | |
Zhu et al. | Load-aware dynamic mode selection for network-assisted full-duplex cell-free large-scale distributed MIMO systems | |
Zheng et al. | Next-Generation RIS: From Single to Multiple Functions | |
Al-Habob et al. | Latency minimization in phase-coupled STAR-RIS assisted multi-MEC server systems | |
Xie et al. | Intelligent reflecting surface assisted wireless information and power transfer with X-duplex for 6G networks | |
CN116321236A (en) | RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method | |
Zhang et al. | Fairness Optimization for Intelligent Reflecting Surface Aided Uplink Rate-Splitting Multiple Access | |
Hao et al. | Max-min security energy efficiency optimization for RIS-aided cell-free networks | |
Islam et al. | Distributed power and admission control for cognitive radio networks using antenna arrays | |
Wan et al. | Performance Analysis of Multi-UAV Aided Cell-Free Radio Access Network with Network-Assisted Full-Duplex for URLLC | |
CN111800217A (en) | Full-duplex cognitive multi-input multi-output relay cooperation method under non-ideal channel state | |
Pala et al. | Robust Design of RIS-aided Full-Duplex RSMA System for V2X communication: A DRL Approach | |
Li et al. | Multigroup multicast transmission via intelligent reflecting surface | |
Lu et al. | Energy-efficient beamforming design for cooperative double-IRS aided multi-user MIMO | |
Chen et al. | Reconfigurable intelligent surface assisted D2D networks: Power and discrete phase shift design | |
Zhu et al. | Multiuser Communication Aided by Movable Antenna | |
Zhang et al. | Massive-mimo based statistical qos provisioning for murllc over 6g uav mobile wireless networks | |
CN111404588A (en) | Physical layer secure transmission method of full-duplex cognitive eavesdropping network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |