CN116321236A - RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method - Google Patents

RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method Download PDF

Info

Publication number
CN116321236A
CN116321236A CN202310367185.9A CN202310367185A CN116321236A CN 116321236 A CN116321236 A CN 116321236A CN 202310367185 A CN202310367185 A CN 202310367185A CN 116321236 A CN116321236 A CN 116321236A
Authority
CN
China
Prior art keywords
ris
value
mimo system
energy efficiency
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310367185.9A
Other languages
Chinese (zh)
Inventor
宋清洋
李慧
孙巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202310367185.9A priority Critical patent/CN116321236A/en
Publication of CN116321236A publication Critical patent/CN116321236A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/02Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas
    • H04B7/04Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas
    • H04B7/06Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station
    • H04B7/0613Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission
    • H04B7/0615Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission of weighted versions of same signal
    • H04B7/0619Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas at the transmitting station using simultaneous transmission of weighted versions of same signal using feedback from receiving side
    • H04B7/0621Feedback content
    • H04B7/0626Channel coefficients, e.g. channel state information [CSI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/06Testing, supervising or monitoring using simulated traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. Transmission Power Control [TPC] or power classes
    • H04W52/04Transmission power control [TPC]
    • H04W52/18TPC being performed according to specific parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. Transmission Power Control [TPC] or power classes
    • H04W52/04Transmission power control [TPC]
    • H04W52/18TPC being performed according to specific parameters
    • H04W52/24TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters
    • H04W52/243TPC being performed according to specific parameters using SIR [Signal to Interference Ratio] or other wireless path parameters taking into account interferences
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention relates to the technical field of wireless communication networks, and discloses an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method.S. 1, the AP carries out channel estimation according to a pilot sequence sent by a legal user and an active eavesdropper, obtains channel state information and beam coefficients of the legal user and the active eavesdropper, and sends the channel state information and the beam coefficients to a central processing unit CPU through a return link; s2, the CPU calculates interference vectors between legal users and between the legal users and an active eavesdropper, and sends the interference vectors to all APs and reconfigurable intelligent surfaces connected with the CPU; and S3, the AP adjusts power according to the interference vector transmitted by the CPU and the transmission data of the legal user and the active eavesdropper, and the reconfigurable intelligent surface adjusts phase. The invention can improve the energy efficiency of the system on the premise of realizing safe communication.

Description

RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method
Technical Field
The invention relates to the technical field of wireless communication networks, in particular to an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization design method.
Background
In the last decades, multiple input multiple output (Multiple Input and Multiple Output, MIMO) technology has received widespread attention. However, since conventional wireless communication networks employ cellular structures, the performance of multi-cell MIMO systems is often limited by inter-cell interference. In order to solve this problem, a new user-centric network paradigm-no cellular network has recently been proposed. Unlike conventional cellular network design principles, non-cellular networks employ user-centric transmission designs, i.e., all of the networks together provide collaborative services without cell boundaries to all users. Due to the efficient cooperation among the distributed systems, the inter-cell interference is effectively relieved, and the method has high potential in improving the network spectrum and energy efficiency. This technique is considered as a potential candidate for future communication systems, and has attracted increasing research interest in recent years, such as resource allocation, precoding/beamforming, and channel estimation. However, deploying a large number of distributed devices can result in higher costs and power consumption, resulting in less than optimal energy efficient performance.
Fortunately, a new revolutionary technology, known as reconfigurable intelligent surface (Reconfigurable Intelligent Surface, RIS), has been identified as a low cost and low energy consumption solution to spectral efficiency that can effectively solve this problem. The RIS is made up of a large number of low cost passive elements whose phase shift is controlled by simple programmable PIN diodes, which can reflect the signal and generate a directional beam to the user. Unlike large-scale phased array antennas implemented by phase shifters in a non-cellular network, RIS does not require additional hardware implementations, such as complex digital phase shifting circuitry, thus greatly saving the energy consumption and complexity of signal processing. Thus, deploying RIS into a non-cellular network enables the same services to be implemented with lower levels of power consumption, effectively solving the problems described above.
The system aims at the problems of high cost and high energy consumption caused by the need of large-scale deployment in the existing non-cellular network (RIS-assisted non-cellular large-scale MIMO system), and proposes that the RIS is adopted to replace part of the communication so as to realize relatively low-cost and low-energy consumption communication. Furthermore, since no cellular network is vulnerable to active eavesdropper during the uplink channel estimation phase.
Therefore, on the premise of realizing secure communication, how to optimize the system energy efficiency of the RIS-assisted non-cellular large-scale MIMO system and improve the energy efficiency of the system is a problem to be solved.
Disclosure of Invention
The invention provides an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method to solve the problems.
The invention is realized by the following technical scheme:
the utility model provides a safe no honeycomb massive MIMO system energy efficiency optimization method of RIS auxiliary, the safe no honeycomb massive MIMO system of RIS auxiliary includes a central processing unit CPU, M many antennas 'APs, K single antenna's legal users, R reflection unit number N's reconfigurable intelligent surface and a single antenna's initiative eavesdropper, the concrete steps are:
s1, in an uplink training stage, channel estimation is carried out according to pilot sequences sent by legal users and eavesdroppers, so that channel state information and beam coefficients of the legal users and the eavesdroppers are obtained, and the channel state information and the beam coefficients are sent to a central processing unit CPU through a return link;
s2, in a downlink transmission stage, the central processing unit CPU calculates an interference vector according to the received channel state information, the received beam coefficients and the spatial relation among antennas, and sends the interference vector to all APs and reconfigurable intelligent surfaces connected with the central processing unit CPU;
and S3, the AP carries out power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the eavesdropper, the reconfigurable intelligent surface carries out phase adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the eavesdropper, and after adjustment, the AP and the reconfigurable intelligent surface send the transmission data to all legal users and active eavesdroppers connected with the AP and the reconfigurable intelligent surface.
As an optimization, in S3, the specific steps of the access point performing phase adjustment and power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user are as follows:
s3.1, setting related constraint conditions for the AP to send transmission information and the reconfigurable intelligent surface to reflect the transmission information to legal users and eavesdroppers according to the interference vector and the transmission data;
s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;
s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;
s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;
and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
As optimization, in S3.1, the relevant constraint conditions are:
C 1 :
Figure BDA0004167191930000031
C 2 :
Figure BDA0004167191930000032
C 3 :R sec ≥R th
C 4 :
Figure BDA0004167191930000033
C 5 :
Figure BDA0004167191930000034
C 6 :
Figure BDA0004167191930000035
C 1 indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 The indicated total power control coefficient of the transmission is not more than 1, which is the transmission power constraint; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 The gain magnitude of each element representing the reconfigurable smart surface is 1, which is a constraint on the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,
Figure BDA0004167191930000036
representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>
Figure BDA0004167191930000037
Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R sec Indicating privacy rate of eavesdropped user, R th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
As optimization, the total energy consumption model of the RIS-assisted safe non-cellular massive MIMO system is:
Figure BDA0004167191930000038
wherein P is m Is the power consumption generated by the amplifier and the circuit at the mth AP, including the power consumption of the transceiver link and the power consumption of the signal processing, P k Is the circuit loss of the kth user, P r,n Representing the low power consumption, P, produced by the nth reflective element in the nth reconfigurable intelligent surface fh,m The power consumed for the backhaul link connecting the central processing unit CPU and the mth AP is used for transmitting data between the AP and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable smart surface, and R represents the total number of reconfigurable smart surfaces.
As an optimization, the total throughput of the RIS-assisted safe non-cellular massive MIMO system is:
Figure BDA0004167191930000039
wherein B is the system bandwidth, S ek ({η mk ,θ r,n }) is the spectral efficiency per user, K is the total number of users.
As an optimization, the power consumption P generated by the amplifier and the circuit at the mth AP m The method comprises the following steps:
Figure BDA0004167191930000041
wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas of the AP, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d Maximum normalized transmit power for the AP;
as an optimization, the power P consumed by the forward link connecting the central processing unit CPU and the mth AP fh,m The method comprises the following steps:
P fh,m =P 0,m +B·S e ({η mkr,n })·P bt,m
wherein P is 0,m Is the fixed energy consumption per backhaul, B is the system bandwidth, P bt,m Is the power consumption related to the flow, the unit is Watt/(Gbits/S), S e ({η mkr,n -j) is the overall spectral efficiency of the system;
Figure BDA0004167191930000042
as optimization, solving the optimization problem by adopting a DDPG-PER algorithm, wherein the DDPG-PER algorithm flow is specifically as follows:
t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;
t2, in each training round, generating an action value by the Actor network according to the current state value;
t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;
t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;
t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;
t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;
t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;
t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;
and T9, making a decision by using the trained Actor network to obtain a final strategy.
As an optimization of this process,
the state value is: signal-to-interference-noise ratio of legitimate users:
Figure BDA0004167191930000051
signal-to-interference-noise ratio of active eavesdroppers: />
Figure BDA0004167191930000052
The action value is as follows: a phase shift matrix θ and a power control coefficient matrix η;
the prize value is: energy efficiency of the system.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides an RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method, which can improve the energy efficiency of the system on the premise of realizing safe communication by carrying out joint optimization on a reconfigurable intelligent surface reflection coefficient theta and a power control coefficient eta, and can obviously improve the energy efficiency performance of the honeycomb-free large-scale MIMO system under the conditions of ensuring the safe communication of eavesdropped users, the reliable communication of general legal users and no additional cost through simulation verification display, thereby having very strong application value and development potential;
the invention adopts a deep reinforcement learning algorithm to solve, and jointly optimizes the reconfigurable intelligent surface reflection coefficient theta and the power control coefficient eta, and maximizes the energy efficiency of the system on the premise of meeting the safety communication of legal users.
Drawings
In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the drawings that are needed in the examples will be briefly described below, it being understood that the following drawings only illustrate some examples of the present invention and therefore should not be considered as limiting the scope, and that other related drawings may be obtained from these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is a schematic diagram of the structural composition of a RIS-assisted safe non-cellular massive MIMO system according to the present invention;
fig. 2 is a graph of energy efficiency versus AP transmit power;
FIG. 3 is a graph of energy efficiency versus number of legitimate users;
fig. 4 is a graph of safety energy efficiency versus safety rate threshold.
Detailed Description
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
An energy efficiency optimization method for an RIS-assisted safe non-cellular massive MIMO system in embodiment 1, where the composition of the RIS-assisted safe non-cellular massive MIMO system (called "system") is shown in fig. 1, and the system includes a central processing unit CPU, M multi-antenna APs (which can be understood as base stations), K legal users with single antennas, R reconfigurable intelligent surfaces with N reflection units, and a single-antenna active eavesdropper, where M, K, R are all positive integers.
The basic working principle of the RIS-assisted honeycomb-free massive MIMO system is as follows: in RIS-assisted non-cellular massive MIMO systems, data is transmitted mostly in Time division duplex (Time-Division Duplexing, TDD) mode. For a system in TDD mode, the uplink and downlink data transmissions occur in the same frequency band. In the uplink training stage, a legal user sends a training sequence in a certain time slot, an Access Point (AP) carries out channel estimation and beam selection on signals after receiving the training sequence, and then sends channel state information (Channel State Information, CSI) obtained by calculation and a beam coefficient after selection to a Central Processing Unit (CPU) through a return link; in the downlink transmission stage, a Central Processing Unit (CPU) calculates an interference vector according to the received CSI, beam coefficients and spatial relations among antennas, and sends the interference vector to all APs connected with the CPU, after each AP receives interference vector information transmitted by the CPU, the CPU adjusts the transmitting power and the phase shift of an RIS according to the interference vector and the transmitted data, and then transmits the interference vector information to all legal users connected with the CPU and an active eavesdropper, and each user receives signals transmitted by all APs and RIS serving the user, and then performs interference elimination and signal decoding to obtain required data.
S3.1, setting related constraint conditions for the access point to send transmission information to the legal user according to the interference vector and the transmission data;
s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;
s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;
s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;
and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
From the above description, the signal received by the kth legal user can be obtained as follows:
Figure BDA0004167191930000061
wherein ρ is d For the maximum normalized transmitting power of the AP, 0 is less than or equal to eta mk Less than or equal to 1 is a power control coefficient,
Figure BDA0004167191930000062
is a composite channel between the mth and kth legal users, w mk Precoding vector representing the transmission of data by an AP to a legitimate user, for example>
Figure BDA0004167191930000063
Representing the data signal sent to legal user k, and +.>
Figure BDA0004167191930000064
Figure BDA0004167191930000065
Additive white gaussian noise (Additive White Gaussion Noise, AWGN) at legal user k;
similarly, the signal received by the active eavesdropper is:
Figure BDA0004167191930000071
Figure BDA0004167191930000072
is additive white gaussian noise at the eavesdropper.
The signal to noise ratios of the legitimate user and the active eavesdropper are respectively:
Figure BDA0004167191930000073
Figure BDA0004167191930000074
the security rate of the legitimate user 1 thus obtained is the following:
R sec,j =log 2 (1+γ j )-log 2 (1+γ E )
furthermore, the total energy consumption of the system can be modeled as:
Figure BDA0004167191930000075
wherein P is m Is the power consumption (including the power consumption of the transmit-receive link and the power consumption of the signal processing) generated by the amplifier and the circuit at the mth AP, P k Is the circuit loss of the kth user, P r,n Representing low power consumption, P, produced by the nth reflective element in the nth RIS fh,m Is the power consumed by the forward link connecting the CPU and the mth AP forData is transferred between the APs and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable intelligent surface, and R represents the total number of reconfigurable intelligent surfaces.
Power consumption P generated by amplifier and circuit at mth AP m Can be modeled as:
Figure BDA0004167191930000076
wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d Is the maximum normalized transmit power.
Backhaul is used to transfer data between an AP and a CPU (each backhaul represents a process of transferring data between an AP and a CPU), and its power consumption is proportional to spectral efficiency, and can be expressed as:
P fh,m =P 0,m +B·S e ({η mkr,n })·P bt,m
wherein P is 0,m Is the fixed energy consumption per backhaul, B is the system bandwidth, P bt,m Is the power consumption (unit: watt/(Gbits/S), S associated with the flow e ({η mkr,n }) is the overall spectral efficiency of the system, the corresponding expression is:
Figure BDA0004167191930000081
the spectral efficiency of each user can be expressed as:
Figure BDA0004167191930000082
wherein τ c Is the length of the coherence interval τ p Is the pilot length, these two terms of art are the terms of art that occur during pilot training, and are notAnd is further detailed.
In order to solve the problems of low energy efficiency and poor safety of the non-cellular large-scale MIMO system, the invention introduces RIS, and maximizes the energy efficiency of the RIS-assisted safe non-cellular large-scale MIMO system by jointly optimizing the reconfigurable intelligent surface reflection coefficient theta and the AP power control coefficient eta.
Figure BDA0004167191930000083
C 1 :
Figure BDA0004167191930000084
C 2 :
Figure BDA0004167191930000085
C 3 :R sec ≥R th
C 4 :
Figure BDA0004167191930000086
C 5 :
Figure BDA0004167191930000087
C 6 :
Figure BDA0004167191930000088
C 1 Indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 The indicated total power control coefficient of the transmission is not more than 1, which is the transmission power constraint; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 Representing reconfigurableThe gain magnitude of each element of the smart surface is 1, which is a constraint of the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,
Figure BDA0004167191930000089
representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>
Figure BDA0004167191930000091
Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R SEC Privacy rate, R, representing the legal user who is eavesdropped th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
Since the optimization problem comprises a non-convex objective function and a constraint function, the optimization problem is a non-convex optimization problem which is difficult to solve, and the scheme does not directly solve the challenging optimization problem mathematically, but adopts a deep reinforcement learning algorithm to solve.
The invention adopts a depth deterministic strategy gradient (Deep Deterministic Policy Gradient with Prioritized Experience Replay, DDPG-PER) algorithm based on preferential experience playback to solve the optimization problem, and the algorithm totally comprises four neural networks: an Actor network, a Critic network, a Target Actor network, a Target Critic network, and a preferential experience playback pool. The Actor network and the Target Actor network are used for estimating the probability of taking various actions under the current state, namely determining the current strategy, the Critic network and the Target Critic network are used for evaluating the value of the current state and the action pair, namely determining the optimal action under the current state, and the priority experience playback pool is used for storing experiences.
Solving the optimization problem by adopting a DDPG-PER algorithm, wherein the DDPG-PER algorithm comprises the following steps:
t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;
t2, in each training round, generating an action value by the Actor network according to the current state value;
t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;
t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;
t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;
t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;
t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;
t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;
and T9, making a decision by using the trained Actor network to obtain a final strategy.
Specifically, the Actor network receives the current state as input, outputs the current policy, and the Critic network receives the current state and the action under the current policy as input, and outputs the value Q under the current state. The Target Actor network and the Target Critic network are Target networks of the Actor network and the Critic network, respectively, and are used for estimating Target strategies and Target values. The priority experience playback pool gives each experience different priority according to the importance of the experience, the priority can be calculated according to the TD error of the experience, and the TD error refers to the difference between the rewards obtained by taking a certain action in the current state and the cost function of the next state under the current strategy. Experience with higher priority is frequently extracted, thereby increasing the importance of these experiences.
At the beginning of the algorithm, the parameters of the priority empirical playback pool, the four neural networks, actions θ and η need to be initialized.
The algorithm runs a total of E rounds, each round being trained T times. For each round, the algorithm terminates when it converges or reaches the maximum number of exercises allowed. In addition, the present invention utilizes DRL to obtain optimal θ and η instead of training the neural network for online processing.
To the state
Figure BDA0004167191930000101
The actions θ and η and the transient rewards are described in detail as follows:
1. status: state at time t
Figure BDA0004167191930000102
Providing useful information about the environment, helping to train the network, the present invention considers the signal-to-interference-and-noise ratio of the user and the signal-to-interference-and-noise ratio of the active eavesdropper as state vectors.
2. The actions are as follows: the invention considers the motion a taking the phase shift matrix theta and the power control coefficient matrix eta as the time t t . Because the input of the neural network can only be real, and the RIS phase shift matrix is complex, the invention converts the reflection coefficient into an angle and then inputs the angle to the neural network.
3. Rewarding: the invention considers the energy efficiency of the system as a rewarding value, but only when the action output by the Actor network meets all constraint conditions, the system can obtain rewards, otherwise, the system can be punished to a certain degree.
In order to verify the superiority of RIS-assisted honeycomb-free large-scale MIMO system in energy efficiency performance and the superiority of DDPG-PER algorithm used in the invention, two other schemes and an algorithm are designed as comparison, and the two schemes and the algorithm are respectively as follows: 1.RIS is not adopted but power control coefficients are optimized; 2. the power control coefficient scheme used by the invention is not adopted but RIS is adopted; 3. a conventional DDPG algorithm is used.
The superiority of the algorithm used and the system performance of the invention are verified by simulation experiments. The system simulation parameters were set as follows: the total number of the AP numbers M and the reconfigurable intelligent surface numbers R is 10, the number of users K=5, the number of eavesdroppers is 1, the number of antennas b=2 of each AP, the total number of reflection units of all the reconfigurable intelligent surfaces is 20-120, and the transmitting power p of the APs d =0 to 30dBm, a noise figure of 9dB, a system bandwidth of b=20 MHz, a pilot sequence length τ p Length of coherence interval τ =30 c Pilot transmit power p of legal user =200 u =0.1w, active eavesdropper pilot transmit power p E =0.1w, power amplifier efficiency α m Power consumption P of each antenna =0.4 tc,m =0.2W, fixed power consumption P per backhaul 0,m Flow dependent power consumption p=0.825W bt,m =0.25W/(Gbits), backhaul link capacity between mth AP and CPU
Figure BDA0004167191930000103
Safety rate threshold R th =0.2 bit/S, minimum spectrum requirement S for kth user ok =0.7 bit/s/Hz. The channels from the AP to all legal users, from the AP to the eavesdropper, and from the reconfigurable intelligent surface to the users are modeled as rayleigh channels, the channels from the AP to the reconfigurable intelligent surface are modeled as rice channels, the path loss at the reference distance of 1m is 30db, and the path loss indexes of the AP-RIS link, the RIS-user link, and the AP-user link are 2.2, 2.8, and 3.5, respectively.
Fig. 2 shows the energy efficiency versus transmit power under different schemes and algorithms. It can be seen that as the transmit power increases, the energy efficiency under different schemes and algorithms gradually increases and then stabilizes, because as the transmit power threshold increases, the power allocated by the AP to the user increases, resulting in an increase in energy efficiency. However, the threshold value of the transmitting power is further increased, the transmitting power reaches the optimal value, and the energy efficiency tends to be stable. In addition, under the same transmitting power, the scheme and algorithm of the invention are superior to other schemes and the traditional DDPG algorithm, because RIS reflection phase is optimized, thereby the legal user has high-efficiency transmission rate, and the priority mechanism is used, thereby the utilization rate of important experience of the algorithm can be improved.
Fig. 3 shows the energy efficiency versus the number of legitimate users. It can be seen that the energy efficiency increases monotonically with increasing number of users in the case of different numbers of APs and RIS, because multi-user diversity is exploited. It can be found that when the total number of APs and RIS is fixed, deploying more RIS can achieve better performance because the RIS is made up of a large number of passive components, which can reduce system power consumption, while the reconfigurable nature of the RIS improves the spectral efficiency of legitimate users. Furthermore, in FIG. 3, the total number of reflective elements for all RISs is unchanged, and when the number of users is fixed, the energy efficiency increases as the number of RISs increases, which illustrates the advantage of RISs employing distributed deployment.
Fig. 4 shows the relationship of the safety energy efficiency to the safety rate threshold under different schemes and algorithms. It can be seen that, as the safety rate threshold increases, the energy efficiency of the system is kept stable and then reduced, because under the constraint of smaller safety rate, the throughput of the user easily meets the constraint condition of the safety rate and remains unchanged, but as the safety rate threshold increases, the transmitting power needs to be increased to meet the constraint condition, so that the power consumption of the system is increased, and the energy efficiency is reduced. It can also be seen from fig. 4 that the use of the proposed solution can be better than other solutions in terms of safety and energy efficiency performance.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (9)

  1. The RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method is characterized in that the RIS-assisted honeycomb-free large-scale MIMO system comprises a central processing unit CPU, M multi-antenna APs, K single-antenna legal users, R reconfigurable intelligent surfaces with N reflecting units and a single-antenna active eavesdropper, and the method comprises the following specific steps of:
    s1, in an uplink training stage, the AP carries out channel estimation according to pilot sequences sent by legal users and active eavesdroppers to obtain channel state information and beam coefficients of the transmission data, and sends the channel state information and the beam coefficients to the CPU through a return link;
    s2, in a downlink transmission stage, the central processing unit CPU calculates interference vectors between legal users and between the legal users and an active eavesdropper according to the received channel state information, the received beam coefficients and the spatial relation among antennas, and sends the interference vectors to all APs and reconfigurable intelligent surfaces connected with the central processing unit CPU;
    and S3, the AP carries out power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user, the reconfigurable intelligent surface carries out phase adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user, and after adjustment, the AP and the reconfigurable intelligent surface send the transmission data to all legal users and active eavesdroppers connected with the AP and the reconfigurable intelligent surface.
  2. 2. The method for optimizing the energy efficiency of the RIS-assisted safe non-cellular massive MIMO system according to claim 1, wherein in S3, the specific steps of the access point performing phase adjustment and power adjustment according to the interference vector transmitted by the CPU and the transmission data of the legal user and the active eavesdropper are as follows:
    s3.1, setting related constraint conditions about the sending of transmission information by the AP and the reflection of the transmission information by the reconfigurable intelligent surface to legal users and eavesdroppers according to the interference vector and the transmission data;
    s3.2, establishing a total energy consumption model of the RIS-assisted safe honeycomb-free large-scale MIMO system;
    s3.3, combining the related constraint conditions and the total energy consumption model to obtain an optimization problem;
    s3.4, solving the optimization problem through a depth deterministic strategy gradient algorithm based on priority experience playback to obtain an optimal phase of the reconfigurable intelligent surface and an optimal power control coefficient of the AP;
    and S3.5, adjusting the phase of the reconfigurable intelligent surface according to the optimal phase, and adjusting the power of the AP according to the optimal power control coefficient.
  3. 3. The method for optimizing the energy efficiency of a safe non-cellular massive MIMO system with the aid of claim 2, wherein in S3.1, the relevant constraint conditions are:
    C 1 :
    Figure QLYQS_1
    C 2 :
    Figure QLYQS_2
    C 3 :R sec ≥R th
    C 4 :
    Figure QLYQS_3
    C 5 :
    Figure QLYQS_4
    C 6 :
    Figure QLYQS_5
    C 1 indicating that the backhaul link capacity of the system is not greater than the maximum capacity, which is the backhaul link capacity constraint; c (C) 2 Representing the value range of the power control coefficient; c (C) 3 The confidentiality rate of the intercepted legal user is not less than the minimum confidentiality rate, so that the legal user can be ensured to communicate safely; c (C) 4 Indicating that the total power control coefficient of the transmission of the AP is not largeAt 1, a transmit power constraint for the AP; c (C) 5 Representing that the frequency spectrum efficiency of each legal user is not lower than the minimum frequency spectrum efficiency, so as to ensure the communication quality of the legal users; c (C) 6 The gain magnitude of each element representing the reconfigurable smart surface is 1, which is a constraint on the modulus characteristic of the reflection coefficient of the reconfigurable smart surface, wherein,
    Figure QLYQS_6
    representing spectral efficiency, τ, of legitimate users c Is the length of the coherence interval τ p Is the pilot sequence length; η (eta) mk A power control coefficient representing the mth transmitted transmission signal to the kth user; θ r,n Representing the reflection coefficient, a, of the nth reflection unit on the nth reconfigurable intelligent surface m 1 indicates that the data rate of the mth forward link transmission should be a of the mth AP total achievable rate m Multiple of (I)>
    Figure QLYQS_7
    Represents the maximum capacity of the backhaul link between the mth AP and the CPU, R sec Indicating privacy rate of eavesdropped user, R th Representing a privacy rate threshold; s is S ok Representing the lowest spectral efficiency, M representing the total number of APs, K representing the total number of users, and N representing the total number of reflective elements in the reconfigurable intelligent surface.
  4. 4. A method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system according to claim 3, wherein the total energy consumption model of the RIS-assisted safe non-cellular massive MIMO system is:
    Figure QLYQS_8
    wherein P is m Is the power consumption generated by the amplifier and the circuit at the mth AP, including the power consumption of the transceiver link and the power consumption of the signal processing, P k Is the circuit loss of the kth user, P r,n Representing the r-th reconfigurable intelligenceLow power consumption, P, generated by the nth reflective element in the surface fh,m The power consumed for the backhaul link connecting the central processing unit CPU and the mth AP is used for transmitting data between the AP and the central processing unit CPU, M represents the total number of APs, K represents the total number of users, N represents the total number of reflective elements in the reconfigurable smart surface, and R represents the total number of reconfigurable smart surfaces.
  5. 5. The method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system of claim 4, wherein the total throughput of the RIS-assisted safe non-cellular massive MIMO system is:
    Figure QLYQS_9
    wherein B is the system bandwidth, S ek ({η mk ,θ r,n }) is the spectral efficiency per user, K is the total number of users.
  6. 6. The method for optimizing the energy efficiency of a RIS-assisted safe non-cellular massive MIMO system according to claim 5, wherein the power consumption P at the mth AP is generated by an amplifier and a circuit m The method comprises the following steps:
    Figure QLYQS_10
    wherein 0 < alpha m Less than or equal to 1 is the efficiency of the power amplifier, N 0 Is the noise power, b is the number of antennas of the AP, P tc,m Is the power consumption, ρ, required to operate the circuit components associated with each antenna on the AP d The maximum normalized transmit power for the AP.
  7. 7. The RIS-assisted safe non-cellular massive MIMO system energy efficiency optimization method according to claim 5, wherein the power P consumed by the backhaul link connecting the CPU and the mth AP fh,m The method comprises the following steps:
    P fh,m =P 0,m +B·S e ({η mkr,n })·P bt,m
    wherein P is 0,m Is the fixed energy consumption per backhaul, B is the system bandwidth, P bt,m Is the power consumption related to the flow, the unit is Watt/(Gbits/S), S e ({η mkr,n -j) is the overall spectral efficiency of the system;
    Figure QLYQS_11
  8. 8. the RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method according to claim 1, wherein the optimization problem is solved by adopting a DDPG-PER algorithm, and the DDPG-PER algorithm flow is specifically as follows:
    t1, initializing the parameters of an Actor network and a Critic network, setting a priority experience playback buffer zone, and defining learning rate parameters of the Actor and Critic, a reward attenuation coefficient, the number of training rounds and the number of training steps per round;
    t2, in each training round, generating an action value by the Actor network according to the current state value;
    t3, the action value obtained by the T2 is interacted with the environment to obtain a rewarding value, and a next state value is generated;
    t4, storing experience into a priority experience playback buffer, wherein the experience consists of a current state value, an action value, a reward value and a generated next state value;
    t5, sampling a batch of experiences with high priority from the priority experience playback buffer area according to the weight of each experience, and training an Actor network and a Critic network;
    t6, calculating the Q value of the value function in the current state, and updating the parameters of the cost function by using the Critic network;
    t7, using the Actor network and the current state value as inputs, calculating a value function Q value of the generated action value, and updating the Actor network parameter by using the value;
    t8, repeating the steps from T2 to T7 until a preset training step number is reached or a stopping condition is met;
    and T9, making a decision by using the trained Actor network to obtain a final strategy.
  9. 9. The RIS-assisted safe non-cellular massive MIMO system energy efficiency optimization method of claim 8, wherein,
    the state value is: signal-to-interference-noise ratio of legitimate users:
    Figure QLYQS_12
    signal-to-interference-noise ratio of active eavesdroppers: />
    Figure QLYQS_13
    The action value is as follows: a phase shift matrix θ and a power control coefficient matrix η;
    the prize value is: energy efficiency of the system.
CN202310367185.9A 2023-04-07 2023-04-07 RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method Pending CN116321236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310367185.9A CN116321236A (en) 2023-04-07 2023-04-07 RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310367185.9A CN116321236A (en) 2023-04-07 2023-04-07 RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method

Publications (1)

Publication Number Publication Date
CN116321236A true CN116321236A (en) 2023-06-23

Family

ID=86792585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310367185.9A Pending CN116321236A (en) 2023-04-07 2023-04-07 RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method

Country Status (1)

Country Link
CN (1) CN116321236A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116865798A (en) * 2023-07-06 2023-10-10 河北大学 Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116865798A (en) * 2023-07-06 2023-10-10 河北大学 Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system
CN116865798B (en) * 2023-07-06 2024-01-05 河北大学 Intelligent super-surface phase shift method for high-speed railway honeycomb removing large-scale MIMO system

Similar Documents

Publication Publication Date Title
Perera et al. Sum rate maximization in STAR-RIS assisted full-duplex communication systems
Pradhan et al. Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications
CN110149127B (en) NOMA technology-based precoding vector optimization method for D2D communication system
Wang et al. Intelligent reflecting surface assisted massive MIMO communications
Almasi et al. Joint beamwidth and power optimization in mmWave hybrid beamforming-NOMA systems
Khisa et al. Energy consumption optimization in ris-assisted cooperative rsma cellular networks
Gao et al. Resource allocation in IRSs aided MISO-NOMA networks: A machine learning approach
Deshpande et al. Resource allocation design for spectral-efficient URLLC using RIS-aided FD-NOMA system
Zhu et al. Load-aware dynamic mode selection for network-assisted full-duplex cell-free large-scale distributed MIMO systems
Zheng et al. Next-Generation RIS: From Single to Multiple Functions
Al-Habob et al. Latency minimization in phase-coupled STAR-RIS assisted multi-MEC server systems
Xie et al. Intelligent reflecting surface assisted wireless information and power transfer with X-duplex for 6G networks
CN116321236A (en) RIS-assisted safe honeycomb-free large-scale MIMO system energy efficiency optimization method
Zhang et al. Fairness Optimization for Intelligent Reflecting Surface Aided Uplink Rate-Splitting Multiple Access
Hao et al. Max-min security energy efficiency optimization for RIS-aided cell-free networks
Islam et al. Distributed power and admission control for cognitive radio networks using antenna arrays
Wan et al. Performance Analysis of Multi-UAV Aided Cell-Free Radio Access Network with Network-Assisted Full-Duplex for URLLC
CN111800217A (en) Full-duplex cognitive multi-input multi-output relay cooperation method under non-ideal channel state
Pala et al. Robust Design of RIS-aided Full-Duplex RSMA System for V2X communication: A DRL Approach
Li et al. Multigroup multicast transmission via intelligent reflecting surface
Lu et al. Energy-efficient beamforming design for cooperative double-IRS aided multi-user MIMO
Chen et al. Reconfigurable intelligent surface assisted D2D networks: Power and discrete phase shift design
Zhu et al. Multiuser Communication Aided by Movable Antenna
Zhang et al. Massive-mimo based statistical qos provisioning for murllc over 6g uav mobile wireless networks
CN111404588A (en) Physical layer secure transmission method of full-duplex cognitive eavesdropping network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination