CN116822329A - Decision method for multi-user power control in wireless network - Google Patents

Decision method for multi-user power control in wireless network Download PDF

Info

Publication number
CN116822329A
CN116822329A CN202310529999.8A CN202310529999A CN116822329A CN 116822329 A CN116822329 A CN 116822329A CN 202310529999 A CN202310529999 A CN 202310529999A CN 116822329 A CN116822329 A CN 116822329A
Authority
CN
China
Prior art keywords
user
power control
power
wolf
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310529999.8A
Other languages
Chinese (zh)
Inventor
贾文生
刘露萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou University
Original Assignee
Guizhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou University filed Critical Guizhou University
Priority to CN202310529999.8A priority Critical patent/CN116822329A/en
Publication of CN116822329A publication Critical patent/CN116822329A/en
Pending legal-status Critical Current

Links

Landscapes

  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a decision method for multi-user power control in a wireless network, which comprises the following steps: modeling a multi-user power control problem to be a problem of n-person non-cooperative game, and simultaneously proving the existence of Nash balance of the multi-user power control game; by introducing an average field term into gradient rising containing Wolf criterion, designing an average field gradient rising Wolf-MFGA algorithm containing Wolf criterion; the WoLF-MFGA algorithm allows the office to update the strategy based on current gradients and variable learning rates; the invention models the multi-user power control game problem as an n-person non-cooperative game.

Description

Decision method for multi-user power control in wireless network
Technical Field
The invention relates to the technical field of multi-user power control games, in particular to a decision method for multi-user power control in a wireless network.
Background
The gradient descent algorithm is an optimization problem solving method that gradually iterates and gradually reduces the loss function value, thereby minimizing the loss function value. In the real world, players typically do not exhibit "full rationality". This may be due to the fact that the office's decision is myopic, the decision update process is progressive and the computational power is limited, this feature of the decision is called "limited rationality". Studies have shown that naive expectations and regulation of myopia can well represent human behavior in selecting decisions without anticipating future. In this regard, gradient learning-based methods are used as a decision to study persons in the middle office in limited rational widespread models. The gradient rising algorithm is a common matrix game learning related algorithm, in the general and game of two people and two actions, singh and the like prove that under the condition of infinitesimal step sizes, the strategies of each person in the office are converged to the Nash balance or the average return in the whole process is converged to the limit value of the expected return of the Nash balance. Then, bolwing et al put forward an infinitesimal Wolf gradient rising algorithm by introducing a variable learning rate (or Wolf criterion) into the infinitesimal gradient rising algorithm, namely, when the in-office person wins, the learning rate value is smaller, and when the in-office person falls out, the learning rate is larger, and prove that the algorithm converges to Nash balance in all double-matrix games. Zinkivich et al propose a generalized infinite small gradient ascent algorithm that extends the infinite small gradient ascent algorithm to gaming with two or more strategies, with universal consistency. For general and random gaming of two actions, banerjee et al propose a policy-based dynamic WoLF learning algorithm. However, since the gradient ascent algorithm requires fully known opponent strategies and is difficult to extend to multi-person multi-action general and gaming, the algorithm cannot be applied to a large number of practical problems. To solve these problems, the average field term is an effective method. When the number of persons in the office is high, the behaviors of all other persons in the office are packaged in an average field item, and each person in the office only needs to know the value of the average field item when deciding, which is the overall influence of the policy of the person in the office on the profit function of the person in the office, and not the individual policy of the person in the other offices. Gradient learning is used for different types of gaming Nash balance implementations, but there is currently no one for analyzing n-person non-cooperative gaming and its applications.
One of the key issues in wireless network systems is multi-user power control. The main purpose of power control is to provide each signal with sufficient quality without causing unnecessary interference with other signals, i.e. to achieve maximum utility per user (office man). In recent years, the problem of multiuser power control in wireless networks has been closely linked to game theory. MacKenzie et al demonstrate that gaming theory is a suitable tool to address various problems in wireless network communication systems. Yu et al have studied the multiuser power control problem in frequency selective interference channels and proposed an Iterative water-filtering (IWF) algorithm to effectively reach its Nash balance. Further, the convergence of the IWF algorithm indicates that the Nash balance of the two-action matrix game is unique. Yamashita et al expressed the multi-user power control problem of the digital subscriber line as a nonlinear complementary problem, and used a Newton-type smoothing (NS) method to effectively calculate the Nash equilibrium solution, and found that the NS algorithm is more robust to strong interference than the existing synchronous IWF algorithm. In addition, meshkati et al use a game theory model to study power control in a multi-carrier code division multiple access system and propose an iterative distributed algorithm to achieve the existence and uniqueness of power control game Nash balance. Hao et al designed a joint channel allocation and power control optimization algorithm based on non-cooperative game to reduce wireless sensor network interference and balance network energy consumption and further demonstrate that the algorithm converges to Nash balance. Meanwhile, many students have designed different learning algorithms for analyzing multi-user power control games in wireless networks. Goodman and Mandayam refer to power control in wireless data transmission networks and propose an algorithm where the pricing function is proportional to the transmit power to achieve individual maximization of utility for each user. Luo et al propose convergence analysis of IWF algorithms under more realistic channel setup and arbitrary user numbers for multi-user power control in digital subscriber lines. He et al propose a kind of projective neural network to solve the Nash balance of multi-user power control optimization problem in modern digital subscriber line. Tao et al propose a centralized power control algorithm based on projection gradient design for solving multi-user power control gaming in cognitive radio networks. In the literature, non-cooperative game-based power control algorithms are proposed. Gulzar et al propose an adaptive power control algorithm to address the power control problem of cognitive radio networks. Taghizadeh et al propose an average field gradient rising learning algorithm to solve Nash equalization of n-person non-cooperative games in blockchain network mining and obtain sufficient conditions for convergence of the algorithm.
Over the past two decades, wireless communication has become a widespread field of application and research, including cellular networks, satellite networks, wireless mesh networks, cognitive radio networks, wireless self-organizing and sensor networks, and the like.
Disclosure of Invention
In order to solve the problems in the prior art, the invention aims to provide a decision method for multi-user power control in a wireless network.
In order to achieve the above purpose, the invention adopts the following technical scheme: a decision method for multi-user power control in a wireless network, comprising the steps of:
step 1, modeling a multi-user power control problem to be a problem of n-person non-cooperative game, and simultaneously proving the existence of Nash balance of the multi-user power control game;
step 2, designing an average field gradient rising Wolf-MFGA algorithm containing a Wolf criterion by introducing an average field term into gradient rising containing the Wolf criterion; the WoLF-MFGA algorithm allows the office to update the strategy based on the current gradient and variable learning rate.
As a further improvement of the present invention, the step 1 is specifically as follows:
assuming that all users are homogenous and that the wireless power usage unit price is the same for each user, the primary purpose of each user is to maximize their own overall utility; because each user transmits power with randomness, the user's chances to win in each round of transmitting power is equal to the ratio of its transmit power to the total capacity of the wireless power:
wherein n= {1, …N is the set of all users, n represents the number of all users, P i suc. Is the probability distribution and x of the i-th user's successful power distribution i Is the transmission power of user i on the carrier; let x be i ∈X i WhereinAnd->
Defining a utility function suitable for data application; defining a user/utility function as the ratio of its throughput to the transmitted power:
wherein q is i Is the transmission power of the ith user, T i Indicating the throughput of the ith user, which refers to the net information bit number of error-free transmission in unit time, namely effective transmission; throughput is expressed as:
where L is the number of signal bits and M is the total number of bits of the data packet; r is R i And l i The transmission rate and the signal-to-interference-and-noise ratio of the ith user; f (l) i ) Is a utility function of successful power transmission, i.e., probability of error-free reception of the transmitted power;
assuming that a transmission terminal has one or more bit errors, it will be retransmitted; f (l) i ) For the purpose of increasing, continuous, along with the S-type function of f (+) 1, and f (0) =0 is required to ensure p i A when=0 i =0;
Considering the cost of transmitting power, the payment function of user i can be expressed as:
wherein p is i Is the unit price of the ith user to transmit power, E i Representing expected revenue obtained from successful power transmission by user i;
modeling the multi-user power control problem as an n-person non-cooperative game, by selecting how much power to transmit in the wireless network to maximize the utility of each user, the competing relationship between the multiple users is modeled as a model of the multi-user power control game based on the user set, the strategy set and the payment function in the game model:
Γ=<N,X i ,J i > i∈N
as a further improvement of the present invention, the step 2 is specifically as follows:
the gradient strategy update formula is as follows:
wherein the method comprises the steps ofFor variable learning rate parameters, +.>Is a constant; at the kth iteration, the step size of the algorithm in the gradient direction is +.>Wherein->The update rule of (2) is as follows:
using the average field game theory, the behavior of a large number of users is analyzed by encapsulating the behavior of the large number of users into average field terms, which are used to estimate, the update formula is as follows:
wherein Z (k) is the average field term of the kth iteration;
the average field term is brought into a formula gradient strategy updating formula, and the probability distribution of successful power transmission of the ith user is obtained as follows:
the payment function for user i is now rewritten as:
the rewritten payment function is brought into a gradient strategy updating formula, and the gradient learning strategy updating formula is obtained again as follows:
the strategy update of each user is obtained through a function of an average field term, and an update equation learned by gradient rise of the WoLF average field term is rewritten into an invariant point iterative formula as follows:
in the multiuser power control game Γ, each user encapsulates the effects of the other users with the average field term, and ignores its own effects on the average field term.
Because the strategies of all persons in the bureau affect the expected utility of the persons in other bureaus through the success probability, the invention models the power control problem among users in the wireless network as an n-person non-cooperative game to analyze the strategy behaviors of the persons in the bureau. The Nash equilibrium point for this game exists and is unique. The invention considers the gradient learning strategy of the persons in the office, and simultaneously reserves the privacy information of the persons in the office as a finite rational learning model. Thus, to achieve the Nash equilibrium for this game, a new variable learning rate-containing average field gradient-rise (WolF-MFGA) algorithm is proposed by introducing variable learning rate and average field terms into the gradient-rise algorithm. Furthermore, the variable gradient based on multi-user decision behavior ensures that low cost power control between users can be achieved. The method is a new method and provides an effective method for the equilibrium analysis of the power control game in the wireless network
The beneficial effects of the invention are as follows:
1. the invention models the multi-user power control problem as n-person non-cooperative game, and utilizes some existing analysis tools to prove the Nash balance existence of the multi-user power control game, namely at least one hybrid strategy Nash balance exists in the game in a strategy space.
2. The WoLF-MFGA gradient ramp-up algorithm is designed by combining the average field term with a gradient ramp-up algorithm with variable learning rate. In addition, the full condition of the convergence of the Wolf-MFGA algorithm is given by using the Banach compression invariant point theorem.
3. In order to realize the Nash balance of the multi-user power control game, some simulation results show that the Wolf-MFGA algorithm can better approximate to the optimal response algorithm. In order to analyze the convergence of the proposed algorithm, the sensitivity results of the WoLF-MFGA algorithm were studied.
Drawings
FIG. 1 is a diagram showing a multi-user power control gaming process in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of the overall utility of the Wolf-MFGA algorithm and BR algorithm in the multi-user power control game under different learning parameters according to the embodiment of the present invention;
FIG. 3 is a schematic diagram of the overall utility of the Wolf-MFGA algorithm and BR algorithm for multi-user power control gaming under different Ai distributions in an embodiment of the present invention;
FIG. 4 shows the calculation of Wolf-MFGA in the example of the present inventionThe method and BR algorithm are at different q i A total utility schematic diagram of the multi-user power control game under distribution;
FIG. 5 is a schematic diagram illustrating the evolution of the total utility under different learning rate parameters according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating the evolution of the total utility under different wireless network scale parameters according to an embodiment of the present invention;
FIG. 7 is a diagram of user A in an embodiment of the invention i Schematic of the increase ratio at 400 generations.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Examples
The main objective of this embodiment is to explore the policy decisions of multi-user power control in wireless networks from the point of view of average field game theory.
Firstly, modeling the multi-user power control problem is an n-person non-cooperative game, and simultaneously proving the existence of the balance of the multi-user power control game Nash. Then, by introducing an average field term into the gradient rise containing the Wolf criterion, a new average field gradient rise (Wolf-MFGA) algorithm containing the Wolf criterion is designed. The WoLF-MFGA algorithm allows the office to update the strategy based on the current gradient and variable learning rate. Subsequently, the present example investigated the condition of sufficient convergence of the proposed WoLF-MFGA algorithm. In addition, the Wolf-MFGA algorithm is used for realizing Nash equalization of the multi-user power control game, and some simulation results are given. Finally, in order to analyze the convergence of the proposed algorithm, the present example investigated the sensitivity results of the WoLF-MFGA algorithm.
A decision method for multi-user power control in a wireless network considers the decision problem of multi-user power control game in the wireless network, and can model the multi-user power control game problem as an n-person non-cooperative game. Because gradient rising ensures that low-cost and high-benefit power control can be realized between multi-user decision behaviors, designing Nash balance based on variable learning rate average field gradient rising (Wolf-MFGA) algorithm to realize n-person non-cooperative game is one of the research hotspots worthy of attention.
1. N-person non-cooperative gaming in multi-user power control problem:
assuming that all users are homogenous and that the wireless power usage unit price is the same for each user, the primary purpose of each user is to maximize their own overall utility. Because of the randomness of the power transmitted by each user, the chance that a user wins in each round of transmitted power is equal to the ratio of its transmitted power to the total capacity of the wireless power,
where n= {1, …, N } is the set of all users, N represents the number of all users,is the probability distribution and x of the i-th user's successful power distribution i Is the transmission power of user i on the carrier. Let x be i ∈X i Wherein->And
the above equation represents the winning probability of an individual user, and the expected portion of the radio network that the user receives from each carrier can be represented in the same way. This is because users who are part of the radio network receive a proportion of rewards from the carrier transmission power, their transmission power being proportional to the proportion of the wireless transmission power.
In order to model the power control problem as an n-person non-cooperative game, it is first necessary to define a utility function suitable for data applications. Obviously, a higher signal-to-interference-and-noise ratio level output by the receiver will generally result in a lower bit error rate, and thus higher throughput. However, achieving higher sir levels requires the ue to transmit at higher power, which in turn results in lower battery life. By defining the user\utility function as the ratio of its throughput to the transmitted power, this trade-off relationship can be quantified, namely:
wherein q is i Is the transmission power of the i-th user. T (T) i The throughput of the i-th user is expressed as the number of net information bits transmitted without errors per unit time (sometimes simply referred to as effective transmission). Throughput can be expressed as:
where L is the number of signal bits and M is the total number of bits of the data packet. R is R i And l i The transmission rate and the signal-to-interference-and-noise ratio of the ith user, respectively. f (l) i ) Is a utility function of the successful transmission power, i.e. the probability of receiving the transmission power without error.
It is assumed that a transmission terminal has one or more bit errors, which it will be retransmitted. f (l) i ) For the purpose of increasing, continuous, along with the S-type function of f (+) 1, and f (0) =0 is required to ensure p i A when=0 i =0。
By taking into account the cost of transmitting power, the payment function of user i can be expressed as:
wherein p is i Is the unit price (electricity fee and other maintenance fee) of the ith user's transmitted power, E i Indicating the expected revenue that would be obtained from a successful transmission of power by user i. In a multi-user power system, each user may be affected by other user policies, and it is difficult for each user to collect state information of all other users under limited perceptibility. This results in the present embodiment taking into account the establishment of a unified price calculated by the terminal to determine how much power to transmit, whichThe medium price is the same unit of measure as the utility function uses. The multi-user power control problem is modeled as an n-person non-cooperative game that maximizes the utility of each user by selecting how much power to transmit in the wireless network. Thus, the competition relationship between multiple users can be modeled as a "multi-user power control game" based on the user set, policy set, and payment function in the game model:
Γ=<N,X i ,J i > i∈N ,
the multi-user power control problem is modeled as an n-player non-cooperative game and the multi-user power control game process is shown in fig. 1.
2. Wolf-MFGA algorithm:
in the real world, people in the office are often of limited rationality, i.e. lack of perspective. Thus, the gradient learning method is used as a general model with limited rationality for analyzing the decision-making behavior of persons in the bureau, and the gradient strategy update formula is as follows:
wherein the method comprises the steps ofIs a variable learning rate parameter (Wolf criterion), +.>Is a constant. At the kth iteration
Instead, the step length of the algorithm in the gradient direction isThe main idea of the Wolf criterion is to slowly and carefully adjust the learning rate when the game of the person in the office wins, and to quickly learn when the person in the office falls out or performs poorly, wherein +.>The update rule of (2) is as follows:
because the payment of user i is affected by other user policies, solving the Nash balance of multi-user power control gaming is neither realistic nor computable when faced with a large number of users. At this time, the conventional analysis method may face the problem of high computational complexity and even difficult computation, but the average field game theory helps to analyze the behaviors of a large number of users by encapsulating the behaviors of a large number of users into average field terms, and the average field terms can be used for estimation, and the update formula is as follows:
where Z (k) is the average field term for the kth iteration. One basic assumption behind average field gaming is that the number of people in the game is sufficiently large that the variation of the people strategy in a single game does not significantly change the value of the benefits of average field gaming. This assumption is valid in wireless network multi-user power control gaming and the number of users is sufficiently large. Thus, each user only has to estimate the total transmitted power of the network, i.e. the average field term Z (k), at each iteration.
Bringing formula (2) into formula (1) to obtain the probability distribution of successful power transmission of the ith user as follows:
the payment function for user i is now rewritten as:
bringing the formula (3) into the formula (1), and recovering a gradient learning strategy updating formula as follows:
it is apparent that the policy updates for each user need to be obtained as a function of the average field term. Thus, the updated equation (3) learned by Wolf average field term gradient rise (Wolf-MFGA) can be rewritten as the dead point iteration formula as follows:
in the multiuser power control game Γ, each user encapsulates the effects of the other users with the average field term, and ignores its own effects on the average field term, so the average field balance is different from the optimal response balance.
In summary, the WoLF-MFGA algorithm is designed by introducing the WoLF criterion and the mean field term into the gradient-lifting algorithm. The WoLF-MFGA algorithm is shown in flow algorithm 1, which shows alternate updates between average field terms and user policies.
3. Numerical simulation:
the multi-user power control game is modeled as an n-person non-cooperative game, and the proposed variable learning rate average field gradient rise (Wolf-MFGA) learning algorithm is evaluated through numerical simulation. The Wolf-MFGA algorithm is compared with the Best Response (BR) algorithm, which is a good evaluation scheme, first considering the complete information of the office's human payment function. Secondly, the sensitivity analysis of the Wolf-MFGA algorithm to parameters is considered, and the convergence of the Wolf-MFGA algorithm is researched under the dynamic condition of an actual wireless network.
The personnel in the office updates the strategy iteration of the personnel in the office according to the optimal response function of the personnel in the office and the algorithm 1. For simplicity, all simulation parameter settings are shown in table 1 unless otherwise specified.
Parameter settings in the algorithm of Table 1
The parameter selection takes multi-user power control problem in the wireless network as an example. A is that i Indicating the utility of the ith user to successfully transmit power. To illustrate accessibility, ease of consumption, and energy value of a power control system, assume an expected utility A i The average distribution is in the range of 70,000 to 90,000. P is p i Representing the unit price of the transmitted power, assuming that the unit price of the power transmission is uniformly distributed in the range of 1 to 9. All simulations were run 6 times and the average results were described.
Comparing the WoLF-MFGA algorithm with the Best Response (BR) algorithm:
in the following figures, the overall utility obtained by the variable learning rate average field gradient rise (WoLF-MFGA) algorithm and the Best Response (BR) algorithm in wireless network power control is compared. The total utility is defined as the sum of all user benefits. In all of the figures of the drawings,is small enough to ensure convergence of the proposed WoLF-MFGA algorithm. The total utility is calculated in the wireless network and converges to the Nash balance of the corresponding n-person non-cooperative game.
FIG. 2 is a diagram showing the overall utility of the Wolf-MFGA algorithm and the BR algorithm for multi-user power control gaming under different learning parameters, whereinAs can be seen from fig. 2, the total utility of the WoLF-MFGA algorithm and the best response algorithm is determined by all users. Variable learning rate parameter->Set to 0.3,0.5 and 0.8, respectively. Since best response learning is the theoretically best solution, it is likely to achieve higher utility than the proposed WoLF-MFGA algorithm. When (when)Smaller, wolf-MFGA learning can better approximate best response learning when +.>Larger, it has little effect on the overall utility of the WoLF-MFGA algorithm, but the best response fluctuates more, meaning that the best response learning method is not stable. However, the power control game balancing solution in the wireless network is very close to the solution of the optimal response, taking into account the incomplete information of the users in the competitor model. The main reason for the fluctuations is the randomness of the selected parameters. It is clear that the total utility value of the Wolf-MFGA algorithm is very stable no matter how the learning parameters are changed, that is to say, the Wolf-MFGA algorithm is insensitive to the parameters and has good stability. In summary, the equilibrium solution of the WoLF-MFGA algorithm is as close as possible to the optimal solution for multi-user power control gaming. Then, as the number of users increases, the competing actions of the users in the network become more intense, resulting in a decrease in overall utility.
FIG. 3 is a diagram of the overall utility of the Wolf-MFGA algorithm and BR algorithm for multi-user power control gaming with different Ai distributions; comparing the different ranges of A is shown by FIG. 3 i Is a total utility of (2). A is that i The value intervals of (a) are [20000,40000 ]],[40000,60000],[60000,80000],[80000,100000],[100000,120000]And [120000,140000 ]]. Suppose in each case, the user's reward A i Evenly distributed within a given range. Then, with A i The ith user successfully transmits power and gets rewards, resulting in a higher overall utility of the overall wireless network. Due to the benefits of the user and A thereof i The relationship between them is linear, so the overall utility is almost as the A i Linearly increasing.
FIG. 4 shows the Wolf-MFGA algorithm and the BR algorithm at different q i The overall utility of the distributed multi-user power control game; FIG. 4 shows for different q i Total utility of the range wireless network to the WoLF-MFGA algorithm and the BR algorithm. In each case, q i Are uniformly distributed within a given range. q i Unit price corresponding to wireless power controlIt is readily seen that the overall utility follows q i And decreases with increasing numbers. At the same time, the total utility representation is relative to q i Range of reciprocal behavior.
Sensitivity of the WoLF-MFGA algorithm convergence:
in order to analyze the convergence of the proposed algorithm, the present implementation studied some parametersN and A i Is a sensitive result of (a). In all simulations, the range of iterations was chosen large enough to show the convergence of the proposed WoLF-MFGA algorithm.
Figure 5 is the evolution of the overall utility at different learning rate parameters,equal to 0.1,0.3,0.5,0.7,0.8 and 1; as can be seen from FIG. 5, when the variable learning rate parameter +.>When a small change occurs, the smaller the value of the learning rate change is, the slower the convergence speed of the proposed WoLF-MFGA algorithm is. However, when the variable learning rate is changed greatly, this means that the fluctuation of the user total utility value is small, and the convergence speed of the WoLF-MFGA algorithm is fast.
Fig. 6 is an evolution of the overall utility under different wireless network scale parameters, set up as 100,300,600,900,1200 and 1400; fig. 6 shows that as the number of iterations increases, users of different sizes in the WoLF-MFGA algorithm tend to stabilize points in the wireless network. The more users in the wireless network multi-user power control game, the higher the total utility, and conversely, the fewer the number of wireless network users, the lower the total utility.
As can be seen from fig. 7, assume a i =7×10 4 ,p i -uniform (1, 9). After about 100 iterations, the total utility may converge. For analysis of the proposed Wolf-MFGA algorithm with parameter a i Behavior upon change, thus A i The increase was 60% at 400 iterations. This abrupt change in parameter values causes the wireless network system to move to oneA new balance.
In this embodiment, the behavior decision of the multiuser power control problem in the wireless network is studied from the point of view of the average field game theory. Aiming at the n-person non-cooperative game in multi-user power control, a variable learning rate average field gradient rising (Wolf-MFGA) algorithm is designed by introducing a variable learning rate (Wolf criterion) and average field item method into the gradient rising algorithm, and the variable learning rate average field gradient rising (Wolf-MFGA) algorithm is applied to realizing Nash balance of the n-person non-cooperative game. The convergence of the Wolf-MFGA is proved by a nonlinear analysis tool, a sufficient condition for the convergence of the Wolf-MFGA algorithm is provided, and simulation experiments show that the Wolf-MFGA algorithm can better approximate the accurate solution of a game, namely the accurate solution corresponding to the optimal response algorithm by adjusting a plurality of proper parameters. In order to verify the convergence of the proposed algorithm, the Wolf-MFGA algorithm is applied to achieve Nash balance of wireless network multi-user power control games, and the sensitivity of the algorithm to parameters is analyzed. Furthermore, the analysis method provided in this embodiment extends to other scenarios, such as shared resources, multi-agent collaboration, blockchain and network economy, etc., and is considered to be a future effort.
The foregoing examples merely illustrate specific embodiments of the invention, which are described in greater detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (3)

1. A decision method for multi-user power control in a wireless network, comprising the steps of:
step 1, modeling a multi-user power control problem to be a problem of n-person non-cooperative game, and simultaneously proving the existence of Nash balance of the multi-user power control game;
step 2, designing an average field gradient rising Wolf-MFGA algorithm containing a Wolf criterion by introducing an average field term into gradient rising containing the Wolf criterion; the WoLF-MFGA algorithm allows the office to update the strategy based on the current gradient and variable learning rate.
2. The decision-making method for multi-user power control in a wireless network according to claim 1, wherein said step 1 is specifically as follows:
assuming that all users are homogenous and that the wireless power usage unit price is the same for each user, the primary purpose of each user is to maximize their own overall utility; because each user transmits power with randomness, the user's chances to win in each round of transmitting power is equal to the ratio of its transmit power to the total capacity of the wireless power:
where n= {1,..n } is a set of all users, N represents the number of all users, P i suc. Is the probability distribution and x of the i-th user's successful power distribution i Is the transmission power of user i on the carrier; let x be i ∈X i WhereinAnd is also provided with
Defining a utility function suitable for data application; defining a user/utility function as the ratio of its throughput to the transmitted power:
wherein q is i Is the transmission power of the ith user, T i Indicating the throughput of the ith user, which refers to the net information bit number of error-free transmission in unit time, namely effective transmission; throughput is expressed as:
where L is the number of signal bits and M is the total number of bits of the data packet; r is R i And l i The transmission rate and the signal-to-interference-and-noise ratio of the ith user; f (l) i ) Is a utility function of successful power transmission, i.e., probability of error-free reception of the transmitted power;
assuming that a transmission terminal has one or more bit errors, it will be retransmitted; f (l) i ) For the purpose of increasing, continuous, along with the S-type function of f (+) 1, and f (0) =0 is required to ensure p i A when=0 i =0;
Considering the cost of transmitting power, the payment function of user i can be expressed as:
wherein p is i Is the unit price of the ith user to transmit power, E i Representing expected revenue obtained from successful power transmission by user i;
modeling the multi-user power control problem as an n-person non-cooperative game, by selecting how much power to transmit in the wireless network to maximize the utility of each user, the competing relationship between the multiple users is modeled as a model of the multi-user power control game based on the user set, the strategy set and the payment function in the game model:
Γ=<N,X i ,J i > i∈N
3. the decision-making method for multi-user power control in a wireless network according to claim 2, wherein said step 2 is specifically as follows:
the gradient strategy update formula is as follows:
wherein the method comprises the steps ofFor variable learning rate parameters, +.>Is a constant; at the kth iteration, the step size of the algorithm in the gradient direction is +.>Wherein->The update rule of (2) is as follows:
using the average field game theory, the behavior of a large number of users is analyzed by encapsulating the behavior of the large number of users into average field terms, which are used to estimate, the update formula is as follows:
wherein Z (k) is the average field term of the kth iteration;
the average field term is brought into a formula gradient strategy updating formula, and the probability distribution of successful power transmission of the ith user is obtained as follows:
the payment function for user i is now rewritten as:
the rewritten payment function is brought into a gradient strategy updating formula, and the gradient learning strategy updating formula is obtained again as follows:
the strategy update of each user is obtained through a function of an average field term, and an update equation learned by gradient rise of the WoLF average field term is rewritten into an invariant point iterative formula as follows:
in the multiuser power control game Γ, each user encapsulates the effects of the other users with the average field term, and ignores its own effects on the average field term.
CN202310529999.8A 2023-05-11 2023-05-11 Decision method for multi-user power control in wireless network Pending CN116822329A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310529999.8A CN116822329A (en) 2023-05-11 2023-05-11 Decision method for multi-user power control in wireless network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310529999.8A CN116822329A (en) 2023-05-11 2023-05-11 Decision method for multi-user power control in wireless network

Publications (1)

Publication Number Publication Date
CN116822329A true CN116822329A (en) 2023-09-29

Family

ID=88123050

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310529999.8A Pending CN116822329A (en) 2023-05-11 2023-05-11 Decision method for multi-user power control in wireless network

Country Status (1)

Country Link
CN (1) CN116822329A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120468A (en) * 2015-07-13 2015-12-02 华中科技大学 Dynamic wireless network selection method based on evolutionary game theory
CN107949025A (en) * 2017-11-02 2018-04-20 南京南瑞集团公司 A kind of network selecting method based on non-cooperative game
US20210143639A1 (en) * 2019-11-08 2021-05-13 Global Energy Interconnection Research Institute Co. Ltd Systems and methods of autonomous voltage control in electric power systems
CN114449536A (en) * 2022-01-27 2022-05-06 重庆邮电大学 5G ultra-dense network multi-user access selection method based on deep reinforcement learning
CN115456659A (en) * 2022-08-04 2022-12-09 贵州大学 Novel hybrid strategy electric power dynamic pricing scheme

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120468A (en) * 2015-07-13 2015-12-02 华中科技大学 Dynamic wireless network selection method based on evolutionary game theory
CN107949025A (en) * 2017-11-02 2018-04-20 南京南瑞集团公司 A kind of network selecting method based on non-cooperative game
US20210143639A1 (en) * 2019-11-08 2021-05-13 Global Energy Interconnection Research Institute Co. Ltd Systems and methods of autonomous voltage control in electric power systems
CN114449536A (en) * 2022-01-27 2022-05-06 重庆邮电大学 5G ultra-dense network multi-user access selection method based on deep reinforcement learning
CN115456659A (en) * 2022-08-04 2022-12-09 贵州大学 Novel hybrid strategy electric power dynamic pricing scheme

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨峻楠;张红旗;张传富;: "基于随机博弈与改进WoLF-PHC的网络防御决策方法", 计算机研究与发展, no. 05, 15 May 2019 (2019-05-15), pages 38 - 50 *

Similar Documents

Publication Publication Date Title
Qiu et al. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications
CN109862610B (en) D2D user resource allocation method based on deep reinforcement learning DDPG algorithm
Huang et al. Distributed interference compensation for wireless networks
CN103796211B (en) Joint Power and method for channel allocation in a kind of cognition wireless network
CN105120468A (en) Dynamic wireless network selection method based on evolutionary game theory
CN110225525B (en) Cognitive radio network-based spectrum sharing method, device and equipment
CN114698128B (en) Anti-interference channel selection method and system for cognitive satellite-ground network
CN107094060A (en) Distributed super-intensive heterogeneous network disturbance coordination method based on non-cooperative game
CN107105453B (en) Cut-in method is selected based on the heterogeneous network of analytic hierarchy process (AHP) and evolutionary game theory
CN112492686B (en) Cellular network power distribution method based on deep double Q network
CN113312177B (en) Wireless edge computing system and optimizing method based on federal learning
CN107172701A (en) A kind of power distribution method of non-orthogonal multiple access system
CN108880709B (en) Distributed multi-user dynamic spectrum access method in a kind of cognition wireless network
Jiang et al. Dynamic user pairing and power allocation for NOMA with deep reinforcement learning
CN115310360A (en) Digital twin auxiliary industrial Internet of things reliability optimization method based on federal learning
CN113795050B (en) Sum Tree sampling-based deep double-Q network dynamic power control method
CN115766089A (en) Energy acquisition cognitive Internet of things anti-interference optimal transmission method
Huang et al. A game theoretic analysis of distributed power control for spread spectrum ad hoc networks
CN106027214A (en) Pilot distribution method of multi-cell large-scale MIMO system
Zhang et al. Wireless service pricing competition under network effect, congestion effect, and bounded rationality
CN114126021A (en) Green cognitive radio power distribution method based on deep reinforcement learning
CN114051252A (en) Multi-user intelligent transmitting power control method in wireless access network
CN116302569B (en) Resource partition intelligent scheduling method based on user request information
CN116822329A (en) Decision method for multi-user power control in wireless network
Nguyen et al. Utility optimization for blockchain empowered edge computing with deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination