CN110933679B

CN110933679B - Robust D2D power control method under probability-based active eavesdropping

Info

Publication number: CN110933679B
Application number: CN201911267451.0A
Authority: CN
Inventors: 王金龙; 罗屹洁; 杨旸; 龚玉萍; 崔丽; 童晓兵
Original assignee: Army Engineering University of PLA
Current assignee: Army Engineering University of PLA
Priority date: 2019-12-11
Filing date: 2019-12-11
Publication date: 2023-05-02
Anticipated expiration: 2039-12-11
Also published as: CN110933679A

Abstract

A robust D2D power control method under probability-based active eavesdropping comprises S1, initializing system parameters; s2, inner layer circulation: each D2D user updates the power selection probability of the next time slot according to a random learning method, and selects a discrete power value from a discrete power set according to the power selection probability; s3, outer layer circulation: the base station randomly selects an interference cost factor, transfers to an inner layer circulation, and then updates the selection of the interference cost factor of the next time slot according to a better strategy; until the outer layer circulation meets the convergence condition, and the circulation is ended; and D2D users perform power control according to the power selection probability of the inner layer circulation output at the current moment. The robust D2D power control method based on random learning, disclosed by the invention, has the advantages that the safety performance of a physical layer of a cellular network is improved, the requirement of D2D user data transmission is met, and meanwhile, the influence of the attack probability of an active eavesdropper on the total safe reachable rate of the system is considered.

Description

Robust D2D power control method under probability-based active eavesdropping

Technical Field

The invention belongs to the technical field of power control in wireless communication, in particular to a D2D robust power control method based on random learning under probability active eavesdropping, and particularly relates to a method for improving the safety performance of a physical layer of a D2D cellular network by adopting a power update strategy based on random learning automaton and better response.

Background

D2D communication is direct communication by cellular users in a mobile communication system without being forwarded through a base station. Communication between D2D devices is more likely to be eavesdropped due to the openness of the wireless channel and the size and hardware limitations of the D2D devices. According to the theory of information theory, when the channel gain between legal transceivers is better than the channel gain between legal transmitters and eavesdroppers, the safety of the transmitted information can be perfectly ensured, and the information cannot be intercepted and decoded by the eavesdroppers to obtain. For the cellular mobile communication network, the mutual interference of the D2D user and the cellular user can be reasonably utilized to improve the total physical layer security performance of the system. Meanwhile, the safety performance of the physical layer of the cellular user is ensured, and meanwhile, the communication requirement of the D2D user can be considered.

Considering that an attacker existing in the system is not just like a traditional passive eavesdropper to passively eavesdrop on information sent by legal users, but can passively eavesdrop or actively interfere with the probability-based active eavesdropper, and the attack strategy is randomly selected with a certain probability, the attacker brings greater aggressiveness and destructiveness, and the power control of D2D communication in the mobile cellular network is extremely challenged.

Disclosure of Invention

The invention provides a robust D2D power control method based on random learning aiming at the situation that an active eavesdropper based on probability attack exists in a cellular network, which not only improves the safety performance of a physical layer of the cellular network, but also considers the requirement of D2D user data transmission, and simultaneously considers the influence of the attack probability of the active eavesdropper on the total safe reachable rate of the system.

The technical scheme of the invention is as follows:

the technical scheme of the invention is as follows:

the invention provides a robust D2D power control method under probability active eavesdropping, which comprises the following steps:

s1, initializing system parameters: setting the iteration times of the outer layer loop and the iteration times of the inner layer loop; each D2D user selects the probability of each transmission power value in the discrete power set and the interference cost factor set; the base station randomly selects one from the interference cost factor set and broadcasts the selected interference cost factor set to all D2D users;

s2, performing inner layer loop iteration: each D2D user updates the power selection probability of the next time slot according to a random learning method, and selects a discrete power value from a discrete power set according to the power selection probability;

s3, performing outer layer loop iteration: the base station randomly selects an interference cost factor, transfers to an inner layer circulation, and then updates the selection of the interference cost factor of the next time slot according to a better strategy; until the outer layer circulation meets the convergence condition, and the circulation is ended; and D2D users perform power control according to the power selection probability of the inner layer circulation output at the current moment.

Further, the step S1 specifically includes: initializing system parameters: setting the iteration number of the outer layer loop as k, wherein the initial value k=0, and k' =k+1 during iteration; setting the iteration number of the inner layer loop as t, wherein the initial value t=0, and t' =t+1 during iteration;

setting the probability that each D2D user selects the first transmission power value, i.e. the first discrete power level, in the discrete power set as

i represents the number of D2D users, N represents the total number of D2D users; l represents the number of transmission power values, i.e. the level of transmission power values, L represents the number of powers in the discrete power set of D2D user i: and setting an interference cost factor set, and randomly selecting one from the interference cost factor set by the base station and broadcasting the selected interference cost factor set to all D2D users.

Further, the step S2 specifically includes:

s2-a, each D2D user i observes attack mode selection of an active eavesdropper A and reports the attack mode selection to a base station;

s2-b, if the active eavesdropper A is observed to select passive eavesdropping, calculating the D2D user data transmission rate R _i ：

Wherein: i. j represents the number of the D2D user; p (P) _B And P _i Representing the transmit power of the base station and the ith D2D user, N ₀ Additive white gaussian noise representing background; h is a _Bi ，h _ji H _ii Representing channel gains between the base station and the ith D2D user, between the jth D2D user and the ith D2D user, and between the ith D2D user transceiver;

if an active eavesdropper A is observed to select the masterDynamic interference, calculating data transmission rate R of D2D user _iJ ：

Wherein: p (P) _A Indicating the interference power, h, when the active eavesdropper a selects active interference _Ai Representing channel gain between the active eavesdropper a to the i-th D2D user;

s2-c, each D2D user i respectively establishes a utility function U _i The utility function is expressed as a compromise between the average achievable rate of all D2D users and the interference cost of the corresponding i-th D2D to the cellular user and the power consumption itself:

wherein:

e represents eavesdropped, J represents external interference, a=e represents an attacker adopting an eavesdropping attack mode, a=j represents an attacker adopting an interference attack mode, and I represents internal interference; r is R _I Indicating internal interference cost factor, h determined by base station _iC Representing the channel gain between the ith D2D user to cellular user C; c represents a cellular user, C _D Representing a unit power consumption factor of the D2D user;

S2-D, updating the probability of selecting the first discrete power level by each D2D user according to the following two formulas:

wherein: b denotes the learning rate of each user i of the D2D,

representing the effect of the ith D2D user normalization of the nth time slot, i.e. the current time slotBy a function, U _imin (t) and U _imax (t) represents the minimum and maximum utility function values available to the ith D2D user of the nth time slot, U _i (t) is the current real-time utility function value, a _i (t) represents a discrete power level selected by an ith D2D user of the nth slot;

s2-e, selecting a transmission power value of a corresponding user from the discrete power set according to the updated power selection probability; repeating the steps S2-a to S2-D until all D2D users meet the convergence condition, stopping inner-layer loop iteration, and turning to the step S3; convergence conditions: p (t)/p (t-1) |/|p (t) | < epsilon, where p (t) represents the power selection probability vector for the corresponding D2D user at the t-th slot.

Further, in step S2-e, the convergence condition is: the iteration of the inner layer loop reaches the preset times, or the probability that each D2D user selects the first discrete power level at the current moment is consistent with the probability at the last moment.

Further, the probability tends to be consistent with the probability of the last moment: p (t)/p (t-1) |/p (t) | < epsilon, where p (t) represents the power selection probability vector of all D2D users at the t-th slot, epsilon=0.01; the preset times are 1000 times.

Further, the step S3 specifically includes:

s3-a, if the active eavesdropper A selects the attack mode of passive eavesdropping, calculating the safe reachable rate R of the cellular user according to the following formula _CS ：

R _CS ＝max(R _C -R _A ,0) (5)

Wherein:

indicating the achievable rate of cellular subscriber C under passive eavesdropping,

representing the eavesdropping rate of the active eavesdropper a; h is a _BC And h _jC Respectively representing channel gain between base station B and cellular user C, between jth D2D user and cellular user C, h _BA And h _jA Respectively representing channel gains between the base station B and the active eavesdropper A and between the jth D2D user and the active eavesdropper A;

if the active eavesdropper selects the attack mode of active interference, the achievable rate R of the cellular user is calculated as follows _CJ ：

Wherein: h is a _AC Representing the channel gain between the active eavesdropper a and the cellular user C;

s3-b, establishing a utility function of the cellular user C:

wherein:

the utility function of a cellular user is expressed as the sum of its safe reachable rate and the interference cost of all D2D to the cellular user;

s3-c, the base station randomly selects another interference cost factor R 'from the interference cost factor set' _I Updating the selection of the interference cost factor according to the following formula, and jumping to the inner loop S2,

s3-d, randomly selecting an interference cost factor from the interference cost factor set; repeating the steps S3-a to S3-c until the outer layer loop iteration reaches the preset times.

Further, in step S3-c, the convergence condition is: the outer layer loop iterates for a preset number of times, or the interference cost factor selected by the cellular user at the current moment is consistent with that at the previous moment, and the preset number of times is 1000 times.

The invention has the beneficial effects that:

the robust D2D power control method based on random learning, disclosed by the invention, has the advantages that the safety performance of a physical layer of a cellular network is improved, the requirement of D2D user data transmission is met, and meanwhile, the influence of the attack probability of an active eavesdropper on the total safe reachable rate of the system is considered.

Additional features and advantages of the invention will be set forth in the detailed description which follows.

Drawings

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular descriptions of exemplary embodiments of the invention as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the invention.

Fig. 1 is a general block diagram of a D2D robust power control system model.

Fig. 2 is a graph of a certain D2D user power selection probability convergence curve.

Fig. 3 is a plot of all D2D user power selection policy convergence.

Fig. 4 is a graph of the overall utility of a D2D user as a function of its number.

Fig. 5 is a graph of the overall utility of a legitimate user as a function of the probability of attack by an active eavesdropper.

Detailed Description

Preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein.

The system block diagram of the invention is shown in figure 1, the system simulation adopts Matlab simulation, and the parameter setting does not affect the generality. Consider a cell, where there is a base station, located in the center of the cell; a cellular user, 3D 2D communication pairs and an active eavesdropper are randomly distributed in square areas with side lengths of 1km x 1km, and the distance between the D2D transmitter and receiver is set to 20m. The transmitting power of the base station and the interference power of the active eavesdropper are set to P _E ＝P _A =1w, the set of discrete power values for each D2D user is P _i ＝[0.2,0.4,0.6,0.8,1]W, interference cost factor set R of base station _I ＝[0.2,0.4,0.6,0.8,1]. Modeling large-scale path loss of channel with transmission loss model, loss factor set to 3, background Gaussian additive white noise power level set to N ₀ ＝10 ^-10 W。

The specific implementation steps are as follows:

1) Initializing system parameters: k=0, t=0

Representing a probability that an ith D2D user selects an ith transmission power value in the discrete power set;

2) The number of outer loop iterations is added with 1: k=k+1

3) The base station randomly selects one from the interference price factor set and broadcasts the selected interference price factor set to all D2D users;

4) The number of inner layer loop iterations is added with 1: t=t+1

5) In the t-th cycle, each D2D user selects a probability from a set of discrete power values according to its power

Wherein the discrete power values are selected;

6) Each D2D user updates the power selection probability of the next slot according to a random learning algorithm, which specifically comprises the following steps:

a) Observing the attack mode selection of the active eavesdropper and reporting the attack mode selection to the base station;

b) If an active eavesdropper is observed to select passive eavesdropping, calculating the data transmission rate of the ith D2D user

Wherein: i. j represents the number of the D2D user; p (P) _B And P _i Representing the transmit power of the base station and the ith D2D user, N ₀ Additive white gaussian noise representing background; h is a _Bi ，h _ji H _ii Representing base station and ithChannel gains between D2D users, between the j-th D2D user and the i-th D2D user, and between the i-th D2D user transceivers; if an active eavesdropper A is observed to select active interference, calculating the data transmission rate R of the D2D user _iJ ：

c) If an active eavesdropper is observed to select active interference, calculating the data transmission rate of the ith D2D user

Wherein P is _A Indicating the interference power, h, when the active eavesdropper a selects active interference _Ai Representing channel gain between the active eavesdropper a to the i-th D2D user;

d) Designing utility functions for ith D2D user

Wherein:

the utility function of a D2D user is expressed as a compromise between the average achievable rate of all D2D users and the interference cost of the ith D2D to the cellular user and its own power consumption,

e represents eavesdropped, J represents external interference, a=e represents an attacker adopting an eavesdropping attack mode, a=j represents an attacker adopting an interference attack mode, and I represents internal interference; r is R _I Indicating internal interference cost factor, h determined by base station _iC Representing the channel gain between the ith D2D user to cellular user C; c represents a cellular user, C _D Representing the unit power consumption factor of the D2D user.

e) All D2D users monitor the attack pattern of the active eavesdropper, estimate the respective utility function, and update the probability of the ith D2D user to select the ith discrete power level according to the following two formulas:

wherein: b denotes the learning rate of each user i of the D2D,

normalized utility function of ith D2D user representing the t time slot, i.e. the current time slot, U _imin (t) and U _imax (t) represents the minimum and maximum utility function values available to the ith D2D user of the nth time slot, U _i (t) is the current real-time utility function value, a _i (t) represents a discrete power level selected by an ith D2D user of the nth slot; .

7) The inner loop proceeds until the convergence condition ||p (t)/p (t-1) |/|p (t) | < epsilon is satisfied, where p (t) represents the power selection probability vector of all D2D users at the t-th slot, epsilon=0.01, or the preset number of loops 1000 times is reached.

8) In the external circulation, the base station calculates the safe reachable rate of the cellular user according to the active eavesdropper attack mode reported by the D2D user;

9) If the active eavesdropper selects the attack mode of passive eavesdropping, the safe reachable rate of the cellular user is calculated as follows

R _CS ＝max(R _C -R _A ,0) (5)

Wherein:

indicating the achievable rate of the cellular user under passive eavesdropping,

indicating the eavesdropping rate of an active eavesdropper, h _BC And h _jC Respectively representing channel gain between base station B and cellular user C, between jth D2D user and cellular user C, h _BA And h _jA The channel gain between the jth D2D user and the active eavesdropper a is shown between the base station B and the active eavesdropper a, respectively.

10 If the active eavesdropper selects the attack mode of active interference, the achievable rate of the cellular user is calculated as follows

Where the channel gain between active eavesdropper and cellular user is used h _AC And (3) representing.

11 Cellular subscriber calculates its utility function

Wherein:

the utility function of a cellular user is expressed as the sum of its safe achievable rate and the interference cost of all D2D to the cellular user.

12 Base station from interference cost factor set

Is selected randomly with another interference cost factor R' _I And jumps to inner loop 4), then updates the selection probability of its interference cost factor according to the following equation

12 Turning to step 2) until the preset number of cycles of 1000 times is reached.

Fig. 2-5 are simulation curves of the present invention. Wherein fig. 2-3 are convergence curves for one of the D2D user power selection probabilities and all D2D user power selection strategies in a simulation. It can be seen from the figure that under the random learning-based discrete power selection algorithm provided by the invention, the probability vector of one D2D user power selection converges about 500 times, and all D2D user power selection strategies converge about 800 times. FIGS. 4 and 5 are simulation resultsIs 10 ⁵ Average values of sub-independent experiments, wherein fig. 4 is a comparison curve of the total utility of D2D users under the random learning-based discrete power control algorithm presented herein, with the number of D2D users under the selfish power accounting control and random power selection algorithm that maximizes the respective utility for each D2D user, and fig. 5 is a comparison curve of the total utility function of all users in the system under the above three algorithms with the probability of active eavesdropper attack. As can be seen from both fig. 4 and fig. 5, the D2D power control algorithm based on random learning proposed by the present invention is superior to the other two algorithms, and the total utility of D2D users increases with the increase of the number of D2D users, while the total utility function of legal users decreases with the increase of the probability of active eavesdropper selecting active interference.

The foregoing description of embodiments of the invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described.

Claims

1. A robust D2D power control method under probabilistic active eavesdropping, comprising the steps of:

s3, performing outer layer loop iteration: the base station randomly selects an interference cost factor, transfers to an inner layer circulation, and then updates the selection of the interference cost factor of the next time slot according to a better strategy; until the outer layer circulation meets the convergence condition, and the circulation is ended; D2D users perform power control according to the power selection probability of the inner layer circulation output at the current moment;

the step S2 specifically comprises the following steps:

if an active eavesdropper A is observed to select active interference, calculating the data transmission rate R of the D2D user _iJ ：

wherein:

/>

wherein: b denotes the learning rate of each user i of the D2D,

normalized utility function of ith D2D user representing the t time slot, i.e. the current time slot, U _imin (t) and U _imax (t) represents the minimum and maximum utility function values available to the ith D2D user of the nth time slot, U _i (t) is the current real-time utility function value, a _i (t) represents a discrete power level selected by an ith D2D user of the nth slot;

s2-e, selecting a transmission power value of a corresponding user from the discrete power set according to the updated power selection probability; repeating the steps S2-a to S2-D until all D2D users meet the convergence condition, stopping inner-layer loop iteration, and turning to the step S3; convergence conditions: p (t)/p (t-1) |/|p (t) | < epsilon, where p (t) represents the power selection probability vector of the D2D user at the t-th slot and epsilon represents the threshold of convergence.

2. The robust D2D power control method under probabilistic active eavesdropping of claim 1, wherein step S1 is specifically: initializing system parameters: setting the iteration number of the outer layer loop as k, wherein the initial value k=0, and k' =k+1 during iteration; setting the iteration number of the inner layer loop as t, wherein the initial value t=0, and t' =t+1 during iteration;

3. The method for robust D2D power control under probabilistic active eavesdropping of claim 1, wherein in step S2-e, the convergence condition is: the iteration of the inner layer loop reaches the preset times, or the probability that each D2D user selects the first discrete power level at the current moment is consistent with the probability at the last moment.

4. The robust D2D power control method under probabilistic active eavesdropping of claim 3, wherein the probability tends to agree with the probability of the last moment to satisfy: p (t)/p (t-1) |/|p (t) | < epsilon, where p (t) represents the power selection probability vector of the D2D user at the t-th slot, epsilon=0.01.

5. The method of claim 3, wherein the predetermined number of times is 1000.

6. The robust D2D power control method under probabilistic active eavesdropping of claim 1, wherein step S3 is specifically:

s3-a, if the active eavesdropper A selects the attack mode of passive eavesdropping, calculating the security of the cellular user according to the following formulaFull achievable rate R _CS ：

R _CS ＝max(R _C -R _A ,0) (5)

Wherein:

s3-b, establishing a utility function of the cellular user C:

wherein:

s3-c, the base station randomly selects another interference cost factor R from the interference cost factor set _I ' update according to the following formulaThe selection of the interference cost factor, and jumps to the inner loop S2,

s3-d, randomly selecting an interference cost factor from the interference cost factor set; steps S3-a to S3-c are repeated until the outer loop satisfies the convergence condition.

7. The method for robust D2D power control under probabilistic active eavesdropping of claim 6, wherein in step S3-D, the convergence condition is: the outer layer loop iterates for a preset number of times, or the interference cost factor selected by the cellular user at the current moment is consistent with that at the previous moment, and the preset number of times is 1000 times.