CN110933679A

CN110933679A - Robust D2D power control method under active eavesdropping according to probability

Info

Publication number: CN110933679A
Application number: CN201911267451.0A
Authority: CN
Inventors: 王金龙; 罗屹洁; 杨旸; 龚玉萍; 崔丽; 童晓兵
Original assignee: PLA University of Science and Technology
Current assignee: PLA University of Science and Technology
Priority date: 2019-12-11
Filing date: 2019-12-11
Publication date: 2020-03-27
Anticipated expiration: 2039-12-11
Also published as: CN110933679B

Abstract

A robust D2D power control method for resisting probabilistic active eavesdropping based on random learning comprises S1, initializing system parameters; s2, inner layer circulation: updating the power selection probability of the next time slot by each D2D user according to a random learning method, and selecting a discrete power value from the discrete power set according to the power selection probability; s3, outer layer circulation: the base station randomly selects an interference cost factor, switches to inner layer circulation, and then updates the selection of the interference cost factor of the next time slot according to a better strategy; ending the circulation until the outer circulation meets the convergence condition; and D2D user controls power according to the power selection probability output by the inner loop at the current moment. The robust D2D power control method based on random learning improves the security performance of a cellular network physical layer, considers the data transmission requirements of D2D users, and considers the influence of the attack probability of an active eavesdropper on the total security reachable rate of the system.

Description

Robust D2D power control method under active eavesdropping according to probability

Technical Field

The invention belongs to the technical field of power control in wireless communication, in particular to a D2D steady power control method based on random learning under active eavesdropping according to probability, and particularly relates to a method for improving the safety performance of a D2D cellular network physical layer by adopting a power updating strategy based on a random learning automaton and a better response.

Background

The D2D communication is a direct communication in which cellular users in the mobile communication system do not pass through base station forwarding. Due to the openness of the wireless channel and the size and hardware limitations of the D2D devices, communications between D2D devices are more susceptible to eavesdropping. According to the theory of information theory, when the channel gain between the legal transceivers is better than the channel gain between the legal transmitter and the eavesdropper, the safety of the transmitted information can be perfectly ensured, and the information cannot be intercepted and decoded by the eavesdropper. For a cellular mobile communication network, the mutual interference of a D2D user and a cellular user can be reasonably utilized to improve the overall physical layer safety performance of the system. Meanwhile, the communication requirement of the D2D user can be considered while the security performance of the physical layer of the cellular user is ensured.

Considering that an attacker existing in the system no longer passively eavesdrops on information sent by a legal user like a traditional passive eavesdropper, but can passively eavesdrop or actively interfere with a probabilistic active eavesdropper, and randomly selects an attack strategy of the attacker with a certain probability, the aggressivity and the destructiveness brought by the attacker are greater, and great challenges are brought to power control of D2D communication in the mobile cellular network.

Disclosure of Invention

The invention provides a steady D2D power control method based on random learning aiming at the condition that an active eavesdropper attacked according to probability exists in a cellular network, improves the security performance of the physical layer of the cellular network, considers the data transmission requirements of a D2D user, and considers the influence of the attack probability of the active eavesdropper on the total security reachable rate of a system.

The technical solution of the purpose of the invention is as follows:

the technical scheme of the invention is as follows:

the invention provides a robust D2D power control method for resisting probability-dependent active eavesdropping and based on random learning, which comprises the following steps:

s1, initializing system parameters: setting the iteration times of the outer layer loop and the iteration times of the inner layer loop; the probability of each D2D user selecting each transmission power value in the discrete power set, and the set of interference cost factors; the base station randomly selects one from the interference cost factor set and broadcasts the selected one to all D2D users;

s2, inner-layer loop iteration is carried out: updating the power selection probability of the next time slot by each D2D user according to a random learning method, and selecting a discrete power value from the discrete power set according to the power selection probability;

s3, outer loop iteration is carried out: the base station randomly selects an interference cost factor, switches to inner layer circulation, and then updates the selection of the interference cost factor of the next time slot according to a better strategy; ending the circulation until the outer circulation meets the convergence condition; and D2D user controls power according to the power selection probability output by the inner loop at the current moment.

Further, step S1 is specifically: initializing system parameters: setting the iteration number of the outer loop as k, setting the initial value k as 0, and setting k' as k +1 during iteration; setting the iteration number of the inner layer loop as t, setting the initial value t as 0, and setting the iteration time t as t + 1;

the probability that each D2D user selects the ith transmission power value, namely the ith discrete power level, in the discrete power set is set as

i represents the number of D2D users, N represents the total number of D2D users; l denotes the number of transmission power values, i.e. the level of the transmission power values, L denotes the amount of power in the discrete power set of D2D user i: setting interference cost factor set, the base station isOne of the set of interference cost factors is randomly selected and broadcast to all D2D users.

Further, step S2 is specifically:

s2-a, each D2D user i observes the attack mode selection of the active eavesdropper A and reports the attack mode selection to the base station;

s2-b, if it is observed that the active eavesdropper A chooses passive eavesdropping, calculating D2D user data transmission rate R_i：

Wherein: i. j represents the number of D2D users; p_BAnd P_iRepresenting the transmit power, N, of the base station and the ith D2D user, respectively₀Additive white gaussian noise representing the background; h is_Bi，h_jiAnd h_iiIndicating the channel gains between the base station and the ith D2D user, between the jth D2D user and the ith D2D user, and between the ith D2D user transceivers;

if it is observed that the active eavesdropper A selects active interference, the data transmission rate R of the D2D user is calculated_iJ：

Wherein: p_ARepresenting the interference power, h, at which the active eavesdropper A selects active interference_AiRepresenting the channel gain between the active eavesdropper a to the ith D2D user;

s2-c, and establishing utility function U for each D2D user i_iThe utility function is expressed as a compromise between the total average reachable rate of all D2D users and the interference cost of the corresponding ith D2D to the cellular user and its own power consumption:

wherein:

e represents eavesdropping, J represents external interference, a ═ E represents that an attacker adopts an eavesdropping attack mode, a ═ J represents that the attacker adopts an interference attack mode, and I represents internal interference; r_IRepresenting the internal interference cost factor, h, determined by the base station_iCRepresents the channel gain between the ith D2D user to cellular user C; c denotes cellular subscribers, C_DRepresents the specific power consumption factor of the D2D user;

S2-D, each D2D user updates the probability of each selecting the ith discrete power level according to the following two equations:

wherein: b represents the learning rate of D2D for each user i,

utility function, U, representing the normalization of the ith D2D user in the t-th time slot, i.e., the current time slot_imin(t) and U_imax(t) minimum and maximum utility function values, U, that can be achieved by the ith D2D user at the tth time slot, respectively_i(t) is the currently obtained real-time utility function value, a_i(t) represents the discrete power level selected by the ith D2D user at the tth time slot;

s2-e, selecting the transmission power value of the corresponding user from the discrete power set according to the updated power selection probability; repeating the steps S2-a to S2-D until all D2D users meet the convergence condition, stopping iteration of the inner loop, and turning to the step S3; the convergence condition is as follows: if the power selection probability vector is greater than | | | p (t) |/| p (t) | < ε, where p (t) represents the power selection probability vector for the D2D user at the t-th slot.

Further, in step S2-e, the convergence condition is: the iteration of the inner loop reaches the preset number of times, or the probability that each D2D user selects the ith discrete power level at the current moment is consistent with the probability at the last moment.

Further, the probability and the probability of the last moment tend to be consistent and satisfy: if p (t) is less than s, where p (t) represents the power selection probability vector of all D2D users at the t-th time slot, e ═ 0.01; the preset number of times is 1000.

Further, step S3 is specifically:

s3-a, if the active eavesdropper A selects the attack mode of passive eavesdropping, calculating the safe reachable rate R of the cellular user according to the following formula_CS：

R_CS＝max(R_C-R_A,0) (5)

Wherein:

indicating the achievable rate of the cellular user C under passive eavesdropping,

indicating the eavesdropping rate of the active eavesdropper A; h is_BCAnd h_jCRespectively, the channel gains h between the base station B and the cellular user C and the jth D2D user and the cellular user C_BAAnd h_jARespectively representing the channel gains between the base station B and the active eavesdropper A and between the jth D2D user and the active eavesdropper A;

if the active eavesdropper selects the attack mode of the active interference, the reachable rate R of the cellular user is calculated according to the following formula_CJ：

Wherein: h is_ACRepresenting the channel gain between the active eavesdropper a and the cellular subscriber C;

s3-b, establishing a utility function of the cellular user C:

wherein:

the utility function of a cellular user is expressed as its safe achievable rate and the interference of all D2D to the cellular userThe sum of the interference costs;

s3-c, the base station randomly selects another interference cost factor R from the interference cost factor set_I' the selection of the interference cost factor is updated according to the following formula, and jumps to the inner loop S2,

s3-d, randomly selecting an interference cost factor from the interference cost factor set; steps S3-a through S3-c are repeated until the outer loop iterations reach a preset number of times.

Further, in step S3-c, the convergence condition is: the outer loop iteration reaches the preset times, or the interference cost factor selected by the cellular user at the current time is consistent with that at the previous time, and the preset times are 1000 times.

The invention has the beneficial effects that:

the robust D2D power control method based on random learning improves the security performance of a cellular network physical layer, considers the data transmission requirements of D2D users, and considers the influence of the attack probability of an active eavesdropper on the total security reachable rate of the system.

Additional features and advantages of the invention will be set forth in the detailed description which follows.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail exemplary embodiments thereof with reference to the attached drawings, in which like reference numerals generally represent like parts throughout.

Fig. 1 is a general block diagram of a D2D robust power control system model.

Fig. 2 is a power selection probability convergence curve for a certain D2D user.

Fig. 3 is a power selection strategy convergence curve for all D2D users.

Fig. 4 is a plot of the total utility of D2D users as a function of their number.

Fig. 5 is a graph of the total utility of a legitimate user as a function of the probability of an active eavesdropper attack.

Detailed Description

Preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein.

The system block diagram of the invention is shown in fig. 1, the Matlab simulation is adopted for system simulation, and the parameter setting does not influence the generality. Consider a cell, where there is a base station, located in the center of the cell; a cellular subscriber, 3D 2D communication pairs and an active eavesdropper were randomly distributed in a square region with a side length of 1km x 1km, and the distance between the D2D transmitter and receiver was set to 20 m. The transmission power of the base station and the interference power of the active eavesdropper are set to be P_E＝P_A1W, the discrete power value set for each D2D user is

Interference cost factor combining for base stations

Modeling the large-scale path loss of the channel by using a transmission loss model, setting a loss factor to be 3, and setting a background Gaussian additive white noise power level to be N₀＝10^-10W。

The specific implementation steps are as follows:

1) initializing system parameters: k is 0, t is 0 and

represents the probability that the ith D2D user selects the ith transmission power value in the discrete power set;

2) adding 1 to the iteration number of the outer loop: k is k +1

3) The base station randomly selects one from the interference price factor set and broadcasts the selected one to all D2D users;

4) adding 1 to the number of iteration times of the inner loop: t +1

5) In the first placeFrom the discrete power value set, each D2D user selects a probability according to their power in t cycles

Selecting discrete power value;

6) each D2D user updates the power selection probability of the next slot according to the random learning algorithm, which specifically operates as follows:

a) observing attack mode selection of an active eavesdropper and reporting the attack mode selection to a base station;

b) if it is observed that the active eavesdropper chooses passive eavesdropping, the data transmission rate of the ith D2D user is calculated

Wherein: i. j represents the number of D2D users; p_BAnd P_iRepresenting the transmit power, N, of the base station and the ith D2D user, respectively₀Additive white gaussian noise representing the background; h is_Bi，h_jiAnd h_iiIndicating the channel gains between the base station and the ith D2D user, between the jth D2D user and the ith D2D user, and between the ith D2D user transceivers; if it is observed that the active eavesdropper A selects active interference, the data transmission rate R of the D2D user is calculated_iJ：

c) If active interference selected by an active eavesdropper is observed, the data transmission rate of the ith D2D user is calculated

Wherein, P_ARepresenting the interference power, h, at which the active eavesdropper A selects active interference_AiRepresenting the channel gain between the active eavesdropper a to the ith D2D user;

d) designing utility function of ith D2D user

Wherein:

the utility function of D2D users is expressed as a compromise between the total average achievable rate of all D2D users and the interference cost of the ith D2D to the cellular users and its own power consumption,

e represents eavesdropping, J represents external interference, a ═ E represents that an attacker adopts an eavesdropping attack mode, a ═ J represents that the attacker adopts an interference attack mode, and I represents internal interference; r_IRepresenting the internal interference cost factor, h, determined by the base station_iCRepresents the channel gain between the ith D2D user to cellular user C; c denotes cellular subscribers, C_DRepresenting the specific power consumption factor of the D2D user.

e) All D2D users monitored the attack pattern of the active eavesdropper, estimated the respective utility function, and updated the probability that the ith D2D user selected the ith discrete power level according to the following two equations:

wherein: b represents the learning rate of D2D for each user i,

utility function, U, representing the normalization of the ith D2D user in the t-th time slot, i.e., the current time slot_imin(t) and U_imax(t) minimum and maximum utility function values, U, that can be achieved by the ith D2D user at the tth time slot, respectively_i(t) is the currently obtained real-time utility function value, a_i(t) represents the discrete power level selected by the ith D2D user at the tth time slot; .

7) The inner loop is performed until the convergence condition | | | p (t)/p (t-1) | |/| p (t) | < epsilon, wherein p (t) represents the power selection probability vectors of all D2D users at the t-th time slot, and epsilon is 0.01 or reaches the preset loop time number 1000 times.

8) In the external circulation, the base station calculates the safe reachable rate of the cellular user according to the active eavesdropper attack mode reported by the D2D user;

9) if the active eavesdropper selects the attack mode of passive eavesdropping, the safe reachable rate of the cellular user is calculated according to the following formula

R_CS＝max(R_C-R_A,0) (5)

Wherein:

indicating the achievable rate of a cellular user under passive eavesdropping,

indicating the eavesdropping rate of an active eavesdropper, h_BCAnd h_jCRespectively, the channel gains h between the base station B and the cellular user C and the jth D2D user and the cellular user C_BAAnd h_jARespectively, the channel gains between the base station B and the active eavesdropper a, and between the jth D2D user and the active eavesdropper a.

10) If the active eavesdropper selects the attack mode of the active interference, the reachable rate of the cellular user is calculated according to the following formula

Wherein the channel gain between the active eavesdropper and the cellular user is given by h_ACAnd (4) showing.

11) Cellular user computing its utility function

Wherein:

the utility function of a cellular user is expressed as the sum of its safe achievable rate and the interference cost of all D2D to the cellular user.

12) Base station combining from interference cost factors

In the random selection of another interference cost factor R'_IAnd jumps to inner loop 4) and then updates its selection probability of the interference cost factor according to the following equation

12) And turning to the step 2) until the preset cycle times are reached for 1000 times.

Fig. 2-5 are simulation curves of the present invention. Wherein fig. 2-3 are the convergence curves of one of the D2D user power selection probabilities and all D2D user power selection strategies in a simulation. It can be seen from the figure that under the discrete power selection algorithm based on random learning proposed by the present invention, the probability vector of one D2D user power selection reaches convergence around 500 times, and all D2D user power selection strategies reach convergence around 800 times. FIGS. 4 and 5 show the simulation results 10⁵Average of the sub-independent experiments, wherein fig. 4 is a comparison curve of the total utility of D2D users with the number of D2D users under the discrete power control algorithm based on random learning proposed herein, and the selfish power accounting control and random power selection algorithm that maximize the respective utility of each D2D user, fig. 5 is a comparison curve of the total utility function of all users in the system with the probability of active eavesdropper attack under the above three algorithms. As can be seen from both fig. 4 and fig. 5, the random learning based D2D power control algorithm proposed by the present invention is superior to the other two algorithms, and the total utility of D2D users increases with the increase of the number of D2D users, while the total utility function of legitimate users decreases with the increase of the active eavesdropper's selection of active interference probability.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments.

Claims

1. A robust D2D power control method based on random learning against probabilistic active eavesdropping, comprising the steps of:

2. The robust random learning-based D2D power control method against probabilistic active eavesdropping as claimed in claim 1, wherein the step S1 is specifically: initializing system parameters: setting the iteration number of the outer loop as k, setting the initial value k as 0, and setting k' as k +1 during iteration; setting the iteration number of the inner layer loop as t, setting the initial value t as 0, and setting the iteration time t as t + 1;

i represents the number of D2D users, N represents the total number of D2D users; l denotes the number of transmission power values, i.e. the level of the transmission power values, L denotes the amount of power in the discrete power set of D2D user i: and setting an interference cost factor set, and randomly selecting one from the interference cost factor set by the base station and broadcasting the selected one to all D2D users.

3. The robust random learning-based D2D power control method against probabilistic active eavesdropping as claimed in claim 1, wherein the step S2 is specifically:

wherein:

wherein: b represents the learning rate of D2D for each user i,

4. The robust random learning based D2D power control method against probabilistic active eavesdropping as claimed in claim 3, wherein in step S2-e, the convergence condition is: the iteration of the inner loop reaches the preset number of times, or the probability that each D2D user selects the ith discrete power level at the current moment is consistent with the probability at the last moment.

5. The robust random learning based D2D power control method against probabilistic active eavesdropping according to claim 4, wherein the probability and the probability of the last moment tend to agree with the following: if p (t) is less than s, p (t-1) |/| p (t) is less than s, where p (t) represents the power selection probability vector for all D2D users in the t-th slot, e ═ 0.01.

6. The robust random learning based D2D power control method against probabilistic active eavesdropping according to claim 4, wherein the predetermined number of times is 1000.

7. The robust random learning-based D2D power control method against probabilistic active eavesdropping as claimed in claim 1, wherein the step S3 is specifically:

R_CS＝max(R_C-R_A,0) (5)

Wherein:

if the active eavesdropper selects the attack mode of the active interference, according to the followingCalculating the achievable rate R of a cellular user_CJ：

s3-b, establishing a utility function of the cellular user C:

wherein:

the utility function of the cellular user is expressed as the sum of its safe achievable rate and the interference cost of all D2D to the cellular user;

s3-d, randomly selecting an interference cost factor from the interference cost factor set; steps S3-a through S3-c are repeated until the outer loop satisfies the convergence condition.

8. The robust random learning based D2D power control method against probabilistic active eavesdropping as claimed in claim 7, wherein in step S3-D, the convergence condition is: the outer loop iteration reaches the preset times, or the interference cost factor selected by the cellular user at the current time is consistent with that at the previous time, and the preset times are 1000 times.