CN114827947A - Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal - Google Patents

Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal Download PDF

Info

Publication number
CN114827947A
CN114827947A CN202210253563.6A CN202210253563A CN114827947A CN 114827947 A CN114827947 A CN 114827947A CN 202210253563 A CN202210253563 A CN 202210253563A CN 114827947 A CN114827947 A CN 114827947A
Authority
CN
China
Prior art keywords
vehicle
resource allocation
representing
power
vehicles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210253563.6A
Other languages
Chinese (zh)
Inventor
俱莹
曹植伟
陈宇超
王浩宇
刘雷
裴庆祺
王励成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210253563.6A priority Critical patent/CN114827947A/en
Publication of CN114827947A publication Critical patent/CN114827947A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/02Resource partitioning among network components, e.g. reuse partitioning
    • H04W16/10Dynamic resource partitioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/53Allocation or scheduling criteria for wireless resources based on regulatory allocation policies

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention belongs to the technical field of Internet of vehicles edge computing, and discloses an Internet of vehicles safety computing unloading and resource allocation method, computer equipment and a terminal. Firstly, modeling an optimization problem into a multi-agent sequential decision problem, and solving by using a reinforcement learning method. Since the dqn (deep Q learning) method has over-estimation problem, the Q value is overestimated and the performance is reduced. Therefore, DDQN (Dual deep Q learning) method is adopted to train the multi-agent model. The dynamic process of the vehicle is modeled by using the queuing theory, so that the scene is closer to the actual scene. This approach enables the user to select a reasonable strategy that minimizes the maximum delay among all vehicles.

Description

Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal
Technical Field
The invention belongs to the technical field of vehicle networking edge computing, and particularly relates to a vehicle networking safety computing unloading and resource allocation method, computer equipment and a terminal.
Background
With the continuous progress of technology and the increasing demand, the application of big data to the internet of vehicles prompts vehicles to generate more and more delay-sensitive tasks to support new services including traffic flow prediction, and there are two methods for dealing with the problem: one is to enhance the computing power of the onboard chip so that it can handle these tasks. Another approach is to use moving edge computing techniques to handle the task. The mobile edge computing technology utilizes the wireless access network to provide the required service and cloud computing function for the user nearby, so as to create a communication service environment with high performance, low delay and high bandwidth. The mobile edge computing technology can effectively solve the problem of insufficient computing capability of the vehicle, however, due to the open characteristic of the wireless channel, the computing unloading process has a risk of information leakage, and the physical layer security technology utilizes the characteristic of the wireless channel to protect the privacy of the user, such as: signal processing, channel coding, multi-antenna modulation, etc. With the application of big data in the scene of internet of vehicles, the contradiction between the transmission of mass data and limited spectrum resources is increasingly prominent, and the emerging spectrum sharing technology can remarkably improve the utilization rate of the spectrum and save the spectrum resources while ensuring the normal communication requirements of users.
In the existing research, the combination of physical layer security technology and spectrum sharing technology in a vehicle edge computing network is not researched yet. On one hand, the network topology structure is changed rapidly due to the high-speed movement of the vehicle, so that the conventional scheme cannot be used for rapid decision making, and on the other hand, the safety scheme of the computing network on the edge of the vehicle is considered to be difficult to meet the requirement of ultra-low time delay. Due to the fact that the vehicle networking scene is complex due to high-speed movement of the vehicle and more eavesdroppers, the traditional mathematical optimization method is difficult to adapt to the dynamic complex vehicle networking scene, and therefore the complex dynamic optimization problem needs to be solved by means of decision making and learning capacity of Deep Reinforcement Learning (DRL). Based on this, a transmission scheme (SoRA) for security offloading and resource allocation based on Deep Reinforcement Learning (DRL) is designed for a multi-user communication scene of the internet of vehicles, so that the method is fast suitable for complex and dynamic communication environments, and can reduce service delay to the maximum extent while ensuring the communication security of individual users.
In a practical internet of vehicles multi-user service scenario, multiple users may compete for the same segment of premium spectrum resources or the same edge server computing resources, which may lead to the problem of competing gaming. Therefore, how to organically combine the frequency band selection and the edge server selection reduces the overall service delay while considering the vehicle power; how to adapt to the rapid change of a dynamic scene in the internet of vehicles and solve the problem of multi-eavesdropper security, and meeting the service requirement in the dynamic scene of the internet of vehicles is the problem to be solved in the development of the communication calculation unloading technology of the internet of vehicles. The disadvantage of using mathematical derivation to ensure physical layer safety research is that it can be performed in static scenes and cannot adapt to high-speed dynamic scenes in the internet of vehicles. The existing research on the security of the physical layer of the internet of vehicles still only considers one static eavesdropper, but in an actual scene, a plurality of eavesdroppers often exist. Once the above problem can be solved, the communication can be secured in the vehicle edge calculation.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) the traditional optimization method is difficult to adapt to complex dynamic vehicle networking computing unloading scenes, and cannot meet the requirements of high-reliability and high-speed data transmission service.
(2) Due to selfish preference of vehicle nodes, part of vehicles tend to minimize service delay of the vehicles, but service delay of the rest of vehicles exceeds a tolerable range, so that overall performance of the system is reduced.
(3) In the prior art, only a single static eavesdropper is considered, and a plurality of dynamic eavesdroppers are necessarily arranged in an actual scene, so that the safety of the calculation unloading process is difficult to ensure.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method for unloading and allocating resources in the safety calculation of the Internet of vehicles, computer equipment and a terminal.
The invention is realized in such a way, and provides a method for calculating, unloading and resource allocation of the safety of the Internet of vehicles, which can effectively break through the limitation of a static scene and realize the real-time decision of the safety calculation and unloading of the Internet of vehicles. The dynamic course of the vehicle is first modeled using queuing theory and there are multiple dynamic eavesdroppers in the scene. Secondly, modeling the optimization problem into a multi-agent sequential decision problem, and performing multi-agent training solution by using a DDQN reinforcement learning method, so that a user can select a reasonable strategy to minimize the maximum service delay in all vehicles while performing safe unloading. The invention actively promotes the cooperation among the nodes of the Internet of vehicles, meets the communication requirements of ultra-low time delay, high safety and high reliability, and can adapt to dynamic Internet of vehicles scenes. Further, the internet of vehicles safety calculation unloading and resource allocation method comprises the following steps:
the method comprises the steps that firstly, a vehicle networking communication scene of a single base station is constructed, and the base station is connected to an edge server to provide calculation unloading service; a dynamic car networking scene is set up for the method, so that subsequent modeling and analysis are facilitated.
Secondly, modeling a communication process for transmission processes of different links; lays a foundation for the communication channel used subsequently by the invention.
Thirdly, modeling an optimized target by utilizing a Wyner eavesdropping coding scheme; the method and the device lay a foundation for calculating the eavesdropping rate of an eavesdropper and training the model.
Fourthly, the base station obtains the state information of the current moment through the action of the surrounding environment information, wherein the state information comprises the information of the target vehicle, including the vehicle speed, the position coordinate, the current state, the frequency band allocation information and the resource allocation information on the edge server, and the information is used as the state input of the deep reinforcement learning, and the DDQN algorithm is used for the deep reinforcement learning; a state space is determined for the training of the following agents.
Fifthly, selecting corresponding actions by the vehicle based on the current state information; the current state action is power selection, frequency band selection and edge server calculation resource block selection; an action space of the agent is determined.
Sixthly, designing a reward mechanism and a structure of the neural network according to the model and the strategy constructed in the second step; the reward mechanism is designed so that the user vehicles in the system can cooperate better to minimize the maximum latency.
Seventhly, extracting input characteristics of the current state by using the DDQN neural network in the fifth step, fitting a Q function to obtain Q values of different actions in various input states, selecting the action in the current state according to an element-greedy strategy, training and updating neural network parameters by combining a reward mechanism in the fifth step, and mainly updating the neural network parameters; and updating the neural network parameters.
Eighthly, using the trained DDQN network, taking the state information of the current environment as state input, outputting a Q value sequence adopting corresponding actions in the current state, and taking the action with the maximum Q value as a strategy for selecting power, frequency band and edge server computing resources of the target vehicle in the current state; and guarantees are provided for the convergence property and the convergence time of the model.
Further, the first step process is as follows: the arrival process of the vehicle is modeled by using a queuing theory, the arrival time interval t of the vehicle obeys a negative exponential score, and a probability density function is as follows:
Figure BDA0003547924910000041
where λ is the vehicle arrival rate and t is the time interval between vehicle arrivals.
Further, the process of the second step is as follows:
2.1 during communicationIn, channel gain g between transmitting end and receiving end k By large scale fading a k And small scale fading component h k
g k =α k h k
2.2 Large Scale fading h k Consisting of path LOSs and shadow fading, the path LOSs of V2V is divided into LOS and NLOS cases, where:
Figure BDA0003547924910000042
wherein f is c Is the carrier frequency, d is the distance, d BP Is an effective distance, h 0 And h 1 The path loss in the case of NLOS is:
PL Nlos (d 1 ,d 2 )=PL los (d 1 )+20-12.5n j +10n j log 10 d 2 +3log 10 (f c /5)
wherein n is j =max(2.8-0.0024d 1 ,1.84),d 1 And d 2 Representing the length and width of each road grid in a Manhattan grid layout;
shadow fading of V2V:
Figure BDA0003547924910000043
where D is the updated distance matrix, D corr 10, N on the general city road S (n) is an M x M matrix, which is a normally distributed matrix expected to be 0 with variance of 1;
path loss PL of V2I V2I =a+blog 10 R, wherein R represents the distance between the vehicle and the base station, and a and b are path loss parameters related to the scene; shadow fading of V2I:
Figure BDA0003547924910000051
wherein D i Matrix representing the updated distance of the ith vehicle user, D corr Is 50, R is an M x M matrix with the diagonal lines being k and the remaining elements being k/2, N i (n) a Mx1 matrix for the ith vehicle user, which is a normal distribution matrix with 0 and 1 variance desired;
2.3 the rate of the offload link from the kth vehicle user to the base station via the mth sub-channel is
Figure BDA0003547924910000052
Figure BDA0003547924910000053
Where W is the bandwidth of the channel and,
Figure BDA0003547924910000054
expressed as the signal-to-noise ratio:
Figure BDA0003547924910000055
wherein
Figure BDA0003547924910000056
Represents the power from the kth vehicle user to the base station, g k,B [m]Representing the channel gain, σ, of the k-th vehicle to the base station on the m-th frequency band 2 The representation of the noise is represented by,
Figure BDA0003547924910000057
is the interference experienced in the unloading of the kth user vehicle,
Figure BDA0003547924910000058
transmission power g representing that m-th vehicle performs V2V communication m,B [m]Represents an interference channel gain, ρ, caused to V2I communication by the m-th vehicle for V2V communication k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used;
2.4 nth eavesdropper on mth sub-bandRate of eavesdropping of k vehicle users
Figure BDA0003547924910000059
Expressed as:
Figure BDA00035479249100000510
wherein
Figure BDA00035479249100000511
Represents the power of the k-th vehicle user, g k,n [m]Indicates the channel gain, σ, of the k-th vehicle to the eavesdropper on the m-th band 2 The representation of the noise is represented by,
Figure BDA00035479249100000512
is the disturbance suffered during the eavesdropping,
Figure BDA00035479249100000513
transmission power g representing that m-th vehicle performs V2V communication m,n [m]Denotes the channel gain, ρ, of the m-th vehicle in V2V communication with the eavesdropper k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used.
Further, the process of the third step is as follows:
3.1 secure offload Rate expressed as
Figure BDA00035479249100000514
v e Representing all eavesdroppers.
3.2 time of transmission of kth vehicle user to base station
Figure BDA00035479249100000515
Wherein B is k Which represents the size of the computing task,
Figure BDA0003547924910000061
representing the secure offload rate, the time the task computed on the edge computing server:
Figure BDA0003547924910000062
wherein B is k Representing the size of the computational task, z k [j]1 means that the jth resource block is allocated to the kth vehicle user for use, z k [j]0 means that the jth resource block is not allocated to the kth vehicle user for use, N c,j Indicates the total number of edge server processing cores, u E Representing the processing rate of each core; the total time delay
Figure BDA0003547924910000063
Figure BDA0003547924910000064
3.3 minimize the maximum service delay among all vehicles, the objective function is:
Figure BDA0003547924910000065
Subject to:
C 1
Figure BDA0003547924910000066
C 2
Figure BDA0003547924910000067
C 3
Figure BDA0003547924910000068
C 4
Figure BDA0003547924910000069
C 5
Figure BDA00035479249100000610
wherein N is u Representing the total number of service vehicles, N b Representing edge server resource blocks, N c Representing the total number of processing cores and processing power of the MEC server, N p Represents a selectable amount of vehicle power,
Figure BDA00035479249100000611
meaning that the kth vehicle user selects the ith power as the transmit power, otherwise
Figure BDA00035479249100000612
C 1 Ensuring that the total number of processing cores does not exceed the core number of edge servers, C 2 ,C 3 ,C 4 Three constraints ensure that each vehicle user can only select one frequency band, one transmitting power and one calculation resource block, C 5 And the decision variables of the optimization target are designated as binary variables.
Further, the process of the fifth step is as follows:
5.1 the motion space can be represented by a three-dimensional coordinate, wherein the x axis represents the selection of a frequency band, the y axis represents the selection of the vehicle emission power, and the z axis represents the selection of a computing resource block on the edge server; let the frequency band select have N a The power of the vehicle is selected to be N p The edge server resource block is selected to have N b In this way, the action for any vehicle needing service may be N a ×N b ×N p
And 5.2, balancing the training process and the exploration process by adopting an element-greedy strategy, and at the time t, selecting the action with the maximum Q value by the base station according to the probability of 1-element, and selecting one action from the state space A according to the probability of element.
Further, the process of the sixth step is as follows:
6.1 dividing rewards into N according to time of service delay w Grading;
6.2 when the calculated unload rate is too low, there is a large delay with a reward of 0.
Further, the neural network training process of the seventh step is as follows:
7.1 initializing environmental information and Q network parameters to generate vehicle operation data;
7.2 in each training round, updating and acquiring the current vehicle position and the environmental state, resetting the frequency band power selection and the edge server resource allocation strategy;
7.3 selecting an action for the target vehicle according to the current state information and a greedy algorithm, namely a combination scheme of frequency band selection, vehicle power and edge server resource allocation, and updating the environment information;
7.4 obtaining the action combination scheme of all target vehicles, obtaining the reward value r related to the capacity c,i And the returned prize value r t };
7.5 storing the state, action, reward and next state at time t as a sample in the experience pool,
7.6 when the number of samples in the experience pool is large enough, training the model is started, and small batches of samples(s) are randomly drawn from the experience pool t ,a t ,r t ,s t+1 ) And training network parameters and updating the target network weight.
Another object of the present invention is to provide a computer device, comprising a memory and a processor, the memory storing a computer program, the computer program, when executed by the processor, causing the processor to perform the steps of the car networking security calculation offloading and resource allocation method.
Another objective of the present invention is to provide an information data processing terminal, where the information data processing terminal is configured to execute the steps of the method for calculating, uninstalling and allocating resources in the internet of vehicles.
In combination with the above technical solutions and the technical problems to be solved, please analyze the advantages and positive effects of the technical solutions to be protected in the present invention from the following aspects:
first, aiming at the technical problems existing in the prior art and the difficulty in solving the problems, the technical problems to be solved by the technical scheme of the present invention are closely combined with results, data and the like in the research and development process, and some creative technical effects are brought after the problems are solved. The specific description is as follows:
the invention can overcome the uncertainty caused by the movement of multiple eavesdroppers and vehicles, and reduces the service delay while considering the safety of the physical layer to ensure the communication safety. Firstly, modeling an optimization problem into a multi-agent sequential decision problem, and solving by using a reinforcement learning method. Since the dqn (deep Q learning) method has over-estimation problem, the Q value is overestimated and the performance is reduced. Therefore, DDQN (Dual deep Q learning) method is adopted to train the multi-agent model. The dynamic process of the vehicle is modeled by using the queuing theory, so that the scene is closer to the actual scene. This approach enables the user to select a reasonable strategy that minimizes the maximum delay among all vehicles.
Secondly, considering the technical scheme as a whole or from the perspective of products, the technical effect and advantages of the technical scheme to be protected by the invention are specifically described as follows: the invention researches the service problem of multi-user multi-eavesdroppers under the condition that the vehicle needs to calculate and unload, provides a DRL-based SoRA strategy through design, and can help the vehicle to quickly make an optimal strategy according to the current environment so as to minimize service delay. In the model, the problems of high-speed moving characteristics of a vehicle, competition in the processes of frequency band selection and edge server resource block selection, interference under a multi-user scene and the like are considered. The simulation result of the model shows that the method provided by the invention can reduce the overall delay of calculation unloading service of the vehicle and improve the communication safety and the like.
Third, as an inventive supplementary proof of the claims of the present invention, there are also presented several important aspects:
the technical scheme of the invention fills the technical blank in the industry at home and abroad:
the invention provides a safety calculation unloading and resource allocation method, which can effectively break through the limitation of a static scene and realize dynamic real-time decision of the Internet of vehicles. Meanwhile, the invention can solve the problem of the increase of the overall time delay of the network caused by the selfish preference among the nodes in the prior art, effectively stimulates the mutual cooperation among vehicles, thereby minimizing the maximum service time delay in a network system, simultaneously ensuring the safety of calculation and unloading in consideration of a plurality of dynamic eavesdroppers, meeting the communication requirements of ultra-low time delay, high reliability and high safety of the communication of the Internet of vehicles, enabling the communication to adapt to dynamic and complex communication and edge calculation scenes of the Internet of vehicles, filling the blank of the Internet of vehicles industry at home and abroad and promoting the landing of edge calculation service.
Drawings
Fig. 1 is a flowchart of a method for offloading and allocating resources for security calculation in an internet of vehicles according to an embodiment of the present invention.
Fig. 2 is a flowchart of an implementation of the method for offloading security computation and resource allocation in the internet of vehicles according to the embodiment of the present invention.
Fig. 3 is a schematic diagram of a millimeter wave multi-user communication scene of the internet of vehicles according to the embodiment of the invention.
Fig. 4 is a schematic diagram of a DDQN network according to an embodiment of the present invention.
Fig. 5 is a comparison diagram of system performance and vehicle performance under different traffic patterns according to different schemes provided by the embodiment of the invention.
Fig. 6 is a schematic diagram of average connection probabilities under different capacity threshold limits according to different solutions provided in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
First, an embodiment is explained. This section is an explanatory embodiment expanding on the claims so as to fully understand how the present invention is embodied by those skilled in the art.
As shown in fig. 1, the method for secure computation offloading and resource allocation in the internet of vehicles provided by the present invention includes the following steps:
s101: constructing a vehicle networking communication scene of a single base station, wherein the base station is connected to an edge server to provide calculation unloading service;
s102: aiming at the transmission processes of different links, modeling is carried out on the communication process and the like;
s103: improving the security of the edge computing network of the Internet of vehicles by utilizing the Wyner interception coding scheme, and modeling an optimization target;
s104: the base station acquires the state information of the current moment through the action with the surrounding environment information, wherein the state information comprises the information (including vehicle speed, position coordinates and current state) of a target vehicle, frequency band allocation information and resource allocation information on an edge server are used as the state input of deep reinforcement learning, and the deep reinforcement learning uses a DDQN algorithm;
s105: based on the current state information, the vehicle selects a corresponding action; the current state action is power selection, frequency band selection and edge server calculation resource block selection;
s106: designing a reward mechanism and a structure of a neural network according to the model and the strategy constructed in the S102;
s107: extracting input characteristics of the current state by using the DDQN neural network in the S105, fitting a Q function to obtain Q values of different actions in various input states, selecting the action in the current state according to an Ee-greedy strategy, and training and updating neural network parameters by combining with a reward mechanism in the S105;
s108: and by utilizing the trained DDQN network, taking the state information of the current environment as state input, outputting a Q value sequence adopting corresponding actions in the current state, and taking the action with the maximum Q value as a strategy for power selection, frequency band selection and edge server calculation resource selection of the target vehicle in the current state.
The procedure at step S101 is as follows: the arrival process of the vehicle is modeled by using a queuing theory, the arrival time interval t of the vehicle obeys a negative exponential score, and a probability density function is as follows:
Figure BDA0003547924910000101
where λ is the vehicle arrival rate and t is the time interval between vehicle arrivals.
The process at step S102 is as follows:
s2.1 in the course of communication, the channel gain g between the transmitting end and the receiving end k By large scale fading a k And small scale fading component h k
g k =α k h k
S2.2 Large Scale fading h k Consisting of path loss and shadow fading. The path LOSs of V2V is divided into LOS and NLOS cases. In the LOS case:
Figure BDA0003547924910000102
wherein f is c Is the carrier frequency, d is the distance, d BP Is an effective distance, h 0 And h 1 The path loss in the case of NLOS is the height of the vehicle:
PL Nlos (d 1 ,d 2 )=PL los (d 1 )+20-12.5n j +10n j log 10 d 2 +3log 10 (f c /5)
wherein n is j =max(2.8-0.0024d 1 ,1.84),d 1 And d 2 Indicating the length and width of each road grid in the manhattan grid layout.
Shadow fading of V2V:
Figure BDA0003547924910000111
where D is the updated distance matrix, D corr 10, N on the general city road S (n) is an M x M matrix, which is a normally distributed matrix with variance of 1, expected to be 0.
Path loss PL of V2I V2I =a+blog 10 R, wherein R represents the distance between the vehicle and the base station, and a and b are path loss parameters related to the scene. Of V2IShadow fading:
Figure BDA0003547924910000112
wherein D i Matrix representing the updated distance of the ith vehicle user, D corr Is 50, R is an M x M matrix with the diagonal lines being k and the remaining elements being k/2, N i (n) is the Mx1 matrix generated by the ith vehicle user, which is a normal distribution matrix expected to be 0 with a variance of 1.
S2.3 the rate of the unloading link from the kth vehicle user to the base station via the mth sub-channel is
Figure BDA0003547924910000113
Figure BDA0003547924910000114
Where W is the bandwidth of the channel and,
Figure BDA0003547924910000115
expressed as the signal-to-noise ratio:
Figure BDA0003547924910000116
it can be used for the treatment of cattle
Figure BDA0003547924910000117
Represents the power from the kth vehicle user to the base station, g k,B [m]Representing the channel gain, σ, of the k-th vehicle to the base station on the m-th frequency band 2 The representation of the noise is represented by,
Figure BDA0003547924910000118
is the interference experienced in the unloading of the kth user vehicle,
Figure BDA0003547924910000119
transmission power g representing that m-th vehicle performs V2V communication m,B [m]Interference to V2I communication indicating that the m-th vehicle performs V2V communicationChannel gain, p k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used;
s2.4 Rate of an nth eavesdropper eavesdropping on a kth vehicle user on an mth sub-band
Figure BDA00035479249100001110
Is shown as
Figure BDA0003547924910000121
Wherein
Figure BDA0003547924910000122
Represents the power of the k-th vehicle user, g k,n [m]Indicates the channel gain, σ, of the k-th vehicle to the eavesdropper on the m-th band 2 The representation of the noise is represented by,
Figure BDA0003547924910000123
is the disturbance experienced during the eavesdropping process,
Figure BDA0003547924910000124
transmission power g representing that m-th vehicle performs V2V communication m,n [m]Denotes the channel gain, ρ, of the m-th vehicle in V2V communication with the eavesdropper k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used.
The process at step S103 is as follows:
s3.1 secure offload Rate expressed as
Figure BDA0003547924910000125
v e Representing all eavesdroppers.
S3.2 time of transmission of kth vehicle user to base station
Figure BDA0003547924910000126
Wherein B is k Which represents the size of the computational task or tasks,
Figure BDA0003547924910000127
representing the secure offload rate. Time calculated by task on edge calculation server:
Figure BDA0003547924910000128
wherein B is k Representing the size of the computational task, z k [j]1 means that the jth resource block is allocated to the kth vehicle user for use, z k [j]0 means that the jth resource block is not allocated to the kth vehicle user for use, N c,j Indicates the total number of edge server processing cores, u E Representing the processing rate of each core. The total delay
Figure BDA0003547924910000129
Figure BDA00035479249100001210
S3.3 minimizes the maximum service delay among all vehicles, with an objective function of:
Figure BDA00035479249100001211
Subject to:
C 1
Figure BDA00035479249100001212
C 2
Figure BDA00035479249100001213
C 3
Figure BDA00035479249100001214
C 4
Figure BDA00035479249100001215
C 5
Figure BDA00035479249100001216
wherein N is u Representing the total number of service vehicles, N b Representing edge server resource blocks, N c Representing the total number of processing cores and processing power of the MEC server, N p Represents a selectable amount of vehicle power,
Figure BDA0003547924910000131
meaning that the kth vehicle user selects the ith power as the transmit power, otherwise
Figure BDA0003547924910000132
C 1 Ensuring that the total number of processing cores does not exceed the core number of edge servers, C 2 ,C 3 ,C 4 Three constraints ensure that each vehicle user can only select one frequency band, one transmission power and one calculation resource block. C 5 And the decision variables of the optimization target are designated as binary variables.
The process at step S105 is as follows:
s5.1 the motion space can be represented using a three-dimensional coordinate, with the x-axis representing band selection, the y-axis representing vehicle transmit power selection, and the z-axis representing selection of computing resource blocks on the edge server. Let the frequency band select have N a The power of the vehicle is selected to be N b In the method, the edge server resource block is selected to have N p In this way, the action for any vehicle needing service may be N a ×N b ×N p
S5.2, adopting an epsilon-greedy strategy to balance the training process and the exploration process. At time t, the base station selects the action with the largest Q value with a probability of 1-e, and selects one action from the state space A with a probability of e.
The procedure at step S106 is as follows:
s6.1 dividing the rewards into N according to the time of service delay w And (5) grading.
S6.2 when the calculated offload rate is too low, there is a large delay with a reward of 0.
The neural network training process in the step of step S107 is as follows:
and S7.1, initializing the environmental information and the Q network parameters to generate vehicle operation data.
And S7.2, updating and acquiring the current vehicle position and the environment state in each training round, resetting the frequency band power selection and the edge server resource allocation strategy.
S7.3, selecting an action for the target vehicle according to the current state information and the greedy algorithm, namely a combination scheme of frequency band selection, vehicle power and edge server resource allocation, and updating the environment information.
S7.4 obtaining the action combination schemes of all target vehicles, and further obtaining the reward value r related to the capacity c,i And the returned prize value r t }。
S7.5 stores the state, action, reward and next state at time t as a sample in the experience pool.
S7.6 when the number of empirical pool samples is sufficient, training the model is started. Randomly taking small batches of samples(s) from a pool of experiences t ,a t ,r t ,s t+1 ) And training network parameters and updating the target network weight.
And II, application embodiment. In order to prove the creativity and the technical value of the technical scheme of the invention, the part is the application example of the technical scheme of the claims on specific products or related technologies.
The method is applied and verified in a dynamic Internet of vehicles calculation unloading scene. The application example considers a communication system of a bidirectional crossroad, the arrival time interval of vehicles on each road is distributed according to negative indexes, the arrival rate of the vehicles is 0.5, and the vehicle speed is 72 km/h. The scenario therefore requires the base station to quickly make edge server resource block, vehicle transmit power and band block selections based on limited state information. Meanwhile, the model training and verification analysis provided by the invention are carried out on application implementation cases. Fig. 5 to 6 are performance analysis diagrams of the present embodiment, and multi-dimensional new energy analysis is performed, so that the effectiveness and robustness of the proposed security computation offloading and resource allocation method are verified, and the overall performance of the system can be significantly improved. The method has profound significance for promoting the development of the car networking and edge computing technology. It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware part can be realized by a special logic chip; the software may be stored in memory for execution with appropriate instructions. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
And thirdly, evidence of relevant effects of the embodiment. The embodiment of the invention achieves some positive effects in the process of research and development or use, and has great advantages compared with the prior art, and the following contents are described by combining data, diagrams and the like in the test process.
Fig. 5 is a diagram illustrating the total delay at different locations according to the present invention. By randomly generating 10 points, the maximum delay of all vehicles is calculated as the total processing time delay under different schemes. From the figure, it can be seen that the delay performance of the SoRA scheme is far better than that of the local computation scheme and the scheme without frequency band sharing. Since the frequency band sharing will not only cause interference to the target but also cause interference to the eavesdropper, the eavesdropping rate of the eavesdropper is reduced, and finally the performance of the SoRA scheme is shorter than that of the scheme without sharing. For all random location points, the SoRA scheme is very close to the optimal scheme, compared with the optimal scheme, a lot of time is consumed to traverse all possibilities, and the DRL-based SoRA strategy quickly adapts to the characteristics of the car networking environment, thereby illustrating the high efficiency of the scheme.
Fig. 6 shows the average connection probability under different capacity threshold limits according to different aspects of the present invention. The performance of the different schemes can be seen by setting different capacity thresholds. It can be seen from the figure that, with the continuous increase of the capacity threshold, the connection probability of the random strategy is firstly sharply decreased and then slowly decreased, the connection probability of the optimal scheme remains unchanged, and the SoRA strategy has better effect than the random strategy without sharing and has little difference from the optimal strategy.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The method for the safe calculation unloading and resource allocation of the Internet of vehicles is characterized in that the method for the safe calculation unloading and resource allocation of the Internet of vehicles firstly utilizes a queuing theory to model a dynamic process of a vehicle, and a plurality of dynamic eavesdroppers are arranged in a scene; secondly, modeling the optimization problem into a multi-agent sequential decision problem, and performing multi-agent training solution by using a DDQN reinforcement learning method, so that a user can select a reasonable strategy to minimize the maximum service delay in all vehicles while performing safe unloading.
2. The vehicle networking security computing offloading and resource allocation method of claim 1, wherein the vehicle networking security computing offloading and resource allocation method comprises the steps of:
the method comprises the steps that firstly, a vehicle networking communication scene of a single base station is constructed, and the base station is connected to an edge server to provide calculation unloading service;
secondly, modeling a communication process for transmission processes of different links;
thirdly, modeling an optimized target by utilizing a Wyner eavesdropping coding scheme;
fourthly, the base station obtains the state information of the current moment through the action of the base station and the surrounding environment information, wherein the state information comprises the information of the target vehicle, including the vehicle speed, the position coordinate and the current state), the frequency band allocation information and the resource allocation information on the edge server are used as the state input of the deep reinforcement learning, and the DDQN algorithm is used for the deep reinforcement learning;
fifthly, selecting corresponding actions by the vehicle based on the current state information; the current state action is power selection, frequency band selection and edge server calculation resource block selection;
sixthly, designing a reward mechanism and a structure of the neural network according to the model and the strategy constructed in the second step;
seventhly, extracting input characteristics of the current state by using the DDQN neural network in the fifth step, fitting a Q function to obtain Q values of different actions in various input states, selecting the action in the current state according to an e-greedy strategy, and training and updating parameters of the neural network by combining with a reward mechanism in the fifth step;
and eighthly, using the trained DDQN network, taking the state information of the current environment as state input, outputting a Q value sequence adopting corresponding actions in the current state, and taking the action with the maximum Q value as a strategy for selecting power, frequency band and edge server computing resources of the target vehicle in the current state.
3. The vehicle networking security computation offload and resource allocation method according to claim 2, wherein the process of the first step is as follows: the arrival process of the vehicle is modeled by using a queuing theory, the arrival time interval t of the vehicle obeys a negative exponential score, and a probability density function is as follows:
Figure FDA0003547924900000021
where λ is the vehicle arrival rate and t is the time interval between vehicle arrivals.
4. The vehicle networking security computation offload and resource allocation method of claim 2, wherein the second step is performed as follows:
2.1 channel gain g between transmitting and receiving ends during communication k By large scale fading a k And small scale fading component h k
g k =α k h k
2.2 Large Scale fading h k Consisting of path LOSs and shadow fading, the path LOSs of V2V is divided into LOS and NLOS cases, where:
Figure FDA0003547924900000022
wherein f is c Is the carrier frequency, d is the distance, d BP Is an effective distance, h 0 And h 1 The path loss in the case of NLOS is the height of the vehicle:
PL Nlos (d 1 ,d 2 )=PL los (d 1 )+20-12.5n j +10n j log 10 d 2 +3log 10 (f c /5)
wherein n is j =max(2.8-0.0024d 1 ,1.84),d 1 And d 2 Representing the length and width of each road grid in a manhattan grid layout;
shadow fading of V2V:
Figure FDA0003547924900000023
where D is the updated distance matrix, D corr 10, N on the general city road S (n) is a matrix of M x M, which is a normally distributed matrix with variance 1 expected to be 0;
path loss PL of V2I V2I =a+blog 10 R, wherein R represents the distance between the vehicle and the base station, and a and b are path loss parameters related to the scene; shadow fading of V2I:
Figure FDA0003547924900000031
wherein
Figure FDA0003547924900000032
Represents the power from the kth vehicle user to the base station, g k,B [m]Representing the channel gain, σ, of the k-th vehicle to the base station on the m-th frequency band 2 The representation of the noise is represented by,
Figure FDA0003547924900000033
is the interference experienced in the unloading of the kth user vehicle,
Figure FDA0003547924900000034
transmission power g representing that m-th vehicle performs V2V communication m,n [m]Denotes a channel gain, ρ, of the m-th vehicle for V2V communication k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used.
2.3 user pass of kth vehicleThe rate of the offload link from the mth subchannel to the base station is
Figure FDA0003547924900000035
Figure FDA0003547924900000036
Where W is the bandwidth of the channel and,
Figure FDA0003547924900000037
expressed as the signal-to-noise ratio:
Figure FDA0003547924900000038
wherein
Figure FDA0003547924900000039
Represents the power from the kth vehicle user to the base station, g k,B [m]Indicates the channel gain, σ, of the kth vehicle to the base station on the mth frequency band 2 The representation of the noise is represented by,
Figure FDA00035479249000000310
is the interference experienced in the unloading of the kth user vehicle,
Figure FDA00035479249000000311
transmission power g representing that m-th vehicle performs V2V communication m,B [m]Represents an interference channel gain, ρ, caused to V2I communication by the m-th vehicle for V2V communication k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used;
2.4 Rate of an nth eavesdropper eavesdropping on a kth vehicular user on an mth subband
Figure FDA00035479249000000312
Expressed as:
Figure FDA00035479249000000313
wherein
Figure FDA00035479249000000314
Represents the power of the k-th vehicle user, g k,n [m]Indicates the channel gain, σ, of the k-th vehicle to the eavesdropper on the m-th band 2 The representation of the noise is represented by,
Figure FDA00035479249000000315
is the disturbance suffered during the eavesdropping,
Figure FDA00035479249000000316
transmission power g representing that m-th vehicle performs V2V communication m,n [m]Denotes the channel gain, ρ, of the m-th vehicle in V2V communication with the eavesdropper k′ [m]Using this band is denoted by 1, ρ k′ [m]0 means that this band is not used.
5. The vehicle networking security computation offload and resource allocation method according to claim 2, wherein the third step is as follows:
3.1 secure offload Rate expressed as
Figure FDA0003547924900000041
v e Representing all eavesdroppers.
3.2 time of transmission of kth vehicle user to base station
Figure FDA0003547924900000042
Wherein B is k Which represents the size of the computing task,
Figure FDA0003547924900000043
representing the secure offload rate, the time the task computed on the edge computing server:
Figure FDA0003547924900000044
wherein B is k Representing the size of the computational task, z k [j]1 means that the jth resource block is allocated to the kth vehicle user for use, z k [j]0 means that the jth resource block is not allocated to the kth vehicle user for use, N c,j Indicates the total number of edge server processing cores, u E Representing the processing rate of each core; the total delay
Figure FDA0003547924900000045
Figure FDA0003547924900000046
3.3 minimizing the maximum service delay among all vehicles, the objective function is:
Figure FDA0003547924900000047
Subject to:
C 1 :
Figure FDA0003547924900000048
C 2 :
Figure FDA0003547924900000049
C 3 :
Figure FDA00035479249000000410
C 4 :
Figure FDA00035479249000000411
C 5 :
Figure FDA00035479249000000412
wherein N is u Representing the total number of service vehicles, N b Representing edge server resource blocks, N c Representing the total number of processing cores and processing power of the MEC server, N p Represents a selectable amount of vehicle power,
Figure FDA00035479249000000413
meaning that the kth vehicle user selects the ith power as the transmit power, otherwise
Figure FDA00035479249000000414
N c Representing the total number of processing cores and processing power of the MEC server, N u Representing the total number of serviced vehicles. C 1 Ensuring that the total number of processing cores does not exceed the core number of edge servers, C 2 ,C 3 ,C 4 Three constraints ensure that each vehicle user can only select one frequency band, one transmitting power and one calculation resource block, C 5 And the decision variables of the optimization target are designated as binary variables.
6. The vehicle networking security computation offload and resource allocation method according to claim 2, wherein the process of the fifth step is as follows:
5.1 the motion space can be represented by a three-dimensional coordinate, wherein the x axis represents the selection of a frequency band, the y axis represents the selection of the vehicle emission power, and the z axis represents the selection of a computing resource block on the edge server; let the frequency band select have N a The power of the vehicle is selected to be N b In the method, the edge server resource block is selected to have N p In this way, the action for any vehicle needing service may be N a ×N b ×N p
And 5.2, balancing the training process and the exploration process by adopting an element E-greedy strategy, and selecting the action with the maximum Q value by the base station with the probability of 1-element at the time t, and selecting one action from the state space A with the probability of element E.
7. The vehicle networking security computation offload and resource allocation method according to claim 2, wherein the process of the sixth step is as follows:
6.1 partitioning rewards into N according to time of service delay w Grading;
6.2 when the calculated unload rate is too low, there is a large delay with a reward of 0.
8. The vehicle networking security computation offload and resource allocation method according to claim 2, wherein the neural network training process of the seventh step is as follows:
7.1 initializing environmental information and Q network parameters to generate vehicle operation data;
7.2 in each training round, updating and acquiring the current vehicle position and the environmental state, resetting the frequency band power selection and the edge server resource allocation strategy;
7.3 selecting an action for the target vehicle according to the current state information and a greedy algorithm, namely a combination scheme of frequency band selection, vehicle power and edge server resource allocation, and updating the environment information;
7.4 obtaining the action combination scheme of all target vehicles, obtaining the reward value r related to the capacity c,i And the returned prize value r t };
7.5 storing the state, action, reward and next state at time t as a sample in the experience pool,
7.6 when the number of samples in the experience pool is large enough, training the model is started, and small batches of samples(s) are randomly drawn from the experience pool t ,a t ,r t ,s t+1 ) And training network parameters and updating the target network weight.
9. A computer arrangement, characterized in that the computer arrangement comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of the car networking security calculation offloading and resource allocation method according to any of claims 1-7.
10. An information data processing terminal, characterized in that the information data processing terminal is used for executing the steps of the vehicle networking security computing unloading and resource allocation method according to any one of claims 1 to 7.
CN202210253563.6A 2022-03-15 2022-03-15 Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal Pending CN114827947A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210253563.6A CN114827947A (en) 2022-03-15 2022-03-15 Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210253563.6A CN114827947A (en) 2022-03-15 2022-03-15 Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal

Publications (1)

Publication Number Publication Date
CN114827947A true CN114827947A (en) 2022-07-29

Family

ID=82528510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210253563.6A Pending CN114827947A (en) 2022-03-15 2022-03-15 Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal

Country Status (1)

Country Link
CN (1) CN114827947A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115515101A (en) * 2022-09-23 2022-12-23 西北工业大学 Decoupling Q learning intelligent codebook selection method for SCMA-V2X system
CN115866559A (en) * 2022-11-25 2023-03-28 西安电子科技大学 Non-orthogonal multiple access assisted low-energy-consumption safety unloading method for Internet of vehicles

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115515101A (en) * 2022-09-23 2022-12-23 西北工业大学 Decoupling Q learning intelligent codebook selection method for SCMA-V2X system
CN115866559A (en) * 2022-11-25 2023-03-28 西安电子科技大学 Non-orthogonal multiple access assisted low-energy-consumption safety unloading method for Internet of vehicles
CN115866559B (en) * 2022-11-25 2024-04-30 西安电子科技大学 Non-orthogonal multiple access auxiliary Internet of vehicles low-energy-consumption safe unloading method

Similar Documents

Publication Publication Date Title
CN113950066B (en) Single server part calculation unloading method, system and equipment under mobile edge environment
CN109068391B (en) Internet of vehicles communication optimization algorithm based on edge calculation and Actor-Critic algorithm
CN111414252B (en) Task unloading method based on deep reinforcement learning
CN114827947A (en) Internet of vehicles safety calculation unloading and resource allocation method, computer equipment and terminal
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
Wu et al. Deep reinforcement learning-based computation offloading for 5G vehicle-aware multi-access edge computing network
CN113687875B (en) Method and device for unloading vehicle tasks in Internet of vehicles
CN112422352B (en) Edge computing node deployment method based on user data hotspot distribution
CN113645273B (en) Internet of vehicles task unloading method based on service priority
CN113359480A (en) Multi-unmanned aerial vehicle and user cooperative communication optimization method based on MAPPO algorithm
CN114885420A (en) User grouping and resource allocation method and device in NOMA-MEC system
CN111988787A (en) Method and system for selecting network access and service placement positions of tasks
US20230104220A1 (en) Radio resource allocation
CN115297171A (en) Edge calculation unloading method and system for cellular Internet of vehicles hierarchical decision
Hu et al. Quantifying the influence of intermittent connectivity on mobile edge computing
CN117098189A (en) Computing unloading and resource allocation method based on GAT hybrid action multi-agent reinforcement learning
CN115278693A (en) CVN (continuously variable transmission) spectrum scheduling method and system based on driving state priority and scene simulation
CN116321293A (en) Edge computing unloading and resource allocation method based on multi-agent reinforcement learning
Mafuta et al. Decentralized resource allocation-based multiagent deep learning in vehicular network
Jiao et al. Deep reinforcement learning-based optimization for RIS-based UAV-NOMA downlink networks
CN112445617A (en) Load strategy selection method and system based on mobile edge calculation
CN116996938A (en) Internet of vehicles task unloading method, terminal equipment and storage medium
CN116963034A (en) Emergency scene-oriented air-ground network distributed resource scheduling method
CN116916272A (en) Resource allocation and task unloading method and system based on automatic driving automobile network
CN116367231A (en) Edge computing Internet of vehicles resource management joint optimization method based on DDPG algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination