CN111465032A - Task unloading method and system based on A3C algorithm in multi-wireless body area network environment - Google Patents
Task unloading method and system based on A3C algorithm in multi-wireless body area network environment Download PDFInfo
- Publication number
- CN111465032A CN111465032A CN202010221507.5A CN202010221507A CN111465032A CN 111465032 A CN111465032 A CN 111465032A CN 202010221507 A CN202010221507 A CN 202010221507A CN 111465032 A CN111465032 A CN 111465032A
- Authority
- CN
- China
- Prior art keywords
- task
- network
- classifier
- body area
- decision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/22—Traffic simulation tools or models
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. TPC [Transmission Power Control], power saving or power classes
- H04W52/02—Power saving arrangements
- H04W52/0209—Power saving arrangements in terminal devices
- H04W52/0212—Power saving arrangements in terminal devices managed by the network, e.g. network or access point is master and terminal is slave
- H04W52/0216—Power saving arrangements in terminal devices managed by the network, e.g. network or access point is master and terminal is slave using a pre-established activity schedule, e.g. traffic indication frame
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/18—Self-organising networks, e.g. ad-hoc networks or sensor networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a task unloading method and a task unloading system based on an A3C algorithm in a multi-wireless body area network environment. The method comprises the following steps: determining a network architecture of a multi-wireless body area network, and initializing network parameters; training a task classifier by using the sampled physiological data to obtain a stable classifier model; training the network resource allocation problem by adopting an A3C algorithm based on deep reinforcement learning to obtain a convergent decision network; and (3) task unloading according to the obtained model: and at each moment, firstly, carrying out task classification by using a classifier model, and then carrying out user channel access and edge server computing resource allocation according to a decision network. The method improves the time delay and energy consumption performance of multi-wireless body area network task unloading, and can be widely applied to the practical application scenes of body area networks such as remote medical treatment, health monitoring and the like.
Description
Technical Field
The invention belongs to the field of wireless communication networks, and particularly relates to a task unloading method and system based on an A3C algorithm in a multi-wireless body area network environment.
Background
The wireless body area network is a wireless sensor network taking a human body as a monitoring object. Because the human body has mobility, internetwork interference is more easily generated among a plurality of body area networks, and how to collect and manage data among the plurality of networks is an important direction for researching the body area networks. The current research shows that the body area network has the characteristics of mobility, intensive calculation, low time delay and the like, and the task unloading can be completed by edge calculation in an auxiliary way, namely, base stations equipped with edge servers are placed at the edges of a plurality of networks to perform unified collection and processing of tasks. Because the specific body area network of the monitored object has stricter requirements on time delay and energy consumption, a reasonable task unloading method must be designed to ensure low time delay and low energy consumption of data transmission.
In the existing research on data transmission between a multi-body area network and a data center, most algorithm research is based on a generalized communication network, and no attempt is made to carry out targeted research by combining the data characteristics and the user characteristics of the body area network. In fact, however, the physiological data monitored by the body area network has very important practical significance, and the movement track of the body area network user has characteristics of the body area network user. The existing unloading method does not consider the characteristics, so that the strict requirements of time delay and energy consumption of the wireless body area network cannot be met.
Disclosure of Invention
The invention aims to provide a task unloading method and a task unloading system in a multi-wireless body area network environment, so that the task state and the moving characteristic of a user can be fully considered when the system unloads tasks, and the aim of achieving smaller system time delay and energy consumption is fulfilled.
The technical solution for realizing the purpose of the invention is as follows: a method for task offloading based on A3C algorithm in a multi-wireless body area network environment, the method comprising the steps of:
and 4, unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
Further, in the network architecture of the multiple wireless body area networks in step 1, the network parameters include a user setBase station setRGMM mobility model parameters of the subscriber, base station location ls=(xs,ys) Channel gain hd,s(t) data transfer rate Rd,sTask category βd∈ {0,1}, task offload energy consumption edAnd task offload delay td。
Further, the training classifier in step 2 obtains a user task classifier, and the specific process includes:
step 2-1, estimating a stationary interval of each physiological characteristic by using t-distribution; for a certain physiological characteristic x, the upper limit x of the stationary intervalupAnd a lower limit xlowRespectively as follows:
in the formula (I), the compound is shown in the specification,and sxRespectively, the mean value and the standard deviation corresponding to x, n is the number of the physiological data samples corresponding to the physiological characteristic x, tα,n-1Representing the t-distribution coefficient when the sample size is n;
step 2-2, adding a label for each physiological characteristic corresponding to the physiological characteristic, specifically comprising: adding a label 0 to the physiological data sample in the stable interval to represent a normal task; and adding a label 1 to the physiological data sample outside the stable interval to represent an emergency task.
And 2-3, inputting the physiological data sample processed in the step 2-2 into a support vector machine classifier for training to obtain a task classifier, namely inputting one type of data and outputting the task type of the data.
Further, in step 3, the resource allocation problem during task offloading is trained by using an A3C algorithm, and the specific process includes:
step 3-1, the resource allocation problem is converted into a Markov decision problem, and a Markov decision problem model, namely a decision network, specifically comprises the following steps: state StAnd action atAnd a prize value rt;
Will state StIs set as { bd(t),βd(t),ld(t),Ed(t) }, in which the first two terms bd(t)、βd(t) two quantities related to the task data, respectively representing the data quantity of the task and the task category flag; third item ld(t) is the location status of user d; fourth item Ed(t) is an energy state;
will act atIs arranged as αd,s∈ {0,1} and fd,s,αd,sIndicating whether to offload the task of user d to base station s, fd,sRepresenting the computational resources allocated by base station s to user d,
will award the value rtThe method comprises the following steps:
in the formula, KdFor the benefit of the system, tstaticAnd estaticRespectively representing the time delay and energy consumption under the static allocation method, tdAnd edRespectively representing the time for user d to complete the task and the total energy consumption,weight factors of time delay and energy consumption respectivelyAnd is
Step 3-2, training the decision network, specifically comprising: according to a determined state stDetermining the action a in this state by the decision networktI.e. the base station to which each user should access and the calculation resources allocated by the base station, and then enter a new state to obtain the reward rtObtaining an empirical sequence(s)t,at,rt) Defining the dominance function A(s)t,at) Represents a state stLower motion atThe degree of superiority of (c):
wherein Q(s)t,at) As a function of Q value, V(s)t) As a function of value, gamma is a discount factor, piωTo decide to offload a method;
iteratively updating the decision network parameters until a reward function of the decision network converges, wherein an iterative updating formula is as follows:
in the formula, piw(st,at) Is shown in state stLower selection action atTheta is a parameter of the decision network, E is a mean function, ▽wIs a gradient operator.
Further, in step 4, the task offloading of the multi-radio body area network is performed according to the obtained task classifier and the decision network, and the specific process includes: and at each moment, carrying out task classification by using the trained task classifier, inputting the state of the multi-body-area network system into a decision network according to a classification result, and outputting the results of the user channel access base station and the base station computing resource allocation by the network.
A task offloading system based on A3C algorithm in a multi-wireless body area network environment, the system comprising:
the network construction module is used for constructing network architectures of a plurality of wireless body area networks and initializing network parameters;
the task classifier generating module is used for acquiring physiological data of the user and training a classifier according to the data to obtain a task classifier;
the decision network generation module is used for training the resource allocation problem during task unloading by utilizing an A3C algorithm to obtain a decision network;
and the task unloading module is used for unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
Compared with the prior art, the invention has the following remarkable advantages: 1) the data characteristics in the wireless body area network and the mobile characteristics of users are comprehensively considered, and the time delay and the energy consumption of system task unloading are reduced; 2) the A3C algorithm based on deep reinforcement learning is adopted to optimize the task unloading process of the multi-wireless body area network, and intelligent and autonomous dynamic unloading of the system can be realized under the condition that the system environment is unknown.
The present invention is described in further detail below with reference to the attached drawing figures.
Drawings
Fig. 1 is a flow diagram of a method for task offloading based on the A3C algorithm in a multi-wireless body area network environment, under an embodiment.
FIG. 2 is a flow diagram of training a task classifier in one embodiment.
Figure 3 is a diagram of a multi-wireless body area network architecture in one embodiment.
FIG. 4 is a graph of training benefit variation of the A3C algorithm in one embodiment.
FIG. 5 is a graph of variation in training benefit based on a greedy algorithm in one embodiment.
Detailed Description
In one embodiment, in conjunction with fig. 1, there is provided a method for task offloading based on A3C algorithm in a multi-wireless body area network environment, the method comprising the steps of:
and 4, unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
Further, in one embodiment, the network architecture of the plurality of wireless body area networks in step 1, the network parameters of which include the user setBase station setRGMM mobility model parameters of the subscriber, base station location ls=(xs,ys) Channel gain hd,s(t) data transfer rate Rd,sTask category βd∈ {0,1}, task offload energy consumption edAnd task offload delay td。
Further, in one embodiment, with reference to fig. 2, the training of the classifier in step 2 to obtain the user task classifier includes:
step 2-1, estimating the average of each physiological characteristic by using t-distributionA stable interval; for a certain physiological characteristic x, the upper limit x of the stationary intervalupAnd a lower limit xlowRespectively as follows:
in the formula (I), the compound is shown in the specification,and sxRespectively, the mean value and the standard deviation corresponding to x, n is the number of the physiological data samples corresponding to the physiological characteristic x, tα,n-1Representing the t-distribution coefficient when the sample size is n;
step 2-2, adding a label for each physiological characteristic corresponding to the physiological characteristic, specifically comprising: adding a label 0 to the physiological data sample in the stable interval to represent a normal task; and adding a label 1 to the physiological data sample outside the stable interval to represent an emergency task.
And 2-3, inputting the physiological data sample processed in the step 2-2 into a support vector machine classifier for training to obtain a task classifier, namely inputting one type of data and outputting the task type of the data.
Further, in one embodiment, the resource allocation problem during task offloading is trained by using an A3C algorithm in step 3 to obtain a decision network, and the specific process includes:
step 3-1, the resource allocation problem is converted into a Markov decision problem, and a Markov decision problem model, namely a decision network, specifically comprises the following steps: state StAnd action atAnd a prize value rt;
Will state StIs set as { bd(t),βd(t),ld(t),Ed(t) }, in which the first two terms bd(t)、βd(t) two quantities related to the task data, respectively representing the data quantity of the task and the task category flag; third item ld(t) isThe location status of user d; fourth item Ed(t) is an energy state;
will act atIs arranged as αd,s∈ {0,1} and fd,s,αd,sIndicating whether to offload the task of user d to base station s, fd,sRepresenting the computational resources allocated by base station s to user d,
will award the value rtThe method comprises the following steps:
in the formula, KdFor the benefit of the system, tstaticAnd estaticRespectively representing the time delay and energy consumption under the static allocation method, tdAnd edRespectively representing the time for user d to complete the task and the total energy consumption,weight factors of time delay and energy consumption respectivelyAnd is
Step 3-2, training the decision network, specifically comprising: according to a determined state stDetermining the action a in this state by the decision networktI.e. the base station to which each user should access and the calculation resources allocated by the base station, and then enter a new state to obtain the reward rtObtaining an empirical sequence(s)t,at,rt) Defining the dominance function A(s)t,at) Represents a state stLower motion atThe degree of superiority of (c):
wherein Q(s)t,at) As a function of Q value, V(s)t) As a function of value, gamma is a discount factor, piωTo decide to offload a method;
iteratively updating the decision network parameters until a reward function of the decision network converges, wherein an iterative updating formula is as follows:
in the formula, piw(st,at) Is shown in state stLower selection action atTheta is a parameter of the decision network, E is a mean function, ▽wIs a gradient operator.
Further, in one embodiment, the task offloading of the multi-radio body area network is performed according to the obtained task classifier and the decision network in step 4, and the specific process includes: and at each moment, carrying out task classification by using the trained task classifier, inputting the state of the multi-body-area network system into a decision network according to a classification result, and outputting the results of the user channel access base station and the base station computing resource allocation by the network.
A task offloading system based on A3C algorithm in a multi-wireless body area network environment, the system comprising:
the network construction module is used for constructing network architectures of a plurality of wireless body area networks and initializing network parameters;
the task classifier generating module is used for acquiring physiological data of the user and training a classifier according to the data to obtain a task classifier;
the decision network generation module is used for training the resource allocation problem during task unloading by utilizing an A3C algorithm to obtain a decision network;
and the task unloading module is used for unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
Further, in one embodiment, the task classifier generating module includes:
a plateau region setting unit forEstimating a stationary interval of each physiological characteristic by using the t-distribution; for a certain physiological characteristic x, the upper limit x of the stationary intervalupAnd a lower limit xlowRespectively as follows:
in the formula (I), the compound is shown in the specification,and sxRespectively, the mean value and the standard deviation corresponding to x, n is the number of the physiological data samples corresponding to the physiological characteristic x, tα,n-1Representing the t-distribution coefficient when the sample size is n;
the task labeling unit is used for adding a label to the corresponding physiological data sample according to each physiological characteristic, and specifically comprises: adding a label 0 to the physiological data sample in the stable interval to represent a normal task; and adding a label 1 to the physiological data sample outside the stable interval to represent an emergency task.
And the classifier training unit is used for inputting the physiological data samples processed by the task labeling unit into a support vector machine classifier for training to obtain a task classifier, namely inputting one type of data and outputting the task type of the data.
Further, in one embodiment, the decision network generating module includes:
a decision network construction unit, configured to convert the resource allocation problem into a markov decision problem, where the markov decision problem model, i.e., the decision network specifically includes: state StAnd action atAnd a prize value rt;
Will state StIs set as { bd(t),βd(t),ld(t),Ed(t) }, in which the first two terms bd(t)、βd(t) two quantities related to the task data, respectively representing the data quantity of the task and the task category flag;third item ld(t) is the location status of user d; fourth item Ed(t) is an energy state;
will act atIs arranged as αd,s∈ {0,1} and fd,s,αd,sIndicating whether to offload the task of user d to base station s, fd,sRepresenting the computational resources allocated by base station s to user d,
will award the value rtThe method comprises the following steps:
in the formula, KdFor the benefit of the system, tstaticAnd estaticRespectively representing the time delay and energy consumption under the static allocation method, tdAnd edRespectively representing the time for user d to complete the task and the total energy consumption,weight factors of time delay and energy consumption respectivelyAnd is
The decision network training unit is used for training a decision network, and specifically comprises: according to a determined state stDetermining the action a in this state by the decision networktI.e. the base station to which each user should access and the calculation resources allocated by the base station, and then enter a new state to obtain the reward rtObtaining an empirical sequence(s)t,at,rt) Defining the dominance function A(s)t,at) Represents a state stLower motion atThe degree of superiority of (c):
wherein Q(s)t,at) As a function of Q value, V(s)t) As a function of value, gamma is a discount factor, piωTo decide to offload a method;
iteratively updating the decision network parameters until a reward function of the decision network converges, wherein an iterative updating formula is as follows:
in the formula, piw(st,at) Is shown in state stLower selection action atTheta is a parameter of the decision network, E is a mean function, ▽wIs a gradient operator.
In one embodiment, as a specific example, the present invention is further explained and verified, and the specific contents include:
firstly, a multi-wireless body area network system is established according to the architecture of fig. 3, and initialization of network parameters is carried out. And then, according to the collected human physiological data, performing the calculation of the stable interval, the addition of the data label and the training of the classifier in the step 2. From these data sets, training of the task off-loading method based on the A3C algorithm was performed.
According to the step 3-1, the state s of the task unloading problem in the embodiment is obtainedtAnd action atPrize rtModeling is carried out, and the time delay has more severe requirements for the body area network taking health monitoring as the target, so the weight factors of the time delay and the energy consumption in the step 3-1 are set asThe decision network is then trained according to step 3-2 using the A3C algorithm. Parameters in the algorithm are set as: the discount factor γ is 0.99, and the learning rate is 0.001.
In the training phase, after each task unloading is finished, the state vector s of the system is calculatedtInputting the vector into decision network, outputting the unloading method at next moment to unload task, delaying timeAnd energy consumption is fed back to the decision network in the form of reward values, these values are recorded and the dominance function A(s) is calculatedt,at) And then updating the parameters of the decision network until the average reward converges.
Fig. 4 and fig. 5 are graphs of system delay and energy consumption benefit changes after the present embodiment respectively adopts the conventional unloading method and the unloading method based on A3C (A3C-based unloading and Joint Resource Allocation, AOJRA). The traditional unloading method is an unloading method based on Greedy thought (GOJRA).
In fig. 4, the system benefit of the AOJRA method is around 0.8 when training starts in 3000 training cycles, and rapidly increases under continuous training, and stabilizes around 7 at about 2000 training cycles. According to the definition of the system benefit function in step 3-1, a benefit value of 7 indicates that the total benefit of system delay and energy consumption is 7 with respect to the SORA method. Considering that the number of system users in the embodiment is 20, the total benefit is averagely allocated to each user and is 0.35, which means that compared with the SORA method, the AOJRA method of the present invention averagely improves the delay and energy consumption performance of each user by 35%. Through similar analysis, the GOJRA method in fig. 5 can improve the delay and power consumption performance of each user by 29% on average compared with the SORA method.
Compared with the traditional GOJRA method, the AOJRA method can improve the time delay and energy consumption performance of users more, not only considers the influence of channel gain during task unloading, but also further considers the mutual interference of different users during data transmission at the same time, and can effectively avoid the network congestion caused by the fact that a large number of users select the same base station to perform data transmission in the same time and the time delay and energy consumption increase caused by the shortage of base station computing resources.
In conclusion, the method reduces the time delay and energy consumption of system task unloading under the condition of considering the data characteristics of the wireless body area network and the mobile characteristics of the user. The invention can improve the capability of the wireless body area network to more rapidly serve human life, and can be widely applied to the practical application scenes of the body area network such as remote medical treatment, health monitoring and the like.
Claims (8)
1. A task unloading method based on A3C algorithm in multi-wireless body area network environment is characterized by comprising the following steps:
step 1, constructing network architectures of a plurality of wireless body area networks and initializing network parameters;
step 2, collecting physiological data of a user, training a classifier according to the data, and obtaining a task classifier;
step 3, training the resource allocation problem during task unloading by using an A3C algorithm to obtain a decision network;
and 4, unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
2. The method of claim 1, wherein the network parameters of the network architecture of the plurality of wireless body area networks of step 1 include a set of usersBase station setRGMM mobility model parameters of the subscriber, base station location ls=(xs,ys) Channel gain hd,s(t) data transfer rate Rd,sTask category βd∈ {0,1}, task offload energy consumption edAnd task offload delay td。
3. The method for task offloading based on A3C algorithm in a multi-wireless body area network environment according to claim 1 or 2, wherein the step 2 of training the classifier to obtain the user task classifier comprises the following specific steps:
step 2-1, estimating a stationary interval of each physiological characteristic by using t-distribution; for a certain physiological characteristic x, the upper limit x of the stationary intervalupAnd a lower limit xlowRespectively as follows:
in the formula (I), the compound is shown in the specification,and sxRespectively, the mean value and the standard deviation corresponding to x, n is the number of the physiological data samples corresponding to the physiological characteristic x, tα,n-1Representing the t-distribution coefficient when the sample size is n;
step 2-2, adding a label for each physiological characteristic corresponding to the physiological characteristic, specifically comprising: adding a label 0 to the physiological data sample in the stable interval to represent a normal task; and adding a label 1 to the physiological data sample outside the stable interval to represent an emergency task.
And 2-3, inputting the physiological data sample processed in the step 2-2 into a support vector machine classifier for training to obtain a task classifier, namely inputting one type of data and outputting the task type of the data.
4. The method for task offloading based on A3C algorithm in a multi-radio body area network environment according to claim 3, wherein the step 3 trains a resource allocation problem during task offloading by using an A3C algorithm to obtain a decision network, and the specific process includes:
step 3-1, the resource allocation problem is converted into a Markov decision problem, and a Markov decision problem model, namely a decision network, specifically comprises the following steps: state StAnd action atAnd a prize value rt;
Will state StIs set as { bd(t),βd(t),ld(t),Ed(t) }, in which the first two terms bd(t)、βd(t) two quantities related to task data, each representing data of a taskVolume and task category flags; third item ld(t) is the location status of user d; fourth item Ed(t) is an energy state;
will act atIs arranged as αd,s∈ {0,1} and fd,s,αd,sIndicating whether to offload the task of user d to base station s, fd,sRepresenting the computational resources allocated by base station s to user d,
will award the value rtThe method comprises the following steps:
in the formula, KdFor the benefit of the system, tstaticAnd estaticRespectively representing the time delay and energy consumption under the static allocation method, tdAnd edRespectively representing the time for user d to complete the task and the total energy consumption,weight factors of time delay and energy consumption respectivelyAnd is
Step 3-2, training the decision network, specifically comprising: according to a determined state stDetermining the action a in this state by the decision networktI.e. the base station to which each user should access and the calculation resources allocated by the base station, and then enter a new state to obtain the reward rtObtaining an empirical sequence(s)t,at,rt) Defining the dominance function A(s)t,at) Represents a state stLower motion atThe degree of superiority of (c):
wherein Q(s)t,at) As a function of Q value, V(s)t) As a function of value, gamma is a discount factor, piωTo decide to offload a method;
iteratively updating the decision network parameters until a reward function of the decision network converges, wherein an iterative updating formula is as follows:
5. The method for task offloading based on A3C algorithm in an environment of multiple wireless body area networks according to claim 4, wherein the step 4 of task offloading of multiple wireless body area networks according to the obtained task classifier and decision network comprises: and at each moment, carrying out task classification by using the trained task classifier, inputting the state of the multi-body-area network system into a decision network according to a classification result, and outputting the results of the user channel access base station and the base station computing resource allocation by the network.
6. A task offloading system based on A3C algorithm in a multi-wireless body area network environment, the system comprising:
the network construction module is used for constructing network architectures of a plurality of wireless body area networks and initializing network parameters;
the task classifier generating module is used for acquiring physiological data of the user and training a classifier according to the data to obtain a task classifier;
the decision network generation module is used for training the resource allocation problem during task unloading by utilizing an A3C algorithm to obtain a decision network;
and the task unloading module is used for unloading the tasks of the multi-wireless body area network according to the obtained task classifier and the decision network.
7. The system of claim 6, wherein the task classifier generation module comprises:
a stationary interval setting unit for estimating a stationary interval of each physiological characteristic using the t-distribution; for a certain physiological characteristic x, the upper limit x of the stationary intervalupAnd a lower limit xlowRespectively as follows:
in the formula (I), the compound is shown in the specification,and sxRespectively, the mean value and the standard deviation corresponding to x, n is the number of the physiological data samples corresponding to the physiological characteristic x, tα,n-1Representing the t-distribution coefficient when the sample size is n;
the task labeling unit is used for adding a label to the corresponding physiological data sample according to each physiological characteristic, and specifically comprises: adding a label 0 to the physiological data sample in the stable interval to represent a normal task; and adding a label 1 to the physiological data sample outside the stable interval to represent an emergency task.
And the classifier training unit is used for inputting the physiological data samples processed by the task labeling unit into a support vector machine classifier for training to obtain a task classifier, namely inputting one type of data and outputting the task type of the data.
8. The system for task offloading based on the A3C algorithm in a multi-wireless body area network environment of claim 7, wherein the decision network generation module comprises:
a decision network construction unit, configured to convert the resource allocation problem into a markov decision problem, where the markov decision problem model, i.e., the decision network specifically includes: state StAnd action atAnd a prize value rt;
Will state StIs set as { bd(t),βd(t),ld(t),Ed(t) }, in which the first two terms bd(t)、βd(t) two quantities related to the task data, respectively representing the data quantity of the task and the task category flag; third item ld(t) is the location status of user d; fourth item Ed(t) is an energy state;
will act atIs arranged as αd,s∈ {0,1} and fd,s,αd,sIndicating whether to offload the task of user d to base station s, fd,sRepresenting the computational resources allocated by base station s to user d,
will award the value rtThe method comprises the following steps:
in the formula, KdFor the benefit of the system, tstaticAnd estaticRespectively representing the time delay and energy consumption under the static allocation method, tdAnd edRespectively representing the time for user d to complete the task and the total energy consumption,weight factors of time delay and energy consumption respectivelyAnd is
The decision network training unit is used for training a decision network, and specifically comprises: according to a determined state stDetermining the action a in this state by the decision networktI.e. the base station to which each user should access and the calculation resources allocated by the base station, and then enter a new state to obtain the reward rtObtaining an empirical sequence(s)t,at,rt) Defining the dominance function A(s)t,at) Represents a state stLower motion atThe degree of superiority of (c):
wherein Q(s)t,at) As a function of Q value, V(s)t) As a function of value, gamma is a discount factor, piωTo decide to offload a method;
iteratively updating the decision network parameters until a reward function of the decision network converges, wherein an iterative updating formula is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010221507.5A CN111465032B (en) | 2020-03-26 | 2020-03-26 | Task unloading method and system based on A3C algorithm in multi-wireless body area network environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010221507.5A CN111465032B (en) | 2020-03-26 | 2020-03-26 | Task unloading method and system based on A3C algorithm in multi-wireless body area network environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111465032A true CN111465032A (en) | 2020-07-28 |
CN111465032B CN111465032B (en) | 2023-04-21 |
Family
ID=71680230
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010221507.5A Active CN111465032B (en) | 2020-03-26 | 2020-03-26 | Task unloading method and system based on A3C algorithm in multi-wireless body area network environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111465032B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112241295A (en) * | 2020-10-28 | 2021-01-19 | 深圳供电局有限公司 | Cloud edge cooperative computing unloading method and system based on deep reinforcement learning |
CN113645637A (en) * | 2021-07-12 | 2021-11-12 | 中山大学 | Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109219101A (en) * | 2018-09-21 | 2019-01-15 | 南京理工大学 | Method for routing foundation based on Double moving average predicted method in wireless body area network |
CN110557769A (en) * | 2019-09-12 | 2019-12-10 | 南京邮电大学 | C-RAN calculation unloading and resource allocation method based on deep reinforcement learning |
-
2020
- 2020-03-26 CN CN202010221507.5A patent/CN111465032B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109219101A (en) * | 2018-09-21 | 2019-01-15 | 南京理工大学 | Method for routing foundation based on Double moving average predicted method in wireless body area network |
CN110557769A (en) * | 2019-09-12 | 2019-12-10 | 南京邮电大学 | C-RAN calculation unloading and resource allocation method based on deep reinforcement learning |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112241295A (en) * | 2020-10-28 | 2021-01-19 | 深圳供电局有限公司 | Cloud edge cooperative computing unloading method and system based on deep reinforcement learning |
CN113645637A (en) * | 2021-07-12 | 2021-11-12 | 中山大学 | Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111465032B (en) | 2023-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yu et al. | Computation offloading for mobile edge computing: A deep learning approach | |
CN113950066B (en) | Single server part calculation unloading method, system and equipment under mobile edge environment | |
CN109302463B (en) | Self-organizing cloud architecture and optimization method and system for edge computing | |
CN112995913B (en) | Unmanned aerial vehicle track, user association and resource allocation joint optimization method | |
CN110928654B (en) | Distributed online task unloading scheduling method in edge computing system | |
CN108924936B (en) | Resource allocation method of unmanned aerial vehicle-assisted wireless charging edge computing network | |
CN107682443A (en) | Joint considers the efficient discharging method of the mobile edge calculations system-computed task of delay and energy expenditure | |
CN114219097B (en) | Federal learning training and predicting method and system based on heterogeneous resources | |
CN113286329B (en) | Communication and computing resource joint optimization method based on mobile edge computing | |
CN112835715B (en) | Method and device for determining task unloading strategy of unmanned aerial vehicle based on reinforcement learning | |
CN114065963A (en) | Computing task unloading method based on deep reinforcement learning in power Internet of things | |
CN111465032A (en) | Task unloading method and system based on A3C algorithm in multi-wireless body area network environment | |
Zhou et al. | Computation bits maximization in UAV-assisted MEC networks with fairness constraint | |
CN113286317B (en) | Task scheduling method based on wireless energy supply edge network | |
CN112988285B (en) | Task unloading method and device, electronic equipment and storage medium | |
Muslim et al. | Reinforcement learning based offloading framework for computation service in the edge cloud and core cloud | |
CN106793031A (en) | Based on the smart mobile phone energy consumption optimization method for gathering competing excellent algorithm | |
CN111026548A (en) | Power communication equipment test resource scheduling method for reverse deep reinforcement learning | |
CN114567895A (en) | Method for realizing intelligent cooperation strategy of MEC server cluster | |
WO2022242468A1 (en) | Task offloading method and apparatus, scheduling optimization method and apparatus, electronic device, and storage medium | |
CN115473896A (en) | Electric power internet of things unloading strategy and resource configuration optimization method based on DQN algorithm | |
Chen et al. | Augmented deep reinforcement learning for online energy minimization of wireless powered mobile edge computing | |
Bouzidi et al. | HADAS: Hardware-aware dynamic neural architecture search for edge performance scaling | |
CN113821346B (en) | Edge computing unloading and resource management method based on deep reinforcement learning | |
Chen et al. | Traffic prediction-assisted federated deep reinforcement learning for service migration in digital twins-enabled MEC networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |