CN115002409A - Dynamic task scheduling method for video detection and tracking - Google Patents

Dynamic task scheduling method for video detection and tracking Download PDF

Info

Publication number
CN115002409A
CN115002409A CN202210551198.7A CN202210551198A CN115002409A CN 115002409 A CN115002409 A CN 115002409A CN 202210551198 A CN202210551198 A CN 202210551198A CN 115002409 A CN115002409 A CN 115002409A
Authority
CN
China
Prior art keywords
decision
slot
frame
terminal device
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210551198.7A
Other languages
Chinese (zh)
Other versions
CN115002409B (en
Inventor
王晓飞
王义兰
刘志成
赵云凤
仇超
张程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202210551198.7A priority Critical patent/CN115002409B/en
Publication of CN115002409A publication Critical patent/CN115002409A/en
Application granted granted Critical
Publication of CN115002409B publication Critical patent/CN115002409B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention discloses a dynamic task scheduling method for video detection and tracking, which comprises the following steps: constructing a real-time target detection system comprising a plurality of terminal devices and an edge server, wherein a target tracker is arranged in the terminal devices, and a target detector is arranged in the edge server; constructing a combined optimization problem of a video frame unloading decision, a channel decision and a frame interval decision in a real-time target detection system as a Markov decision problem; in each decision time slot, each terminal device sends tracking precision, head frame information and video content change rate to an edge server, and the edge server constructs a joint decision model by using a DDQN deep reinforcement learning algorithm; and with the maximum gain function as a target, solving the joint optimization problem by using a joint decision model, and executing the terminal equipment according to the video frame unloading decision, the channel decision and the frame interval decision output by the edge server. The invention achieves the maximization of the accuracy of video frame detection under the delay limit.

Description

Dynamic task scheduling method for video detection and tracking
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a dynamic task scheduling method for video detection and tracking.
Background
The advanced machine vision is introduced into the terminal equipment of the Internet of things, so that wide autonomous depth vision application such as traffic monitoring, automatic driving, unmanned aerial vehicle scene analysis, robot vision and the like can be realized. In these applications, the ability of the terminal to detect objects from captured video frames is of paramount importance. However, in order to achieve accurate target detection, the target detection model usually has a complex structure and numerous parameters, and the calculation and storage requirements of the terminal device are high. Therefore, running a full-scale target detection model on a terminal device with limited resources is a challenge, and it is often difficult to meet the requirement of real-time performance, and even heat dissipation problems are encountered. Meanwhile, if the compression model is locally run, although the workload of a Deep Learning (DL) model can be greatly reduced, due to a basic tradeoff between the model size and the model precision, the techniques often result in the reduction of the model precision.
With the advent of 5G networks, offloading the computationally intensive object detection tasks to an edge server for execution has become a promising solution. And the edge server operates the large model, so that accurate detection is realized, and finally, the detection result is transmitted back to the terminal equipment. Some recent efforts have employed a Detection Based Tracking (DBT) approach, specifically running the target detector periodically on some video frames, while processing the frames in between using a lightweight target tracker. Therefore, the DBT-based framework has received more and more attention to real-time video frame detection and analysis. However, most existing solutions based on DBT consider a scenario where one edge server serves a single terminal device and has sufficient transmission resources when designing an offloading policy, and ignore a scenario where one edge server serves multiple terminal devices and limited communication resources negatively affect the offloading performance of competing terminal devices; in addition, most existing solutions based on the DBT adopt tracking of each frame when designing a tracking strategy of the terminal device, and neglect error accumulation of delay caused by tracking of each frame on a detection result; moreover, the conventional technical scheme based on the DBT is completely based on experimental evaluation to realize cooperative detection, system optimization is less realized through theoretical modeling, and specific model encapsulation, modeling and expression cannot be performed on the cooperative detection of the terminal equipment and the edge server.
Disclosure of Invention
Aiming at the technical problems, the invention provides a dynamic task scheduling method facing video detection and tracking. In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a dynamic task scheduling method for video detection and tracking comprises the following steps:
s1, constructing a real-time target detection system comprising a plurality of terminal devices and an edge server, wherein the terminal devices are provided with target trackers, and the edge server is provided with target detectors;
s2, constructing a combined optimization problem of video frame unloading decision, channel decision and frame interval decision in the real-time target detection system into a Markov decision problem;
the video frame unloading decision means that the team first frame of the terminal equipment is continuously waited in a local queue of the terminal equipment when a slot is formed in each decision, is immediately unloaded to an edge server for detection or directly outputs a tracking result, the channel decision means whether the terminal equipment output by the edge server is allocated to a channel, and the frame interval decision means the number of frames of an interval between the next decision slot team first frame when the slot is formed in the current decision of the terminal equipment output by the edge server and the slot next-in-line first frame when the slot is formed in the current decision;
s3, each terminal device sends the tracking precision, the head frame information and the video content change rate to an edge server in each decision time slot, and the edge server constructs a joint decision model by using a DDQN deep reinforcement learning algorithm;
and S4, with the maximum gain function as a target, solving the joint optimization problem by using the joint decision model constructed in the step S3, and executing the terminal equipment according to the video frame unloading decision, the channel decision and the frame interval decision output by the edge server.
The step S2 includes the steps of:
s2.1, constructing a state space, wherein the expression of the state space is as follows:
S n (t)=(M n (t),h n (t),p n (t),v n (t));
in the formula, M n (t) queue head frame information of local queue of terminal device n at decision slot t, h n (t) denotes the channel gain between terminal device n and edge server, v n (t) represents the rate of change of video content of terminal device n at decision slot t, S n (t) represents the state space of terminal device n at decision slot t, p n (t) represents the tracking accuracy of the first frame of the team of the terminal equipment n when the slot is decided by t;
s2.2, constructing an action space, wherein the expression of the action space is as follows:
A n (t)=(a n (t),C n (t),I n (t));
in the formula, A n (t) represents the space of motion of terminal device n at decision slot t, a n (t) video frame unload decision representing the first frame of the local queue of the terminal device n output by the edge server in the decision slot t, i.e. whether to continue waiting in the local queue, unload to the edge server immediately or output the tracking result directly, C n (t) represents the channel decision of the terminal device n output by the edge server in the decision slot t, I n (t) the frame number of the interval between the first frame of the slot queue of the next decision time slot and the first frame of the slot queue of the current decision time, namely the frame interval decision, of the terminal device n output by the edge server at the decision time slot t is represented;
s2.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure BDA0003655086420000021
in the formula (I), the compound is shown in the specification,R n (t) represents a reward function, namely a gain function, of the terminal device n in the decision slot t, Acc represents the detection precision or tracking precision of the first frame of the terminal device n in the decision slot t, beta represents a weight coefficient, and beta is greater than 0,
Figure BDA0003655086420000031
representing the processing time of the first frame of the queue in the terminal equipment n at the time of the decision slot T, alpha is a performance improvement factor and is more than 0, T max Which represents the maximum value of the ideal range of video frame detection delay.
In step S2.1, the first frame information M of the local queue of terminal device n at time slot t is decided n The expression of (t) is:
Figure BDA0003655086420000032
in the formula, s n (t) represents the frame size of the head-of-queue frame of the local queue of terminal device n at decision slot t,
Figure BDA0003655086420000033
representing the arrival time of the head frame of the local queue of terminal device n,
Figure BDA0003655086420000034
indicating the time the head-of-line frame of the local queue of terminal device n at decision slot t has been waiting before processing.
In step S2.1, the channel gain h between the terminal device n and the edge server n The formula for calculation of (t) is:
Figure BDA0003655086420000035
in the formula, gamma n (t) represents a random channel fading factor that conforms to a rayleigh distribution,
Figure BDA0003655086420000036
indicating terminal devicePreparing the average channel gain of n;
average channel gain of the terminal device n
Figure BDA0003655086420000037
The calculation formula of (c) is:
Figure BDA0003655086420000038
in the formula, A d Denotes the antenna gain of the terminal equipment, delta denotes the path loss factor, d n Indicating the distance of the terminal device n from the edge server.
In step S2.1, the tracking accuracy p n (t) the calculation formula:
Figure BDA0003655086420000039
wherein G represents the true location area of the target and Y n And (t) represents the position area of the target detected by the tracking algorithm run by the terminal equipment n when the slot is decided by t.
In step S2.1, the video content change rate v of terminal device n at time slot t n The formula for calculation of (t) is:
Figure BDA00036550864200000310
in the formula (I), the compound is shown in the specification,
Figure BDA00036550864200000311
the pixel position of the kth feature of the i-th frame in the local queue of terminal device n at decision slot t,
Figure BDA00036550864200000312
the pixel position of the kth characteristic of the jth frame in the local queue of the terminal equipment n when the slot t is decided, m represents the characteristic number of the video frame in the local queue of the terminal equipment n when the slot t is decided, and j-i is more than or equal to1。
In step S2.3, if the head frame directly outputs the tracking result, the processing time of the head frame is determined
Figure BDA00036550864200000313
The calculation formula of (2) is as follows:
Figure BDA00036550864200000314
in the formula (I), the compound is shown in the specification,
Figure BDA00036550864200000315
represents the tracking time of the head frame of the queue in terminal device n at decision slot t,
Figure BDA00036550864200000316
representing the waiting time of the head frame of the local queue of the terminal equipment n before processing when the slot t is decided;
if the head frame is unloaded immediately and the channel is available, the processing time of the head frame is determined
Figure BDA0003655086420000041
The calculation formula of (2) is as follows:
Figure BDA0003655086420000042
in the formula, T e Indicating the time at which the edge server performs object detection,
Figure BDA0003655086420000043
the time of transmitting the first frame of the queue in the terminal equipment n through the channel when the slot t is decided is represented;
if the queue head frame decides to wait or decides to unload immediately but the wireless network between the terminal device and the edge server is unavailable at the moment, the queue head frame needs to continue waiting in the local queue until the channel is available and then unload to the edge server, the processing time of the queue head frame is shortened
Figure BDA0003655086420000044
The calculation formula of (2) is as follows:
Figure BDA0003655086420000045
in the formula (I), the compound is shown in the specification,
Figure BDA0003655086420000046
a decision slot indicating the beginning of the transmission of the head frame of the team,
Figure BDA0003655086420000047
presentation decision slot
Figure BDA0003655086420000048
The time when the head frame of the team in terminal device n is transmitted through the channel,
Figure BDA0003655086420000049
representing predicted time slot t to time slot t
Figure BDA00036550864200000410
The number of time slots.
The step S3 includes the following steps:
s3.1, setting a total training round M, initializing an experience playback memory D and a parameter theta of an evaluation network, and assigning the parameter theta of the evaluation network to a parameter theta' of a target network;
s3.2, setting the training cycle number epimode to be 1;
s3.3, to the State space S n (t) initialization, i.e. S n (t)=S n (0) Wherein S is n (t) represents the state space of terminal device n at decision slot t;
s3.4, setting the number T of decision time slot slots;
s3.5, performing t ═ t + 1;
s3.6, selecting action A according to probability epsilon n (t) the expression is:
Figure BDA00036550864200000411
wherein A represents such that Q (S) n (t),A n (t); θ) action at maximum value, A n (t) represents the action space of the terminal device n at the time of the decision slot t;
s3.7, according to the action A selected in the step S3.3 n (t) obtaining a reward R n (t) and the state space S of the next step n (t+1);
S3.8, experience (S) n (t),A n (t),R n (t),S n (t +1)) is stored in the experience replay memory D;
s3.9, randomly taking out G experiences from the experience playback memory D (S) n (t′),A n (t′),R n (t′),S n (t′+1));
S3.10, predicting benefits according to the experience taken out in the step S3.9, wherein the expression is as follows:
Figure BDA00036550864200000412
in the formula, R n (t ') represents the reward function of the terminal device n at the decision slot t ', γ represents the discount factor, A ' represents such that
Figure BDA0003655086420000051
The operation of obtaining the maximum value is performed,
Figure BDA0003655086420000052
represents the maximum gain in slot, S, at the time of t' +1 decision n (t '+ 1) represents the state space of terminal device n at decision slot t' + 1;
s3.11, updating a parameter theta of the evaluation network based on a gradient descent method;
s3.12, assigning the parameter theta of the evaluation network to the parameter theta' of the target network in each step C;
s3.13, judging that T is less than T, if so, returning to the step S3.5, otherwise, executing the step S3.14;
and S3.14, executing the epsilon +1, judging that the epsilon is less than M, if so, returning to the step S3.3, otherwise, outputting a joint decision model containing the target network.
In step S4, the expression of the maximized benefit function is:
Figure BDA0003655086420000053
s.t.C 1 (t)+C 2 (t)+...+C n (t)+...+C N (t)≤1;
a n (t)∈{0,1,2};
I n (t)∈{1,2,3};
in the formula, a n (t) indicating the video frame unloading decision of the head frame of the local queue of the terminal equipment n output by the edge server in the decision slot t, namely, continuously waiting in the local queue, immediately unloading to the edge server or directly outputting the tracking result when a n When (t) is 0, the first frame of the queue representing the terminal device n waits for the next decision slot, when a is n When (t) is 1, the method indicates that the head-of-line frame of the terminal device n is unloaded to the edge server immediately, and when a is n When (t) is 2, it means that the terminal device n directly outputs the tracking result, C n (t) represents the channel decision of the terminal device n output by the edge server in the decision slot t, when C n When (t) is 0, it means that terminal device n is not allocated to a channel in slot t at decision time, and when C is equal to n When (t) is 1, it means that terminal device n is allocated to a channel in decision slot t, I n (t) the frame number of the interval between the first frame of the slot queue of the next decision time slot and the first frame of the slot queue of the current time slot, namely the frame interval decision, R, of the terminal device n output by the edge server in the decision time slot t n (t) represents the reward function, i.e. gain function, of the terminal device N at the decision slot t, N representing the total number of terminal devices.
The invention has the beneficial effects that:
the DBT-based real-time target detection framework mainly aims at continuous video frame scenes with delay constraint, and establishes a target detection system based on the dynamic change network conditions and video content for the cooperative detection of terminal equipment and an edge server, and the characteristics of real-time target detection under a plurality of terminal equipment scenes based on the DBT framework can be further analyzed through the system; the influence of the video content change rate is introduced, the terminal equipment selects different tracking frequencies based on the video content change rate instead of tracking each frame traditionally, an optimization problem is formed by designing a gain function, and the video frame detection accuracy is maximized under the delay limit.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a diagram illustrating tracking accuracy at different frame intervals.
Fig. 3 is a diagram illustrating the change of the average tracking accuracy when the frame interval changes.
Fig. 4 is a graph comparing the effect of the present application with other algorithms.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
A dynamic task scheduling method for video detection and tracking, as shown in fig. 1, includes the following steps:
s1, constructing a real-time target detection system comprising a plurality of terminal devices and an edge server, wherein the terminal devices are provided with target trackers, the edge server is provided with target detectors, and each terminal device is in communication connection with the edge server through a wireless network;
the set of all end devices is denoted by N, N ═ 1,. lamda, N }, and the set of video frames captured by the nth end device is denoted by F n The representation is that all video frame sets captured by the terminal equipment are represented by F, and F is { F ═ F 1 ,...,F n ,...,F N }. The terminal equipment operates a light-weight target tracker, and the edge server operates a large-scale target detector so as to realize real-time detection of targets in captured video frames. However, the tracking performance may decrease with time and the change of the video content, and therefore, before the tracking performance decreases to be too low, that is, the tracking threshold, a new video frame should be sent to the edge server for detection to obtain a new detection result, so as to improve the accuracy of target tracking of the terminal device.
Each terminal device maintains a local queue for buffering video frames waiting for processing, the video frames in the local queue wait for processing according to the principle of first-come first-served, the system time is divided into continuous time slot slots, and if the time slot slots are small enough, only one frame can reach the local queue in each time slot. In each decision time slot t, that is, a time slot in which a video frame waits in a local queue, a video frame at the head of the queue of each terminal device is considered to be also called a head frame, and since the target tracker needs to use a bounding box (bounding box) detected by the edge server for initialization. Therefore, before starting tracking, the terminal device needs to send a first frame to the edge server for detection, obtain a detection result of the first frame, namely the bounding box, then operate the target tracker on a subsequent team first frame based on the detection result for tracking, send frame information and tracking precision to the edge server by the terminal device after tracking, and make a decision of channel allocation, video frame unloading and tracking frequency, namely a tracking frame interval, based on the whole situation and send the decision to the terminal device by the edge server; and finally, the terminal equipment makes corresponding action according to the decision of the edge server. Since the data amount of the result output by the edge server is much smaller than that of the frame itself, the time for returning the result is ignored, and only the uplink frame transmission process of the whole system is considered. And if the unloading decision is local tracking, directly outputting a tracking result. If the unloading decision is immediate unloading and the channel is available, the unloading decision can be unloaded to the edge server for detection, and the detection result is returned to the corresponding terminal equipment after the edge server is detected. If the offload decision is either pending or directly offloaded but the channel is not available, then the slot needs to wait for the next decision in the local queue.
Due to the limited wireless network resources, the wireless network bandwidth may become a bottleneck for the terminal device to offload video frames to the edge server. The present application addresses this challenge in two ways: on one hand, the video frame with reliable tracking performance directly outputs the tracking result to save bandwidth; on the other hand, for a video frame with lower tracking performance, due to the limitation of bandwidth resources and competition of terminal devices, a situation that no available wireless channel exists in slot t at the time of decision, and the video frame waits in a local queue of the terminal device until the channel is available.
S2, constructing a joint optimization Problem of video frame unloading Decision, channel Decision and frame interval Decision in the real-time target detection system as an MDP Problem (Markov Decision Problem), including the following steps:
s2.1, constructing a state space, wherein the expression of the state space is as follows:
S n (t)=(M n (t),h n (t),p n (t),v n (t));
in the formula, M n (t) queue head frame information of local queue of terminal device n at decision slot t, h n (t) denotes the channel gain between terminal device n and edge server, v n (t) represents the rate of change of video content of terminal device n at decision slot t, S n (t) represents the state space of terminal device n at decision slot t, p n (t) head of line of terminal device n at slot time of decision time tThe tracking accuracy of the frame.
The first frame information M of the local queue of the terminal device n at the time of the decision slot t n The expression of (t) is:
Figure BDA0003655086420000071
in the formula s n (t) represents the frame size of the head-of-queue frame of the local queue of terminal device n at decision slot t,
Figure BDA0003655086420000072
the arrival time of the head frame of the queue of the local queue representing terminal device n,
Figure BDA0003655086420000073
indicating the time the head of line frame of the local queue of terminal device n has waited before processing at decision slot t.
The channel gain h between the terminal equipment n and the edge server n (t) conforms to the rayleigh fading channel model, and is calculated by the formula:
Figure BDA0003655086420000074
in the formula, gamma n (t) represents a random channel fading factor that conforms to a rayleigh distribution,
Figure BDA0003655086420000075
representing the average channel gain of terminal device n.
Average channel gain of the terminal device n
Figure BDA0003655086420000081
The method conforms to a free space path loss model, and the calculation formula is as follows:
Figure BDA0003655086420000082
in the formula, A d Denotes the antenna gain of the terminal equipment, delta denotes the path loss factor, d n Indicating the distance of the terminal device n from the edge server.
Before the slot is finished in each decision, the local queue of the terminal equipment is updated, and the number of the video frame frames cached in the local queue of the terminal equipment n in the slot in t decision adopts X n (t) represents, X n The evolution of (t +1) depends on the arrival of a new video frame and the departure of an old video frame, and the updated expression is:
Figure BDA0003655086420000083
in the formula (I), the compound is shown in the specification,
Figure BDA0003655086420000084
is a random binary variable which indicates whether slot t will have a new video frame to reach terminal equipment n, O during decision making n (t) E {0, -1} is also a random binary variable that indicates whether the video frame with slot t at the head of the queue will leave the local queue of terminal device n at the time of decision, X n (t +1) represents the number of video frame frames buffered by the local queue of terminal device n at time slot t + 1. O is n (t) — 0 indicates that at decision slot t the head-of-line frame of the local queue of terminal device n will continue to wait until the next decision slot, O n And (t) — 1 indicates that the head-of-queue frame of the local queue of the decision slot t terminal device n will leave the local queue in the next decision slot, for example, directly output the tracking result of the video frame, or unload the video frame to an edge server for detection.
Based on experiments, it is found that it takes about 10ms for the terminal device to track a single target in a frame, and the duration of tracking the whole frame is increased in proportion to the increase of the number of targets in the frame. Therefore, in order to provide real-time video analysis processing, some frames must be skipped during tracking to keep up with the frame capture speed of the terminal device, such as a video camera, and therefore, I is used n (t) represents a frame interval determined at the time of decision slot t. Thus, the decision slot t +1 time terminal devicen local queue buffer video frame number X n (t +1) may be updated as:
Figure BDA0003655086420000085
in the formula, O n (t) becomes {0, -I n (t) }, 0 denotes that the head-of-line frame continues waiting in the local queue.
As shown in FIG. 2, the experiment measures a succession of 50 video frames, I n (t) is a value of at least 1 and at most 10. It can be seen from the figure that either I n (t) what value to take, the tracking accuracy decreases with the increase of the tracking frame number, and I n The larger the value of (t), the faster the tracking accuracy decreases, so I cannot be increased indefinitely to provide real-time processing n The value of (t). I of the present example n (t) e {1, 2, 3}, and as shown in FIG. 3, in the case of continuously tracking 50 frames, the average tracking accuracy is kept at I of 0.5 or more n The value of (t) is 1, 2, 3.
In the same I n (t), if the video content changes faster, the displacement between two tracked video frames is larger, and the tracking accuracy is more unreliable. Therefore, to ensure more reliable tracking accuracy of the terminal equipment, I n (t) the determination should introduce the effect of the rate of change of the video content, and the metric evaluating the rate of change of the video content must be lightweight to ensure that its calculations do not affect the tracking operation of the real-time target detection system. The method measures the change rate of the video content by using the intermediate result of tracking, so that additional calculation is hardly added, the average moving speed of all characteristics extracted from two adjacent frames is used as the change rate of the video content, and the change rate v of the video content of the terminal equipment n at the time slot t is used as the change rate of the video content n The formula for calculation of (t) is:
Figure BDA0003655086420000091
in the formula (I), the compound is shown in the specification,
Figure BDA0003655086420000092
the pixel position of the kth feature of the i-th frame in the local queue of terminal device n at decision slot t,
Figure BDA0003655086420000093
the pixel position of the kth characteristic of the jth frame in the local queue of the terminal equipment n when the slot t is decided is shown, m represents the characteristic number of the video frame in the local queue of the terminal equipment n when the slot t is decided, and j-i is more than or equal to 1, because some video frames are skipped when the target tracking is carried out. The video content change rate can be obtained by calculating the moving speed between two adjacent frame features, and a high moving speed means that the video content changes rapidly, i.e. the existing object moves out rapidly, and new objects may appear frequently.
According to the method, the target tracking of the frame is carried out based on the Lucas-Kanade method, the tracking precision is reduced along with the change of time and video content, and meanwhile, the first frame of the team with reliable tracking performance of the terminal equipment tends to directly output a tracking result, so that the bandwidth is saved. The following is to calculate the intersection ratio of the tracking result and the real result to measure the tracking performance, and the corresponding expression is as follows:
Figure BDA0003655086420000094
in the formula, Y n And (t) represents a position area of the target detected by the tracking algorithm run by the terminal equipment n when the slot is determined by t, and G represents a real position area of the target.
S2.2, constructing an action space, wherein the expression of the action space is as follows:
A n (t)=(a n (t),C n (t),I n (t));
in the formula, A n (t) represents the space of motion of terminal device n at decision slot t, a n (t) video frame unloading decision of head frame of local queue of terminal equipment n output by edge server in decision slot t, namely, continuing waiting in local queue and immediately unloading to edgeThe server also directly outputs the tracking result when a n When (t) is 0, the first frame of the queue representing the terminal device n waits for the next decision slot, when a is n When (t) is 1, the method indicates that the head-of-line frame of the terminal device n is unloaded to the edge server immediately, and when a is n When (t) is 2, it means that the terminal device n directly outputs the tracking result, C n (t) represents the channel decision of the terminal device n at decision slot t output by the edge server, when C n When (t) is 0, it means that terminal device n is not allocated to the channel in slot t at decision time, when C is equal to n When (t) is 1, it means that terminal device n is allocated to a channel in decision slot t, I n (t) the frame number of the interval between the first frame of the slot queue of the next decision time slot and the first frame of the slot queue of the current decision time slot, namely the frame interval decision, of the terminal device n output by the edge server at the decision time slot t.
S2.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure BDA0003655086420000095
in the formula, R n (t) represents a reward function, namely a gain function, of the terminal device n in the decision slot t, and Acc represents the detection precision or tracking precision p of the first frame of the terminal device n in the decision slot t n (T), the detection precision is set to 1.0, beta represents a weight coefficient, beta is more than 0, adjusting beta can balance the time weight between frame processing and frame transmission, alpha is a performance improvement factor, alpha is more than 0, the importance of adjusting inference performance in a reward function can be reflected through the factor, and T max Represents the maximum value of the ideal range of the video frame detection delay, which refers to the maximum delay of a frame that can be detected under the condition of satisfying the required detection delay,
Figure BDA0003655086420000101
representing the processing time of the head frame in terminal device n at decision slot t.
When the slot t is decided, if the first frame of the queue directly outputs the tracking nodeIf so, the processing time of the first frame of the team
Figure BDA0003655086420000102
Including the tracking time and the waiting time in the queue, the calculation formula is as follows:
Figure BDA0003655086420000103
in the formula (I), the compound is shown in the specification,
Figure BDA0003655086420000104
the tracking time of the head frame of the team in the terminal device n at the decision slot t is shown.
At decision time slot t, if the head frame is unloaded immediately and a channel is available, then the processing time of the head frame
Figure BDA0003655086420000105
The calculation formula of (2) is as follows:
Figure BDA0003655086420000106
in the formula, T e Indicating the time when the edge server performs the object detection,
Figure BDA0003655086420000107
and the time of transmission of the head frame of the queue in the terminal device n through the channel at the decision slot t is shown.
Time for transmitting first frame of queue in terminal equipment n through channel in decision time slot t
Figure BDA0003655086420000108
The calculation formula of (c) is:
Figure BDA0003655086420000109
in the formula, s n (t) represents the frame size of the head frame of the queue in terminal device n at decision slot t, i.e.Amount of data, r n (t) represents the transmission rate between the two in the case where the edge server allocates a terminal device n channel at the time of the slot decision t.
Considering the path loss and Rayleigh fading of the channel, based on Shannon's theorem, when the edge server allocates the n channel to the terminal device in the slot time of t decision, the transmission rate r between the two n The formula for (t) is:
Figure BDA00036550864200001010
where w represents the channel bandwidth, h n (t) denotes the channel gain of terminal device n as a function of decision slot t, P n Representing the transmission power of terminal equipment N, N 0 Representing the background noise power.
Since in order to efficiently utilize bandwidth resources, if a wireless network is unavailable or deteriorates, the first frame of the queue waits for the next decision slot t in the local queue, and these frames are often transmitted to the edge server for detection, rather than directly outputting the tracking result, otherwise, the frame should not be decided to wait. Therefore, in the decision slot t, if the head frame decides to wait, or decides to unload immediately but the wireless network is unavailable, the head frame needs to wait in the local queue until the channel is available and then unload to the edge server, then the processing time of the head frame is the same as the processing time of the head frame
Figure BDA00036550864200001011
The calculation formula of (c) is:
Figure BDA00036550864200001012
in the formula (I), the compound is shown in the specification,
Figure BDA00036550864200001013
a decision slot indicating the beginning of the transmission of the head frame of the team,
Figure BDA00036550864200001014
presentation decision slot
Figure BDA00036550864200001015
The time when the head frame of the team in terminal device n is transmitted through the channel,
Figure BDA0003655086420000111
representing predicted slot t to slot t
Figure BDA0003655086420000112
The number of time slots, and
Figure BDA0003655086420000113
is a positive integer.
S3, each terminal device will track the accuracy p in each decision slot n (t) head of line frame information M n (t) rate of change of video content v n (t) sending to an edge server, wherein the edge server constructs a joint decision model by using an algorithm of Deep Reinforcement Learning (DRL) of DDQN (double Deep Q network), and the method comprises the following steps:
s3.1, setting a total training round M, initializing an experience playback memory D and a parameter theta of an evaluation network, and assigning the parameter theta of the evaluation network to a parameter theta' of a target network;
s3.2, setting the training cycle number epimode to be 1;
s3.3, to the State space S n (t) initialization, i.e. S n (t)=S n (0);
S3.4, setting the number T of decision time slot slots;
s3.5, performing t ═ t + 1;
s3.6, selecting action A according to probability epsilon n (t), the expression of which is:
Figure BDA0003655086420000114
in the formula, θ represents a parameter for evaluating the network, and A represents Q (S) n (t),A n (t); θ) is the maximum value, and random in this equation refers to the random selection of motion from the motion space.
S3.7, according to action A selected in step S3.3 n (t) earning a reward R n (t) and the state space S of the next step n (t+1);
S3.8, experience (S) n (t),A n (t),R n (t),S n (t +1)) is stored in the experience replay memory D;
s3.9, randomly taking out G experiences from the experience playback memory D (S) n (t′),A n (t′),R n (t′),S n (t' +1)), wherein S n (t ') represents the state space of terminal device n at decision slot t', A n (t ') represents the action space of the terminal device n at the time of the decision slot t';
s3.10, predicting the profit according to the experience taken out in the step S3.9, wherein the expression is as follows:
Figure BDA0003655086420000115
in the formula, R n (t ') represents the reward function of the terminal device n at the decision slot t ', γ represents the discount factor for balancing the current benefit with the long-term reward, a ' represents such that
Figure BDA0003655086420000116
The operation of obtaining the maximum value is performed,
Figure BDA0003655086420000117
Figure BDA0003655086420000118
represents the maximum gain in slot, S, at the time of t' +1 decision n (t '+ 1) represents the state space of terminal device n at decision slot t' + 1.
S3.11, updating a parameter theta of the evaluation network based on a gradient descent method;
s3.12, assigning the parameter theta of the evaluation network to the parameter theta' of the target network in each step C, wherein C is integral multiple of T and is less than T;
s3.13, judging that T is less than T, if so, returning to the step S3.5, otherwise, executing the step S3.14;
and S3.14, executing the epsilon +1, judging that the epsilon is less than M, if so, returning to the step S3.3, otherwise, outputting a joint decision model containing the target network.
The DDQN algorithm comprises an evaluation network with a parameter theta and a target network with a parameter theta', the evaluation network is used for updating the parameters by reducing a loss function, the target network is used for calculating a target Q value, and the target network parameters are updated by the evaluation network at regular intervals. Meanwhile, the DDQN maintains a section of experience replay memory D, stores some past experiences, and updates the stored experiences when the experience replay memory D is full.
S4, with the goal of maximizing a gain function, solving the joint optimization problem by using the joint decision model constructed in the step S3, and executing the terminal equipment according to the video frame unloading decision, the channel decision and the frame interval decision output by the edge server;
the expression of the maximized revenue function is:
Figure BDA0003655086420000121
s.t.C 1 (t)+C 2 (t)+...+C n (t)+...+C N (t)≤1;
a n (t)∈{0,1,2};
I n (t)∈{1,2,3}。
in the following, Jetson Nano is taken as a terminal device, a Lucas-Kanade target tracker is operated, Jetson AGX Xavier is taken as an edge server, YOLOX is taken as a target detector, the time of tracking one frame by the terminal device and the time of detecting one frame by the edge server are measured really, and then a simulation environment is established based on the time. The system is divided into individual time slot slots, and the time slot slots are assumed to be small enough, so that at most one new frame arrives at the local queue at the terminal equipment in each time slot, and the arrival rate of the frames conforms to the Bernoulli process with the parameter P. The network simulation adopts a wireless channel Rayleigh fading model, wherein the gain of each terminal equipment antenna is set to be 4.11, the distance between the terminal equipment and the edge server is in accordance with the uniform distribution of U (2.5,5.2), the transmission power of the terminal equipment is 0.03, the background noise is 10e-10, the path loss coefficient is 2.8, and the bandwidth of an uplink is 2 MHZ. The DDQN algorithm based on the pytorech 1.7 was implemented using python, and the size of D was set to 1000, the total training round was set to 400, the batch size was 32, the learning rate was 0.0001, γ was set to 0.9, and ε was set to 0.9.
In order to show the superiority of the method in a continuous video frame scene, the method is compared with a Random algorithm and a Greedy algorithm, and the evaluation index is the system average reward. The random algorithm is to randomly select a decision without considering any environmental information, whose performance is always the worst. The greedy algorithm makes an optimal decision based on the current state, but does not consider the interaction between adjacent tasks. As shown in fig. 4, P is the arrival rate of the video frame in each timeslot slot, and the larger P indicates that the higher the video frame rate is, the more intensive the task is; the smaller p indicates that the video frame rate is smaller and the task is more sparse. It can be found that the algorithm of the present application is superior to the random algorithm and the greedy algorithm regardless of the fluctuation of the P value.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims (9)

1. A dynamic task scheduling method for video detection and tracking is characterized by comprising the following steps:
s1, constructing a real-time target detection system comprising a plurality of terminal devices and an edge server, wherein the terminal devices are provided with target trackers, and the edge server is provided with target detectors;
s2, constructing a combined optimization problem of video frame unloading decision, channel decision and frame interval decision in the real-time target detection system into a Markov decision problem;
the video frame unloading decision means that the team first frame of the terminal equipment is continuously waited in a local queue of the terminal equipment when a slot is formed in each decision, is immediately unloaded to an edge server for detection or directly outputs a tracking result, the channel decision means whether the terminal equipment output by the edge server is allocated to a channel, and the frame interval decision means the number of frames of an interval between the next decision slot team first frame when the slot is formed in the current decision of the terminal equipment output by the edge server and the slot next-in-line first frame when the slot is formed in the current decision;
s3, each terminal device sends the tracking precision, the head frame information and the video content change rate to an edge server in each decision time slot, and the edge server constructs a joint decision model by using a DDQN deep reinforcement learning algorithm;
and S4, solving the joint optimization problem by using the joint decision model constructed in the step S3 with the aim of maximizing a gain function, and executing the terminal equipment according to the video frame unloading decision, the channel decision and the frame interval decision output by the edge server.
2. The dynamic task scheduling method for video detection and tracking according to claim 1, wherein the step S2 comprises the following steps:
s2.1, constructing a state space, wherein the expression of the state space is as follows:
S n (t)=(M n (t),h n (t),p n (t),v n (t));
in the formula, M n (t) queue head frame information of local queue of terminal device n at decision slot t, h n (t) denotes the channel gain between terminal device n and edge server, v n (t) represents the rate of change of video content of terminal device n at decision slot t, S n (t) represents the state space of terminal device n at decision slot t, p n (t) represents the tracking precision of the head frame of the team of the terminal device n when the slot is determined by t;
s2.2, constructing an action space, wherein the expression of the action space is as follows:
A n (t)=(a n (t),C n (t),I n (t));
in the formula, A n (t) represents the space of motion of terminal device n at decision slot t, a n (t) video frame unload decision representing the first frame of the local queue of the terminal device n output by the edge server in the decision slot t, i.e. whether to continue waiting in the local queue, unload to the edge server immediately or output the tracking result directly, C n (t) denotes the channel decision of the terminal device n at decision slot t, I, output by the edge server n (t) the frame number of the interval between the first frame of the slot queue of the next decision time slot and the first frame of the slot queue of the current decision time, namely the frame interval decision, of the terminal device n output by the edge server at the decision time slot t is represented;
s2.3, constructing a reward function, wherein the expression of the reward function is as follows:
Figure FDA0003655086410000021
in the formula, R n (t) represents a reward function, namely a gain function, of the terminal device n at the time of the decision slot t, Acc represents the detection precision or tracking precision of the first frame of the terminal device n at the time of the decision slot t, β represents a weight coefficient, and β > 0,
Figure FDA0003655086410000022
representing the processing time of the first frame of the queue in the terminal equipment n at the time of the decision slot T, alpha is a performance improvement factor and is more than 0, T max Which represents the maximum value of the ideal range of video frame detection delay.
3. The video-detection-and-tracking-oriented dynamic task scheduling method according to claim 2, wherein in step S2.1, the first frame information M of the local queue of terminal device n at decision slot t is determined n The expression of (t) is:
Figure FDA0003655086410000023
in the formula s n (t) represents the frame size of the head-of-queue frame of the local queue of terminal device n at decision slot t,
Figure FDA0003655086410000024
representing the arrival time of the head frame of the local queue of terminal device n,
Figure FDA0003655086410000025
indicating the time the head-of-line frame of the local queue of terminal device n at decision slot t has been waiting before processing.
4. The video detection and tracking oriented dynamic task scheduling method of claim 2, wherein in step S2.1, the channel gain h between the terminal device n and the edge server n The formula for calculation of (t) is:
Figure FDA0003655086410000026
in the formula, gamma n (t) represents a random channel fading factor that conforms to a rayleigh distribution,
Figure FDA0003655086410000027
represents the average channel gain of terminal device n;
average channel gain of the terminal device n
Figure FDA0003655086410000028
The calculation formula of (2) is as follows:
Figure FDA0003655086410000029
in the formula, A d Denotes the antenna gain of the terminal equipment, delta denotes the path loss factor, d n Indicating the distance of the terminal device n from the edge server.
5. The video detection and tracking oriented dynamic task scheduling method of claim 2, wherein in step S2.1, the calculation formula of the tracking accuracy pn (t):
Figure FDA00036550864100000210
wherein G represents the true location area of the target and Y n And (t) represents the position area of the target detected by the tracking algorithm run by the terminal equipment n when the slot is decided by t.
6. The method for dynamic task scheduling for video detection and tracking according to claim 2, wherein in step S2.1, the video content change rate v of terminal device n at time slot t is n The formula for (t) is:
Figure FDA00036550864100000211
in the formula (I), the compound is shown in the specification,
Figure FDA00036550864100000212
the pixel position of the kth feature of the i-th frame in the local queue of terminal device n at the time of decision slot t,
Figure FDA0003655086410000031
the pixel position of the kth characteristic of the jth frame in the local queue of the terminal equipment n at the time of the decision slot t is shown, m represents the characteristic number of the video frame in the local queue of the terminal equipment n at the time of the decision slot t, and j-i is larger than or equal to 1.
7. The video detection and tracking oriented dynamic task scheduling method of claim 2, wherein in step S2.3, if the head frame of the teamDirectly outputting the tracking result to obtain the processing time of the first frame
Figure FDA0003655086410000032
The calculation formula of (2) is as follows:
Figure FDA0003655086410000033
in the formula (I), the compound is shown in the specification,
Figure FDA0003655086410000034
represents the tracking time of the head frame of the queue in terminal device n at decision slot t,
Figure FDA0003655086410000035
representing the waiting time of the head frame of the local queue of the terminal equipment n before processing when the slot t is decided;
if the head frame is unloaded immediately and the channel is available, the processing time of the head frame
Figure FDA0003655086410000036
The calculation formula of (2) is as follows:
Figure FDA0003655086410000037
in the formula, T e Indicating the time at which the edge server performs object detection,
Figure FDA0003655086410000038
the time of transmitting the first frame of the queue in the terminal equipment n through the channel when the slot t is decided is represented;
if the queue head frame decides to wait or decides to unload immediately but the wireless network between the terminal device and the edge server is unavailable at the moment, the queue head frame needs to continue waiting in the local queue until the channel is available and then unload to the edge server, the processing time of the queue head frame is shortened
Figure FDA0003655086410000039
The calculation formula of (2) is as follows:
Figure FDA00036550864100000310
in the formula (I), the compound is shown in the specification,
Figure FDA00036550864100000311
a decision slot indicating the beginning of the transmission of the head frame of the team,
Figure FDA00036550864100000312
presentation decision slot
Figure FDA00036550864100000316
The time when the head frame of the team in terminal device n is transmitted through the channel,
Figure FDA00036550864100000313
representing predicted time slot t to time slot t
Figure FDA00036550864100000314
The number of time slots.
8. The dynamic task scheduling method for video detection and tracking according to claim 1, wherein the step S3 comprises the following steps:
s3.1, setting a total training round M, initializing an experience playback memory D and a parameter theta of an evaluation network, and assigning the parameter theta of the evaluation network to a parameter theta' of a target network;
s3.2, setting the training cycle number epimode to be 1;
s3.3, to the State space S n (t) initialization, i.e. S n (t)=S n (0) Wherein S is n (t) represents the state space of terminal device n at decision slot t;
s3.4, setting the number T of decision time slot slots;
s3.5, performing t ═ t + 1;
s3.6, selecting action A according to probability epsilon n (t) the expression is:
Figure FDA00036550864100000315
wherein A represents such that Q (S) n (t),A n (t); θ) action at maximum value, A n (t) represents the action space of terminal device n at decision slot t;
s3.7, according to action A selected in step S3.3 n (t) earning a reward R n (t) and the state space S of the next step n (t+1);
S3.8, experience (S) n (t),A n (t),R n (t),S n (t +1)) is stored in the experience replay memory D;
s3.9, randomly taking out G experiences from the experience playback memory D (S) n (t′),A n (t′),R n (t′),S n (t′+1));
S3.10, predicting benefits according to the experience taken out in the step S3.9, wherein the expression is as follows:
Figure FDA0003655086410000041
in the formula, R n (t ') represents the reward function of the terminal device n at the decision slot t ', γ represents the discount factor, A ' represents such that
Figure FDA0003655086410000042
An operation of obtaining the maximum value of the data,
Figure FDA0003655086410000043
represents the maximum gain in slot, S, at the time of t' +1 decision n (t '+ 1) indicates that terminal device n is in decision slot t' +1A state space of (a);
s3.11, updating a parameter theta of the evaluation network based on a gradient descent method;
s3.12, assigning the parameter theta of the evaluation network to the parameter theta' of the target network in each step C;
s3.13, judging that T is less than T, if so, returning to the step S3.5, otherwise, executing the step S3.14;
and S3.14, executing the epsilon +1, judging that the epsilon is less than M, if so, returning to the step S3.3, otherwise, outputting a joint decision model containing the target network.
9. The method for dynamic task scheduling for video detection and tracking as claimed in claim 1, wherein in step S4, the expression of the maximized benefit function is:
Figure FDA0003655086410000044
s.t.C 1 (t)+C 2 (t)+...+C n (t)+...+C N (t)≤1;
a n (t)∈{0,1,2};
I n (t)∈{1,2,3};
in the formula, a n (t) indicating the video frame unloading decision of the head frame of the local queue of the terminal equipment n output by the edge server in the decision slot t, namely, continuously waiting in the local queue, immediately unloading to the edge server or directly outputting the tracking result when a n When (t) is 0, the head of line frame of the terminal device n waits for the next decision slot, when a n When (t) is 1, the method indicates that the head-of-line frame of the terminal device n is unloaded to the edge server immediately, and when a is n When (t) is 2, it means that the terminal device n directly outputs the tracking result, C n (t) represents the channel decision of the terminal device n output by the edge server in the decision slot t, when C n When (t) is 0, it means that terminal device n is not allocated to a channel in slot t at decision time, and when C is equal to n When (t) is 1, it means that terminal device n is allocated to a channel in decision slot t, I n (t) a frame number, namely a frame interval decision, representing the interval between the first frame of the next decision time slot queue and the first frame of the current time slot queue of the terminal equipment n output by the edge server at the decision time slot t, R n (t) represents the reward function, i.e. gain function, of the terminal device N at the decision slot t, N representing the total number of terminal devices.
CN202210551198.7A 2022-05-20 2022-05-20 Dynamic task scheduling method for video detection and tracking Active CN115002409B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210551198.7A CN115002409B (en) 2022-05-20 2022-05-20 Dynamic task scheduling method for video detection and tracking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210551198.7A CN115002409B (en) 2022-05-20 2022-05-20 Dynamic task scheduling method for video detection and tracking

Publications (2)

Publication Number Publication Date
CN115002409A true CN115002409A (en) 2022-09-02
CN115002409B CN115002409B (en) 2023-07-28

Family

ID=83028073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210551198.7A Active CN115002409B (en) 2022-05-20 2022-05-20 Dynamic task scheduling method for video detection and tracking

Country Status (1)

Country Link
CN (1) CN115002409B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019117970A1 (en) * 2017-12-15 2019-06-20 Google Llc Adaptive object tracking policy
CN113115072A (en) * 2021-04-09 2021-07-13 中山大学 Video target detection tracking scheduling method and system based on end cloud cooperation
CN113612843A (en) * 2021-08-02 2021-11-05 吉林大学 MEC task unloading and resource allocation method based on deep reinforcement learning
WO2021233053A1 (en) * 2020-05-22 2021-11-25 华为技术有限公司 Computing offloading method and communication apparatus
CN113821346A (en) * 2021-09-24 2021-12-21 天津大学 Computation uninstalling and resource management method in edge computation based on deep reinforcement learning
CN113873022A (en) * 2021-09-23 2021-12-31 中国科学院上海微系统与信息技术研究所 Mobile edge network intelligent resource allocation method capable of dividing tasks
WO2022000838A1 (en) * 2020-07-03 2022-01-06 南京莱斯信息技术股份有限公司 Markov random field-based method for labeling remote control tower video target
WO2022057811A1 (en) * 2020-09-17 2022-03-24 浙江大学 Edge server-oriented network burst load evacuation method
CN114375058A (en) * 2022-01-19 2022-04-19 上海大学 Task queue aware edge computing real-time channel allocation and task unloading method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019117970A1 (en) * 2017-12-15 2019-06-20 Google Llc Adaptive object tracking policy
WO2021233053A1 (en) * 2020-05-22 2021-11-25 华为技术有限公司 Computing offloading method and communication apparatus
WO2022000838A1 (en) * 2020-07-03 2022-01-06 南京莱斯信息技术股份有限公司 Markov random field-based method for labeling remote control tower video target
WO2022057811A1 (en) * 2020-09-17 2022-03-24 浙江大学 Edge server-oriented network burst load evacuation method
CN113115072A (en) * 2021-04-09 2021-07-13 中山大学 Video target detection tracking scheduling method and system based on end cloud cooperation
CN113612843A (en) * 2021-08-02 2021-11-05 吉林大学 MEC task unloading and resource allocation method based on deep reinforcement learning
CN113873022A (en) * 2021-09-23 2021-12-31 中国科学院上海微系统与信息技术研究所 Mobile edge network intelligent resource allocation method capable of dividing tasks
CN113821346A (en) * 2021-09-24 2021-12-21 天津大学 Computation uninstalling and resource management method in edge computation based on deep reinforcement learning
CN114375058A (en) * 2022-01-19 2022-04-19 上海大学 Task queue aware edge computing real-time channel allocation and task unloading method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张文献;杜永文;张希权;: "面向多用户移动边缘计算轻量任务卸载优化" *
张文献;杜永文;张希权;: "面向多用户移动边缘计算轻量任务卸载优化", 小型微型计算机系统, no. 10 *

Also Published As

Publication number Publication date
CN115002409B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN113950066B (en) Single server part calculation unloading method, system and equipment under mobile edge environment
CN112995913B (en) Unmanned aerial vehicle track, user association and resource allocation joint optimization method
CN107708152B (en) Task unloading method of heterogeneous cellular network
EP3491793B1 (en) System and method for resource-aware and time-critical iot frameworks
CN114205353B (en) Calculation unloading method based on hybrid action space reinforcement learning algorithm
CN110580199A (en) service migration method based on particle swarm in edge computing environment
CN113222118B (en) Neural network training method, apparatus, electronic device, medium, and program product
CN113778691A (en) Task migration decision method, device and system
CN114626298A (en) State updating method for efficient caching and task unloading in unmanned aerial vehicle-assisted Internet of vehicles
CN115002409A (en) Dynamic task scheduling method for video detection and tracking
CN112860409A (en) Mobile cloud computing random task sequence scheduling method based on Lyapunov optimization
CN112486685A (en) Computing task allocation method and device of power Internet of things and computer equipment
CN110300380B (en) Target tracking method for balancing system energy consumption and tracking precision in mobile WSN (wireless sensor network)
CN111930435A (en) Task unloading decision method based on PD-BPSO technology
CN115580900A (en) Unmanned aerial vehicle assisted cooperative task unloading method based on deep reinforcement learning
CN114828047A (en) Multi-agent collaborative computing unloading method in 5G mobile edge computing environment
CN115134776A (en) Unmanned aerial vehicle calculation unloading method facing mobile edge calculation
CN114337881A (en) Wireless spectrum intelligent sensing method based on multi-unmanned aerial vehicle distribution and LMS
Li et al. Optimal Offloading of Computing-intensive Tasks for Edge-aided Maritime UAV Systems
CN117544680B (en) Caching method, system, equipment and medium based on electric power Internet of things
CN113378369B (en) Path planning and task scheduling method based on unmanned aerial vehicle calculation unloading
Wang et al. Deep Reinforcement Learning Based on Actor-Critic for Task Offloading in Vehicle Edge Computing
CN115865965B (en) Method, system and equipment for detecting moving target based on hierarchical perception
CN115226130B (en) Multi-unmanned aerial vehicle data unloading method based on fairness perception and related equipment
CN117528658A (en) Edge collaborative caching method and system based on federal deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant