CN114756371A - Method and system for optimal configuration of terminal edge joint resources - Google Patents
Method and system for optimal configuration of terminal edge joint resources Download PDFInfo
- Publication number
- CN114756371A CN114756371A CN202210455191.5A CN202210455191A CN114756371A CN 114756371 A CN114756371 A CN 114756371A CN 202210455191 A CN202210455191 A CN 202210455191A CN 114756371 A CN114756371 A CN 114756371A
- Authority
- CN
- China
- Prior art keywords
- video
- edge
- inference
- user
- dnn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a method and a system for optimizing and configuring joint resources at the edge of a terminal, wherein the method comprises the following steps: when videos with different frame numbers are recorded by the edge controller and input to the neural network, calculating the multiply-add number required by the neural network and the accuracy rate of neural network identification; completing a control task according to a video identification request of the mobile equipment, wherein the control task comprises the following steps: sending sampling frame number control information to a video sampling management module corresponding to the mobile equipment based on the frame number of video sampling of each mobile equipment; determining whether a user's inference task is completed in a local DNN inference module or an edge DNN inference module based on a user offload decision; and determining the time proportion of the uplink video transmission of each mobile device based on the allocation strategy of the communication resources of each mobile video device. Based on the optimization method, the average inference time delay of the system and the average energy consumption of the mobile equipment are simultaneously reduced, and the accuracy of model inference is improved.
Description
Technical Field
The invention mainly relates to the technical field of computers, in particular to a method and a system for optimally configuring joint resources at the edge of a terminal.
Background
The development of technologies such as network, cloud computing, edge computing, artificial intelligence and the like has led to the infinite imagination of people to the meta universe. In order to enable a user to interact between the real world and the virtual world, Augmented Reality (AR) technology plays a crucial role. Meanwhile, artificial intelligence plays an important role in the fields of automatic speech recognition, natural language processing, computer vision and the like due to the learning and reasoning capabilities of the artificial intelligence. With the aid of AI techniques, AR can enable deeper scene understanding and more immersive interactions.
However, the computational complexity of artificial intelligence algorithms, especially Deep Neural Networks (DNNs), is typically very high. Reasoning for neural networks is difficult to accomplish reliably in a timely manner on mobile devices with limited computational and energy capacities. Experiments have shown that a typical single frame image processing AI inference task takes about 600 milliseconds, even under acceleration of a mobile GPU. Furthermore, the continuous execution of the above reasoning task can only last for a maximum of 2.5 hours on the commercial equipment. The above problems have resulted in only a few AR applications currently using depth science. To reduce the inference time of DNN, one approach is to perform network pruning of the neural network. However, if too many channels are pruned, the model may be corrupted and satisfactory accuracy may not be restored by fine tuning.
Moving edge computation assisted AI is another approach to address these problems. The integration of moving edge computing and AI techniques has recently become a promising paradigm for supporting compute-intensive tasks. Edge computing shifts the inference and training process of AI models to the network edge near the data source. It will therefore alleviate network traffic load, delay and privacy issues. However, there are still a number of challenges to the task of edge computation assisted AI applications, as embodied by:
(1) although the computing resources of the edge are far stronger than those of the terminal user, the computing resources are also limited, and the problem of insufficient computing power of the terminal cannot be solved well by relying on the computing power of the edge equipment;
(2) the user unloads the AI inference task to the edge calculation, and although the influence caused by insufficient calculation capacity can be reduced to a certain extent, communication delay can be introduced at the same time;
(3) the time delay, the energy consumption and the accuracy rate are mutually restricted, and the performance of one of the two is inevitably reduced by improving the performance of the other one.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a method and a system for optimizing configuration of terminal edge joint resources.
The invention provides a method for optimizing and configuring joint resources at the edge of a terminal, which comprises the following steps:
when videos with different frame numbers are recorded by the edge controller and input to the neural network, calculating the multiply-add number required by the neural network and the accuracy rate of neural network identification; when receiving a video identification request of each mobile device, completing a control task according to the video identification request of the mobile device, wherein the control task comprises the following steps:
determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device;
determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision;
determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resource of each mobile video device;
and determining a resource allocation strategy of the edge DNN inference module to decide the CPU calculation frequency of each uninstalled user.
The method further comprises the following steps:
the edge DNN reasoning module obtains the uploaded video from the edge equipment needing unloading, then completes reasoning according to the computing resources distributed by the edge controller, and sends a reasoning result to each piece of mobile equipment.
The method further comprises the following steps:
the video sampling management module acquires the control information of the sampling frame number calculated at the edge, controls the sampling frame number of the mobile equipment based on the control information of the sampling frame, and determines the frame number of the input video for neural network inference.
The method further comprises the following steps:
and the local controller acquires the video from the video sampling management module and determines whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller.
The local controller acquires the video from the video sampling management module and determines whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller, wherein the method comprises the following steps:
if the user unloading decision is 1, transmitting the video to an edge server for reasoning, and meanwhile, configuring a transmitted communication resource by a base station;
and if the user unloading decision is 0, allowing the video to locally finish reasoning, and distributing local CPU computing resources according to the local equipment information.
When the mobile device needs to complete DNN reasoning locally, the local DNN reasoning module acquires the video and completes DNN reasoning by using the allocated local computing resources.
Correspondingly, the invention also provides a video AI reasoning system with edge-end cooperation, which comprises:
The edge controller is used for recording the multiplication and addition number required by the neural network and the accuracy rate of neural network identification when videos with different frame numbers are input to the neural network; when receiving the video identification request of each mobile device, completing a control task according to the video identification request of the mobile device;
the edge DNN reasoning module is used for obtaining an uploaded video from edge equipment needing unloading, finishing reasoning according to computing resources distributed by the edge controller and sending a reasoning result to each piece of mobile equipment;
the video sampling management module is used for acquiring the control information of the sampling frame number of the edge calculation, controlling the sampling frame number of the mobile equipment based on the control information of the sampling frame, and determining the frame number of the input video for neural network inference;
the local controller is used for acquiring the video from the video sampling management module and determining whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller;
and the local DNN reasoning module is used for acquiring the video when the mobile equipment needs to locally finish DNN reasoning and finishing DNN reasoning by using the distributed local computing resources.
The completing the control task according to the video identification request of the mobile device comprises the following steps:
Determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device;
determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision;
determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resource of each mobile video device;
and determining a resource allocation strategy of the edge DNN reasoning module to decide the CPU calculation frequency of each uninstalling user.
And if the user unloading decision is 1, the local controller transmits the video to the edge server for reasoning, and meanwhile, the transmitted communication resource is configured by the base station.
And if the user unloading decision is 0, allowing the video to locally finish reasoning and distributing local CPU computing resources according to the local equipment information by the local controller.
The embodiment of the invention has the following beneficial effects:
(1) the invention provides a framework of a video AI inference system with cooperation of edge terminals based on an AI inference algorithm unloading framework with cooperation of edge terminals, and an edge server can determine the number of video frames used for detection by a user according to the number of users requested and provide an unloading strategy of the user and a communication calculation resource allocation scheme of the user.
(2) And (4) multi-dimensional performance optimization. Under the given system architecture, reasoning delay, terminal energy consumption and recognition accuracy rate are jointly considered, an effective algorithm is provided, delay and energy consumption are reduced, and meanwhile recognition accuracy rate of a neural network is improved.
(3) Performance trade-off analysis. The invention provides a balance relation among the reasoning delay, the terminal energy consumption and the identification accuracy, and by utilizing the relation, targeted system optimization can be carried out, and the system performance of AI reasoning under different applications can be realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic structural diagram of a video AI inference system with edge-to-edge coordination according to an embodiment of the present invention;
FIG. 2 is a performance comparison diagram of different offloading policies in an embodiment of the invention;
fig. 3 is a schematic diagram of a trade-off relationship among delay, energy consumption, and recognition accuracy in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention mainly solves the technical problem of providing an architecture of an edge-end cooperative video AI inference system aiming at a deep learning-based video identification task, and provides a method for simultaneously optimizing the delay, energy consumption and accuracy rate of executing the deep learning inference task according to the architecture. Meanwhile, the invention provides a trade-off relation among time delay, energy consumption and accuracy rate, and the relation can be used as a reference for system design.
The invention provides an architecture of a video AI inference system with edge-end cooperation, which has the functions that under the scene that an edge server serves a plurality of mobile devices through wireless access, when the devices all need to carry out AI-based video identification tasks, the inference delay and the energy consumption of a neural network are reduced by reasonably adjusting the number of video frames for identification and jointly configuring wireless and computing resources, and the aim of improving the inference accuracy of the neural network is fulfilled.
Specifically, fig. 1 shows a schematic structural diagram of a video AI inference system with edge-to-end coordination in an embodiment of the present invention, where the system includes:
the edge controller is used for recording the multiplication and addition number required by the neural network and the accuracy rate of neural network identification when videos with different frame numbers are input to the neural network; when receiving the video identification request of each mobile device, completing a control task according to the video identification request of the mobile device;
the edge DNN reasoning module is used for obtaining an uploaded video from edge equipment needing unloading, finishing reasoning according to computing resources distributed by the edge controller and sending a reasoning result to each piece of mobile equipment;
the video sampling management module is used for acquiring the control information of the sampling frame number of the edge calculation, controlling the sampling frame number of the mobile equipment based on the control information of the sampling frame, and determining the frame number of the input video for neural network inference;
the local controller is used for acquiring the video from the video sampling management module and determining whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller;
and the local DNN reasoning module is used for acquiring the video when the mobile equipment needs to locally finish DNN reasoning and finishing DNN reasoning by using the distributed local computing resources.
The completing the control task according to the video identification request of the mobile device comprises the following steps: determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device; determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision; determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resource of each mobile video device; and determining a resource allocation strategy of the edge DNN reasoning module to decide the CPU calculation frequency of each uninstalling user.
If the user unloading decision is 1, the local controller transmits the video to the edge server for reasoning, and meanwhile, the transmitted communication resource is configured by the base station; and if the user unloading decision is 0, allowing the video to locally finish reasoning, and distributing local CPU computing resources according to the local equipment information.
Based on the system structure shown in fig. 1, the method for optimally configuring the joint resources at the edge of the terminal in the embodiment of the present invention includes: when videos with different frame numbers are recorded by the edge controller and input to the neural network, calculating the multiply-add number required by the neural network and the accuracy rate of neural network identification; when receiving a video identification request of each mobile device, completing a control task according to the video identification request of the mobile device, wherein the control task comprises the following steps: determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device; determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision; determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resource of each mobile video device; and determining a resource allocation strategy of the edge DNN reasoning module to decide the CPU calculation frequency of each uninstalling user.
Further, the edge DNN reasoning module obtains the uploaded video from the edge device needing unloading, completes reasoning according to the computing resources distributed by the edge controller, and sends a reasoning result to each mobile device.
Further, the video sampling management module acquires the control information of the sampling frame number calculated at the edge, controls the sampling frame number of the mobile device based on the control information of the sampling frame, and determines the frame number of the input video for neural network inference.
Further, the local controller obtains the video from the video sampling management module, and determines whether to transmit the video to the edge server according to the user unloading decision obtained from the edge controller.
Further, the step of the local controller acquiring the video from the video sampling management module and determining whether to transmit the video to the edge server according to the user offloading decision acquired from the edge controller includes: if the user unloading decision is 1, transmitting the video to an edge server for reasoning, and meanwhile, configuring a transmitted communication resource by a base station; and if the user unloading decision is 0, allowing the video to locally finish reasoning, and distributing local CPU computing resources according to the local equipment information.
Further, when the mobile device needs to complete DNN reasoning locally, the local DNN reasoning module acquires the video and completes DNN reasoning by using the allocated local computing resources.
It should be noted that, according to the above system architecture, the present invention provides a corresponding optimization scheme, and at the same time, reduces the average inference delay of the system and the average energy consumption of the mobile device, and improves the accuracy of model inference, and the specific algorithm is as follows:
firstly, a multiply-add number C (M) required for completing neural network model reasoning is obtained by utilizing a multiply-add number model of the neural networkn) Examination ofConsidering that for each layer of neural network, the multiply-add number needed to complete the inference is proportional to the input size, so by derivation, the total calculated multiply-add number can be roughly represented by a linear function of the input video frame number, which is noted as:
C(Mn)=mc,0Mn+mc,1
wherein m isc,0And mc,1Is a parameter obtained by fitting, determined by the architecture of the model, MnFor sampling frame number control information, then according to multiply-add number giving calculation delay D of neural networknAnd energy consumption EnExpression (c):
where ρ represents the number of CPU revolutions required for each multiply-add operation, d represents the size of each frame of video, RnIndicating the communication rate, t, of the nth mobile devicenRepresents the communication time proportion (i.e. communication resource allocation proportion) of the nth mobile device, k represents the energy consumption coefficient, p nRepresenting the transmission power, x, of the mobile devicenRepresenting a decision for the user to uninstall the facility,indicating the CPU computational resources allocated to the locality,representing computational resources allocated according to the edge controller, xnRepresenting whether to offload to edge computation (if x)n1 represents offload, otherwise it is calculated locally).
For accuracy, the greater the number of frames of the input video, the higher the accuracy of the model prediction. And as the number of video input frames increases, the gain of model prediction accuracy rate gradually decreases. Therefore, it is accurateRate phi (M)n) Can be expressed as the following function:
wherein m isa,0,ma,1And ma,2Are parameters obtained by fitting, and the numerical values are determined by the model and task of the neural network.
In summary, the optimized objective function is:
wherein: beta is a1,β2,β3Respectively, weight coefficients, the optimization objective being to reduce the total delay DnAnd total energy consumption EnAnd improve the user identification accuracy rate phi (M)n) The constraint conditions are that the frame number of the user is in a given range, the sum of the communication time proportion of the equipment participating in the unloading is less than 1, the sum of the distributed calculation frequency of the edge equipment is less than the upper limit, the distributed communication time and the distributed calculation frequency are more than 0, and the calculation frequency distribution of the mobile equipment is more than 0 and less than 0 Value of (c) < x >nEither unloaded or not.
To solve the optimization problem, an offload policy x is setnGiven, the problem is then decomposed into 2 sub-problems, respectively a resource optimization problem for mobile devices that perform inference locally and a resource optimization problem for mobile devices that perform inference at the edge. The set of users who complete inference locally is N0The set of users that accomplish inference at the edge is N1Thus the optimization problem translates into two sub-optimization problems.
For N0The resource optimization problem of (2), which is expressed as:
whereinThe cost function in local computation is outlined. The constraint conditions to be satisfied are that firstly, the number of user frames accords with a given range, and secondly, the calculation frequency of the mobile equipment is distributed to be more than 0 and less than 0Value of (A)
By derivation, a closed expression for solving the sub-problem can be obtained:
for N1The resource optimization problem of (2), which is expressed as:
whereinHooking the cost function when unloading to the edge calculation. The constraint conditions to be satisfied are that firstly, the number of the user frames accords with a given range, secondly, the sum of the communication time proportion of the equipment participating in unloading is less than 1, thirdly, the sum of the distributed calculation frequency of the edge equipment is less than the upper limit, and fourthly, the distributed communication time and the distributed calculation frequency of the edge are more than 0.
the problem can then be solved by a convex optimization method.
For offload policy xnThe embodiment of the invention is implemented based on a greedy iterative algorithm. It can be observed that the cost function is performed locally when reasoning is performedAnd optimizing variable Mn,Only depends on the parameters of the device, and is not influenced by the parameters of other devices. However, for edge setsCost function and setThe number of devices in (1) and the parameters. The algorithm-based principle is described below. First, the task of computing each device is aggregated when executed locallyCost function ofSecond, it is left to unload all equipment to the border garmentThe server makes inferences andin each iteration, obtainFor each device ofComparing cost functionsAnd cost functionIn thatDevices in the set can be obtainedAndand selecting the device with the largest difference as the device y. Attempt to assembleDevice y of (1) into a setAnd calculates the cost of the new set. If the total cost of the new set is reduced, the next iteration continues. Otherwise, device y is put back into the setThe algorithm ends.
The downlink bandwidth is set to 5Mhz, and the path loss is modeled as PL 128.1+37.6log 10(D) Where D is the distance between the device and the wireless access point in kilometers.The devices are randomly distributed in 500m and 500m]Within the range. The computing resources of the MEC server and the devices are set to 1.8GHz and 22GHz, respectively. Recognition accuracy requirement and maximum input video frame number setCoefficient k 1028Determined by the corresponding device. The size of the input video is 112 × Mn. Further, the calculation complexity coefficient is set to ρ of 12, which is obtained through a plurality of experiments. Weight beta1,β2,β3Set to 0.2, 0.2, 0.6, respectively.
The proposed offloading scheme is compared with a Local inference scheme (Local), an Edge inference scheme (Edge), and a Random offloading scheme (Random), and the experimental result is shown in fig. 2. When the number of devices is less than 10, the cost of a solution that performs tasks only at the edges is almost equal to the cost of a proposed offloading solution. This is because all devices can benefit from performing inference on the edge server when the number of devices is small. If the inference task is performed locally only, the average cost of the devices does not change, since the local resources between the devices do not affect each other. The curve of the Edge scheme is linear because the AI model is the same for all users in the experiment.
The experiment used different weights β1,β2,β3To analyze trade-off relationships between average delay, power consumption and accuracy. The performance of the balanced surface is obtained by the proposed unloading and allocating scheme, with the constraint of beta1+β2+β 31. As shown in fig. 3, the delay, power consumption and accuracy are mutually limited and balanced. When the delay is constant, higher recognition accuracy requires higher power consumption. From another perspective, to improve accuracy, performance of delay and power consumption needs to be sacrificed. Furthermore, with the same accuracy, higher energy consumption makes the device more inclined to perform inference tasks locally, resulting in reduced latency.
The embodiment of the invention has the following beneficial effects:
(1) the invention provides a framework of a video AI inference system with edge-side coordination based on an AI inference algorithm unloading framework with edge-side coordination, an edge server can determine the number of video frames used for detection by a user according to the number of users requested, and provides an unloading strategy of the user and a communication calculation resource allocation scheme of the user.
(2) And (4) multi-dimensional performance optimization. Under the given system architecture, reasoning delay, terminal energy consumption and identification accuracy are jointly considered, an effective algorithm is provided, and the identification accuracy of the neural network is improved while the delay and the energy consumption are reduced.
(3) Performance trade-off analysis. The invention provides a balance relation among the reasoning delay, the terminal energy consumption and the identification accuracy, and by utilizing the relation, targeted system optimization can be carried out, and the system performance of AI reasoning under different applications is improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
In addition, the above embodiments of the present invention are described in detail, and the principle and the implementation manner of the present invention should be described herein by using specific examples, and the above description of the embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.
Claims (10)
1. A method for optimal configuration of joint resources at an edge of a terminal, the method comprising:
When videos with different frame numbers are recorded by the edge controller and input to the neural network, calculating the multiply-add number required by the neural network and the accuracy rate of neural network identification; when receiving a video identification request of each mobile device, completing a control task according to the video identification request of the mobile device, wherein the control task comprises the following steps:
determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device;
determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision;
determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resources of each mobile video device;
and determining a resource allocation strategy of the edge DNN inference module to decide the CPU calculation frequency of each uninstalled user.
2. The method for optimal configuration of terminal edge joint resources according to claim 1, wherein the method further comprises:
the edge DNN reasoning module obtains the uploaded video from the edge equipment needing unloading, then completes reasoning according to the computing resources distributed by the edge controller, and sends a reasoning result to each piece of mobile equipment.
3. The method for optimal configuration of joint resources at an edge of a terminal as claimed in claim 1, wherein the method further comprises:
the video sampling management module acquires the control information of the sampling frame number calculated at the edge, controls the sampling frame number of the mobile equipment based on the control information of the sampling frame, and determines the frame number of the input video for neural network inference.
4. The method for optimal configuration of joint resources at an edge of a terminal as claimed in claim 1, wherein the method further comprises:
and the local controller acquires the video from the video sampling management module and determines whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller.
5. The method of claim 4, wherein the local controller obtaining video from a video sampling management module and deciding whether to transmit the video to the edge server based on the user offload decision obtained from the edge controller comprises:
if the user unloading decision is 1, transmitting the video to an edge server for reasoning, and meanwhile, configuring a transmitted communication resource by a base station;
and if the user unloading decision is 0, allowing the video to locally finish reasoning, and distributing local CPU computing resources according to the local equipment information.
6. The method for optimized configuration of terminal-edge federated resources of claim 1, wherein when the mobile device needs to complete DNN inference locally, the local DNN inference module obtains video and completes DNN inference using the allocated local computing resources.
7. An edge-side collaborative video AI inference system, the system comprising:
the edge controller is used for recording the multiplication and addition number required by the neural network and the accuracy rate of neural network identification when videos with different frame numbers are input to the neural network; when receiving the video identification request of each mobile device, completing a control task according to the video identification request of the mobile device;
the edge DNN reasoning module is used for obtaining the uploaded video from the edge equipment needing unloading, finishing reasoning according to the computing resources distributed by the edge controller and sending a reasoning result to each piece of mobile equipment;
the video sampling management module is used for acquiring the control information of the sampling frame number of the edge calculation, controlling the sampling frame number of the mobile equipment based on the control information of the sampling frame, and determining the frame number of the input video for neural network inference;
the local controller is used for acquiring the video from the video sampling management module and determining whether to transmit the video to the edge server according to the user unloading decision acquired from the edge controller;
And the local DNN reasoning module is used for acquiring the video when the mobile equipment needs to locally finish DNN reasoning and finishing DNN reasoning by using the distributed local computing resources.
8. The frontend coordinated video AI inference system of claim 7 wherein said completing a control task based on a video identification request of a mobile device comprises:
determining the frame number of video samples of each mobile device, and sending the control information of the frame number of the samples to a video sample management module corresponding to the mobile device based on the frame number of the video samples of each mobile device;
determining a user offload decision, and determining whether the inference task of the user is completed in the local DNN inference module or the edge DNN inference module based on the user offload decision;
determining the time proportion of uplink video transmission of each mobile device based on the allocation strategy of the communication resource of each mobile video device;
and determining a resource allocation strategy of the edge DNN reasoning module to decide the CPU calculation frequency of each uninstalling user.
9. The frontend collaborative video AI inference system of claim 8, wherein the local controller transmits video to an edge server for inference if a user offload decision is 1, with the transmitted communication resources configured by a base station.
10. The frontend collaborative video AI inference system according to claim 8, wherein the local controller allows videos to complete inference locally if a user offload decision is 0, and allocates local CPU computational resources based on local device information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210455191.5A CN114756371A (en) | 2022-04-27 | 2022-04-27 | Method and system for optimal configuration of terminal edge joint resources |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210455191.5A CN114756371A (en) | 2022-04-27 | 2022-04-27 | Method and system for optimal configuration of terminal edge joint resources |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114756371A true CN114756371A (en) | 2022-07-15 |
Family
ID=82333211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210455191.5A Pending CN114756371A (en) | 2022-04-27 | 2022-04-27 | Method and system for optimal configuration of terminal edge joint resources |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114756371A (en) |
-
2022
- 2022-04-27 CN CN202210455191.5A patent/CN114756371A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Elgendy et al. | Joint computation offloading and task caching for multi-user and multi-task MEC systems: reinforcement learning-based algorithms | |
CN109857546B (en) | Multi-server mobile edge computing unloading method and device based on Lyapunov optimization | |
CN113950066A (en) | Single server part calculation unloading method, system and equipment under mobile edge environment | |
CN110968426B (en) | Edge cloud collaborative k-means clustering model optimization method based on online learning | |
CN113268341B (en) | Distribution method, device, equipment and storage medium of power grid edge calculation task | |
CN113242568A (en) | Task unloading and resource allocation method in uncertain network environment | |
CN110798849A (en) | Computing resource allocation and task unloading method for ultra-dense network edge computing | |
CN114189892A (en) | Cloud-edge collaborative Internet of things system resource allocation method based on block chain and collective reinforcement learning | |
CN112422644B (en) | Method and system for unloading computing tasks, electronic device and storage medium | |
CN110096362B (en) | Multitask unloading method based on edge server cooperation | |
CN110519370B (en) | Edge computing resource allocation method based on facility site selection problem | |
CN113568727A (en) | Mobile edge calculation task allocation method based on deep reinforcement learning | |
CN112214301B (en) | Smart city-oriented dynamic calculation migration method and device based on user preference | |
CN113645637B (en) | Method and device for unloading tasks of ultra-dense network, computer equipment and storage medium | |
CN116112981B (en) | Unmanned aerial vehicle task unloading method based on edge calculation | |
CN113573363A (en) | MEC calculation unloading and resource allocation method based on deep reinforcement learning | |
CN113590279A (en) | Task scheduling and resource allocation method for multi-core edge computing server | |
CN114281718A (en) | Industrial Internet edge service cache decision method and system | |
CN114980160A (en) | Unmanned aerial vehicle-assisted terahertz communication network joint optimization method and device | |
CN115098115A (en) | Edge calculation task unloading method and device, electronic equipment and storage medium | |
CN115408072A (en) | Rapid adaptation model construction method based on deep reinforcement learning and related device | |
Li | Optimization of task offloading problem based on simulated annealing algorithm in MEC | |
CN114339891A (en) | Edge unloading resource allocation method and system based on Q learning | |
CN117560721A (en) | Resource allocation method and device, electronic equipment and storage medium | |
CN117202265A (en) | DQN-based service migration method in edge environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |