CN111522657A - Distributed equipment collaborative deep learning reasoning method - Google Patents

Distributed equipment collaborative deep learning reasoning method Download PDF

Info

Publication number
CN111522657A
CN111522657A CN202010289197.0A CN202010289197A CN111522657A CN 111522657 A CN111522657 A CN 111522657A CN 202010289197 A CN202010289197 A CN 202010289197A CN 111522657 A CN111522657 A CN 111522657A
Authority
CN
China
Prior art keywords
equipment
cache
neural network
edge
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010289197.0A
Other languages
Chinese (zh)
Other versions
CN111522657B (en
Inventor
白跃彬
胡传文
王锐
刘畅
汪啸林
江文灏
程琨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010289197.0A priority Critical patent/CN111522657B/en
Publication of CN111522657A publication Critical patent/CN111522657A/en
Application granted granted Critical
Publication of CN111522657B publication Critical patent/CN111522657B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Neurology (AREA)
  • Computer And Data Communications (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a method for deploying a cache-based deep neural network on distributed edge equipment in a distributed manner. The method comprises the steps of dividing a neural network, pruning a neural network at the previous layer of the divided part, calculating one part of a deep neural network at a task initiating device, transmitting a small amount of intermediate results to other edge devices, calculating the rest part, caching and reusing the intermediate results of the neural network of the edge devices in addition to the calculation, and sharing the cache among different devices, thereby reducing the delay of edge intelligent application and reducing the requirements of the neural network on the performance of the edge devices, particularly reducing the repeated calculation amount when the edge side initiates an intelligent task request on similar data, reducing the performance requirements of deep learning on the devices, and fully utilizing the calculation resources of edge scenes.

Description

Distributed equipment collaborative deep learning reasoning method
Technical Field
The invention relates to the field of artificial intelligence and edge calculation in computer science, in particular to a deep learning inference method combining dispersive equipment cooperative calculation and cache.
Background
Edge computing is a new computing paradigm, and aims to meet the requirements of users on real-time response, privacy, safety, computing autonomy and the like of services by utilizing computing and communication resources of edge equipment. Under the promotion of rapid development of algorithms, computing power and big data, deep learning has made great progress in many fields as the most active field in artificial intelligence. With the development of the internet of things and information physical systems (CPS), novel applications such as automatic driving, intelligent unmanned aerial vehicle formation, intelligent robot clustering and the like drive the fusion of edge computing and artificial intelligence, and promote the appearance and rapid development of edge intelligent technologies. How to deploy and implement a deep learning function in the edge equipment with limited resources brings huge technical challenges to the edge intelligent technology.
The deep learning calculation on the edge device needs to accelerate a general traditional algorithm, and common acceleration methods include:
(1) the model exits in advance: DNN models with high accuracy typically have deeper structures. Executing such DNN models on terminal devices consumes a significant amount of resources. To speed up model reasoning, the model early exit method uses the output data of the early layers to obtain classification results, which means that the reasoning process is done by using a partial DNN model. Reducing the delay is an optimization goal for the model to exit early.
(2) Inputting and filtering: input filtering is an effective way to accelerate DNN model inference, especially for video analytics. The key idea of input filtering is to remove the non-target object frame of the input data and avoid redundant calculation of DNN model inference, thereby improving inference precision, shortening inference time delay and reducing energy consumption.
(3) Selecting a model: the model selection method is proposed to optimize the DNN inference problem of time delay, accuracy and energy consumption. The main idea of model selection is that we can first train a set of DNN models with different model sizes offline for the same task and then adaptively select the models to reason online. Model selection is similar to model early exit, where the exit point of the model early exit mechanism can be viewed as the DNN model. The key difference is that the egress point shares a portion of the DNN layer with the main branch model, and the models in the model selection mechanism are independent.
The methods can accelerate the deep learning model in a certain aspect, but cannot fully utilize the characteristic of interconnection between edge devices, and does not consider the characteristic that the edge devices are likely to perform intelligent tasks in a period of time, and input data are similar or even repeated. Therefore, the invention provides a deep learning inference method combining edge cooperative computing and caching aiming at the characteristics of the edge scenes.
Disclosure of Invention
The invention integrates caching, neural networks and edge computing technologies. The method utilizes geographically dispersed computing equipment in an edge scene to cooperatively perform deep learning computation, wherein task initiating equipment performs part of computation and caches the result and a final label, if the similarity between an intermediate result and the content in the cache reaches a certain degree, the label in the cache is used as the result, otherwise, the intermediate result is transmitted to other dispersed equipment to continue computation to obtain the final result and is transmitted back to the task initiating equipment, so that intelligent task computation is performed in the edge scene without data interaction with a cloud data center. The method specifically comprises a scattered equipment selection step and a distributed neural network calculation step based on cache, wherein:
the distributed equipment selection requires that each equipment registers own equipment information including IP addresses, port numbers and other equipment information in a registration center, the registration center acquires information such as network transmission speed with other equipment, memory, load, calculation performance, electric quantity and the like when initiating tasks, then the weighted summation is carried out on each piece of performance information according to specific conditions, and the equipment with the highest weighted value is selected as the other equipment for distributed neural network calculation.
Based on the edge scene collaborative reasoning of the cache, dividing the neural network according to the information of the two devices and the information of the neural network model, wherein the dividing is based on the layering characteristic of the neural network, and different layers have different calculated amounts according to the type of the network layer and the position of the network layer; the delay of convolutional and pooling layers is much smaller relative to fully connected layers, especially on GPUs; the convolution and pooling layers are typically in front of the network and the fully-connected layer is typically in the back of the network; the data size of the previous layer of the network is generally gradually reduced, and the data size of the later layers is generally obviously smaller than that of the original input data, so that the time required for transmitting or buffering the intermediate result of the later layers is far shorter than that for directly buffering or transmitting the original data. After division is finished, structured pruning can be carried out on the previous layer of the division position by utilizing the neural network structured pruning operation, so that the size of a result in the middle of transmission can be further reduced, the complexity of the model is basically not reduced and the accuracy of the model is basically not lost because only one layer of the division is carried out. After the division, the current task initiating device carries out previous part of calculation, the obtained result is used as a key value to be inquired in a cache, if the similarity is high enough, the corresponding value is used as a final result, otherwise, the intermediate result is transmitted to another device to carry out the rest calculation, and the final result is transmitted back to the task initiating device. The content in the cache can be adjusted in time according to the size of the memory, and the extra time and computing power for repeatedly initiating computing tasks can be effectively avoided even if only the last content is stored in the cache.
And remotely calling another device to perform the rest of neural network calculation by using a dynamic proxy mode, wherein different tasks can be remotely called by using the same set of remote calling module and the dynamically generated proxy class.
Compared with the prior art, the innovation of the invention is that: the neural network is combined with cache distribution to run on geographically dispersed edge equipment by utilizing the characteristics that the intelligent task data of the edge scene are similar and the hierarchical characteristics of the neural network, so that the computing resources of the edge equipment are effectively utilized, and the deep learning computing task which needs a large amount of resources can be efficiently deployed on the edge equipment; after the neural network is divided, the structured pruning is carried out on the neural layer which obtains the intermediate result, so that the size of the transmitted intermediate result can be fully reduced, and the complexity of the model is not reduced basically; other equipment is remotely called to carry out the rest neural network calculation, and a dynamic proxy mode is used, so that a plurality of calculation tasks can share one set of remote calling frame without concerning the implementation details of remote calling; in addition, the calculation cache key value is shared with the first half part of the neural network of the intelligent task, and extra calculation resources and time are not needed.
Drawings
FIG. 1 schematic diagram of a decentralized apparatus selection
FIG. 2 neural network model partitioning diagram
FIG. 3 is a schematic diagram of edge scene collaborative inference based on caching
Detailed Description
The invention mainly comprises two steps: a step of selecting decentralized equipment and a step of calculating the distributed neural network based on the cache.
Decentralized device selection
Because the edge scene devices are generally heterogeneous, that is, different computing performance, network performance and storage performance exist, it is necessary to reasonably select the cooperative device before initiating the cooperative intelligent computing task, and because the edge scene is generally in a centerless mode, each device can be used as a task initiating device or a task cooperative device. As shown in fig. 1, in order to let other devices know their existence, each device registers its IP address, port number, and other device performance information in the registry, and establishes a stable connection while ensuring network reliability. The task initiating device obtains information of other devices through the registration center, firstly screens out devices with insufficient memory, insufficient electric quantity and overlarge load from all devices establishing connection, carries out sequencing according to calculation performance and network performance weighting in the rest devices, and selects the device with the most front sequencing as another cooperative device. When the two devices still do not meet the calculation requirements, the second device selects the next cooperative device from the devices which establish stable connection with the second device in the same way, and so on. The information of the device needs to be updated in time, and the registration center needs to be informed in time when the conditions such as the load of the device and the occupation of the memory of the device change.
In order to prevent the devices from being down, network connection is interrupted and the like, each device needs to send heartbeat information to the registry at regular time, and when the heartbeat information is not received for a continuous period of time, the device is considered to be in fault and is deleted from the registry.
Cache-based edge scene collaborative reasoning
Because of the characteristics of neural network layering, the types of network layers and the positions of the network layers, different layers have different calculated amounts; the delay of convolutional and pooling layers is much smaller relative to fully connected layers, especially on GPUs; the convolution and pooling layers are typically in front of the network and the fully-connected layer is typically in the back of the network; the data size of the layers in front of the network is generally gradually reduced, and the data size of the later layers is generally obviously smaller than that of the original input data, so that the time required for transmitting or buffering intermediate results of the later layers is far shorter than that for directly buffering or transmitting the original data, the neural network can be divided into different parts, and the parts can be operated on different devices in a distributed mode. Distributed neural network reasoning requires first a reasonable partitioning of the neural network. There are two main factors affecting the optimal segmentation point location: one is a static factor, such as the structure of the model; one is dynamic factors such as network transmission speed, load of the edge server, and remaining available power of the device. For the model structure, the running time of the intelligent inference application can be measured on the edge device and the edge server respectively aiming at each layer before the intelligent inference application is deployed, so that the framework performance and the hardware performance are not required to be acquired, and the method is more accurate. Real-time acquisition is required for various dynamic factors, such as network transmission speed, which can be measured using an iperf tool. As shown in fig. 2, a network model is loaded, types and parameter conditions of each network layer in the DNN model are analyzed, and then a prediction model is used to predict delay time of each network layer on each mobile device. And selecting the segmentation position according to the predicted result and by combining the current electric quantity of the edge equipment and the load condition, and advancing the segmentation position when the electric quantity of the equipment is lower so as to save the electric quantity of the equipment.
After the neural network is divided, structured pruning can be carried out on a neural layer which obtains an intermediate result to be transmitted, the specific process is that a trained neural network model is obtained firstly, then the parameter sum of different convolution kernels of the neural layer which obtains the intermediate result is sequenced, a convolution kernel with a larger absolute value sum is taken according to a certain proportion, the rest convolution kernels are pruned (namely discarded), and finally the neural network is finely adjusted.
After the neural network is divided and the neural layer for obtaining the intermediate result is pruned, the distributed deep learning calculation can be carried out. The specific steps are shown in fig. 3: firstly, the first half of calculation is carried out on a task initiating device, and the features extracted through the first half of neural network are used for inquiring and comparing in a cache on one hand and are transmitted to another device for the rest of neural network calculation on the other hand. The characteristics of the input data are stored in the cache instead of the whole input data, and the characteristics extracted by the middle layer of the neural network are far smaller than the original input data, so that the efficiency of query comparison in the cache is high. Inquiring the obtained intermediate result in a cache, using cosine distance as a standard, and directly using value corresponding to the key value in the cache as a result when the similarity is higher than a certain proportion; when the key values which are similar enough are not searched in the cache, the characteristics are transmitted to another device, the other device firstly utilizes the characteristics to search and compare in the edge cache, if the key values are hit, the corresponding result is transmitted back to the edge device, otherwise, the other device carries out the rest calculation and transmits back the result. If the device is a multi-core processor, two threads can be started to execute a query cache task and an intermediate characteristic data sending task in parallel. Both devices can update the content in the cache at this time, cache replacement uses the least recently used algorithm, values that have not been queried for the latest time are eliminated in the cache each time, and the values are replaced with the newly calculated content. In the cache mode, a part of space can be set in each device to serve as a cache, or one device can be designated separately to serve as a cache dedicated server, and all devices fetch the cache from the cache server.
The task initiating device remotely calls another device to perform residual neural network calculation by using a dynamic proxy mode, the specific task initiating device remotely calls a proxy class, the proxy class serializes a function name and a parameter list (the function name can be an intelligent task name, the parameter list contains an intermediate result to be transmitted, information such as network division at the layer of the number of layers and the like), the function name is sent to the other device by using socket, and the other device completes function calling through a reflection mechanism. In this way, intelligent tasks and remote invocations can be decoupled, and a remote invocation module need not be implemented once for each task.

Claims (7)

1. A deep learning reasoning method combining dispersive equipment cooperative computing and cache is disclosed, the method utilizes edge scene dispersive computing equipment to cooperatively perform deep learning computing, wherein task initiating equipment performs a part of operation, caches the part of result and final label, if the similarity between the intermediate result and the content in the cache reaches a certain degree, the label in the cache is used as the result, otherwise the intermediate result is transmitted to other dispersive equipment to continue computing to obtain the final result and transmit the final result back to the task initiating equipment, the method comprises a dispersive equipment selecting step and a cache-based distributed deep learning computing step, and is characterized in that:
the dispersion device selecting step includes:
1) all equipment registers equipment information in a registration center, wherein the equipment information comprises an IP address, a port number, a network transmission speed, calculation performance, a memory and a load;
2) selecting the most suitable equipment according to the performance weighting of each equipment to join in the cooperative calculation;
the cache-based distributed deep learning calculation step comprises the following steps:
1) dividing the network according to the neural network model and the performance of the computing equipment, and performing structured pruning on a layer for obtaining an intermediate result;
2) the task initiating device carries out first half part calculation, and inquires whether the intermediate result has a similar value in a cache, and if the intermediate result has the similar value, the result in the cache is used;
3) and when the cache does not have the similar value, sending the intermediate result and the function name to another device by using the dynamically generated proxy class, remotely calling the other device to perform residual calculation, and transmitting the obtained result back to the task initiating device.
2. The method according to claim 1, wherein there is a registry, and all devices register their own device information with the registry, and all devices can acquire the information of other devices through the registry.
3. The method of claim 1, wherein after partitioning the network layer, performing a structured pruning operation on the network layer where the intermediate result is obtained.
4. The method of claim 1, wherein the edge devices are geographically dispersed and heterogeneous.
5. The method of claim 1, in which the cached content is a key-value pair of an output of a neural network middle layer and a final result tag.
6. The method of claim 1, in which the computation to obtain the cached key value and the neural network computation by the task initiating device are shared.
7. The method of claim 1, wherein distributed remote invocation is implemented using a dynamic proxy mode, and wherein all remote invocation implementations are responsible for the dynamically generated proxy class.
CN202010289197.0A 2020-04-14 2020-04-14 Distributed equipment collaborative deep learning reasoning method Expired - Fee Related CN111522657B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010289197.0A CN111522657B (en) 2020-04-14 2020-04-14 Distributed equipment collaborative deep learning reasoning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010289197.0A CN111522657B (en) 2020-04-14 2020-04-14 Distributed equipment collaborative deep learning reasoning method

Publications (2)

Publication Number Publication Date
CN111522657A true CN111522657A (en) 2020-08-11
CN111522657B CN111522657B (en) 2022-07-22

Family

ID=71902665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010289197.0A Expired - Fee Related CN111522657B (en) 2020-04-14 2020-04-14 Distributed equipment collaborative deep learning reasoning method

Country Status (1)

Country Link
CN (1) CN111522657B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112615736A (en) * 2020-12-10 2021-04-06 南京工业大学 Delay optimal distributed NNs collaborative optimization method facing linear edge network
CN112818788A (en) * 2021-01-25 2021-05-18 电子科技大学 Distributed convolutional neural network hierarchical matching method based on unmanned aerial vehicle cluster
CN112862083A (en) * 2021-04-06 2021-05-28 南京大学 Deep neural network inference method and device under edge environment
CN114401063A (en) * 2022-01-10 2022-04-26 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model
WO2022138232A1 (en) * 2020-12-23 2022-06-30 ソニーグループ株式会社 Communication device, communication method, and communication system
CN114881227A (en) * 2022-05-13 2022-08-09 北京百度网讯科技有限公司 Model compression method, image processing method, device and electronic equipment
WO2023197687A1 (en) * 2022-04-13 2023-10-19 西安广和通无线通信有限公司 Collaborative data processing method, system and apparatus, device, and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268638A (en) * 2018-01-18 2018-07-10 浙江工业大学 A kind of generation confrontation network distribution type implementation method based on Spark frames
CN108846142A (en) * 2018-07-12 2018-11-20 南方电网调峰调频发电有限公司 A kind of Text Clustering Method, device, equipment and readable storage medium storing program for executing
US20180336467A1 (en) * 2017-07-31 2018-11-22 Seematics Systems Ltd System and method for enriching datasets while learning
CN109997154A (en) * 2017-10-30 2019-07-09 上海寒武纪信息科技有限公司 Information processing method and terminal device
CN110309914A (en) * 2019-07-03 2019-10-08 中山大学 Deep learning model reasoning accelerated method based on Edge Server Yu mobile terminal equipment collaboration
CN110795235A (en) * 2019-09-25 2020-02-14 北京邮电大学 Method and system for deep learning and cooperation of mobile web
WO2020042658A1 (en) * 2018-08-31 2020-03-05 华为技术有限公司 Data processing method, device, apparatus, and system
US20200082259A1 (en) * 2018-09-10 2020-03-12 International Business Machines Corporation System for Measuring Information Leakage of Deep Learning Models

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180336467A1 (en) * 2017-07-31 2018-11-22 Seematics Systems Ltd System and method for enriching datasets while learning
CN109997154A (en) * 2017-10-30 2019-07-09 上海寒武纪信息科技有限公司 Information processing method and terminal device
CN108268638A (en) * 2018-01-18 2018-07-10 浙江工业大学 A kind of generation confrontation network distribution type implementation method based on Spark frames
CN108846142A (en) * 2018-07-12 2018-11-20 南方电网调峰调频发电有限公司 A kind of Text Clustering Method, device, equipment and readable storage medium storing program for executing
WO2020042658A1 (en) * 2018-08-31 2020-03-05 华为技术有限公司 Data processing method, device, apparatus, and system
US20200082259A1 (en) * 2018-09-10 2020-03-12 International Business Machines Corporation System for Measuring Information Leakage of Deep Learning Models
CN110309914A (en) * 2019-07-03 2019-10-08 中山大学 Deep learning model reasoning accelerated method based on Edge Server Yu mobile terminal equipment collaboration
CN110795235A (en) * 2019-09-25 2020-02-14 北京邮电大学 Method and system for deep learning and cooperation of mobile web

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴林阳 等: "一种运算和数据协同优化的深度学习编译框架", 《高技术通讯》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112615736A (en) * 2020-12-10 2021-04-06 南京工业大学 Delay optimal distributed NNs collaborative optimization method facing linear edge network
CN112615736B (en) * 2020-12-10 2022-03-18 南京工业大学 Delay optimal distributed NNs collaborative optimization method facing linear edge network
WO2022138232A1 (en) * 2020-12-23 2022-06-30 ソニーグループ株式会社 Communication device, communication method, and communication system
CN112818788A (en) * 2021-01-25 2021-05-18 电子科技大学 Distributed convolutional neural network hierarchical matching method based on unmanned aerial vehicle cluster
CN112862083A (en) * 2021-04-06 2021-05-28 南京大学 Deep neural network inference method and device under edge environment
CN112862083B (en) * 2021-04-06 2024-04-09 南京大学 Deep neural network inference method and device in edge environment
CN114401063A (en) * 2022-01-10 2022-04-26 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model
CN114401063B (en) * 2022-01-10 2023-10-31 中国人民解放军国防科技大学 Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model
WO2023197687A1 (en) * 2022-04-13 2023-10-19 西安广和通无线通信有限公司 Collaborative data processing method, system and apparatus, device, and storage medium
CN114881227A (en) * 2022-05-13 2022-08-09 北京百度网讯科技有限公司 Model compression method, image processing method, device and electronic equipment

Also Published As

Publication number Publication date
CN111522657B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN111522657B (en) Distributed equipment collaborative deep learning reasoning method
CN110941667B (en) Method and system for calculating and unloading in mobile edge calculation network
CN110515732B (en) Task allocation method based on deep learning inference of resource-constrained robot
US20220351019A1 (en) Adaptive Search Method and Apparatus for Neural Network
CN108418718B (en) Data processing delay optimization method and system based on edge calculation
CN110069341B (en) Method for scheduling tasks with dependency relationship configured according to needs by combining functions in edge computing
CN110753107B (en) Resource scheduling system, method and storage medium under space-based cloud computing architecture
CN111553213B (en) Real-time distributed identity-aware pedestrian attribute identification method in mobile edge cloud
CN112540845B (en) Collaboration system and method based on mobile edge calculation
CN111049903A (en) Edge network load distribution algorithm based on application perception prediction
Miao et al. Adaptive DNN partition in edge computing environments
Xu et al. A meta reinforcement learning-based virtual machine placement algorithm in mobile edge computing
Cui Research and application of edge computing based on deep learning
CN116760722A (en) Storage auxiliary MEC task unloading system and resource scheduling method
Lu et al. Dynamic offloading on a hybrid edge–cloud architecture for multiobject tracking
US20230305894A1 (en) Controlling operation of edge computing nodes based on knowledge sharing among groups of the edge computing nodes
CN109617960A (en) A kind of web AR data presentation method based on attributed separation
Sung et al. Use of edge resources for DNN model maintenance in 5G IoT networks
Zhang et al. Vulcan: Automatic Query Planning for Live {ML} Analytics
Huang et al. Field of view aware proactive caching for mobile augmented reality applications
Nashaat et al. DRL-Based Distributed Task Offloading Framework in Edge-Cloud Environment
Jia et al. FedLPS: Heterogeneous Federated Learning for Multiple Tasks with Local Parameter Sharing
CN116128046B (en) Storage method of multi-input neural network model serial block of embedded equipment
CN114860345B (en) Calculation unloading method based on cache assistance in smart home scene
CN118409873B (en) Model memory occupation optimization method, equipment, medium, product and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220722