CN111522657A - Distributed equipment collaborative deep learning reasoning method - Google Patents
Distributed equipment collaborative deep learning reasoning method Download PDFInfo
- Publication number
- CN111522657A CN111522657A CN202010289197.0A CN202010289197A CN111522657A CN 111522657 A CN111522657 A CN 111522657A CN 202010289197 A CN202010289197 A CN 202010289197A CN 111522657 A CN111522657 A CN 111522657A
- Authority
- CN
- China
- Prior art keywords
- equipment
- cache
- neural network
- edge
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000013135 deep learning Methods 0.000 title claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 32
- 238000004364 calculation method Methods 0.000 claims abstract description 29
- 230000000977 initiatory effect Effects 0.000 claims abstract description 18
- 238000013138 pruning Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 5
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 239000006185 dispersion Substances 0.000 claims 1
- 230000003139 buffering effect Effects 0.000 description 4
- 230000001537 neural effect Effects 0.000 description 4
- 238000011176 pooling Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000005265 energy consumption Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Neurology (AREA)
- Computer And Data Communications (AREA)
- Multi Processors (AREA)
Abstract
The invention discloses a method for deploying a cache-based deep neural network on distributed edge equipment in a distributed manner. The method comprises the steps of dividing a neural network, pruning a neural network at the previous layer of the divided part, calculating one part of a deep neural network at a task initiating device, transmitting a small amount of intermediate results to other edge devices, calculating the rest part, caching and reusing the intermediate results of the neural network of the edge devices in addition to the calculation, and sharing the cache among different devices, thereby reducing the delay of edge intelligent application and reducing the requirements of the neural network on the performance of the edge devices, particularly reducing the repeated calculation amount when the edge side initiates an intelligent task request on similar data, reducing the performance requirements of deep learning on the devices, and fully utilizing the calculation resources of edge scenes.
Description
Technical Field
The invention relates to the field of artificial intelligence and edge calculation in computer science, in particular to a deep learning inference method combining dispersive equipment cooperative calculation and cache.
Background
Edge computing is a new computing paradigm, and aims to meet the requirements of users on real-time response, privacy, safety, computing autonomy and the like of services by utilizing computing and communication resources of edge equipment. Under the promotion of rapid development of algorithms, computing power and big data, deep learning has made great progress in many fields as the most active field in artificial intelligence. With the development of the internet of things and information physical systems (CPS), novel applications such as automatic driving, intelligent unmanned aerial vehicle formation, intelligent robot clustering and the like drive the fusion of edge computing and artificial intelligence, and promote the appearance and rapid development of edge intelligent technologies. How to deploy and implement a deep learning function in the edge equipment with limited resources brings huge technical challenges to the edge intelligent technology.
The deep learning calculation on the edge device needs to accelerate a general traditional algorithm, and common acceleration methods include:
(1) the model exits in advance: DNN models with high accuracy typically have deeper structures. Executing such DNN models on terminal devices consumes a significant amount of resources. To speed up model reasoning, the model early exit method uses the output data of the early layers to obtain classification results, which means that the reasoning process is done by using a partial DNN model. Reducing the delay is an optimization goal for the model to exit early.
(2) Inputting and filtering: input filtering is an effective way to accelerate DNN model inference, especially for video analytics. The key idea of input filtering is to remove the non-target object frame of the input data and avoid redundant calculation of DNN model inference, thereby improving inference precision, shortening inference time delay and reducing energy consumption.
(3) Selecting a model: the model selection method is proposed to optimize the DNN inference problem of time delay, accuracy and energy consumption. The main idea of model selection is that we can first train a set of DNN models with different model sizes offline for the same task and then adaptively select the models to reason online. Model selection is similar to model early exit, where the exit point of the model early exit mechanism can be viewed as the DNN model. The key difference is that the egress point shares a portion of the DNN layer with the main branch model, and the models in the model selection mechanism are independent.
The methods can accelerate the deep learning model in a certain aspect, but cannot fully utilize the characteristic of interconnection between edge devices, and does not consider the characteristic that the edge devices are likely to perform intelligent tasks in a period of time, and input data are similar or even repeated. Therefore, the invention provides a deep learning inference method combining edge cooperative computing and caching aiming at the characteristics of the edge scenes.
Disclosure of Invention
The invention integrates caching, neural networks and edge computing technologies. The method utilizes geographically dispersed computing equipment in an edge scene to cooperatively perform deep learning computation, wherein task initiating equipment performs part of computation and caches the result and a final label, if the similarity between an intermediate result and the content in the cache reaches a certain degree, the label in the cache is used as the result, otherwise, the intermediate result is transmitted to other dispersed equipment to continue computation to obtain the final result and is transmitted back to the task initiating equipment, so that intelligent task computation is performed in the edge scene without data interaction with a cloud data center. The method specifically comprises a scattered equipment selection step and a distributed neural network calculation step based on cache, wherein:
the distributed equipment selection requires that each equipment registers own equipment information including IP addresses, port numbers and other equipment information in a registration center, the registration center acquires information such as network transmission speed with other equipment, memory, load, calculation performance, electric quantity and the like when initiating tasks, then the weighted summation is carried out on each piece of performance information according to specific conditions, and the equipment with the highest weighted value is selected as the other equipment for distributed neural network calculation.
Based on the edge scene collaborative reasoning of the cache, dividing the neural network according to the information of the two devices and the information of the neural network model, wherein the dividing is based on the layering characteristic of the neural network, and different layers have different calculated amounts according to the type of the network layer and the position of the network layer; the delay of convolutional and pooling layers is much smaller relative to fully connected layers, especially on GPUs; the convolution and pooling layers are typically in front of the network and the fully-connected layer is typically in the back of the network; the data size of the previous layer of the network is generally gradually reduced, and the data size of the later layers is generally obviously smaller than that of the original input data, so that the time required for transmitting or buffering the intermediate result of the later layers is far shorter than that for directly buffering or transmitting the original data. After division is finished, structured pruning can be carried out on the previous layer of the division position by utilizing the neural network structured pruning operation, so that the size of a result in the middle of transmission can be further reduced, the complexity of the model is basically not reduced and the accuracy of the model is basically not lost because only one layer of the division is carried out. After the division, the current task initiating device carries out previous part of calculation, the obtained result is used as a key value to be inquired in a cache, if the similarity is high enough, the corresponding value is used as a final result, otherwise, the intermediate result is transmitted to another device to carry out the rest calculation, and the final result is transmitted back to the task initiating device. The content in the cache can be adjusted in time according to the size of the memory, and the extra time and computing power for repeatedly initiating computing tasks can be effectively avoided even if only the last content is stored in the cache.
And remotely calling another device to perform the rest of neural network calculation by using a dynamic proxy mode, wherein different tasks can be remotely called by using the same set of remote calling module and the dynamically generated proxy class.
Compared with the prior art, the innovation of the invention is that: the neural network is combined with cache distribution to run on geographically dispersed edge equipment by utilizing the characteristics that the intelligent task data of the edge scene are similar and the hierarchical characteristics of the neural network, so that the computing resources of the edge equipment are effectively utilized, and the deep learning computing task which needs a large amount of resources can be efficiently deployed on the edge equipment; after the neural network is divided, the structured pruning is carried out on the neural layer which obtains the intermediate result, so that the size of the transmitted intermediate result can be fully reduced, and the complexity of the model is not reduced basically; other equipment is remotely called to carry out the rest neural network calculation, and a dynamic proxy mode is used, so that a plurality of calculation tasks can share one set of remote calling frame without concerning the implementation details of remote calling; in addition, the calculation cache key value is shared with the first half part of the neural network of the intelligent task, and extra calculation resources and time are not needed.
Drawings
FIG. 1 schematic diagram of a decentralized apparatus selection
FIG. 2 neural network model partitioning diagram
FIG. 3 is a schematic diagram of edge scene collaborative inference based on caching
Detailed Description
The invention mainly comprises two steps: a step of selecting decentralized equipment and a step of calculating the distributed neural network based on the cache.
Decentralized device selection
Because the edge scene devices are generally heterogeneous, that is, different computing performance, network performance and storage performance exist, it is necessary to reasonably select the cooperative device before initiating the cooperative intelligent computing task, and because the edge scene is generally in a centerless mode, each device can be used as a task initiating device or a task cooperative device. As shown in fig. 1, in order to let other devices know their existence, each device registers its IP address, port number, and other device performance information in the registry, and establishes a stable connection while ensuring network reliability. The task initiating device obtains information of other devices through the registration center, firstly screens out devices with insufficient memory, insufficient electric quantity and overlarge load from all devices establishing connection, carries out sequencing according to calculation performance and network performance weighting in the rest devices, and selects the device with the most front sequencing as another cooperative device. When the two devices still do not meet the calculation requirements, the second device selects the next cooperative device from the devices which establish stable connection with the second device in the same way, and so on. The information of the device needs to be updated in time, and the registration center needs to be informed in time when the conditions such as the load of the device and the occupation of the memory of the device change.
In order to prevent the devices from being down, network connection is interrupted and the like, each device needs to send heartbeat information to the registry at regular time, and when the heartbeat information is not received for a continuous period of time, the device is considered to be in fault and is deleted from the registry.
Cache-based edge scene collaborative reasoning
Because of the characteristics of neural network layering, the types of network layers and the positions of the network layers, different layers have different calculated amounts; the delay of convolutional and pooling layers is much smaller relative to fully connected layers, especially on GPUs; the convolution and pooling layers are typically in front of the network and the fully-connected layer is typically in the back of the network; the data size of the layers in front of the network is generally gradually reduced, and the data size of the later layers is generally obviously smaller than that of the original input data, so that the time required for transmitting or buffering intermediate results of the later layers is far shorter than that for directly buffering or transmitting the original data, the neural network can be divided into different parts, and the parts can be operated on different devices in a distributed mode. Distributed neural network reasoning requires first a reasonable partitioning of the neural network. There are two main factors affecting the optimal segmentation point location: one is a static factor, such as the structure of the model; one is dynamic factors such as network transmission speed, load of the edge server, and remaining available power of the device. For the model structure, the running time of the intelligent inference application can be measured on the edge device and the edge server respectively aiming at each layer before the intelligent inference application is deployed, so that the framework performance and the hardware performance are not required to be acquired, and the method is more accurate. Real-time acquisition is required for various dynamic factors, such as network transmission speed, which can be measured using an iperf tool. As shown in fig. 2, a network model is loaded, types and parameter conditions of each network layer in the DNN model are analyzed, and then a prediction model is used to predict delay time of each network layer on each mobile device. And selecting the segmentation position according to the predicted result and by combining the current electric quantity of the edge equipment and the load condition, and advancing the segmentation position when the electric quantity of the equipment is lower so as to save the electric quantity of the equipment.
After the neural network is divided, structured pruning can be carried out on a neural layer which obtains an intermediate result to be transmitted, the specific process is that a trained neural network model is obtained firstly, then the parameter sum of different convolution kernels of the neural layer which obtains the intermediate result is sequenced, a convolution kernel with a larger absolute value sum is taken according to a certain proportion, the rest convolution kernels are pruned (namely discarded), and finally the neural network is finely adjusted.
After the neural network is divided and the neural layer for obtaining the intermediate result is pruned, the distributed deep learning calculation can be carried out. The specific steps are shown in fig. 3: firstly, the first half of calculation is carried out on a task initiating device, and the features extracted through the first half of neural network are used for inquiring and comparing in a cache on one hand and are transmitted to another device for the rest of neural network calculation on the other hand. The characteristics of the input data are stored in the cache instead of the whole input data, and the characteristics extracted by the middle layer of the neural network are far smaller than the original input data, so that the efficiency of query comparison in the cache is high. Inquiring the obtained intermediate result in a cache, using cosine distance as a standard, and directly using value corresponding to the key value in the cache as a result when the similarity is higher than a certain proportion; when the key values which are similar enough are not searched in the cache, the characteristics are transmitted to another device, the other device firstly utilizes the characteristics to search and compare in the edge cache, if the key values are hit, the corresponding result is transmitted back to the edge device, otherwise, the other device carries out the rest calculation and transmits back the result. If the device is a multi-core processor, two threads can be started to execute a query cache task and an intermediate characteristic data sending task in parallel. Both devices can update the content in the cache at this time, cache replacement uses the least recently used algorithm, values that have not been queried for the latest time are eliminated in the cache each time, and the values are replaced with the newly calculated content. In the cache mode, a part of space can be set in each device to serve as a cache, or one device can be designated separately to serve as a cache dedicated server, and all devices fetch the cache from the cache server.
The task initiating device remotely calls another device to perform residual neural network calculation by using a dynamic proxy mode, the specific task initiating device remotely calls a proxy class, the proxy class serializes a function name and a parameter list (the function name can be an intelligent task name, the parameter list contains an intermediate result to be transmitted, information such as network division at the layer of the number of layers and the like), the function name is sent to the other device by using socket, and the other device completes function calling through a reflection mechanism. In this way, intelligent tasks and remote invocations can be decoupled, and a remote invocation module need not be implemented once for each task.
Claims (7)
1. A deep learning reasoning method combining dispersive equipment cooperative computing and cache is disclosed, the method utilizes edge scene dispersive computing equipment to cooperatively perform deep learning computing, wherein task initiating equipment performs a part of operation, caches the part of result and final label, if the similarity between the intermediate result and the content in the cache reaches a certain degree, the label in the cache is used as the result, otherwise the intermediate result is transmitted to other dispersive equipment to continue computing to obtain the final result and transmit the final result back to the task initiating equipment, the method comprises a dispersive equipment selecting step and a cache-based distributed deep learning computing step, and is characterized in that:
the dispersion device selecting step includes:
1) all equipment registers equipment information in a registration center, wherein the equipment information comprises an IP address, a port number, a network transmission speed, calculation performance, a memory and a load;
2) selecting the most suitable equipment according to the performance weighting of each equipment to join in the cooperative calculation;
the cache-based distributed deep learning calculation step comprises the following steps:
1) dividing the network according to the neural network model and the performance of the computing equipment, and performing structured pruning on a layer for obtaining an intermediate result;
2) the task initiating device carries out first half part calculation, and inquires whether the intermediate result has a similar value in a cache, and if the intermediate result has the similar value, the result in the cache is used;
3) and when the cache does not have the similar value, sending the intermediate result and the function name to another device by using the dynamically generated proxy class, remotely calling the other device to perform residual calculation, and transmitting the obtained result back to the task initiating device.
2. The method according to claim 1, wherein there is a registry, and all devices register their own device information with the registry, and all devices can acquire the information of other devices through the registry.
3. The method of claim 1, wherein after partitioning the network layer, performing a structured pruning operation on the network layer where the intermediate result is obtained.
4. The method of claim 1, wherein the edge devices are geographically dispersed and heterogeneous.
5. The method of claim 1, in which the cached content is a key-value pair of an output of a neural network middle layer and a final result tag.
6. The method of claim 1, in which the computation to obtain the cached key value and the neural network computation by the task initiating device are shared.
7. The method of claim 1, wherein distributed remote invocation is implemented using a dynamic proxy mode, and wherein all remote invocation implementations are responsible for the dynamically generated proxy class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010289197.0A CN111522657B (en) | 2020-04-14 | 2020-04-14 | Distributed equipment collaborative deep learning reasoning method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010289197.0A CN111522657B (en) | 2020-04-14 | 2020-04-14 | Distributed equipment collaborative deep learning reasoning method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111522657A true CN111522657A (en) | 2020-08-11 |
CN111522657B CN111522657B (en) | 2022-07-22 |
Family
ID=71902665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010289197.0A Expired - Fee Related CN111522657B (en) | 2020-04-14 | 2020-04-14 | Distributed equipment collaborative deep learning reasoning method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111522657B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112615736A (en) * | 2020-12-10 | 2021-04-06 | 南京工业大学 | Delay optimal distributed NNs collaborative optimization method facing linear edge network |
CN112818788A (en) * | 2021-01-25 | 2021-05-18 | 电子科技大学 | Distributed convolutional neural network hierarchical matching method based on unmanned aerial vehicle cluster |
CN112862083A (en) * | 2021-04-06 | 2021-05-28 | 南京大学 | Deep neural network inference method and device under edge environment |
CN114401063A (en) * | 2022-01-10 | 2022-04-26 | 中国人民解放军国防科技大学 | Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model |
WO2022138232A1 (en) * | 2020-12-23 | 2022-06-30 | ソニーグループ株式会社 | Communication device, communication method, and communication system |
CN114881227A (en) * | 2022-05-13 | 2022-08-09 | 北京百度网讯科技有限公司 | Model compression method, image processing method, device and electronic equipment |
WO2023197687A1 (en) * | 2022-04-13 | 2023-10-19 | 西安广和通无线通信有限公司 | Collaborative data processing method, system and apparatus, device, and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108268638A (en) * | 2018-01-18 | 2018-07-10 | 浙江工业大学 | A kind of generation confrontation network distribution type implementation method based on Spark frames |
CN108846142A (en) * | 2018-07-12 | 2018-11-20 | 南方电网调峰调频发电有限公司 | A kind of Text Clustering Method, device, equipment and readable storage medium storing program for executing |
US20180336467A1 (en) * | 2017-07-31 | 2018-11-22 | Seematics Systems Ltd | System and method for enriching datasets while learning |
CN109997154A (en) * | 2017-10-30 | 2019-07-09 | 上海寒武纪信息科技有限公司 | Information processing method and terminal device |
CN110309914A (en) * | 2019-07-03 | 2019-10-08 | 中山大学 | Deep learning model reasoning accelerated method based on Edge Server Yu mobile terminal equipment collaboration |
CN110795235A (en) * | 2019-09-25 | 2020-02-14 | 北京邮电大学 | Method and system for deep learning and cooperation of mobile web |
WO2020042658A1 (en) * | 2018-08-31 | 2020-03-05 | 华为技术有限公司 | Data processing method, device, apparatus, and system |
US20200082259A1 (en) * | 2018-09-10 | 2020-03-12 | International Business Machines Corporation | System for Measuring Information Leakage of Deep Learning Models |
-
2020
- 2020-04-14 CN CN202010289197.0A patent/CN111522657B/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180336467A1 (en) * | 2017-07-31 | 2018-11-22 | Seematics Systems Ltd | System and method for enriching datasets while learning |
CN109997154A (en) * | 2017-10-30 | 2019-07-09 | 上海寒武纪信息科技有限公司 | Information processing method and terminal device |
CN108268638A (en) * | 2018-01-18 | 2018-07-10 | 浙江工业大学 | A kind of generation confrontation network distribution type implementation method based on Spark frames |
CN108846142A (en) * | 2018-07-12 | 2018-11-20 | 南方电网调峰调频发电有限公司 | A kind of Text Clustering Method, device, equipment and readable storage medium storing program for executing |
WO2020042658A1 (en) * | 2018-08-31 | 2020-03-05 | 华为技术有限公司 | Data processing method, device, apparatus, and system |
US20200082259A1 (en) * | 2018-09-10 | 2020-03-12 | International Business Machines Corporation | System for Measuring Information Leakage of Deep Learning Models |
CN110309914A (en) * | 2019-07-03 | 2019-10-08 | 中山大学 | Deep learning model reasoning accelerated method based on Edge Server Yu mobile terminal equipment collaboration |
CN110795235A (en) * | 2019-09-25 | 2020-02-14 | 北京邮电大学 | Method and system for deep learning and cooperation of mobile web |
Non-Patent Citations (1)
Title |
---|
吴林阳 等: "一种运算和数据协同优化的深度学习编译框架", 《高技术通讯》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112615736A (en) * | 2020-12-10 | 2021-04-06 | 南京工业大学 | Delay optimal distributed NNs collaborative optimization method facing linear edge network |
CN112615736B (en) * | 2020-12-10 | 2022-03-18 | 南京工业大学 | Delay optimal distributed NNs collaborative optimization method facing linear edge network |
WO2022138232A1 (en) * | 2020-12-23 | 2022-06-30 | ソニーグループ株式会社 | Communication device, communication method, and communication system |
CN112818788A (en) * | 2021-01-25 | 2021-05-18 | 电子科技大学 | Distributed convolutional neural network hierarchical matching method based on unmanned aerial vehicle cluster |
CN112862083A (en) * | 2021-04-06 | 2021-05-28 | 南京大学 | Deep neural network inference method and device under edge environment |
CN112862083B (en) * | 2021-04-06 | 2024-04-09 | 南京大学 | Deep neural network inference method and device in edge environment |
CN114401063A (en) * | 2022-01-10 | 2022-04-26 | 中国人民解放军国防科技大学 | Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model |
CN114401063B (en) * | 2022-01-10 | 2023-10-31 | 中国人民解放军国防科技大学 | Edge equipment cooperative spectrum intelligent monitoring method and system based on lightweight model |
WO2023197687A1 (en) * | 2022-04-13 | 2023-10-19 | 西安广和通无线通信有限公司 | Collaborative data processing method, system and apparatus, device, and storage medium |
CN114881227A (en) * | 2022-05-13 | 2022-08-09 | 北京百度网讯科技有限公司 | Model compression method, image processing method, device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111522657B (en) | 2022-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111522657B (en) | Distributed equipment collaborative deep learning reasoning method | |
CN110941667B (en) | Method and system for calculating and unloading in mobile edge calculation network | |
CN110515732B (en) | Task allocation method based on deep learning inference of resource-constrained robot | |
US20220351019A1 (en) | Adaptive Search Method and Apparatus for Neural Network | |
CN108418718B (en) | Data processing delay optimization method and system based on edge calculation | |
CN110069341B (en) | Method for scheduling tasks with dependency relationship configured according to needs by combining functions in edge computing | |
CN110753107B (en) | Resource scheduling system, method and storage medium under space-based cloud computing architecture | |
CN111553213B (en) | Real-time distributed identity-aware pedestrian attribute identification method in mobile edge cloud | |
CN112540845B (en) | Collaboration system and method based on mobile edge calculation | |
CN111049903A (en) | Edge network load distribution algorithm based on application perception prediction | |
Miao et al. | Adaptive DNN partition in edge computing environments | |
Xu et al. | A meta reinforcement learning-based virtual machine placement algorithm in mobile edge computing | |
Cui | Research and application of edge computing based on deep learning | |
CN116760722A (en) | Storage auxiliary MEC task unloading system and resource scheduling method | |
Lu et al. | Dynamic offloading on a hybrid edge–cloud architecture for multiobject tracking | |
US20230305894A1 (en) | Controlling operation of edge computing nodes based on knowledge sharing among groups of the edge computing nodes | |
CN109617960A (en) | A kind of web AR data presentation method based on attributed separation | |
Sung et al. | Use of edge resources for DNN model maintenance in 5G IoT networks | |
Zhang et al. | Vulcan: Automatic Query Planning for Live {ML} Analytics | |
Huang et al. | Field of view aware proactive caching for mobile augmented reality applications | |
Nashaat et al. | DRL-Based Distributed Task Offloading Framework in Edge-Cloud Environment | |
Jia et al. | FedLPS: Heterogeneous Federated Learning for Multiple Tasks with Local Parameter Sharing | |
CN116128046B (en) | Storage method of multi-input neural network model serial block of embedded equipment | |
CN114860345B (en) | Calculation unloading method based on cache assistance in smart home scene | |
CN118409873B (en) | Model memory occupation optimization method, equipment, medium, product and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220722 |