CN112287609A

CN112287609A - End, edge and cloud collaborative computing device for robot task division

Info

Publication number: CN112287609A
Application number: CN202011576823.0A
Authority: CN
Inventors: 向甜; 张北北; 张鸿轩; 朱世强; 顾建军; 李特
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2021-01-29
Anticipated expiration: 2040-12-28
Also published as: CN112287609B

Abstract

The invention discloses a robot task division-oriented end, edge and cloud collaborative computing device, and belongs to the technical field of robots. The device comprises a task modeling calculation layer, a task off-line cutting layer and a task on-line execution layer; the task modeling calculation layer calculates the calculation delay and the transmission delay of different robot tasks executed in different layers under different devices; the task off-line segmentation layer performs horizontal and vertical segmentation on the target network model according to the calculation delay and the transmission delay to generate an optimal task execution strategy; and the task online execution layer performs task allocation, distribution and scheduling according to the task execution strategy output by the task offline hierarchical layer to complete the online operation of the task. The cooperative computing device can protect the data privacy of the robot terminal, effectively improves the execution efficiency of the given task of the robot, obviously improves the task division performance, and has better practicability.

Description

End, edge and cloud collaborative computing device for robot task division

Technical Field

The invention belongs to the technical field of robots, and particularly relates to a terminal, edge and cloud collaborative computing device for robot task division.

Background

With the rapid development of new-generation information technologies such as internet of things, cloud computing, big data, edge computing, artificial intelligence and the like, the robot has realized the intelligent progress from perception to cognition, reasoning and decision making, gradually becomes one of the main trends of future development in various industrial fields, and has great application value in many aspects such as industrial manufacturing, life service and the like. The existing robot technology integrates cloud intelligence, enhances the capacity of the robot technology by utilizing cloud computing and cloud storage, is limited by network bandwidth and delay, and most robot systems mainly adopt ontology computing and take the task of processing non-real-time and large computing amount by the cloud as an auxiliary task. Generally, in order to be able to accurately perceive an understanding environment to service human-machine interaction, a robot system typically integrates a large number of sensors, thereby generating a large amount of data. For example, a robot using sensors such as a high-definition camera, a depth camera, a microphone array, and a laser radar can generate data of 250MB or more per second. The whole transmission of such huge amount of data to the high in the clouds processing is neither practical nor efficient, still can relate to data privacy security problem simultaneously. In addition, AI algorithms such as face detection, face recognition, voice recognition, etc. used by robots usually require strong computing power, and although the computing power of robot body computing platforms is continuously increasing, the demand for AI algorithms is still limited. Therefore, the introduction of edge computing can well solve the problems of limited capability of the robot terminal and real-time response of cloud computing. The end-edge-cloud integrated computing architecture can provide a distribution relation which can adapt to the computing, storage and cooperative cooperation of the robot for executing different AI inference tasks, and provides computing support at the cloud side and the edge side, so that more effective and more economic computing deployment is realized in a large-scale robot application scene.

At present, for inference task division of an AI model, research at home and abroad focuses more On end-cloud cooperation, end-Edge cooperation, or Edge-cloud cooperation, and an AI model is divided in a layer unit at a fine granularity and then deployed between a terminal device and a cloud server, or between a terminal device and an Edge, or between an Edge device and a cloud server, and includes an Edge-cloud inter-operation AI inference described in a signal conference paper, "neurosourgeon: colorful interactive between the Edge and mobile Edge", and an Edge-cloud inter-operation algorithm described in a journal paper "Edge-cloud inter-operation discovery, and an infoconference paper" Dynamic dncom adaptive n supply for interaction between the Edge and cloud inter-operation.

The NeuroSurgeon combines the calculation time delay and power consumption of each layer of the AI model with the end-cloud transmission time delay and power consumption, finds the optimal task division point through decision traversal, calculates by the end equipment before the division point according to the division point, sends the output of the division point to the cloud end, completes the subsequent calculation by the cloud end, and sends the result back to the equipment end for execution.

The Edge proposed in Edge AI is an AI collaborative inference framework for mobile device and Edge synergy. And the Edgent combines model division and model scale adjustment together, designs a joint optimization search process for selecting model partition points and exit points, and finally deploys the model to the equipment end and the edge side for execution.

An AI model is converted into a directed acyclic graph form in DNN (hierarchical network) Surgery, edge nodes and cloud nodes are introduced, a new continuous edge relation is constructed, a new graph is constructed, an optimal model division point is found by adopting a graph optimization algorithm, and the divided model is deployed to the edge side and the cloud side for execution.

The above method has several problems as follows:

(1) only the task is divided into two parts to be executed, the capacity of cooperative processing of the end-edge-cloud three is not fully utilized, the execution efficiency cannot be guaranteed to be optimal, and for most of robot tasks, real-time and quick response is needed;

(2) generally, horizontal segmentation is performed on a task model, and the found optimal segmentation point cannot ensure that the task execution time meets the response time in a real scene, so that a solution for further accelerating the inference task is lacked.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a robot task division-oriented end, edge and cloud cooperative computing device.

The technical purpose of the invention is realized by the following technical scheme: a robot-task-division-oriented end, edge and cloud collaborative computing device comprises a task modeling computing layer, a task off-line cutting layer and a task on-line executing layer which are sequentially connected;

the task modeling calculation layer monitors the current hardware resource condition and the network bandwidth condition in real time, predicts the calculation delay of each layer of the target network model under different devices according to the layered configuration information of the target network model under the robot task in combination with the hardware resources, and calculates the transmission delay of each layer of the target network model output between different devices according to the data output size information of the target network model layered under the robot task in combination with the network bandwidth;

the task off-line segmentation layer horizontally segments a target network model under the robot task according to the calculation delay and the transmission delay, and divides the target network model into sub-network models, so that the inference delay from data input to output after the target network model operates under the robot task is minimum; judging whether the reasoning time delay meets the task requirement, if so, outputting a horizontally segmented task execution strategy; if the requirements are not met, one or more of the running sub-network models are subjected to vertical segmentation, and a horizontal segmentation task execution strategy and a vertical segmentation task execution strategy are output; the connection mode of the sub-network models is series connection;

and the task online execution layer performs task allocation, issuing and scheduling according to the task execution strategy output by the task offline hierarchical layer to finish the online operation of the robot task.

Further, the device includes a robot terminal device, an edge device, and a cloud server, and the hardware resource includes: CPU, GPU, FPGA, ASIC, NPU; the real-time monitoring of the hardware resource condition comprises monitoring of the utilization rate, the main frequency, the memory occupation and the resource consumption of a CPU, a GPU, an FPGA, an ASIC and an NPU.

Further, the hierarchical configuration information of the target network model includes: the number of filters per layer, the size of the filters, filter related parameters, the size of the input data, the number of neurons; the network bandwidth comprises network bandwidth between an end and an edge, network bandwidth between the end and a cloud and network bandwidth between the edge and the cloud; the real-time monitoring of the network bandwidth comprises monitoring of uplink and downlink traffic and network delay.

Further, the sub-network models are respectively run on the robot terminal, the edge device and the cloud server.

Further, the sub-network models are respectively operated on the robot terminal and the edge device, or respectively operated on the robot terminal and the cloud server, or respectively operated on the edge device and the cloud server.

Further, the horizontal segmentation method specifically comprises the following steps: modeling a target network model under a robot task into a directed acyclic graph, numbering each node on the directed acyclic graph, representing each layer of the target network model by each node on the directed acyclic graph, representing the calculation time delay of each layer on different equipment by the characteristic information of each node, and representing the transmission time delay of each layer of data output among different equipment by the edge weight among the nodes.

Compared with the prior art, the invention has the following beneficial effects: the robot task division-oriented end, edge and cloud collaborative computing device divides a target network model into sub-network models which can respectively run on the end, the edge and the cloud or on the end, the edge and the cloud through offline horizontal division of the target network model under a robot task, so that the inference time delay of the target network model under the robot task from data input to output after the target network model runs is minimum, the sub-network models can be selectively divided vertically according to task requirements, the sub-network models are subdivided into a plurality of sub-blocks which can run in parallel to issue a plurality of devices for processing, and the inference of the robot task is accelerated. The device can meet the requirement of real-time and rapid reasoning of the robot task, and simultaneously protects the privacy of data from being disclosed to the maximum extent.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and it is obvious for those skilled in the art to obtain other drawings according to these drawings.

FIG. 1 is a schematic structural diagram of a peer-to-peer, edge-to-cloud cooperative computing apparatus according to the present invention;

FIG. 2 is a flowchart illustrating operation of the peer-to-peer, edge-to-cloud computing apparatus according to the present invention;

FIG. 3 is a schematic diagram of the horizontal slicing method of the present invention;

FIG. 4 is a schematic diagram of the horizontal segmentation method for segmenting the network model of the multi-branch structure according to the present invention;

FIG. 5 is a schematic diagram of the vertical slicing method of the present invention;

FIG. 6 is a schematic diagram of cooperative computing of the robot task on the end, edge and cloud.

Detailed Description

In order to make the technical field of the invention better understand the scheme of the invention, the following detailed description of the embodiments of the invention is provided in conjunction with the accompanying drawings and the implementation mode.

In order to solve the problems that in the prior art, a task can only be divided into two parts which are respectively operated between an end cloud, an end edge and an edge cloud, so that the operation efficiency is not high, and real-time rapid reasoning cannot be guaranteed, the invention provides the end, edge and cloud cooperative computing device for robot task division, so that the robot task division is not limited to two parts in cooperation any more, the cooperative operation of the end cloud, the edge cloud and the end cloud can be expanded, and the task execution efficiency is greatly improved. The robotic tasks of the present invention include, but are not limited to: the human face detection, the human face recognition, the human body detection, the pedestrian tracking, the positioning navigation, the voice recognition, the knowledge question answering and the like relate to tasks of an artificial intelligence algorithm.

Referring to fig. 1, a schematic structural diagram of an end, edge and cloud collaborative computing device in the present invention is given, the end, edge and cloud collaborative computing device includes a task modeling computing layer, a task offline slicing layer and a task online execution layer, which are connected in sequence, and each component of the device is explained below.

The task modeling calculation layer is used for modeling and predicting calculation time delay and transmission time delay of different robot tasks executed in a layered mode under different devices, and specifically comprises the following steps:

(1) the task modeling calculation layer monitors the current hardware resource condition and the network bandwidth condition in real time, and predicts the calculation delay of each layer of a target network model under different devices according to the layered configuration information of the target network model under the robot task in combination with the hardware resources, wherein the devices comprise robot terminal devices, edge devices and cloud servers, and the hardware resources comprise: CPU, GPU, FPGA, ASIC, NPU; the real-time monitoring of the hardware resource condition comprises monitoring of the utilization rate, main frequency, memory occupation and resource consumption of a CPU, a GPU, an FPGA, an ASIC and an NPU; the hierarchical configuration information of the target network model includes: the number of filters per layer, the size of the filters, filter related parameters, the size of the input data, the number of neurons. In addition, the calculated time delay in the invention can be obtained by calculating a time delay prediction model or by actual measurement.

(2) Calculating transmission time delay of each layer of output of the target network model among different devices according to data output size information of target network model layers under the robot task and network bandwidth; the network bandwidth comprises network bandwidth between an end and an edge, network bandwidth between the end and a cloud and network bandwidth between the edge and the cloud; the real-time monitoring of the network bandwidth comprises monitoring of uplink and downlink traffic and network delay. Furthermore, the propagation delay can also be obtained by actual measurement.

The task off-line segmentation layer horizontally segments a target network model under a robot task according to computation delay and transmission delay, divides the target network model into three or two sub-network models, and divides the sub-network models into three types which respectively correspond to an end-side task, an edge-side task and a cloud-side task and respectively operate on a robot terminal, an edge device and a cloud server, and the sub-network models are connected in series, namely the output of the end-side task operation is used as the operation input of the edge-side task, the output of the edge-side task operation is used as the operation input of the cloud-side task, and the output of the cloud-side task is returned to the robot terminal as a final result for execution; the sub-network model is divided into two sub-networks which are respectively operated on the robot terminal and the edge device, or respectively operated on the robot terminal and the cloud server, or respectively operated on the edge device and the cloud server, and the connection mode of the sub-network model is also in a series relation; the specific division of the sub-network model by those skilled in the art depends on the minimum inference delay from data input to output after the target network model operates under the robot task, and then judges whether the inference delay meets the task requirement, for example, meets the task requirement, and outputs a task execution strategy of horizontal segmentation. The horizontal segmentation method specifically comprises the following steps: modeling a target network model under a robot task into a directed acyclic graph, numbering each node on the directed acyclic graph, representing each layer of the target network model by each node on the directed acyclic graph, representing the calculation time delay of each layer on different equipment by the characteristic information of each node, and representing the transmission time delay of each layer of data output among different equipment by the edge weight among the nodes. If the reasoning time delay does not meet the task requirement, such as: the total task delay requirement is 1s, the actual running time on the robot terminal, the edge device and the cloud server is respectively 0.3s, 1.5s and 0.2s, namely the time from the robot task input to the target network model output of the task execution strategy is 2s, one or more of the running sub-network models are vertically split, the sub-network models are split into a plurality of blocks which can be processed in parallel for execution, the purpose of reasoning acceleration is achieved, and the horizontally split task execution strategy and the vertically split task execution strategy are output. The horizontal segmentation task execution strategy is to distribute and send the divided three-part or two-part sub-network models to corresponding equipment for execution; the vertical segmentation strategy is to determine the number of the robot terminals, the edge devices or the cloud servers according to the number of segmentation blocks needing to be vertically segmented of the layered model running on the robot terminals, the edge devices or the cloud servers, and allocate the coordinates of the corresponding initial segmentation layers to the corresponding devices for model reasoning within the number range of the layers.

Referring to fig. 2, an operation flowchart of the end, edge and cloud cooperative computing device of the present invention specifically includes the following steps:

step 201, inputting a target network model corresponding to a robot task, wherein the target network model is an artificial intelligence related deep learning model;

step 202, predicting the computation delay of each layer of the model under different devices according to the hierarchical configuration information of the target network model and the hardware resource information of the current device (end, edge, cloud), wherein the computation delay can also be obtained through actual measurement;

step 203, deducing the transmission delay of each layer of the model under different devices according to the data size and the network bandwidth information hierarchically output by the target network model, wherein the transmission delay can also be obtained through actual measurement;

step 204, horizontally dividing the target network model according to the calculation delay and the transmission delay obtained by each layer of the model, and dividing the target network model into two parts or three parts of sub-network models, wherein the two parts of sub-network models can be operated on end equipment and edge equipment or end equipment and a cloud server or the edge equipment and the cloud server, the three parts of sub-network models are respectively operated on the end equipment, the edge equipment or the cloud server, and the inference delay from data input to output after the target network model operates for the specific divided parts of the sub-network models depending on the target network model under the robot task is minimum;

step 205, judging whether the inference time delay meets the task requirement;

step 206, when the reasoning time delay meets the task requirement, outputting a horizontal segmentation strategy, and deploying the model according to the horizontal segmentation strategy to complete a reasoning task in actual operation;

step 207, when the reasoning time delay does not meet the task requirement, vertically dividing the model layer running on the end equipment, the edge equipment or the cloud server according to the running time of each part as required, expanding the running of single equipment to multi-equipment parallel processing, and accelerating the whole reasoning process;

and 208, outputting a horizontal segmentation strategy and a vertical segmentation strategy on the basis of the step 207, and deploying the model according to the horizontal segmentation strategy and the vertical segmentation strategy during actual operation to complete an inference task.

The horizontal segmentation method comprises the following steps:

the level is divided into three parts of sub-network model, the schematic diagram is shown in fig. 3: the horizontal segmentation takes a target network model, calculation time delay and transmission time delay corresponding to a robot task as input, the target network model is horizontally segmented, and three parts of sub-network models can be obtained after segmentation; outputting a node number n of a layer division point position where the sub-network model operates on the robot terminal and the edge device, and a node number m of a layer division point position where the sub-network model operates on the edge device and the cloud server; a sub-network model that is before the node number n and includes the node number n itself is run on the robot terminal device, a sub-network model that is before the node number m and includes the node number m itself is run on the edge device after the node number n, and a sub-network model that is after the node number m is run on the cloud server. In fig. 3, the three-part sub-network model divides the target network model into 301, 302, and 303 in sequence from the input layer, where 301 corresponds to the sub-network model running on the end device, 302 corresponds to the sub-network model running on the edge device, and 303 corresponds to the sub-network model running on the cloud server; the horizontal splitting method outputs layer splitting points after model splitting, in fig. 3, corresponding to number 2 shown by 301 and number 13 shown by 302, a network layer containing number 2 itself before number 2 runs on an end device, a network layer containing number 13 itself after number 2, that is, from number 3 to number 13 runs on an edge device, and a network layer containing number 13 itself after node number 13, that is, from number 14 to the last layer of the network runs on a cloud server.

In addition, if the sub-network model is horizontally divided into two parts and respectively runs on the robot terminal and the edge device, the node number p of the sub-network model at the position of the layer split point where the robot terminal and the edge device run is output; the sub-network model preceding the node number p and containing the node number p itself runs on the robot terminal device, and the sub-network model following the node number p runs on the edge device.

If the sub-network model is horizontally divided into two parts and respectively operated on the robot terminal and the cloud server, outputting a node number q of a layer division point position of the sub-network model operated on the robot terminal and the cloud server; a sub-network model preceding the node number q and including the node number q itself is run on the robot terminal device, and a sub-network model following the node number q is run on the cloud server.

If the sub-network model is horizontally divided into two parts and respectively operated on the edge equipment and the cloud server, outputting a node number k of a layer division point position of the sub-network model operated on the edge equipment and the cloud server; the sub-network model preceding node number k and containing node number k itself runs on the edge device, and the sub-network model following node number k runs on the cloud server.

The node number may comprise one or more numbers, only one if the target network model is a single link network, or more if the target network model comprises a multi-branch structure. Therefore, the horizontal segmentation method provided by the present invention can also segment the model with the multi-branch structure, specifically referring to fig. 4, if the horizontal segmentation method segments the model with the multi-branch structure into three parts, the number corresponding to the layer segmentation point of the output model segmentation can be multiple; for example,

numbers

3, 4, and 5 in fig. 4 are division points of the end device and the edge device, a network layer including the

number

3, 4, and 5 itself before the

number

3, 4, and 5 runs on the end device, a network layer including the number 11 itself before the number 11 after the

number

3, 4, and 5 runs on the edge device, and the remaining network layers after the number 11 run on the cloud server. The vertical segmentation method comprises the following steps:

determining the number range of the layers of the layered model, which is operated on a robot terminal, edge equipment or a cloud server by the model and needs to be vertically divided, the number of divided blocks and the coordinate information of the initial divided layer corresponding to each block; the specific process is shown in fig. 5:

(1) for the convolutional neural network layer, determining the number of blocks to be segmented of each molecular network model and the number range of the layers to be segmented; for example, in fig. 5, it is necessary to vertically divide the sub-network model running on the edge device in fig. 3, where the number of layer numbers ranges from 3 to 13, and the number of division blocks is 4;

(2) the vertical segmentation method comprises the steps of numbering ranges according to the number of layers, carrying out non-overlapping division on model layers corresponding to last digits of numbers according to the number of segmented blocks, and recording coordinate information of each block; for example, in fig. 5, the model layer numbered 13 is divided into 4 blocks, and the coordinate information of the upper left corner and the lower right corner of each block is recorded;

(3) the vertical segmentation method reversely deduces the block coordinate information of the model layer corresponding to the first digit of the serial number layer by layer according to the coordinate information of the block of the model layer corresponding to the last digit of the serial number; for example, in fig. 5, according to the coordinate information of the upper left corner and the lower right corner of each block of the model layer with the number of 13, the coordinate information of the upper left corner and the lower right corner corresponding to each block of the model layer with the number of 3 is deduced in a reverse manner;

(4) by the vertical segmentation method, the sub-network model can be further divided into a plurality of sub-blocks, the sub-blocks are deployed on a plurality of devices for parallel processing, and finally, the processing results of the devices are collected and spliced to be used as the output of the model hierarchy corresponding to the sub-tasks.

According to the task execution strategy output by the horizontal segmentation method and the vertical segmentation method, resources of the end, the edge and the cloud can be effectively utilized, the robot task is deployed on the end, the edge and the cloud in actual operation, and inference delay minimization of overall operation is realized, specifically, referring to fig. 6, a schematic diagram of cooperative computing of the robot task on the end, the edge and the cloud is given, and the schematic diagram can include:

(1) the robot task is divided into three parts according to the horizontal segmentation method, and the three parts are respectively operated on end equipment, edge equipment and a cloud server;

(2) the robot task divides subtasks running on the edge equipment into 4 sub-blocks according to the vertical segmentation method, each sub-block is calculated on one edge equipment, the 4 sub-blocks can be processed in parallel, and results after parallel processing are collected by the appointed edge equipment and output to the cloud server;

(3) and finally, the cloud server returns the inference result of the task operation to the robot terminal for subsequent processing.

According to the cooperative computing device, the target network model under the robot task is divided and operated on the end, the edge and the cloud, so that computing resources on the end, the edge and the cloud can be effectively utilized, and the task execution efficiency is optimal. The test is carried out on the classical network used for picture feature extraction, such as ResNet50, VGG16 and Inception V4, and by adopting the task execution strategy of the task offline hierarchical output, compared with the prior art, the inference time delay of the task online execution can be improved by 3.5 times, the task inference time is greatly shortened, and the requirement of the task real-time property is met.

The foregoing detailed description of the embodiments of the present invention has been presented for purposes of illustration and description, and is intended to be exemplary only of the devices and methods for understanding the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed, and as described above, the content of the present specification should not be construed as limiting the present invention.

Claims

1. A robot-task-division-oriented end, edge and cloud collaborative computing device is characterized by comprising a task modeling computing layer, a task off-line cutting layer and a task on-line executing layer which are sequentially connected;

2. The robot task division-oriented end-to-end, edge-to-cloud collaborative computing apparatus according to claim 1, wherein the devices include a robot terminal device, an edge device, and a cloud server, and the hardware resources include: CPU, GPU, FPGA, ASIC, NPU; the real-time monitoring of the hardware resource condition comprises monitoring of the utilization rate, the main frequency, the memory occupation and the resource consumption of a CPU, a GPU, an FPGA, an ASIC and an NPU.

3. The robot-task-division-oriented end, edge and cloud collaborative computing device according to claim 1, wherein hierarchical configuration information of the target network model comprises: the number of filters per layer, the size of the filters, filter related parameters, the size of the input data, the number of neurons; the network bandwidth comprises network bandwidth between an end and an edge, network bandwidth between the end and a cloud and network bandwidth between the edge and the cloud; the real-time monitoring of the network bandwidth comprises monitoring of uplink and downlink traffic and network delay.

4. The robot task partitioning-oriented end-to-end, edge-to-cloud collaborative computing apparatus according to claim 1, wherein the sub-network models are respectively run on a robot terminal, an edge device, and a cloud server.

5. The robot task partitioning-oriented peer-to-peer, edge-to-cloud collaborative computing apparatus according to claim 1, wherein the sub-network models are respectively run on a robot terminal and an edge device, or respectively run on the robot terminal and a cloud server, or respectively run on the edge device and the cloud server.

6. The robot task division-oriented end, edge and cloud collaborative computing device according to claim 1, wherein the horizontal segmentation method specifically comprises: modeling a target network model under a robot task into a directed acyclic graph, numbering each node on the directed acyclic graph, representing each layer of the target network model by each node on the directed acyclic graph, representing the calculation time delay of each layer on different equipment by the characteristic information of each node, and representing the transmission time delay of each layer of data output among different equipment by the edge weight among the nodes.