WO2022171066A1 - 基于物联网设备的任务分配方法、网络训练方法及装置 - Google Patents

基于物联网设备的任务分配方法、网络训练方法及装置 Download PDF

Info

Publication number
WO2022171066A1
WO2022171066A1 PCT/CN2022/075450 CN2022075450W WO2022171066A1 WO 2022171066 A1 WO2022171066 A1 WO 2022171066A1 CN 2022075450 W CN2022075450 W CN 2022075450W WO 2022171066 A1 WO2022171066 A1 WO 2022171066A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource
graph
subgraph
node
performance
Prior art date
Application number
PCT/CN2022/075450
Other languages
English (en)
French (fr)
Inventor
曲薇
Original Assignee
中国移动通信有限公司研究院
中国移动通信集团有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国移动通信有限公司研究院, 中国移动通信集团有限公司 filed Critical 中国移动通信有限公司研究院
Priority to EP22752227.3A priority Critical patent/EP4290824A1/en
Priority to JP2023548262A priority patent/JP2024506073A/ja
Publication of WO2022171066A1 publication Critical patent/WO2022171066A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/101Server selection for load balancing based on network conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1012Server selection for load balancing based on compliance of requirements or conditions with available server resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Definitions

  • the present disclosure relates to the field of Internet of Things (IoT, Internet of Things), and in particular to a task allocation method, network training method and device based on Internet of Things equipment.
  • IoT Internet of Things
  • cloud computing can meet the computing power and storage resource requirements of computing-intensive deep learning tasks, it is not suitable for applications that are sensitive to latency, reliability, and privacy, such as autonomous driving, virtual reality (VR, Virtual Reality), In IoT scenarios such as augmented reality (AR, Augmented Reality), and resources on a single IoT device are extremely limited, distributed edge computing that can perform cross-device collaboration on a variety of interconnected heterogeneous IoT devices may become An efficient solution in which an intelligent computing task distribution method across heterogeneous devices will be the key to its realization.
  • the distributed training and reasoning of the deep learning model is mainly realized through the layer scheduling algorithm based on model segmentation, and some layers of the model are allocated to the side, and the remaining layers are allocated to the cloud center; Edge servers are mainly used to process lower-level data, while cloud servers are mainly used to process higher-level data.
  • This task allocation strategy does not involve the allocation of the underlying deep learning algorithm, which limits the optimization effect of task scheduling and resources.
  • Embodiments of the present disclosure provide a task assignment method, a network training method, and an apparatus based on an Internet of Things device.
  • an embodiment of the present disclosure provides a network training method based on an IoT device, the method comprising: determining a training data set, and training a first network based on the training data set; the training data set includes at least one A task allocation strategy and corresponding actual performance; an actual performance is obtained based on the actual execution of the corresponding task allocation strategy; the first network is used to predict the performance of the task allocation strategy.
  • the method further includes: determining a computation graph corresponding to the task to be processed and a resource graph corresponding to the IoT device, and generating at least one task assignment based on the computation graph and the resource graph Strategy.
  • the generating at least one task allocation strategy based on the computation graph and the resource graph includes: generating at least one resource subgraph based on the computation graph and the resource graph, each resource subgraph The graph includes a task allocation strategy; the task allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph; a node in the resource subgraph represents at least part of the capabilities of the IoT device; resources The edges of two adjacent nodes in the subgraph represent the relationship between at least some of the capabilities of the IoT device.
  • the generating at least one resource subgraph based on the computation graph and the resource graph includes:
  • the first node is the node with the largest resource requirement
  • the at least one second node is a node that satisfies the resource requirements of the first node
  • a resource subgraph is determined based on each second node, and each resource subgraph contains a task allocation strategy.
  • the training of the first network includes: training the first network based on predicted performance and actual performance of at least one task allocation strategy.
  • obtaining the predicted performance of at least one task allocation strategy includes:
  • the prediction performance corresponding to each resource subgraph is obtained through the first network.
  • obtaining the prediction performance corresponding to each resource subgraph through the first network includes:
  • the features of the at least one resource sub-graph are respectively extracted by the feature extraction module to obtain at least one second feature set;
  • the prediction performance corresponding to each resource sub-graph is obtained based on the first feature set, each second feature set and the prediction module of the first network.
  • the obtaining the prediction performance corresponding to each resource subgraph based on the first feature set, each second feature set, and the first network's prediction module includes:
  • each third feature set including the first feature set and each second feature set;
  • Prediction data corresponding to each resource sub-picture is obtained based on each third feature set and the prediction module, and prediction performance corresponding to each resource sub-picture is obtained based on the prediction data corresponding to each resource sub-picture.
  • the prediction data includes at least one of the following;
  • the obtaining the prediction performance corresponding to each resource sub-map based on the prediction data corresponding to each resource sub-map includes: according to a preset weight, predicting the corresponding to each resource sub-map The data is weighted to obtain the prediction performance corresponding to each resource subgraph.
  • the training of the first network includes: training the feature extraction module and the prediction module based on the predicted performance and actual performance of each task allocation strategy.
  • the training of the feature extraction module and the prediction module includes:
  • the error between the predicted performance and the actual performance of each task allocation strategy is back-propagated, and the gradient descent algorithm is used to update the network parameters of the feature extraction module and the prediction module of the first network until the difference between the predicted performance and the actual performance is reached.
  • the error satisfies the preset condition.
  • the method further includes: updating the training data set, and the updated training data set is used to update the first network.
  • the updating of the training data set includes at least one of the following:
  • At least one resource subgraph is generated by using at least one of the heuristic method, the graph search method, the graph optimization method and the subgraph matching method, and the actual task allocation strategy corresponding to each resource subgraph is used. After execution, the actual performance corresponding to each resource subgraph is obtained, and the calculation graph, each resource subgraph and the corresponding actual performance are added to the training data set;
  • At least one of the heuristic method, graph search method, graph optimization method and subgraph matching method is used to generate at least one resource subgraph, and the prediction corresponding to each resource subgraph is obtained through the first network performance, select the resource subgraph with the best predicted performance from the at least one resource subgraph, and obtain the actual performance after actually executing the task allocation strategy corresponding to the resource subgraph with the best predicted performance, and calculate the The graph, the resource subgraph with the best prediction performance and the corresponding actual performance are added to the training data set;
  • an embodiment of the present disclosure further provides a method for assigning tasks based on IoT devices, the method comprising:
  • the generating at least one task allocation strategy based on the computation graph and the resource graph includes: generating at least one resource subgraph based on the computation graph and the resource graph, Each resource subgraph contains a task allocation strategy; the task allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph; a node in the resource subgraph represents at least one node of the IoT device. Partial capabilities; an edge between two adjacent nodes in the resource subgraph represents a relationship between at least part of the capabilities of an IoT device.
  • the generating at least one resource subgraph based on the computation graph and the resource graph includes: determining a first node in the computation graph; the first node is a resource Nodes with the greatest demand;
  • the at least one second node is a node that satisfies the resource requirements of the first node
  • the first network is optimized by using the method described in the first aspect of the embodiments of the present disclosure.
  • the obtaining the prediction performance corresponding to each task allocation strategy includes: obtaining the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph .
  • obtaining the prediction performance corresponding to each resource subgraph through the first network includes:
  • the obtaining the prediction performance corresponding to each resource subgraph based on the first feature set, each second feature set, and the first network's prediction module includes:
  • each third feature set including the first feature set and each second feature set;
  • the prediction data includes at least one of the following;
  • the obtaining the prediction performance corresponding to each resource sub-map based on the prediction data corresponding to each resource sub-map includes: according to a preset weight, predicting the corresponding to each resource sub-map The data is weighted to obtain the prediction performance corresponding to each resource subgraph.
  • the first determining unit is configured to determine a training data set, the training data set includes at least one task allocation strategy and corresponding actual performance; an actual performance is obtained based on the actual execution of the corresponding task allocation strategy;
  • the training unit is configured to train a first network based on the training data set; the first network is used to predict the performance of the task allocation strategy.
  • the apparatus further includes a first generating unit configured to determine a computation graph corresponding to the task to be processed and a resource graph corresponding to the IoT device, based on the computation graph and the resource graph, Generate at least one task assignment strategy.
  • the first generating unit is configured to generate at least one resource subgraph based on the computation graph and the resource graph, each resource subgraph including a task allocation strategy; the task The allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph; a node in the resource subgraph represents at least part of the capabilities of the IoT device; the edges of two adjacent nodes in the resource subgraph represent The relationship between at least some of the capabilities of an IoT device.
  • the first generating unit is configured to determine a first node in the calculation graph; the first node is a node with the largest resource demand; at least one second node; the at least one second node is a node that satisfies the resource requirements of the first node; a resource subgraph is determined based on each second node, and each resource subgraph includes a task allocation strategy.
  • the training unit is configured to train the first network based on predicted performance and actual performance of at least one task allocation strategy.
  • the training unit is further configured to obtain the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph.
  • the prediction data includes at least one of the following;
  • the training unit is configured to perform weighting processing on the prediction data corresponding to each resource sub-map according to a preset weight, so as to obtain the prediction performance corresponding to each resource sub-map.
  • the training unit is configured to train the feature extraction module and the prediction module based on the predicted performance and actual performance of each task allocation strategy.
  • the training unit is configured to back-propagate the error between the predicted performance and the actual performance of each task allocation strategy, and use a gradient descent algorithm to extract the features of the first network. and the network parameters of the prediction module are updated until the error between the predicted performance and the actual performance meets the preset conditions.
  • the apparatus further includes an update unit configured to update the training data set, and the updated training data set is used to update the first network.
  • the updating unit is configured to update the training data set in at least one of the following manners:
  • At least one resource subgraph is generated by using at least one of the heuristic method, the graph search method, the graph optimization method and the subgraph matching method, and the actual task allocation strategy corresponding to each resource subgraph is used. After execution, the actual performance corresponding to each resource subgraph is obtained, and the calculation graph, each resource subgraph and the corresponding actual performance are added to the training data set;
  • At least one resource subgraph is obtained by adopting at least one of a heuristic method, a graph search method, a graph optimization method and a subgraph matching method, and a prediction corresponding to each resource subgraph is obtained through the first network performance, select the resource subgraph with the best predicted performance from the at least one resource subgraph, and obtain the actual performance after actually executing the task allocation strategy corresponding to the resource subgraph with the best predicted performance, and calculate the The graph, the resource subgraph with the best prediction performance and the corresponding actual performance are added to the training data set;
  • a random walk method is used to generate at least one resource subgraph, and the actual performance is obtained after the task allocation strategy corresponding to each resource subgraph is actually executed. Figures and actual performance are added to the training dataset.
  • an embodiment of the present disclosure further provides an apparatus for task assignment based on IoT devices, the apparatus includes: a second determination unit, a second generation unit, a prediction unit, and a task assignment unit; wherein,
  • the second determining unit is configured to determine the computation graph corresponding to the task to be processed, and the resource graph corresponding to the Internet of Things device;
  • the second generating unit is configured to generate at least one task allocation strategy based on the computation graph and the resource graph;
  • the prediction unit is configured to input the at least one task allocation strategy into the first network, and obtain the prediction performance corresponding to each task allocation strategy;
  • the task allocation unit is configured to determine a task allocation strategy with the best prediction performance, and perform task allocation based on the determined task allocation strategy.
  • the second generating unit is configured to generate at least one resource subgraph based on the computation graph and the resource graph, and each resource subgraph includes a task allocation strategy;
  • the task allocation strategy is used to allocate at least one node of the corresponding resource graph for each node of the computing graph;
  • a node in the resource subgraph represents at least part of the capabilities of the IoT device;
  • Edges represent relationships between at least some of the capabilities of IoT devices.
  • the second generating unit is configured to determine a first node in the computing graph; the first node is a node with the largest resource demand; at least one second node; the at least one second node is a node that satisfies the resource requirements of the first node; a resource subgraph is determined based on each second node, and each resource subgraph includes a task allocation strategy.
  • the prediction unit is configured to obtain the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph.
  • the predicting unit is configured to extract features of the computational graph through a feature extraction module of the first network to obtain a first feature set; and extract the features through the feature extraction module respectively Obtain at least one second feature set based on the features of the at least one resource sub-map; obtain a prediction corresponding to each resource sub-map based on the first feature set, each second feature set and the prediction module of the first network performance.
  • the prediction unit is configured to obtain at least one third feature set based on the first feature set and each second feature set, and each third feature set includes the The first feature set and each second feature set; the prediction data corresponding to each resource submap is obtained based on each third feature set and the prediction module, and each resource submap is obtained based on the prediction data corresponding to each resource submap. The corresponding prediction performance of the graph.
  • the prediction data includes at least one of the following;
  • an embodiment of the present disclosure further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements the present disclosure when executing the program Examples are the steps of the method described in the first aspect or the second aspect.
  • FIG. 3 is a schematic flowchart of a method for generating a resource subgraph in an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of generating a resource subgraph in an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of a method for obtaining prediction performance in an embodiment of the present disclosure
  • FIG. 6b is a schematic diagram of feature extraction of a resource subgraph in an embodiment of the disclosure.
  • FIG. 8 is a schematic diagram of a task assignment method based on an IoT device according to an embodiment of the present disclosure
  • FIG. 9 is a schematic diagram of the composition of a task allocation system according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic diagram 1 of the composition and structure of a network training apparatus based on an IoT device according to an embodiment of the present disclosure
  • FIG. 11 is a second schematic diagram of the composition and structure of a network training device based on an Internet of Things device according to an embodiment of the present disclosure
  • FIG. 12 is a third schematic diagram of the composition and structure of a network training apparatus based on an Internet of Things device according to an embodiment of the present disclosure
  • FIG. 13 is a schematic diagram of the composition and structure of an apparatus for assigning tasks based on IoT devices according to an embodiment of the present disclosure
  • FIG. 14 is a schematic structural diagram of a hardware composition of an electronic device according to an embodiment of the disclosure.
  • the main purpose of the embodiments of the present disclosure is how to construct an end-to-end trainable network model for resource scheduling across heterogeneous IoT devices, and to implement an automatically optimized, high-performance, and intelligent adaptive task assignment method.
  • an end-to-end trainable network model for resource scheduling across heterogeneous IoT devices, and to implement an automatically optimized, high-performance, and intelligent adaptive task assignment method.
  • appropriate computing power, storage and communication resources are allocated to nodes in the calculation graph in a task allocation method that can obtain the best system performance, which promotes decentralized machines that collaborate across devices
  • the realization of learning e.g. training and inference of deep models
  • Deep Neural Networks DNN, Deep Neural Networks
  • CNN Convolutional Neural Networks
  • RNN Recurrent Neural Networks
  • LSTM Long Short-Term Memory
  • GCN Graph Convolutional Networks
  • GNN Graph Neural Networks
  • GNN Graph Neural Networks
  • the scenario also includes multiple IoT devices on the edge side, and each IoT device may have different capabilities.
  • the capability can be embodied by computing resources (Computation Resource), storage resources (Storage/Memory Resource), and communication resources (Communication Resource).
  • the above-mentioned computing resources may refer to available or idle computing resources.
  • computing resources may include central processing.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • FPGA Programmable Gate Array
  • DSP Digital Signal Processor
  • the above-mentioned storage resources may refer to available or idle storage resources, and storage resources may include, for example, memory (Memorizer) resources, cache memory (Cache) resources, random access memory (RAM, Random Access Memory), etc.;
  • the above-mentioned communication resources may refer to available or idle communication resources.
  • different IoT devices can provide at least one of different computing resources, storage resources, and communication resources, and the computing resources, storage resources, and communication resources of each IoT device can form a resource pool.
  • a computing graph is generated by abstracting each operator of a computing task into corresponding nodes, and the capabilities of each IoT device are abstracted into corresponding nodes, thereby generating a resource graph, and based on the computational graph and the resource graph Construct the resource sub-graph (Resource Sub-graph Construction), perform feature extraction (Feature Extraction) on the computational graph and resource sub-graph, and perform performance prediction (Performance Prediction) on the implicit task allocation strategy according to the extracted features. way to achieve intelligent assignment of tasks.
  • Resource Sub-graph Construction Resource Sub-graph Construction
  • feature extraction Feature Extraction
  • Performance Prediction Performance Prediction
  • the smart home service scenario shown in FIG. 1 is only an optional application scenario to which the technical solutions of the embodiments of the present disclosure are applicable, and other application scenarios may also fall within the protection scope of the embodiments of the present disclosure. There is no specific limitation in the embodiments of the present disclosure.
  • Embodiments of the present disclosure provide a network training method based on IoT devices, which is applied to various electronic devices, including but not limited to fixed devices and/or mobile devices.
  • the fixed device includes, but is not limited to, a personal computer (PC, Personal Computer) or a server, and the server may be a cloud server or a common server.
  • the mobile device includes, but is not limited to: one or more of a mobile phone, a tablet computer or a wearable device.
  • FIG. 2 is a schematic flowchart of a network training method based on an IoT device according to an embodiment of the present disclosure; as shown in FIG. 2 , the method includes:
  • Step 101 Determine a training data set; the training data set includes at least one task allocation strategy and corresponding actual performance; an actual performance is obtained based on the actual execution of the corresponding task allocation strategy;
  • Step 102 Train a first network based on the training data set; the first network is used to predict the performance of the task allocation strategy.
  • the task allocation strategy represents a strategy for allocating to-be-processed tasks to at least one IoT device for execution; in other words, through the task allocation strategy, at least one IoT device can be determined, and at least one IoT device can be The device executes pending tasks as directed by the task allocation policy.
  • the task allocation strategy may also be referred to as one of the following: a task allocation method, a task allocation method, a task scheduling strategy, a task scheduling method, a task scheduling method, a task scheduling strategy, a task scheduling method, a task scheduling method, etc. Wait.
  • the first network is trained by using each task allocation strategy in the training data set and the corresponding actual performance.
  • each IoT device in the system may be a heterogeneous IoT device.
  • heterogeneous IoT devices refer to: in a network containing multiple IoT devices and servers, the hardware of one IoT device is different from the hardware of another IoT device, and/or, the server of one IoT device Unlike another IoT device's server.
  • that the hardware of one IoT device is different from the hardware of another IoT device means that the model or type of hardware corresponding to at least one of the computing resources and storage resources of one IoT device and another IoT device is different .
  • a task assignment strategy is used to assign at least one node of the corresponding resource graph to each node in the computation graph, or to assign or map the task to be processed to at least one IoT device, or to assign the to-be-processed task to at least one IoT device. Match between processing tasks and IoT devices, or between pending tasks and resources.
  • At least one node of the resource graph allocated to each node of the computation graph may be the same or different; that is, an IoT device may utilize at least part of its own capabilities to implement
  • multiple IoT devices can implement a computing unit corresponding to an operator in a cooperative manner.
  • nodes (or operators) without computational dependencies (ie, data dependencies) in the computational graph can be executed (or operated, calculated) in parallel on the same or different IoT devices.
  • a task allocation strategy may be embodied by a resource subgraph, in other words, the training data set may include a computation graph, at least one resource subgraph, and corresponding actual performance.
  • the training data set may include a computation graph, at least one resource subgraph, and corresponding actual performance.
  • the requirements for computing resources, storage resources and communication resources of the tasks to be processed and the requirements of each IoT The relationship between the available resources or capabilities on the device generates at least one resource subgraph from the complete resource graph, that is, generates at least one task allocation strategy.
  • Resource subgraph construction realizes full utilization of idle resources on IoT devices and fine-grained task allocation and optimization.
  • the resource graph in this embodiment may also be referred to as a resource knowledge graph or a resource knowledge graph; a resource subgraph may also be referred to as a resource knowledge subgraph or a resource knowledge subgraph, and so on.
  • FIG. 3 is a schematic flowchart of a method for generating a resource subgraph in an embodiment of the present disclosure; as shown in FIG. 3 , the method for generating a resource subgraph may include:
  • Step 202 Determine at least one second node in the resource graph; the at least one second node is a node that meets the resource requirements of the first node;
  • Step 203 Determine a resource subgraph based on each second node, and each resource subgraph includes a task allocation strategy.
  • FIG. 4 is a schematic diagram of generating a resource subgraph in an embodiment of the present disclosure; with reference to FIG. 4 , first, each node in a calculation graph is numbered.
  • the nodes in the computation graph are labeled based on a uniform rule. For example, first determine the branch with the largest number of nodes in the calculation graph, and number the nodes in the order in sequence. For example, the branch with the largest number of nodes in Figure 4 has 5 nodes, and the branches containing 5 nodes are numbered in sequence; further, all nodes in the branch with the second largest number of nodes are numbered, and so on, until the calculation graph All nodes are numbered.
  • the first node among all nodes in the computing graph, where the first node is the node with the largest resource demand, and the first node may also be called a bottleneck node.
  • the resource requirement of the node numbered 4 on the calculation graph in FIG. 4 is the largest, and the node numbered 4 is determined as the first node or the bottleneck node.
  • the node with the greatest resource demand may refer to the node with the greatest demand for at least one resource among computing resources, storage resources, and communication resources.
  • At least one second node in the resource graph is determined, that is, a suitable resource node (also called a device node, or a capability node) is allocated to the first node (or bottleneck node) to provide for the execution of the to-be-processed task. available resources.
  • a suitable resource node also called a device node, or a capability node
  • all nodes in the resource graph that can meet the resource requirements of the first node (or bottleneck node) can be determined as the second nodes, for example, the three nodes numbered 4 in the resource graph in FIG.
  • the resource requirements of the node (or the bottleneck node), the three nodes numbered 4 are all determined as the second node.
  • each resource node corresponding to the first node (or the bottleneck node) as the starting point, for example, the resource node (marked as node V3) on the right side of the resource graph in FIG. Starting point, search for other resource nodes adjacent to it on the resource graph (for example, nodes V1, V4, and V5 in the resource graph of Figure 4), which are nodes that are 1 hop away from the first node in the computational graph (for example, the nodes in the computational graph are labeled as 3, 5, and 6 nodes) allocate appropriate resource nodes to meet the resource requirements of the corresponding workload.
  • the training of the first network includes: training the first network based on predicted performance and actual performance of at least one task allocation strategy.
  • obtaining the prediction performance of at least one task allocation strategy includes: obtaining the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph.
  • FIG. 5 is a schematic flowchart of a method for obtaining predicted performance in an embodiment of the present disclosure; as shown in FIG. 5 , the method for obtaining predicted performance may include:
  • Step 301 Extract the features of the computational graph through a feature extraction module of the first network to obtain a first feature set
  • Step 303 Obtain the prediction performance corresponding to each resource sub-graph based on the first feature set, each second feature set, and the prediction module of the first network.
  • the feature of the computation graph is extracted by the feature extraction module to obtain the first feature set.
  • the first feature set may also be referred to as a feature set, a feature, or a feature vector.
  • the feature extraction module is used to extract the features of the computation graph and resource subgraphs, including CPU, GPU, FPGA, DSP, and memory, etc., mainly covering computing power, storage, etc. , communication and other dimensions.
  • the input feature set may also be referred to as input feature, input feature vector, input feature matrix, including input feature information of each node in the computation graph; determine the computation graph
  • the adjacency matrix of the adjacency matrix represents the topology information of the computation graph, or the adjacency matrix represents the relationship between the nodes in the computation graph; based on the input feature combination, the adjacency matrix and the
  • the feature extraction module extracts features of the computational graph to obtain a first feature set.
  • the input feature set of the computation graph includes a feature vector of each node in the computation graph; the feature vector of each node in the computation graph includes resource information required to execute the operator corresponding to each node.
  • the required resource information includes, for example, CPU, GPU, DSP, FPGA, and memory usage.
  • the elements in the adjacency matrix corresponding to the computation graph represent the strength of the relationship between each two nodes, and the numerical size of the elements is related to the size of the transmission data between the corresponding two nodes.
  • Fig. 6a is a schematic diagram of feature extraction of a computation graph in an embodiment of the disclosure; as shown in Fig. 6a, the computation graph includes 6 nodes, and correspondingly, the input feature set includes 6 groups of feature vectors, such as Expressed as:
  • each group of eigenvectors corresponds to a node on the calculation graph
  • each group of eigenvectors includes the usage of each resource when the operator corresponding to the node is executed (or operated) ( or resource requirements) characteristics, that is, the hardware execution cost of the operator, or the hardware occupancy data of the operator, for example, may include features such as CPU occupancy, GPU occupancy, DSP occupancy, FPGA occupancy, and storage occupancy.
  • the occupancy rate may also be referred to as occupancy, occupancy ratio, occupancy ratio, usage, usage ratio, usage ratio, usage ratio, utilization, utilization, utilization ratio, utilization ratio.
  • e kd; wherein, k represents a preset coefficient, and d represents the size of transmission data between two adjacent nodes.
  • e 34 kd 34 ; wherein, d 34 represents the size of the data transmitted between node 3 and node 4 in the calculation graph.
  • the input feature set and the adjacency matrix shown in the figure are input to the feature extraction module, so as to obtain the first feature set.
  • determining an input feature set of the resource subgraph where the input feature set includes input feature information of each node in the resource subgraph; determining an adjacency matrix of the resource subgraph, where the adjacency matrix represents the resource The topology information of the subgraph, or the adjacency matrix represents the relationship between the nodes in the resource subgraph; based on the input feature combination, the adjacency matrix and the feature extraction module, the resource subgraph is extracted. feature to obtain the second feature set.
  • the input feature set of the resource subgraph includes a feature vector of each node in the resource subgraph; the feature vector of each node in the resource subgraph includes the IoT device corresponding to each node.
  • At least part of the resource information (or referred to as capability information), the at least part of the resource information includes, for example, available resources such as CPU, GPU, DSP, FPGA, and memory.
  • the elements in the adjacency matrix corresponding to the resource subgraph represent the strength of the communication relationship between each two nodes, and the numerical value of the elements is related to the transmission rate and/or delay between the corresponding two nodes, etc. .
  • FIG. 6b is a schematic diagram of feature extraction of a resource subgraph in an embodiment of the disclosure; as shown in FIG. 6b, it is a task allocation strategy, that is, a resource subgraph, and the figure includes 6 nodes,
  • the input feature set includes 6 groups of feature vectors, for example, expressed as:
  • each group of feature vectors corresponds to a node on the resource subgraph
  • each group of feature vectors includes the features of the resources corresponding to each node, such as CPU resources, GPU resources, DSP resources, FPGA resources, storage resources and other characteristics.
  • connection relationship between the nodes in the graph it can be determined that there is a connection relationship between node V2 and node V1, a connection relationship between node V1 and node V3, and a connection relationship between node V3 and node V4, There is a connection relationship between node V3 and node V5, and a connection relationship between node V5 and node V6, so that the corresponding elements in the adjacency matrix (ie e 13 , e 21 , e 34 , e 35 shown in the figure) can be determined. , e 56 ) have a specific value, while the other elements have the value 0, and the adjacency matrix looks like this:
  • the numerical value of the above elements is related to the transmission rate and/or time delay between the corresponding two nodes.
  • e k 1 ts+k 2 l; wherein, k 1 and k 2 represent preset coefficients respectively, ts represents the transmission rate between two adjacent nodes, and l represents the transmission rate between two adjacent nodes. transmission delay.
  • e 34 k 1 ts 34 +k 2 l 34 ; wherein, ts 34 represents the transmission rate between node V3 and node V4 in the resource subgraph; l 34 represents between node V3 and node V4 in the resource subgraph delay.
  • the input feature set and the adjacency matrix shown in the figure are input to the feature extraction module, so as to obtain the second feature set.
  • the input feature set and the adjacency matrix corresponding to the above calculation graph or resource subgraph are used as the input of the feature extraction module, and the feature is updated by the forward propagation algorithm of the following expression (5);
  • the forward propagation of the multi-layer network obtains a feature set (or feature vector) that combines the features of all nodes and edges in the computational graph or resource subgraph, that is, the first feature set corresponding to the computational graph and the corresponding feature set corresponding to the computational graph are obtained.
  • the second feature set for the resource subgraph is obtained.
  • H (l) represents all node features of the lth layer in the multi-layer network in the feature extraction module
  • H (0) represents the feature extraction module.
  • W represents the trainable weight matrix (ie network parameters) of the feature extraction module
  • W (l) represents the trainable weight matrix of the lth layer in the multi-layer network in the feature extraction module
  • ⁇ ( ) represents the activation function.
  • the obtaining the prediction performance corresponding to each resource subgraph based on the first feature set, each second feature set, and the prediction module of the first network includes: based on the first feature set a feature set and each second feature set, obtain at least one third feature set, each third feature set includes the first feature set and each second feature set; based on each third feature set and the The prediction module obtains the prediction data corresponding to each resource sub-map, and obtains the prediction performance corresponding to each resource sub-map based on the prediction data corresponding to each resource sub-map.
  • the input data of the prediction module is a fusion feature (ie, a third feature set) of the first feature set and a second feature set obtained respectively by the feature extraction module.
  • a fusion feature ie, a third feature set
  • the first feature set and the second feature set may be spliced together to obtain a third feature set.
  • the third feature set is further input into the prediction module, and the prediction data corresponding to each resource sub-graph is obtained through layer-by-layer iteration of the forward propagation algorithm of the multi-layer network in the prediction module.
  • the predicted data may be a vector including three data or three components; wherein, one data or component represents the predicted execution time for executing the to-be-processed task, for example, it may be denoted as ⁇ t , and one data or component represents The predicted energy consumption of executing the to-be-processed task, for example, can be denoted as ⁇ e , and a data or component represents the prediction reliability of executing the to-be-processed task, such as can be denoted as ⁇ r , then each resource is determined based on the above prediction data
  • the prediction performance corresponding to the subgraph the performance may also be referred to as overall system performance.
  • the weighting process is performed according to the preset weight corresponding to each component, that is, the corresponding prediction performance ⁇ is obtained according to the expression (6); wherein, Q( ⁇ ) represents A function that includes weighting information for each component or data or (key) performance indicator.
  • the specific form of the function expression of Expression (6) that is, the specific information of the preset weight, depends on the different requirements of different scenarios for delay, energy consumption, reliability, etc., or the degree of importance or attention, that is, through Use a specific function to weight different performance indicators to achieve the trade-off between various performance indicators, and calculate the weighted value of each key performance indicator according to the set formula to obtain the overall system performance, that is, through the expression (6 )
  • the obtained predicted performance reflects the overall system performance related to Quality of Service (QoS).
  • QoS Quality of Service
  • the training of the first network includes: training the feature extraction module and the prediction module based on the predicted performance and actual performance of each task assignment strategy.
  • the network parameters of the feature extraction module and the prediction module are updated based on the prediction performance of each task allocation strategy and the actual performance in the training data set, thereby realizing the training of the feature extraction module and the prediction module.
  • the prediction module is used to learn various heterogeneous IoTs from the correspondence between different task allocation strategies and system performance (execution time, power consumption, reliability).
  • the inherent statistical law of task scheduling on different operating systems of the device realizes the system performance prediction for a given task allocation strategy before the task is executed, so that it can be selected from different task allocation strategies contained in multiple resource subgraphs.
  • Task allocation strategy for optimal system performance By achieving the best match between the computing tasks to be processed and the available resources of the IoT device, the utilization of resources is maximized, thereby improving the overall system performance.
  • At least one resource subgraph is generated by using at least one of the heuristic method, the graph search method, the graph optimization method and the subgraph matching method, and the actual task allocation strategy corresponding to each resource subgraph is used. After execution, the actual performance corresponding to each resource subgraph is obtained, and the calculation graph, each resource subgraph and the corresponding actual performance are added to the training data set;
  • a random walk method is used to generate at least one resource subgraph, and the actual performance is obtained after the task allocation strategy corresponding to each resource subgraph is actually executed. Figures and actual performance are added to the training dataset.
  • Step 401 Determine the computation graph corresponding to the task to be processed and the resource graph corresponding to the IoT device;
  • Step 404 Determine the task allocation strategy with the best predicted performance, and perform task allocation based on the determined task allocation strategy.
  • the generating at least one task allocation strategy based on the computation graph and the resource graph includes: generating at least one resource subgraph based on the computation graph and the resource graph, Each resource subgraph contains a task allocation strategy; the task allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph; a node in the resource subgraph represents at least one node of the IoT device. Partial capabilities; an edge between two adjacent nodes in the resource subgraph represents a relationship between at least part of the capabilities of an IoT device.
  • the generating at least one resource subgraph based on the computation graph and the resource graph includes: determining a first node in the computation graph; the first node is a resource the node with the greatest demand; determine at least one second node in the resource graph; the at least one second node is a node that satisfies the resource requirement of the first node; determine a resource subgraph based on each second node, Each resource subgraph contains a task allocation strategy.
  • obtaining the prediction performance corresponding to each resource sub-graph through the first network includes: extracting features of the computation graph through a feature extraction module of the first network to obtain a first feature set ; Extract the features of the at least one resource sub-graph respectively by the feature extraction module to obtain at least one second feature set; Based on the first feature set, each second feature set and the prediction module of the first network Obtain the prediction performance corresponding to each resource subgraph.
  • the obtaining the prediction performance corresponding to each resource subgraph based on the first feature set, each second feature set, and the prediction module of the first network includes: based on the first feature set a feature set and each second feature set, obtain at least one third feature set, each third feature set includes the first feature set and each second feature set; based on each third feature set and the The prediction module obtains the prediction data corresponding to each resource sub-map, and obtains the prediction performance corresponding to each resource sub-map based on the prediction data corresponding to each resource sub-map.
  • the obtaining the prediction performance corresponding to each resource submap based on the prediction data corresponding to each resource submap includes: weighting the prediction data corresponding to each resource submap according to a preset weight processing to obtain the prediction performance corresponding to each resource sub-map.
  • the determining a task allocation strategy with the best predictive performance and performing task allocation based on the determined task allocation strategy includes: selecting the predictive performance (that is, the overall system) according to the predictive performance corresponding to each resource subgraph.
  • the predicted value of the performance index corresponds to the task allocation strategy with the largest value, and in practice, the task allocation is carried out according to this strategy.
  • the method further includes: after the task assignment is performed, acquiring the actual performance of the to-be-processed task when it is executed according to the corresponding task assignment strategy; The acquired actual performance is stored in a training data set; the training data set is used to update the first network.
  • the tasks to be processed are actually executed according to the modified task assignment strategy to obtain the actual performance (or the actual overall system performance), and the corresponding tasks are assigned to the
  • the allocation strategy and the obtained actual performance are stored in the training data set for updating the first network to constitute the training data set, or update the training data set.
  • FIG. 8 is a schematic diagram of a task allocation method based on an Internet of Things device according to an embodiment of the present disclosure; as shown in FIG. 8 , in the first step, for a task to be processed, a calculation diagram of the task to be processed is determined; The graph is optimized, for example, some nodes are merged to obtain an optimized computational graph. It should be noted that, the calculation graph in the foregoing embodiments of the present disclosure may refer to an optimized calculation graph. Further, each node in the calculation graph is numbered according to certain rules.
  • At least one resource subgraph is generated based on the computation graph and the resource graph constructed according to the resources and capabilities of each IoT device in the system.
  • the method of generating the resource sub-map can be referred to as described in the foregoing embodiments, which will not be repeated here. Also, for the resource submap shown in FIG. 8 , reference may be made to the resource submap in FIG. 4 .
  • the first feature set and the second feature set are merged into a third feature set, and the third feature set is input to the prediction module for system performance prediction, and the prediction performance corresponding to each resource sub-map is obtained.
  • the prediction module may be implemented by a deep neural network (DNN).
  • record the actual performance obtained after each task is actually executed determine the error between the predicted performance and the actual performance by comparing the predicted performance with the actual performance, and then use the error backpropagation and gradient descent algorithms to determine the accuracy of the performance.
  • the network parameters of the feature extraction module and the prediction module included in the first network are updated, thereby realizing the training of the feature extraction module and the prediction module.
  • FIG. 9 is a schematic diagram of the composition of a task assignment system or platform according to an embodiment of the present disclosure; as shown in FIG. 9 , the task assignment system of this embodiment includes several parts of training data set construction, training phase, reasoning phase, and continuous learning. in:
  • Training phase the input is all training data (also called training samples) in the training data set, wherein each training data includes: a computational graph, a resource subgraph, and the corresponding actual system performance.
  • the training data is input into the network model of the embodiment of the present disclosure, and the error between the obtained predicted value ⁇ p of the system performance index and the actual value ⁇ t of the system performance is back-propagated through the gradient descent algorithm to update the features in the network model.
  • the network parameters (such as weights) of the extraction module and the prediction module are obtained until convergence, and the final network parameters (or model parameters) will be within the acceptable range of errors (which can be manually set in the algorithm), which can make the system of training samples
  • the predicted value of the performance indicator is the closest to the actual value.
  • Continuous learning stage The realization of the continuous learning mechanism is to use the continuously updated training data set to periodically train the feature extraction module in the network model and the network parameters in the prediction module, so that the task allocation system and platform have self-learning and adaptive capabilities. , to achieve intelligent self-adaptation, and achieve the effect of "more use is more intelligent". Specifically, it can be realized by historical sample accumulation and random walk. The accumulation of historical samples is to record the task allocation strategy adopted by each computing task and the actual system performance obtained after the actual execution of each computing task, and store it in the training data set as a new training sample. The specific implementation method of random walk is as described above.
  • ICTA Intelligent Computing Task Allocation
  • Learning task intelligent distribution system and platform ICTA mainly includes: resource subgraph construction, feature extraction, and performance prediction.
  • the input is the computational graph constructed by the current deep learning task and the resource graph constructed by the IoT edge devices with idle resources.
  • the resource subgraph is constructed using methods such as graph search and subgraph matching to generate multiple tasks carrying different tasks.
  • the resource subgraph of the allocation strategy realizes the full utilization of the available resources on the IoT device and the operator-level allocation and optimization of the tasks to be processed; the multi-layer neural network is used for feature extraction and performance prediction respectively, and the resource subgraph and computing
  • the graphs are put into the feature extraction module for feature extraction and fusion, and the features of dimensions such as computing power, storage and communication hidden in the nodes and graph topology structures in the two types of graphs are fully exploited.
  • the fused features are put into the performance prediction module to predict the system performance.
  • the inherent statistical law of the algorithm realizes accurate system performance prediction for a given task allocation strategy before the actual execution of the task, so that the optimal allocation strategy can be selected from the alternatives, and the best match between computing tasks and available resources is achieved. , so as to maximize resource utilization and improve overall system performance; introduce a continuous learning mechanism to periodically train ICTA with continuously updated training data sets, further improve system performance and adaptability to dynamically changing environments, and make it self-sufficient. Self-adaptive and self-learning ability to achieve intelligent self-adaptation, so that the task allocation system achieves the effect of "the more you use, the smarter you are”.
  • ICTA intelligent computing task allocation method
  • the feature extraction module extracts the node and topology features of the resource subgraph and the computation graph respectively, and performs feature fusion. Realize deep perception, feature extraction and feature matching of dimension features such as computing power, storage, and communication that play a key role in the performance of deep learning computing tasks;
  • Performance prediction is based on fusing features and using multi-layer neural networks to learn the inherent statistical laws of task scheduling on different operating systems. The corresponding relationship between the system performance is predicted, so that the task allocation strategy corresponding to the best system performance (prediction) is selected from the alternatives to actually perform the deep learning task. Achieve the best match between deep learning computing tasks and available resources on IoT devices to maximize resource utilization and improve system performance.
  • FIG. 10 is a schematic diagram 1 of the composition and structure of a network training apparatus based on an IoT device according to an embodiment of the present disclosure; as shown in FIG. 10 , the apparatus includes: a first determination unit 11 and a training unit 12 ; wherein,
  • the first determining unit 11 is configured to determine a training data set, the training data set includes at least one task allocation strategy and corresponding actual performance; an actual performance is obtained based on the actual execution of the corresponding task allocation strategy;
  • the training unit 12 is configured to train a first network based on the training data set; the first network is used to predict the performance of the task allocation strategy.
  • the apparatus further includes a first generating unit 13 configured to determine a computation graph corresponding to the task to be processed and a resource graph corresponding to the IoT device, based on the The computing graph and the resource graph are described, and at least one task allocation strategy is generated.
  • the first generating unit 13 is configured to generate at least one resource subgraph based on the computation graph and the resource graph, and each resource subgraph includes a task allocation strategy; the The task allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph; a node in the resource subgraph represents at least part of the capabilities of the IoT device; the edge of two adjacent nodes in the resource subgraph Represents a relationship between at least some of the capabilities of an IoT device.
  • the first generating unit 13 is configured to determine a first node in the calculation graph; the first node is a node with the largest resource demand; determine the resource graph at least one second node; the at least one second node is a node that satisfies the resource requirements of the first node; a resource subgraph is determined based on each second node, and each resource subgraph includes a task allocation strategy .
  • the training unit 12 is configured to train the first network based on predicted performance and actual performance of at least one task allocation strategy.
  • the training unit 12 is further configured to obtain the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph.
  • the training unit 12 is configured to extract features of the computational graph through a feature extraction module of the first network to obtain a first feature set; Extract the features of the at least one resource sub-map to obtain at least one second feature set; obtain the corresponding resource sub-map based on the first feature set, each second feature set and the prediction module of the first network Predictive performance.
  • the training unit 12 is configured to obtain at least one third feature set based on the first feature set and each second feature set, and each third feature set includes the Describe the first feature set and each second feature set; obtain prediction data corresponding to each resource submap based on each third feature set and the prediction module, and obtain each resource based on the prediction data corresponding to each resource submap The prediction performance corresponding to the subgraph.
  • the prediction data includes at least one of the following;
  • the training unit 12 is configured to perform weighting processing on the prediction data corresponding to each resource sub-map according to a preset weight, so as to obtain the prediction performance corresponding to each resource sub-map.
  • the training unit 12 is configured to train the feature extraction module and the prediction module based on the predicted performance and actual performance of each task allocation strategy.
  • the training unit 12 is configured to back-propagate the error between the predicted performance and the actual performance of each task allocation strategy, and use a gradient descent algorithm to extract the features of the first network.
  • the network parameters of the module and the prediction module are updated until the error between the predicted performance and the actual performance satisfies a preset condition.
  • the apparatus further includes an update unit 14 configured to update the training data set, and the updated training data set is used to update (or referred to as training) the first network.
  • the updating unit 14 is configured to update the training data set in at least one of the following manners:
  • At least one resource subgraph is generated by using at least one of the heuristic method, the graph search method, the graph optimization method and the subgraph matching method, and the actual task allocation strategy corresponding to each resource subgraph is used. After execution, the actual performance corresponding to each resource subgraph is obtained, and the calculation graph, each resource subgraph and the corresponding actual performance are added to the training data set;
  • At least one of the heuristic method, graph search method, graph optimization method and subgraph matching method is used to generate at least one resource subgraph, and the prediction corresponding to each resource subgraph is obtained through the first network performance, select the resource subgraph with the best predicted performance from the at least one resource subgraph, and obtain the actual performance after actually executing the task allocation strategy corresponding to the resource subgraph with the best predicted performance, and calculate the The graph, the resource subgraph with the best prediction performance and the corresponding actual performance are added to the training data set;
  • a random walk method is used to generate at least one resource subgraph, and the actual performance is obtained after the task allocation strategy corresponding to each resource subgraph is actually executed. Figures and actual performance are added to the training dataset.
  • the method for generating resource subgraphs described in the embodiments of the present disclosure may be a heuristic method, a graph search method, or a subgraph matching method, and this embodiment is not limited to the above resource subgraphs.
  • the graph generation method generates the task assignment strategy, and at least one of other heuristic methods, graph search methods and subgraph matching methods can also be used to generate the task assignment strategy.
  • the first determining unit 11 , the training unit 12 , the first generating unit 13 and the updating unit 14 in the apparatus can all be composed of CPU, GPU, DSP, Microcontroller (MCU, Microcontroller) in practical applications. Unit) or FPGA, TPU, ASIC, or AI chip, etc.
  • the network training device based on the Internet of Things equipment provided in the above embodiment performs network training
  • only the division of the above program modules is used as an example for illustration. In practical applications, the above processing can be allocated according to different needs.
  • the program module is completed, that is, the internal structure of the device is divided into different program modules, so as to complete all or part of the above-described processing.
  • the network training apparatus based on the Internet of Things device provided by the above embodiments and the network training method based on the Internet of Things device belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • Embodiments of the present disclosure also provide a task assignment apparatus based on an IoT device.
  • FIG. 13 is a schematic diagram of the composition and structure of an apparatus for task assignment based on IoT devices according to an embodiment of the present disclosure; as shown in FIG. 13 , the apparatus includes: a second determination unit 21 , a second generation unit 22 , a prediction unit 23 and a task assignment unit. 24; of which,
  • the second determining unit 21 is configured to determine the computation graph corresponding to the task to be processed and the resource graph corresponding to the IoT device;
  • the second generating unit 22 is configured to generate at least one task allocation strategy based on the computation graph and the resource graph;
  • the prediction unit 23 is configured to input the at least one task allocation strategy into the first network, and obtain the prediction performance corresponding to each task allocation strategy;
  • the task assignment unit 24 is configured to determine a task assignment strategy with the best predictive performance, and perform task assignment based on the determined task assignment strategy.
  • the second generating unit 22 is configured to generate at least one resource subgraph based on the computation graph and the resource graph, and each resource subgraph includes a task allocation strategy;
  • the task allocation strategy is used to allocate at least one node of the corresponding resource graph to each node of the computing graph;
  • a node in the resource subgraph represents at least part of the capabilities of the IoT device;
  • two adjacent nodes in the resource subgraph The edges represent relationships between at least some of the capabilities of IoT devices.
  • the second generating unit 22 is configured to determine a first node in the calculation graph; the first node is a node with the largest resource demand; determine the resource graph at least one second node; the at least one second node is a node that satisfies the resource requirements of the first node; a resource subgraph is determined based on each second node, and each resource subgraph includes a task allocation strategy .
  • the first network is optimized by using the network training apparatus described in the foregoing embodiments of the present disclosure.
  • the predicting unit 23 is configured to obtain the prediction performance corresponding to each resource subgraph through the first network based on the computation graph and each resource subgraph.
  • the predicting unit 23 is configured to extract features of the computational graph through a feature extraction module of the first network to obtain a first feature set; Extract the features of the at least one resource sub-map to obtain at least one second feature set; obtain the corresponding resource sub-map based on the first feature set, each second feature set and the prediction module of the first network Predictive performance.
  • the predicting unit 23 is configured to obtain at least one third feature set based on the first feature set and each second feature set, and each third feature set includes the Describe the first feature set and each second feature set; obtain prediction data corresponding to each resource submap based on each third feature set and the prediction module, and obtain each resource based on the prediction data corresponding to each resource submap The prediction performance corresponding to the subgraph.
  • the prediction data includes at least one of the following;
  • the prediction unit 23 is configured to perform weighting processing on the prediction data corresponding to each resource sub-map according to a preset weight, to obtain the prediction performance corresponding to each resource sub-map.
  • the task allocation unit 24 is configured to determine a task allocation strategy with the best predicted performance, and perform task allocation and execution in practice according to the strategy.
  • the apparatus further includes: an obtaining unit, configured to obtain the actual performance of the to-be-processed task when it is executed according to a corresponding task allocation strategy after the task allocation is performed;
  • the task allocation strategy and the obtained actual performance are stored in a training data set; the training data set is used to update the first network.
  • the second determination unit 21 , the second generation unit 22 , the prediction unit 23 , the task assignment unit 24 , and the acquisition unit in the device can all be composed of CPU, GPU, DSP, ASIC, AI, etc. Chip, MCU or FPGA etc.
  • the task allocation device based on the Internet of Things device performs task allocation
  • only the division of the above program modules is used as an example for illustration.
  • the program module is completed, that is, the internal structure of the device is divided into different program modules, so as to complete all or part of the above-described processing.
  • the task assignment apparatus based on IoT devices provided by the above embodiments and the embodiments of the task assignment method based on IoT devices belong to the same concept, and the specific implementation process is detailed in the method embodiments, which will not be repeated here.
  • FIG. 14 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the disclosure.
  • the electronic device includes a memory 32 , a processor 31 , and a computer program stored in the memory 32 and running on the processor 31 .
  • the processor 31 implements the steps of the network training method described in the foregoing embodiments of the present disclosure when executing the program; or, when the processor executes the program, realizes the steps of the task allocation method described in the foregoing embodiments of the present disclosure. step.
  • bus system 33 is used to implement the connection communication between these components.
  • bus system 33 also includes a power bus, a control bus and a status signal bus.
  • the various buses are designated as bus system 33 in FIG. 14 .
  • the memory 32 may be either volatile memory or non-volatile memory, and may include both volatile and non-volatile memory.
  • the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read-only memory) Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), Magnetic Random Access Memory (FRAM, ferromagnetic random access memory), Flash Memory (Flash Memory), Magnetic Surface Memory , CD-ROM, or CD-ROM (Compact Disc Read-Only Memory); magnetic surface memory can be disk memory or tape memory.
  • RAM Random Access Memory
  • SRAM Static Random Access Memory
  • SSRAM Synchronous Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM Enhanced Type Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Link Dynamic Random Access Memory
  • DRRAM Direct Rambus Random Access Memory
  • the memory 32 described in the embodiments of the present disclosure is intended to include, but not be limited to, these and any other suitable types of memory.
  • the methods disclosed in the above embodiments of the present disclosure may be applied to the processor 31 or implemented by the processor 31 .
  • the processor 31 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by a hardware integrated logic circuit in the processor 31 or an instruction in the form of software.
  • the above-mentioned processor 31 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • the processor 31 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present disclosure.
  • a general purpose processor may be a microprocessor or any conventional processor or the like.
  • the steps of the methods disclosed in combination with the embodiments of the present disclosure can be directly embodied as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium, and the storage medium is located in the memory 32, and the processor 31 reads the information in the memory 32 and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device may be implemented by one or more Application Specific Integrated Circuit (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), FPGA, general-purpose processor, controller, MCU, Microprocessor (Microprocessor), or other electronic components implemented for performing the aforementioned method.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal processor
  • PLD Programmable Logic Device
  • CPLD Complex Programmable Logic Device
  • FPGA general-purpose processor
  • controller MCU
  • Microprocessor Microprocessor
  • an embodiment of the present disclosure further provides a computer-readable storage medium, such as a memory 32 including a computer program, and the computer program can be executed by the processor 31 of the electronic device to complete the steps of the foregoing method.
  • the computer-readable storage medium can be memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM; it can also be various devices including one or any combination of the above memories.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, implements the steps of the network training method described in the foregoing embodiments of the present disclosure; or, the program is processed
  • the steps of the task allocation method described in the foregoing embodiments of the present disclosure are implemented when the server is executed.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.
  • the unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be all integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.
  • the aforementioned program may be stored in a computer-readable storage medium, and when the program is executed, execute It includes the steps of the above method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other media that can store program codes.
  • the above-mentioned integrated units of the present disclosure are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present disclosure essentially or the parts that make contributions to the prior art can be embodied in the form of a software product, and the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) is caused to execute all or part of the methods described in the various embodiments of the present disclosure.
  • the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other mediums that can store program codes.

Abstract

本申请实施例公开了一种基于物联网设备的任务分配方法、网络训练方法及装置。所述网络训练方法包括:确定训练数据集,基于所述训练数据集训练第一网络;所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;所述第一网络用于预测任务分配策略的性能。

Description

基于物联网设备的任务分配方法、网络训练方法及装置
相关申请的交叉引用
本公开基于申请号为202110184998.5、申请日为2021年02月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本公开。
技术领域
本公开涉及物联网(IoT,Internet of Things)领域,具体涉及一种基于物联网设备的任务分配方法、网络训练方法及装置。
背景技术
随着5G移动网络与深度学习技术的突破性进展,基于人工智能的物联网应用和服务也将迎来新的机遇和挑战,万物智联无疑是其中涉及的关键技术趋势。云计算虽然可以满足计算密集型的深度学习任务对算力和存储资源的需求,但不适用于对时延、可靠性、隐私等较为敏感的诸如自动驾驶、虚拟现实(VR,Virtual Reality)、增强现实(AR,Augmented Reality)等物联网场景,而单一的物联网设备上资源又极为有限,因此能够在互联互通的多种异构物联网设备上进行跨设备协作的分布式边缘计算可能成为一种有效的解决方案,其中跨异构设备的智能的计算任务分配方法将是其实现的关键。
目前,在边缘深度学习系统中,主要通过基于模型分割的层调度算法来实现深度学习模型的分布式训练和推理,将模型的某些层分配在边侧,剩下的层分配在云中心;边缘服务器主要用来处理较低层的数据,而云服务器主要处理较高层的数据。这种任务分配策略不涉及底层深度学习算法 的分配,限制了任务调度和资源的优化效果。
发明内容
本公开实施例提供一种基于物联网设备的任务分配方法、网络训练方法及装置。
本公开实施例的技术方案是这样实现的:
第一方面,本公开实施例提供了一种基于物联网设备的网络训练方法,所述方法包括:确定训练数据集,基于所述训练数据集训练第一网络;所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;所述第一网络用于预测任务分配策略的性能。
在本公开的一些可选实施例中,所述方法还包括:确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
在本公开的一些可选实施例中,所述基于所述计算图和资源图生成至少一种任务分配策略,包括:基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述基于所述计算图和资源图生成至少一个资源子图,包括:
确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;
确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;
基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分 配策略。
在本公开的一些可选实施例中,所述训练第一网络,包括:基于至少一种任务分配策略的预测性能和实际性能,训练所述第一网络。
在本公开的一些可选实施例中,获得至少一种任务分配策略的预测性能,包括:
基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述通过第一网络得到每个资源子图对应的预测性能,包括:
通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;
通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;
基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:
基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;
基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能,包括:根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练第一网络,包括:基于每种任务分配策略的预测性能和实际性能,训练所述特征提取模块和所述预测模块。
在本公开的一些可选实施例中,所述训练所述特征提取模块和所述预测模块,包括:
将每种任务分配策略的预测性能和实际性能的误差进行反向传播,利用梯度下降算法,对第一网络的特征提取模块和预测模块的网络参数进行更新,直至预测性能和实际性能之间的误差满足预设条件。
在本公开的一些可选实施例中,所述方法还包括:更新所述训练数据集,更新后的所述训练数据集用于更新所述第一网络。
在本公开的一些可选实施例中,所述更新所述训练数据集,包括以下至少之一:
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后,得到每种资源子图对应的实际性能,将所述计算图、每种资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,通过第一网络得到每个资源子图对应的预测性能,从所述至少一种资源子图中选择预测性能最佳的资源子图,并按照所述预测性能最佳的资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、所述预测性能最佳的资源 子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用随机游走方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、至少一种资源子图和实际性能加入所述训练数据集。
第二方面,本公开实施例还提供了一种基于物联网设备的任务分配方法,所述方法包括:
确定待处理任务对应的计算图,以及物联网设备对应的资源图;
基于所述计算图和所述资源图,生成至少一种任务分配策略;
将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
在本公开的一些可选实施例中,所述基于所述计算图和所述资源图生成至少一种任务分配策略,包括:基于所述计算图和所述资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述基于所述计算图和所述资源图生成至少一个资源子图,包括:确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;
确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;
基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
在本公开的一些可选实施例中,所述第一网络采用本公开实施例第一方面所述方法进行优化。
在本公开的一些可选实施例中,所述获得每种任务分配策略对应的预测性能,包括:基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述通过第一网络得到每个资源子图对应的预测性能,包括:
通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;
通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;
基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:
基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;
基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述基于每个资源子图对应的预测数 据获得每个资源子图对应的预测性能,包括:根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述方法还包括:在进行任务分配后,获取所述待处理任务按照相应的任务分配策略被执行时的实际性能;并将相应的任务分配策略及获取的实际性能存储至训练数据集;所述训练数据集用于更新所述第一网络。
第三方面,本公开实施例还提供了一种基于物联网设备的网络训练装置,所述装置包括:第一确定单元和训练单元;其中,
所述第一确定单元,配置为确定训练数据集,所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;
所述训练单元,配置为基于所述训练数据集训练第一网络;所述第一网络用于预测任务分配策略的性能。
在本公开的一些可选实施例中,所述装置还包括第一生成单元,配置为确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
在本公开的一些可选实施例中,所述第一生成单元,配置为基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述第一生成单元,配置为确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点 的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
在本公开的一些可选实施例中,所述训练单元,配置为基于至少一种任务分配策略的预测性能和实际性能,训练所述第一网络。
在本公开的一些可选实施例中,所述训练单元,还配置为基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元,配置为通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元,配置为基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述训练单元,配置为根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元,配置为基于每种任务分配策略的预测性能和实际性能,训练所述特征提取模块和所述预测模块。
在本公开的一些可选实施例中,所述训练单元,配置为将每种任务分配策略的预测性能和实际性能的误差进行反向传播,利用梯度下降算法,对第一网络的特征提取模块和预测模块的网络参数进行更新,直至预测性能和实际性能之间的误差满足预设条件。
在本公开的一些可选实施例中,所述装置还包括更新单元,配置为更新所述训练数据集,更新后的所述训练数据集用于更新所述第一网络。
在本公开的一些可选实施例中,所述更新单元,配置为采用以下至少一种方式更新所述训练数据集:
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后,得到每种资源子图对应的实际性能,将所述计算图、每种资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法获得至少一种资源子图,通过第一网络得到每个资源子图对应的预测性能,从所述至少一种资源子图中选择预测性能最佳的资源子图,并按照所述预测性能最佳的资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、所述预测性能最佳的资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用随机游走方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、至少一种资源子图和实际性能加入所述训练数据集。
第四方面,本公开实施例还提供了一种基于物联网设备的任务分配装置,所述装置包括:第二确定单元、第二生成单元、预测单元和任务分配单元;其中,
所述第二确定单元,配置为确定待处理任务对应的计算图,以及物联 网设备对应的资源图;
所述第二生成单元,配置为基于所述计算图和所述资源图,生成至少一种任务分配策略;
所述预测单元,配置为将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
所述任务分配单元,配置为确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
在本公开的一些可选实施例中,所述第二生成单元,配置为基于所述计算图和所述资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述第二生成单元,配置为确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
在本公开的一些可选实施例中,所述第一网络采用本公开实施例第三方面所述的装置进行优化。
在本公开的一些可选实施例中,所述预测单元,配置为基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测单元,配置为通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二 特征集;基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测单元,配置为基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述预测单元,配置为根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述装置还包括:获取单元,配置为在进行任务分配后,获取所述待处理任务按照相应的任务分配策略被执行时的实际性能;并将相应的任务分配策略及获取的实际性能存储至训练数据集;所述训练数据集用于更新所述第一网络。
第五方面,本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开实施例前述第一方面或第二方面所述方法的步骤。
第六方面,本公开实施例还提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现本公开实施例前述第一方面或第二方面所述方法的步骤。
本公开实施例的技术方案,一方面,通过确定训练数据集,基于所述 训练数据集训练第一网络;所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得,所述实际性能即性能的实际值;所述第一网络用于预测任务分配策略的性能,另一方面,确定待处理任务对应的计算图,以及物联网设备对应的资源图;基于所述计算图和所述资源图,生成至少一种任务分配策略;将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能,所述预测性能即性能的预测值;确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。如此,通过对第一网络的训练,实现了用于对采用某种任务分配策略执行待处理任务时的系统性能的预测,从而可确定最佳分配策略,实现在任务分配过程中待处理任务和设备的可用资源之间的最佳匹配,从而最大化资源利用,提升系统性能。
附图说明
图1为本公开实施例应用的一种可选场景示意图;
图2为本公开实施例的基于物联网设备的网络训练方法的流程示意图;
图3为本公开实施例中一种资源子图的生成方式的流程示意图;
图4为本公开实施例中的一种资源子图的生成示意图;
图5为本公开实施例中一种预测性能的获得方式的流程示意图;
图6a为本公开实施例中的一种计算图的特征提取示意图;
图6b为本公开实施例中的一种资源子图的特征提取示意图;
图7为本公开实施例的基于物联网设备的任务分配方法的流程示意图;
图8为本公开实施例的基于物联网设备的任务分配方法的示意图;
图9为本公开实施例的任务分配系统的组成示意图;
图10为本公开实施例基于物联网设备的网络训练装置的组成结构示意图一;
图11为本公开实施例基于物联网设备的网络训练装置的组成结构示意图二;
图12为本公开实施例基于物联网设备的网络训练装置的组成结构示意图三;
图13为本公开实施例基于物联网设备的任务分配装置的组成结构示意图;
图14为本公开实施例的电子设备的硬件组成结构示意图。
具体实施方式
下面结合附图及具体实施例对本公开作进一步详细的说明。
一方面,随着深度学习技术的突破性进展,以及5G技术的推广和普及,车联网、智慧养老、智慧社区、工业互联网等领域将会提供越来越多的智能服务,这些智能服务的实现往往依赖于人工智能技术和深度学习模型。另一方面,随着物联网设备数量和智能化程度的快速增长,为了充分利用资源受限且高度异构的物联网设备,可以考虑利用广泛分布的物联网设备上的闲置资源(或称为闲散资源、空闲资源、可用资源、闲置能力、闲散能力、可用能力、空闲能力),通过资源共享和设备协作来以分布式的方式并行执行计算密集型的计算任务。基于此,本公开实施例的主要目的在于如何构建跨异构物联网设备资源调度的、端到端可训练的网络模型、以及实现可自动优化的、高性能且智能自适应的任务分配方法。通过学习长期优化的资源管理和任务调度策略,以能获得最佳系统性能的任务分配方式给计算图中的节点分配合适的算力、存储及通信资源,促进了跨设备协作的去中心化机器学习(例如深度模型的训练和推理)的实现,并进一步对物联网场景中智能应用和服务的实现做出了贡献。
图1为本公开实施例应用的一种可选场景示意图;如图1所示,智能家庭服务(Smart Home Service)中可包括但不限于以下服务:家庭服务机 器人(Home Service Robot)、智能监控(Intelligent monitoring)、虚拟现实(VR,Virtual Reality)、智能控制(Intelligent control)等等。通过物联网设备可以采集到包括视频、图像、语音、文本等数据在内的结构化和非结构化数据,将采集到的数据输入到设计好的网络模型中,并利用物联网设备上的硬件资源执行相应的计算任务,从而实现各种智能功能。例如AI应用(AI application)中的行为识别(Action Recognition)、自然语言处理(Natural Language Processing)、图像处理(Image Recognition)、人脸识别(Face Recognition)等等。其中,可通过执行例如深度神经网络(DNN,Deep Neural Networks)、卷积神经网络(CNN,Convolutional Neural Networks)、循环神经网络(RNN,Recurrent Neural Network)、长短期记忆(LSTM,Long Short-Term Memory)网络、图卷积网络(GCN,Graph Convolutional Networks)、图神经网络(GNN,Graph Neural Networks)等计算任务实现上述功能。将上述用于实现计算任务的网络中的各种抽象的运算进行拆分,得到一系列的算子,例如卷积(Conv)、池化(Pooling)等等,它们代表了待处理任务中某种类型的运算;各算子可形成算子库(Operator library)。
场景中还包括多个边缘侧的物联网设备(IoT Device),各个物联网设备可具有不同的能力。所述能力可通过计算资源(Computation Resource)、存储资源(Storage/Memory Resource)、通信资源(Communication Resource)体现,上述计算资源可以是指可用的或闲置的计算资源,计算资源例如可包括中央处理器(CPU,Central Processing Unit)资源、图形处理器(GPU,Graphics Processing Unit)资源、可编程门阵列(FPGA,Field-Programmable Gate Array)资源、数字信号处理器(DSP,Digital Signal Processor)资源等等;上述存储资源可以是指可用的或闲置的存储资源,存储资源例如可包括;存储器(Memorizer)资源、高速缓冲存储器(Cache)资源、随机 存取存储器(RAM,Random Access Memory)等等;上述通信资源可以是指可用的或闲置的通信资源。则不同的物联网设备可以提供不同的计算资源、存储资源、通信资源中的至少一种资源,各个物联网设备的计算资源、存储资源、通信资源可形成资源池。
本实施例中,通过将计算任务的各算子抽象为相应的节点,从而生成计算图,以及对各物联网设备的能力抽象为相应的节点,从而生成资源图,并且基于计算图和资源图进行资源子图的构建(Resource Sub-graph Construction),对计算图和资源子图进行特征提取(Feature Extraction),根据提取出的特征对其中隐含的任务分配策略进行性能预测(Performance Prediction)等方式,从而实现任务的智能分配。
需要说明的是,上述图1所示的智能家庭服务场景仅是本公开实施例的技术方案适用的一种可选的应用场景,其他应用场景也可在本公开实施例的保护范围之内,本公开实施例中不做具体限定。
本公开实施例提供了一种基于物联网设备的网络训练方法,应用于各种电子设备中,所述电子设备包括但不限于固定设备和/或移动设备。例如,所述固定设备包括但不限于个人计算机(PC,Personal Computer)或者服务器等,所述服务器可以是云服务器或普通服务器。所述移动设备包括但不限于:手机、平板电脑或可穿戴式设备中的一项或是多项。
图2为本公开实施例的基于物联网设备的网络训练方法的流程示意图;如图2所示,所述方法包括:
步骤101:确定训练数据集;所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;
步骤102:基于所述训练数据集训练第一网络;所述第一网络用于预测任务分配策略的性能。
本实施方式中,所述任务分配策略表示将待处理任务分配到至少一个物联网设备执行的策略;换句话说,通过所述任务分配策略,可以确定至少一个物联网设备,通过至少一个物联网设备按照任务分配策略的指示执行待处理任务。可选地,所述任务分配策略也可以称为以下其中一种:任务分配方法、任务分配方式、任务调度策略、任务调度方法、任务调度方式、任务编排策略、任务编排方法、任务编排方式等等。
本实施方式中,采用不同的任务分配策略实际执行待处理任务,可得到相同或不同的系统性能,即上述实际性能。所述实际性能表示系统在实际执行待处理任务时的性能(例如执行时长、能耗、可靠性等等)。本实施例通过训练数据集中的各任务分配策略以及对应的实际性能对第一网络进行训练。
本实施方式中,系统中的各物联网设备可以是异构物联网设备。这里,异构物联网设备是指:在一个包含多个物联网设备和服务器的网络中,一个物联网设备的硬件与另一个物联网设备的硬件不同,和/或,一个物联网设备的服务器与另一个物联网设备的服务器不同。其中,一个物联网设备的硬件与另一个物联网设备的硬件不同是指:一个物联网设备与另一个物联网设备的计算资源、存储资源中的至少一种资源对应的硬件的型号或类型不同。例如,以计算资源对应的硬件为例,一个物联网设备的CPU、GPU、总线接口芯片(BIC,Bus Interface Chip)、DSP、FPGA、专用集成电路(ASIC,Application Specific Integrated Circuit)、张量处理单元(TPU,Tensor Processing Unit)、人工智能(AI,Artificial Intelligence)芯片中的至少一种硬件的型号与另一个物联网设备的对应硬件的型号不同;又例如,以存储资源对应的硬件为例,一个物联网设备的RAM、只读存储器(ROM,Read Only Memory)、Cache中的至少一种硬件的型号与另一个物联网设备的对应硬件的型号不同;一个物联网设备的服务器与另一个物联网设备的服务 器不同是指:一个物联网设备对应的后端程序与另一个物联网设备对应的后端程序不同;其中,所述后端程序可包括操作系统,也即一个物联网设备对应的操作系统与另一个物联网设备对应的操作系统不同,换句话说,两个物联网设备之间在软件层面存在不同。
示例性的,所述物联网设备可以包括手机、PC、可穿戴智能设备、智能网关、计算盒子等;所述PC可以包括台式电脑、笔记本电脑、平板电脑等;所述可穿戴智能设备可以包括智能手表、智能眼镜等。
在本公开的一些可选实施例中,所述方法还包括:确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
本实施例中,计算图中可包括待处理任务;计算图中可包括至少一个节点,每个节点对应于待处理任务中的某种运算或算子,节点之间的边表示相邻节点之间的关系。示例性的,计算图中包括三个节点,三个节点依次相连,则可表明待处理任务通过三个节点对应的算子实现,第一个节点通过对应的第一个算子处理后,将输出数据发送至第二个节点,进而通过第二个节点对应的第二个算子进行处理,处理后的数据发送至第三个节点,通过第三个节点对应的第三个算子进行处理,从而实现待处理任务。
本实施方式中,资源图中可包括系统中的各物联网设备的能力(或称为资源)。物联网设备的能力至少包括:计算能力、存储能力和通信能力中的至少一种。示例性的,资源图中可包括至少一个节点,每个节点对应于物联网设备的至少部分能力;一个示例中,一个节点可以表示一个物联网设备的所有能力(例如计算能力、存储能力和通信能力等);另一个示例中,一个节点可以表示一个物联网设备的部分能力,例如仅表示一个物联网设备的计算能力或存储能力,或者仅表示一个物联网设备的部分计算能力和/或部分存储能力。节点之间的边表示物联网设备的至少部分能力之间的关 系。
在一些可选实施方式中,所述基于所述计算图和资源图生成至少一种任务分配策略,包括:基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
本实施例中,一个任务分配策略用于为计算图中的每个节点分配相应资源图的至少一个节点,或者用于将待处理任务分配或映射到至少一个物联网设备上,或者,将待处理任务和物联网设备之间进行匹配,或者,将待处理任务与资源之间进行匹配。
实际应用时,为所述计算图的每个节点分配的所述资源图的至少一个节点可以相同或不同;也就是说,一个物联网设备可以利用自身的至少部分能力实现多个算子对应的计算单元,同时,多个物联网设备可以以协作的方式实现一个算子对应的计算单元。此外,计算图中没有计算依赖关系(即数据依赖关系)的节点(或算子)可以在相同或不同的物联网设备上并行地执行(或运算、计算)。
示例性的,一个任务分配策略可通过一个资源子图体现,换句话说,所述训练数据集中可包括计算图、至少一个资源子图和对应的实际性能。本实施例中,基于待处理任务构建的计算图以及包含有闲置资源的多种异构物联网设备构建的资源图,根据待处理任务对计算资源、存储资源以及通信资源的需求以及各物联网设备上的可用资源或能力之间的关系,从完整的资源图中生成至少一个资源子图,也即生成至少一种任务分配策略。资源子图构建实现了对物联网设备上闲散资源的充分利用以及细粒度的任务分配和优化。
可选地,本实施例中的资源图也可称为资源知识图或资源知识图谱;资源子图也可称为资源知识子图或资源知识子图谱等等。
可选地,图3为本公开实施例中一种资源子图的生成方式的流程示意图;如图3所示,资源子图的生成方法可包括:
步骤201:确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;
步骤202:确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;
步骤203:基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
示例性的,图4为本公开实施例中的一种资源子图的生成示意图;结合图4所示,首先,针对一个计算图,对其中的每个节点进行编号。
示例性的,基于一个统一的规则给计算图中的节点进行标号。例如:首先确定计算图中节点数量最多的分支,按照该分支中的各节点的先后顺序依次编号。例如图4中的节点数量最多的分支具有5个节点,针对包含有5个节点的分支按照先后顺序编号;进一步给节点数量第二多的分支中所有节点编号,以此类推,直至计算图中所有节点均被编号。
其次,确定计算图的所有节点中的第一节点,所述第一节点为资源需求最大的节点,所述第一节点也可称为瓶颈节点。例如图4中计算图上编号为4的节点的资源需求最大,将编号为4的节点确定为第一节点或瓶颈节点。其中,资源需求最大的节点可以是指对计算资源、存储资源和通信资源中的至少一种资源的需求最大的节点。
进一步地,确定资源图中的至少一个第二节点,也即为第一节点(或瓶颈节点)分配合适的资源节点(也可称为设备节点,或能力节点),为待处理任务的执行提供可用的资源。示例性的,可以将资源图中能够满足第 一节点(或瓶颈节点)的资源需求的节点均确定为第二节点,例如图4的资源图中三个编号为4的节点均能够满足第一节点(或瓶颈节点)的资源需求,则将三个编号为4的节点均确定为第二节点。通常情况下,资源图中能够满足第一节点(或瓶颈节点)资源需求的资源节点的数量不止一个,因此生成的资源子图也可以是不止一个。
再次,在资源图中,以每个对应于第一节点(或瓶颈节点)的资源节点为起点,例如以图4中资源图上的右侧标号为4的资源节点(记为节点V3)为起点,在资源图上搜索与其相邻的其他资源节点(例如图4资源图中的节点V1、V4和V5),分别为计算图中距离第一节点1跳远的节点(例如计算图中标号为3、5、6的节点)分配合适的资源节点,以满足相应工作负载的资源需求。例如,将资源图中的节点V1分配给了计算图中的节点3,将资源图中的节点V4分配给了计算图中的节点6,将资源图中的节点V5分配给了计算图中的节点5。进一步给计算图中距离第一节点2跳远的节点(例如计算图中标号为2的节点)分配资源节点。例如将资源图中的节点V2分配给了计算图中的节点2。依此类推,直至计算图中的所有节点均被分配了资源图中的资源节点。
其中,针对资源图中每个满足第一节点(或瓶颈节点)资源需求的资源节点,均按照上述步骤进行资源节点的分配,从而可得到如图4中右侧所示的三种任务分配策略,也即获得三个资源子图。通过多种任务分配策略(即资源子图)的构建,便于后续通过特征提取以及性能预测筛选出最优的任务分配策略。
本实施例上述资源子图的构建过程仅为一种示例,其他的任务分配策略的分配方式也可在本公开实施例的保护范围之内。
在一些可选实施例中,所述训练第一网络,包括:基于至少一种任务分配策略的预测性能和实际性能,训练所述第一网络。
本实施方式中,通过第一网络获得每种任务分配策略的预测性能,通过预测性能和实际性能以及误差反向传播方式,训练所述第一网络。
在一些可选实施例中,获得至少一种任务分配策略的预测性能,包括:基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
本实施方式中,通过将计算图和一个资源子图输入至第一网络,得到所述资源子图对应的预测性能。
示例性的,图5为本公开实施例中一种预测性能的获得方式的流程示意图;如图5所示,预测性能的获得方法可包括:
步骤301:通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;
步骤302:通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;
步骤303:基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
本实施方式中,通过特征提取模块提取计算图的特征,得到第一特征集。所述第一特征集也可称为特征集合、特征、特征向量。为了通过图节点之间的消息传递来捕获图的依赖关系,利用特征提取模块分别提取计算图和资源子图的特征,包括CPU、GPU、FPGA、DSP以及内存等,主要涵盖了算力、存储、通信等维度的特征。
示例性的,确定计算图的输入特征集合,所述输入特征集合也可称为输入特征、输入特征向量、输入特征矩阵,包括所述计算图中的每个节点的输入特征信息;确定计算图的邻接矩阵,所述邻接矩阵表示所述计算图的拓扑结构信息,或者所述邻接矩阵表示所述计算图中的节点之间的关系;基于所述输入特征结合、所述邻接矩阵和所述特征提取模块提取所述计算 图的特征,得到第一特征集。
可选地,所述计算图的输入特征集合包括所述计算图中的每个节点的特征向量;所述计算图中的每个节点的特征向量包括执行每个节点对应算子需要的资源信息,所需要的资源信息例如包括CPU、GPU、DSP、FPGA、内存占用率等。
可选地,对应于计算图的邻接矩阵中的元素表示每两个节点之间关系的强弱,元素的数值大小与对应的两个节点之间的传输数据大小相关。
示例性的,图6a为本公开实施例中的一种计算图的特征提取示意图;如图6a所示,计算图中包括6个节点,相应的,输入特征集合中包括6组特征向量,例如表示为:
Figure PCTCN2022075450-appb-000001
其中,每组特征向量(上述矩阵中的每一列)对应计算图上的一个节点,每组特征向量中包括对应于该节点的算子在被执行(或运算)时对各个资源的使用情况(或资源需求)的特征,即算子的硬件执行代价,或算子的硬件占用数据,例如可包括CPU占用率、GPU占用率、DSP占用率、FPGA占用率、存储占用率等特征。所述占用率也可称为占用、占用比例、占用比率、使用、使用率、使用比例、使用比率、利用、利用率、利用比例、利用比率。通过计算图中的节点之间的连接关系,可确定出节点1和节点2之间具有连接关系,节点2和节点3之间具有连接关系,节点3和节点4之间具有连接关系,节点4和节点5之间具有连接关系,节点1和节点6之间具有连接关系,节点6和节点4之间具有连接关系,由此可确定出邻接矩阵中相应元素(即图中所示的e 12、e 16、e 23、e 34、e 45、e 64)具有特定值,而其他元素的取值为0,邻接矩阵如下所示:
Figure PCTCN2022075450-appb-000002
其中,上述元素的数值大小与相应的两个节点之间的传输数据大小相关。
示例性的,e=kd;其中,k表示预设系数,d表示相邻两个节点之间的传输数据的大小。
例如:e 34=kd 34;其中,d 34表示计算图中的节点3和节点4之间传输数据的大小。
则将例如图中所示的输入特征集合和邻接矩阵输入至特征提取模块,从而获得第一特征集。
同理,通过特征提取模块提取资源子图的特征,得到第二特征集。所述第二特征集也可称为特征集合、特征、特征向量。
示例性的,确定资源子图的输入特征集合,所述输入特征集合包括所述资源子图中的每个节点的输入特征信息;确定资源子图的邻接矩阵,所述邻接矩阵表示所述资源子图的拓扑结构信息,或者所述邻接矩阵表示所述资源子图中的节点之间的关系;基于所述输入特征结合、所述邻接矩阵和所述特征提取模块提取所述资源子图的特征,得到第二特征集。
可选地,所述资源子图的输入特征集合包括所述资源子图中的每个节点的特征向量;所述资源子图中的每个节点的特征向量包括每个节点对应的物联网设备具有的至少部分资源信息(或称为能力信息),所述至少部分资源信息例如包括CPU、GPU、DSP、FPGA、内存等可用资源。
可选地,对应于资源子图的邻接矩阵中的元素表示每两个节点之间通信关系的强弱,元素的数值大小与对应的两个节点之间的传输速率和/或时 延等相关。
示例性的,图6b为本公开实施例中的一种资源子图的特征提取示意图;如图6b所示,为一种任务分配策略,即一种资源子图,图中包括6个节点,相应的,输入特征集合中包括6组特征向量,例如表示为:
Figure PCTCN2022075450-appb-000003
其中,每组特征向量(上述矩阵中的每一列)对应资源子图上的一个节点,每组特征向量中包括对应于各个节点的资源的特征,例如可包括CPU资源、GPU资源、DSP资源、FPGA资源、存储资源等特征。并且,通过计算图中的节点之间的连接关系,可确定出节点V2和节点V1之间具有连接关系,节点V1和节点V3之间具有连接关系,节点V3和节点V4之间具有连接关系,节点V3和节点V5之间具有连接关系,节点V5和节点V6之间具有连接关系,由此可确定出邻接矩阵中相应元素(即图中所示的e 13、e 21、e 34、e 35、e 56)具有特定值,而其他元素的取值为0,邻接矩阵如下所示:
Figure PCTCN2022075450-appb-000004
其中,上述元素的数值大小与相应的两个节点之间的传输速率和/或时延等相关。
示例性的,e=k 1ts+k 2l;其中,k 1和k 2分别表示预设系数,ts表示相邻两个节点之间的传输速率,l表示相邻两个节点之间的传输时延。
例如:e 34=k 1ts 34+k 2l 34;其中,ts 34表示资源子图中的节点V3和节点 V4之间传输速率;l 34表示资源子图中的节点V3和节点V4之间的时延。
则将例如图中所示的输入特征集合和邻接矩阵输入至特征提取模块,从而获得第二特征集。
本实施方式中,将上述计算图或资源子图对应的输入特征集合和邻接矩阵作为特征提取模块的输入,通过以下表达式(5)的前向传播算法进行特征更新;经过特征提取模块中的多层网络的前向传播,获得一个融合了计算图或资源子图中所有节点和边的特征的特征集(或称为特征向量),即得到对应于计算图的第一特征集和对应于资源子图的第二特征集。
Figure PCTCN2022075450-appb-000005
其中,
Figure PCTCN2022075450-appb-000006
表示加入了自连接的邻接矩阵,I表示单位矩阵;
Figure PCTCN2022075450-appb-000007
其中,i表示矩阵中的行数,j表示矩阵中的列数;H (l)表示特征提取模块中的多层网络中的第l层的所有节点特征;H (0)表示特征提取模块的输入层的所有节点的特征;W表示特征提取模块的可训练权重矩阵(即网络参数);W (l)表示特征提取模块中的多层网络中的第l层的可训练权重矩阵;σ(·)表示激活函数。
在一些可选实施例中,所述基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
本实施方式中,为了从至少一个任务分配策略中选择最佳的任务分配策略,则本实施方式中通过预测模块预测执行每个任务分配策略的系统性能。为了学习不同操作系统上任务调度和资源分配的内在统计学规律以适 应动态变化的物联网边缘计算环境,构建了预测模块,用于处理复杂问题的非线性回归预测,可以从隐含了不同任务分配策略和系统性能之间对应关系的大量历史样本中学习其统计学规律。
本实施方式中,预测模块的输入数据是利用特征提取模块分别得到的第一特征集和一个第二特征集的融合特征(即第三特征集)。示例性的,可将第一特征集和第二特征集拼接在一起,得到第三特征集。进一步将第三特征集输入至预测模块,通过预测模块中的多层网络的前向传播算法的层层迭代,获得每个资源子图对应的预测数据。
在一些可选实施例中,所述预测数据可以表示性能指标(或称关键性能指标、系统性能指标、关键系统性能指标)的预测值,包括以下至少之一;执行所述待处理任务的预测执行时长;执行所述待处理任务的预测能耗;执行所述待处理任务的预测可靠性。所述预测执行时长即执行时长的预测值,所述预测能耗即能耗的预测值,所述预测可靠性即可靠性的预测值。示例性的,预测数据可以是一个包括三个数据或三个分量的向量;其中,一个数据或分量表示执行所述待处理任务的预测执行时长,例如可记为λ t,一个数据或分量表示执行所述待处理任务的预测能耗,例如可记为λ e,一个数据或分量表示执行所述待处理任务的预测可靠性,例如可记为λ r,则基于上述预测数据确定每个资源子图对应的预测性能。这里,所述性能也可称为整体系统性能。
在一些可选实施例中,所述基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能,包括:根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
本实施方式中,以预测数据包括三个分量为例,则按照每个分量对应的预设权重进行加权处理,即按照表达式(6)获得对应的预测性能η;其中,Q(·)表示一种函数,其中包括了对每个分量或数据或(关键)性能指标 的加权信息。
η=Q(λ t,λ e,λ r,...)           (6)
由于表达式(6)的函数表达式的具体形式,即预设权重的具体信息,取决于不同场景对时延、能耗、可靠性等的不同要求,或者说重视程度或关注度,即通过使用特定函数给不同性能指标进行加权来实现多种性能指标之间的权衡,根据所设定的公式计算各项关键性能指标的加权值以得到整体系统性能,也就是说,通过表达式(6)得到的预测性能反映了与服务质量(QoS,Quality of Service)相关的整体系统性能。
在一些可选实施例中,所述训练第一网络,包括:基于每种任务分配策略的预测性能和实际性能,训练所述特征提取模块和所述预测模块。
本实施方式中,具体基于每种任务分配策略的预测性能以及训练数据集中的实际性能,更新特征提取模块和预测模块的网络参数,从而实现对所述特征提取模块和所述预测模块的训练。
在一些可选实施例中,所述训练所述特征提取模块和所述预测模块,包括:将每种任务分配策略的预测性能和实际性能的误差进行反向传播,利用梯度下降算法,对第一网络的特征提取模块和预测模块的网络参数进行更新,直至预测性能和实际性能之间的误差满足预设条件。
示例性的,所述预测性能和实际性能之间的误差满足预设条件可以是,预测性能和实际性能之间的误差小于预设阈值。
本实施方式中,基于特征提取模块的多层网络提取到的特征,利用预测模块从不同任务分配策略和系统性能(执行时间、功耗、可靠性)的对应关系中学习多种异构物联网设备的不同操作系统上任务调度的内在统计学规律,实现了在任务执行前对于给定的任务分配策略的系统性能预测,以便于从多个资源子图包含的不同任务分配策略中选择能够得到最佳系统性能的任务分配策略。通过实现待处理计算任务和物联网设备可用资源之 间的最佳匹配,达到资源利用最大化,进而提升整体系统性能。
在本公开的一些可选实施例中,所述方法还包括:更新所述训练数据集,更新后的所述训练数据集用于更新所述第一网络。所述更新所述第一网络即训练所述第一网络,或对第一网络进行优化(即优化第一网络)。其中,训练数据集的更新可包括:更新任务分配策略(例如资源子图)以及更新实际性能;可选地,训练数据集中还可包括计算图;则训练数据集的更新还可包括更新计算图。
在一些可选实施例中,所述更新所述训练数据集,包括以下至少之一:
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后,得到每种资源子图对应的实际性能,将所述计算图、每种资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,通过第一网络得到每个资源子图对应的预测性能,从所述至少一种资源子图中选择预测性能最佳的资源子图,并按照所述预测性能最佳的资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、所述预测性能最佳的资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用随机游走方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、至少一种资源子图和实际性能加入所述训练数据集。
本实施方式中,通过持续更新的训练数据集周期性地训练第一网络,从而使基于第一网络的任务分配系统或平台具有自学习和自适应能力,实现智能自适应,达到“越用越聪明”的效果。
其中,训练数据集的更新至少包括上述几种方式:采用启发式方法、 图搜索方法、图优化方法和子图匹配方法中的至少一种方法获得至少一种任务分配策略,并依据相应的至少一种任务分配策略进行实际的任务执行后,记录对应的实际性能,将所采用的至少一种任务分配策略以及对应的实际性能作为新的样本数据加入训练数据集;第二种方式是首先采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法获得至少一种任务分配策略,通过第一网络确定其中预测性能最佳的任务分配策略,实际按照该预测性能最佳的任务分配策略进行任务执行后,将得到的实际性能进行记录,并将该预测性能最佳的任务分配策略及对应的实际性能作为新的样本数据加入训练数据集;第三种方式是,可以在系统不繁忙的时候通过在资源图上进行随机游走的方式进行待处理任务算子的分配,以生成带有不同分配策略的多种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述任务分配策略和实际性能加入所述训练数据集,从而克服基于贪婪启发式的简单固定且模式受限的资源子图构建方法易落入局部最优解的缺陷,增加任务分配策略的多样性,从而获得更可能得到最佳系统性能的任务分配策略。
可选地,本公开实施例上述资源子图的生成方法(即确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略)可以是一种启发式方法,或者是一种图搜索方法,或者是一种图优化方法,又或者是一种子图匹配方法,本实施例中不限于上述资源子图的生成方法获得任务分配策略,也可采用其他的启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法获得任务分配策略。
本公开实施例还提供了一种基于物联网设备的任务分配方法。图7为 本公开实施例的基于物联网设备的任务分配方法的流程示意图;如图7所示,所述方法包括:
步骤401:确定待处理任务对应的计算图,以及物联网设备对应的资源图;
步骤402:基于所述计算图和所述资源图,生成至少一种任务分配策略;
步骤403:将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
步骤404:确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
本实施例中的第一网络可参照前述网络训练方法实施例中的详细阐述进行优化,以获得优化后的第一网络。
本实施方式中,所述任务分配策略表示将待处理任务分配到至少一个物联网设备执行的策略;换句话说,通过所述任务分配策略,可以确定至少一个物联网设备,通过至少一个物联网设备按照任务分配策略的指示执行待处理任务。可选地,所述任务分配策略也可以称为以下其中一种:任务分配方法、任务分配方式、任务调度策略、任务调度方法、任务调度方式、任务编排策略、任务编排方法、任务编排方式等等。
本实施例步骤401至步骤404具体可参照前述网络训练方法实施例的具体阐述。区别在于,本实施例中,针对每种任务分配策略均通过第一网络获得对应的预测性能,从中选择预测性能最佳的任务分配策略进行任务分配。示例性的,通过第一网络可获得每种任务分配策略对应的系统性能的预测值;选择最大预测值对应的任务分配策略进行任务分配。
在本公开的一些可选实施例中,所述基于所述计算图和所述资源图生成至少一种任务分配策略,包括:基于所述计算图和所述资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略 用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
本实施例具体可参照前述网络训练方法实施例中的详细阐述,这里不再赘述。
在本公开的一些可选实施例中,所述基于所述计算图和所述资源图生成至少一个资源子图,包括:确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
本实施例具体可参照前述网络训练方法实施例中的详细阐述(例如包括图3和图4所示的详细记载),这里不再赘述。
在本公开的一些可选实施例中,所述获得每种任务分配策略对应的预测性能,包括:基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
本实施例具体可参照前述网络训练方法实施例中的详细阐述,这里不再赘述。
在一些可选实施例中,所述通过第一网络得到每个资源子图对应的预测性能,包括:通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
本实施例具体可参照前述网络训练方法实施例中的详细阐述(例如包括图5、图6a和图6b所示的详细记载),这里不再赘述。
在一些可选实施例中,所述基于所述第一特征集、每个第二特征集和 所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
本实施例具体可参照前述网络训练方法实施例中的详细阐述,这里不再赘述。
在一些可选实施例中,所述预测数据包括以下至少之一;执行所述待处理任务的预测执行时长;执行所述待处理任务的预测能耗;执行所述待处理任务的预测可靠性。
本实施例具体可参照前述网络训练方法实施例中的详细阐述,这里不再赘述。
在一些可选实施例中,所述基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能,包括:根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
本实施例具体可参照前述网络训练方法实施例中的详细阐述,这里不再赘述。
在一些可选实施例中,所述确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配,包括:根据每个资源子图对应的预测性能,选择预测性能(即整体系统性能指标的预测值)数值最大的一个对应的任务分配策略,并在实际中按该策略进行任务分配。
在本公开的一些可选实施例中,所述方法还包括:在进行任务分配后,获取所述待处理任务按照相应的任务分配策略被执行时的实际性能;并将相应的任务分配策略及获取的实际性能存储至训练数据集;所述训练数据集用于更新所述第一网络。
本实施方式中,按照上述确定的预测性能最佳的任务分配策略进行任务分配后,将待处理任务按照改任务分配策略实际执行,获得实际性能(或实际整体系统性能),将将相应的任务分配策略及获取的实际性能存储至用于更新第一网络的训练数据集,以构成训练数据集,或者更新训练数据集。
下面结合一个具体的示例对本公开实施例的任务分配进行说明。
图8为本公开实施例的基于物联网设备的任务分配方法的示意图;如图8所示,第一步,针对一个待处理任务,确定待处理任务的计算图;可选地,可对计算图进行优化,例如将某些节点进行合并,得到优化后的计算图。需要说明的是,本公开前述实施例中的计算图,可以是指优化后的计算图。进一步按照一定规则对计算图中的各节点进行编号。
第二步,基于计算图以及根据系统中的各物联网设备的资源和能力情况构建的资源图,生成至少一个资源子图。其中,资源子图的生成方式可参照前述实施例中所述,这里不再赘述。并且图8中所示的资源子图具体可参见图4中的资源子图。
第三步,分别对计算图和资源子图进行预处理,具体是确定计算图对应的输入特征集合(也可称为输入特征,或输入特征矩阵)以及邻接矩阵,以及确定每个资源子图对应的输入特征集合(也可称为输入特征,或输入特征矩阵)以及邻接矩阵,进一步分别将计算图对应的输入特征集合(也可称为输入特征,或输入特征矩阵)以及邻接矩阵输入至特征提取模块进行特征提取,获得第一特征集。将每个资源子图对应的输入特征集合(也可称为输入特征,或输入特征矩阵)以及邻接矩阵输入至特征提取模块进行特征提取,获得第二特征集。其中,示例性的,特征提取模块可通过图卷积神经网络(GCN)实现。
第四步,将第一特征集和第二特征集融合为第三特征集,将第三特征集输入至预测模块进行系统性能预测,得到每个资源子图对应的预测性能。 其中,示例性的,所述预测模块可通过深度神经网络(DNN)实现。
这里,通过预测模块可获得例如至少包括执行所述待处理任务的预测执行时长λ t、执行所述待处理任务的预测能耗λ e、执行所述待处理任务的预测可靠性λ r等性能指标的预测数据,并通过预设权重对这些预测数据进行加权处理,获得每个资源子图对应的预测性能η。进一步从获得的各资源子图对应的预测性能中选择η取值最大的资源子图,并根据其表示的任务分配策略进行任务分配。
可选地,在每次任务实际执行后记录获得的实际性能,通过将预测性能与实际性能进行比较,确定预测性能与实际性能之间的误差,再通过误差反向传播和梯度下降算法可对第一网络所包含的特征提取模块和预测模块的网络参数进行更新,从而实现对特征提取模块和预测模块的训练。
图9为本公开实施例的任务分配系统或平台的组成示意图;如图9所示,本实施例的任务分配系统包括训练数据集构建、训练阶段、推理阶段以及持续学习几个部分。其中:
训练数据集构建:将由不同的待处理任务构建的计算图和多种物联网设备的资源和能力情况构建的资源图作为输入,通过资源子图构建模块构建多个包含不同任务分配策略的资源子图,并将其在物联网设备上进行实际的部署,在任务执行后记录相应的实际性能,将每一组任务分配策略(即资源子图)及其所得到的实际性能作为训练数据,从而完成初始训练数据集的构建。实际情况下,训练数据集中还包括对应于各待处理任务的计算图。
训练阶段:输入为训练数据集中的所有训练数据(又称为训练样本),其中,每个训练数据包括:计算图、资源子图、对应的实际系统性能。将训练数据输入本公开实施例的网络模型中,将得到的系统性能指标预测值η p和系统性能的实际值η t之间的误差通过梯度下降算法进行反向传播,以 更新网络模型中特征提取模块和预测模块的网络参数(例如权重)直至收敛,最终得到的网络参数(或称为模型参数)将是在误差可接受范围内(可在算法中手动设置),能够使得训练样本的系统性能指标预测值与实际值最接近的。
推理阶段:根据训练好的网络模型,基于待处理任务对资源的需求和物联网设备所能提供的能力或资源情况,得到预测性能最佳的任务分配策略,并按照该策略将待处理任务在物联网设备上进行实际的任务分配。输入数据是由待执行任务构建的计算图,以及由具有闲散资源的物联网设备构建的资源图。将两个包含大量关于算力、存储和通信的隐含信息的图结构(即计算图和资源图)输入到网络模型,通过其中的资源子图构建、特征提取和性能预测三个模块,得到对应于各资源子图(即不同任务分配策略)的系统性能预测值。选择系统性能预测值最大(即η p最大)的任务分配策略作为最佳任务分配策略,并按此任务分配策略在实际执行过程中将待处理任务的算子部署在对应的物联网设备上,完成任务的部署和执行。
持续学习阶段:持续学习机制的实现是利用持续更新的训练数据集周期性地训练网络模型中的特征提取模块和预测模块中的网络参数,从而使任务分配系统、平台具有自学习和自适应能力,实现智能自适应,达到“越用越聪明”的效果。具体可采用历史样本积累和随机游走方式实现。历史样本积累是在每次计算任务实际执行后对其采用的任务分配策略以及得到的实际系统性能进行记录,并作为新的训练样本存入训练数据集中。随机游走的具体实施方法如前文所述。
本公开以上各实施例提出了一种使能物联网分布式边缘计算系统高效深度学习的智能计算任务分配方法(ICTA,Intelligent Computing Task Allocation),并基于ICTA构建了跨异构物联网设备的深度学习任务智能分配系统、平台。ICTA主要包括:资源子图构建、特征提取、性能预测。输 入为由当前深度学习任务构建的计算图和有闲散资源的物联网边缘设备构建的资源图,基于两者并采用图搜索和子图匹配等方法进行资源子图构建,以生成多个携带不同任务分配策略的资源子图,实现了对物联网设备上可用资源的充分利用以及待处理任务的算子级分配和优化;利用多层神经网络分别进行特征提取和性能预测,将资源子图和计算图分别放入特征提取模块进行特征提取和融合,充分挖掘两类图中隐藏在节点和图拓扑结构中的关于算力、存储和通信等维度的特征。随后将融合的特征放入性能预测模块中进行系统性能的预测,通过对特征提取和性能预测模块的端到端训练,学习不同任务分配策略和系统性能的对应关系,以及不同操作系统上任务调度的内在统计学规律,实现在任务实际执行前对给定任务分配策略的准确的系统性能预测,以便从备选方案中选择最优分配策略,实现了计算任务和可用资源之间的最佳匹配,从而最大化资源利用率,提升整体系统性能;引入持续学习机制,利用不断更新的训练数据集对ICTA进行周期性训练,进一步提升系统性能及对动态变化的环境的适应性,使其具备自适应和自学习能力,实现智能自适应,使得任务分配系统达到“越用越聪明”的效果。
具体而言,本公开及以上各实施例具有以下技术关键点和优势:
1.构建了基于智能的任务分配方法的跨异构物联网设备的深度学习任务智能分配系统、平台,包含训练数据集的构建阶段、训练阶段、推理阶段、持续学习阶段。为跨异构物联网设备的深度学习模型分布式训练和推理提供了构建思路,促进生成端到端自动优化的跨物联网异构设备的分布式边缘计算生态模式;
2.提出智能的计算任务分配方法(ICTA),使能物联网分布式边缘计算系统中的高效深度学习,包含资源子图构建、特征提取、性能预测等,实现跨异构物联网设备的高性能且智能自适应的深度学习计算任务的最优分 配;
3.特征提取模块分别提取资源子图和计算图的节点和拓扑结构特征,并进行特征融合。实现对深度学习计算任务的性能起到关键作用的算力、存储、通信等维度特征的深层感知、特征提取和特征匹配;
4.性能预测是基于融合特征、利用多层神经网络学习不同操作系统上任务调度的内在统计学规律,通过对特征提取和性能预测模块的端到端训练,挖掘不同任务分配策略和系统性能之间的对应关系,进行系统性能的预测,以便从备选方案中选择对应最佳系统性能(预测)的任务分配策略,来实际地执行深度学习任务。实现深度学习计算任务和物联网设备上可用资源之间的最佳匹配,从而最大化资源利用,提升系统性能。
5.持续学习机制的实现主要有两种方式:历史样本积累、随机游走,基于这两种方式不断更新训练集,并对特征提取模块和预测模块进行周期性地训练,以提升系统性能,并适应环境的动态变化,使其具备自适应和自学习能力,实现智能自适应,达到“越用越聪明”的效果。
本公开实施例提供了一种基于物联网设备的网络训练装置。图10为本公开实施例基于物联网设备的网络训练装置的组成结构示意图一;如图10所示,所述装置包括:第一确定单元11和训练单元12;其中,
所述第一确定单元11,配置为确定训练数据集,所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;
所述训练单元12,配置为基于所述训练数据集训练第一网络;所述第一网络用于预测任务分配策略的性能。
在本公开的一些可选实施例中,如图11所示,所述装置还包括第一生成单元13,配置为确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
在本公开的一些可选实施例中,所述第一生成单元13,配置为基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述第一生成单元13,配置为确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
在本公开的一些可选实施例中,所述训练单元12,配置为基于至少一种任务分配策略的预测性能和实际性能,训练所述第一网络。
在本公开的一些可选实施例中,所述训练单元12,还配置为基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元12,配置为通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元12,配置为基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述训练单元12,配置为根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述训练单元12,配置为基于每种任务分配策略的预测性能和实际性能,训练所述特征提取模块和所述预测模块。
在本公开的一些可选实施例中,所述训练单元12,配置为将每种任务分配策略的预测性能和实际性能的误差进行反向传播,利用梯度下降算法,对第一网络的特征提取模块和预测模块的网络参数进行更新,直至预测性能和实际性能之间的误差满足预设条件。
在本公开的一些可选实施例中,如图12所示,所述装置还包括更新单元14,配置为更新所述训练数据集,更新后的所述训练数据集用于更新(或称为训练)所述第一网络。
在本公开的一些可选实施例中,所述更新单元14,配置为采用以下至少一种方式更新所述训练数据集:
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后,得到每种资源子图对应的实际性能,将所述计算图、每种资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,通过第一网络得 到每个资源子图对应的预测性能,从所述至少一种资源子图中选择预测性能最佳的资源子图,并按照所述预测性能最佳的资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、所述预测性能最佳的资源子图和对应的实际性能加入所述训练数据集;
基于计算图和资源图,采用随机游走方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、至少一种资源子图和实际性能加入所述训练数据集。
其中,本公开实施例中所述的资源子图的生成方法可以是一种启发式方法,或者是一种图搜索方法,又或者是一种子图匹配方法,本实施例中不限于上述资源子图的生成方法生成任务分配策略,也可采用其他的启发式方法、图搜索方法和子图匹配方法中的至少一种方法生成任务分配策略。
本公开实施例中,所述装置中的第一确定单元11、训练单元12、第一生成单元13和更新单元14,在实际应用中均可由CPU、GPU、DSP、微控制单元(MCU,Microcontroller Unit)或FPGA、TPU、ASIC、或AI芯片等实现。
需要说明的是:上述实施例提供的基于物联网设备的网络训练装置在进行网络训练时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的基于物联网设备的网络训练装置与基于物联网设备的网络训练方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本公开实施例还提供了一种基于物联网设备的任务分配装置。图13为本公开实施例基于物联网设备的任务分配装置的组成结构示意图;如图13所示,所述装置包括:第二确定单元21、第二生成单元22、预测单元23 和任务分配单元24;其中,
所述第二确定单元21,配置为确定待处理任务对应的计算图,以及物联网设备对应的资源图;
所述第二生成单元22,配置为基于所述计算图和所述资源图,生成至少一种任务分配策略;
所述预测单元23,配置为将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
所述任务分配单元24,配置为确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
在本公开的一些可选实施例中,所述第二生成单元22,配置为基于所述计算图和所述资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
在本公开的一些可选实施例中,所述第二生成单元22,配置为确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
在本公开的一些可选实施例中,所述第一网络采用本公开前述实施例所述的网络训练装置进行优化。
在本公开的一些可选实施例中,所述预测单元23,配置为基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测单元23,配置为通过所述第 一网络的特征提取模块提取所述计算图的特征,得到第一特征集;通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测单元23,配置为基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述预测数据包括以下至少之一;
执行所述待处理任务的预测执行时长;
执行所述待处理任务的预测能耗;
执行所述待处理任务的预测可靠性。
在本公开的一些可选实施例中,所述预测单元23,配置为根据预设权重,对每个资源子图对应的预测数据进行加权处理,获得每个资源子图对应的预测性能。
在本公开的一些可选实施例中,所述任务分配单元24,配置为确定预测性能最佳的任务分配策略,并依据该策略在实际中进行任务分配和执行。
在本公开的一些可选实施例中,所述装置还包括:获取单元,配置为在进行任务分配后,获取所述待处理任务按照相应的任务分配策略被执行时的实际性能;并将相应的任务分配策略及获取的实际性能存储至训练数据集;所述训练数据集用于更新所述第一网络。
本公开实施例中,所述装置中的第二确定单元21、第二生成单元22、预测单元23、任务分配单元24和获取单元,在实际应用中均可由CPU、GPU、DSP、ASIC、AI芯片、MCU或FPGA等实现。
需要说明的是:上述实施例提供的基于物联网设备的任务分配装置在进行任务分配时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的基于物联网设备的任务分配装置与基于物联网设备的任务分配方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本公开实施例还提供了一种电子设备。图14为本公开实施例的电子设备的硬件组成结构示意图,如图14所示,所述电子设备包括存储器32、处理器31及存储在存储器32上并可在处理器31上运行的计算机程序,所述处理器31执行所述程序时实现本公开前述实施例所述的网络训练方法的步骤;或者,所述处理器执行所述程序时实现本公开前述实施例所述的任务分配方法的步骤。
可以理解,电子设备中的各个组件通过总线系统33耦合在一起。可理解,总线系统33用于实现这些组件之间的连接通信。总线系统33除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图14中将各种总线都标为总线系统33。
可以理解,存储器32可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc  Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本公开实施例描述的存储器32旨在包括但不限于这些和任意其它合适类型的存储器。
上述本公开实施例揭示的方法可以应用于处理器31中,或者由处理器31实现。处理器31可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器31中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器31可以是通用处理器、DSP,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器31可以实现或者执行本公开实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本公开实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器32,处理器31读取存储器32中的信息,结合其硬件完成前述方法的步骤。
在示例性实施例中,电子设备可以被一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、FPGA、通用处理器、控制器、MCU、微处理器(Microprocessor)、或其他电子元件实现,用于执行前述方法。
在示例性实施例中,本公开实施例还提供了一种计算机可读存储介质,例如包括计算机程序的存储器32,上述计算机程序可由电子设备的处理器31执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器;也可以是包括上述存储器之一或任意组合的各种设备。
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开前述实施例所述的网络训练方法的步骤;或者,该程序被处理器执行时实现本公开前述实施例所述的任务分配方法的步骤。
本公开所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本公开所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本公开所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
本公开的说明书实施例和权利要求书及上述附图中的术语“第一”、“第二”、和“第三”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产 品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本公开所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本公开各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本公开上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开实施例的技术方案本质上或者说对现有技术做出 贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本公开各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (24)

  1. 一种基于物联网设备的网络训练方法,所述方法包括:
    确定训练数据集,基于所述训练数据集训练第一网络;所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;所述第一网络用于预测任务分配策略的性能。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
  3. 根据权利要求2所述的方法,其中,所述基于所述计算图和资源图,生成至少一种任务分配策略,包括:
    基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
  4. 根据权利要求3所述的方法,其中,所述基于所述计算图和资源图生成至少一个资源子图,包括:
    确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;
    确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;
    基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
  5. 根据权利要求1所述的方法,其中,所述训练第一网络,包括:
    基于至少一种任务分配策略的预测性能和实际性能,训练所述第一网 络;
    其中,基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
  6. 根据权利要求5所述的方法,其中,所述通过第一网络得到每个资源子图对应的预测性能,包括:
    通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;
    通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;
    基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
  7. 根据权利要求6所述的方法,其中,所述基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:
    基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;
    基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
  8. 根据权利要求7所述方法,其中,所述训练第一网络,包括:
    基于每种任务分配策略的预测性能和实际性能,训练所述特征提取模块和所述预测模块。
  9. 根据权利要求8所述方法,其中,所述训练所述特征提取模块和所述预测模块,包括:
    将每种任务分配策略的预测性能和实际性能的误差进行反向传播,利用梯度下降算法,对第一网络的特征提取模块和预测模块的网络参数进行 更新,直至预测性能和实际性能之间的误差满足预设条件。
  10. 根据权利要求1所述的方法,其中,所述方法还包括:更新所述训练数据集;
    所述更新所述训练数据集,包括以下至少之一:
    基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后,得到每种资源子图对应的实际性能,将所述计算图、每种资源子图和对应的实际性能加入所述训练数据集;
    基于计算图和资源图,采用启发式方法、图搜索方法、图优化方法和子图匹配方法中的至少一种方法生成至少一种资源子图,通过第一网络得到每个资源子图对应的预测性能,从所述至少一种资源子图中选择预测性能最佳的资源子图,并按照所述预测性能最佳的资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、所述预测性能最佳的资源子图和对应的实际性能加入所述训练数据集;
    基于计算图和资源图,采用随机游走方法生成至少一种资源子图,并按照每种资源子图对应的任务分配策略实际执行后得到实际性能,将所述计算图、至少一种资源子图和实际性能加入所述训练数据集。
  11. 一种基于物联网设备的任务分配方法,所述方法包括:
    确定待处理任务对应的计算图,以及物联网设备对应的资源图;
    基于所述计算图和所述资源图,生成至少一种任务分配策略;
    将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
    确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
  12. 根据权利要求11所述的方法,其中,所述基于所述计算图和所述 资源图生成至少一种任务分配策略,包括:
    基于所述计算图和所述资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
  13. 根据权利要求12所述的方法,其中,所述基于所述计算图和所述资源图生成至少一个资源子图,包括:
    确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;
    确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;
    基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
  14. 根据权利要求11所述的方法,其中,所述获得每种任务分配策略对应的预测性能,包括:
    基于计算图和每个资源子图,通过第一网络得到每个资源子图对应的预测性能。
  15. 根据权利要求14所述的方法,其中,所述通过第一网络得到每个资源子图对应的预测性能,包括:
    通过所述第一网络的特征提取模块提取所述计算图的特征,得到第一特征集;
    通过所述特征提取模块分别提取所述至少一个资源子图的特征,得到至少一个第二特征集;
    基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能。
  16. 根据权利要求15所述的方法,其中,所述基于所述第一特征集、每个第二特征集和所述第一网络的预测模块获得每个资源子图对应的预测性能,包括:
    基于所述第一特征集和每个第二特征集,获得至少一个第三特征集,每个第三特征集包括所述第一特征集和每个第二特征集;
    基于每个第三特征集和所述预测模块获得每个资源子图对应的预测数据,基于每个资源子图对应的预测数据获得每个资源子图对应的预测性能。
  17. 根据权利要求11所述的方法,其中,所述方法还包括:
    在进行任务分配后,获取所述待处理任务按照相应的任务分配策略被执行时的实际性能;并将相应的任务分配策略及获取的实际性能存储至训练数据集;所述训练数据集用于更新所述第一网络。
  18. 一种基于物联网设备的网络训练装置,所述装置包括:第一确定单元和训练单元;其中,
    所述第一确定单元,配置为确定训练数据集,所述训练数据集中包括至少一种任务分配策略及对应的实际性能;一个实际性能是基于对应的任务分配策略进行实际执行而获得;
    所述训练单元,配置为基于所述训练数据集训练第一网络;所述第一网络用于预测任务分配策略的性能。
  19. 根据权利要求18所述的装置,其中,所述装置还包括第一生成单元,配置为确定待处理任务对应的计算图,以及物联网设备对应的资源图,基于所述计算图和资源图,生成至少一种任务分配策略。
  20. 根据权利要求19所述的装置,其中,所述第一生成单元,配置为基于所述计算图和资源图生成至少一个资源子图,每个资源子图包含一种任务分配策略;所述任务分配策略用于为所述计算图的每个节点分配相应资源图的至少一个节点;资源子图中的一个节点表示物联网设备的至少部 分能力;资源子图中两个相邻节点的边表示物联网设备的至少部分能力之间的关系。
  21. 根据权利要求19所述的装置,其中,所述第一生成单元,配置为确定所述计算图中的第一节点;所述第一节点为资源需求最大的节点;确定所述资源图中的至少一个第二节点;所述至少一个第二节点为满足所述第一节点的资源需求的节点;基于每个第二节点确定一个资源子图,每个资源子图包含一种任务分配策略。
  22. 一种基于物联网设备的任务分配装置,所述装置包括:第二确定单元、第二生成单元、预测单元和任务分配单元;其中,
    所述第二确定单元,配置为确定待处理任务对应的计算图,以及物联网设备对应的资源图;
    所述第二生成单元,配置为基于所述计算图和所述资源图,生成至少一种任务分配策略;
    所述预测单元,配置为将所述至少一种任务分配策略输入第一网络,获得每种任务分配策略对应的预测性能;
    所述任务分配单元,配置为确定预测性能最佳的任务分配策略,基于确定的任务分配策略进行任务分配。
  23. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现权利要求1至10任一项所述方法的步骤;或者,该程序被处理器执行时实现权利要求11至17任一项所述方法的步骤。
  24. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现权利要求1至10任一项所述方法的步骤;或者,所述处理器执行所述程序时实现权利要求11至17任一项所述方法的步骤。
PCT/CN2022/075450 2021-02-10 2022-02-08 基于物联网设备的任务分配方法、网络训练方法及装置 WO2022171066A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22752227.3A EP4290824A1 (en) 2021-02-10 2022-02-08 Task allocation method and apparatus based on internet-of-things device, and network training method and apparatus
JP2023548262A JP2024506073A (ja) 2021-02-10 2022-02-08 モノのインターネット機器ベースのタスク割り当て方法、ネットワーク訓練方法及びその装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110184998.5A CN114915630B (zh) 2021-02-10 2021-02-10 基于物联网设备的任务分配方法、网络训练方法及装置
CN202110184998.5 2021-02-10

Publications (1)

Publication Number Publication Date
WO2022171066A1 true WO2022171066A1 (zh) 2022-08-18

Family

ID=82761021

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075450 WO2022171066A1 (zh) 2021-02-10 2022-02-08 基于物联网设备的任务分配方法、网络训练方法及装置

Country Status (4)

Country Link
EP (1) EP4290824A1 (zh)
JP (1) JP2024506073A (zh)
CN (1) CN114915630B (zh)
WO (1) WO2022171066A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421930A (zh) * 2022-11-07 2022-12-02 山东海量信息技术研究院 任务处理方法、系统、装置、设备及计算机可读存储介质
CN116545892A (zh) * 2023-06-29 2023-08-04 云账户技术(天津)有限公司 业务瓶颈节点的确定方法、装置、电子设备及存储介质
CN116662010A (zh) * 2023-06-14 2023-08-29 肇庆学院 基于分布式系统环境下的动态资源分配方法及系统
CN117272838A (zh) * 2023-11-17 2023-12-22 恒海云技术集团有限公司 一种政务大数据平台数据采集优化方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701001B (zh) * 2023-08-08 2023-10-20 国网浙江省电力有限公司信息通信分公司 目标任务分配方法、装置、电子设备及存储介质
CN117573382B (zh) * 2024-01-17 2024-03-29 国网浙江省电力有限公司丽水供电公司 一种数据采集任务编排方法、装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111562972A (zh) * 2020-04-24 2020-08-21 西北工业大学 一种面向群智感知的泛在操作系统
CN111782354A (zh) * 2020-05-29 2020-10-16 国网江苏省电力有限公司信息通信分公司 一种基于强化学习的集中式数据处理时间优化方法
US20210027197A1 (en) * 2018-01-30 2021-01-28 Deepmind Technologies Limited Dynamic placement of computation sub-graphs

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109656702B (zh) * 2018-12-20 2022-10-04 西安电子科技大学 一种基于强化学习的跨数据中心网络任务调度方法
CN109753751B (zh) * 2019-01-20 2023-04-18 北京工业大学 一种基于机器学习的mec随机任务迁移方法
CN112187859B (zh) * 2020-08-24 2022-05-24 国网浙江省电力有限公司信息通信分公司 物联网业务与边缘网络能力动态映射的方法及电子设备
CN112328378B (zh) * 2020-11-05 2023-03-24 南京星环智能科技有限公司 任务调度方法、计算机设备及存储介质
CN112101530B (zh) * 2020-11-10 2021-02-26 南京集成电路产业服务中心有限公司 神经网络训练方法、装置、设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210027197A1 (en) * 2018-01-30 2021-01-28 Deepmind Technologies Limited Dynamic placement of computation sub-graphs
CN111562972A (zh) * 2020-04-24 2020-08-21 西北工业大学 一种面向群智感知的泛在操作系统
CN111782354A (zh) * 2020-05-29 2020-10-16 国网江苏省电力有限公司信息通信分公司 一种基于强化学习的集中式数据处理时间优化方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI DI-FEI, DI TIAN, XIONG-WEI HU: " A method of deep learning based on distributed memory computing", JOURNAL OF JILIN UNIVERSITY (ENGINEERING AND TECHNOLOGY EDITION), vol. 45, no. 3, 15 May 2015 (2015-05-15), XP055957885 *
WANG LI, GUO ZHEN-HUA; CAO FANG; G AO KAI; ZHAO YA-QIAN: "An Automatic Model Splitting Strategy Generation Method for Model Parallel Training", COMPUTER ENGINEERING AND SCIENCE, GUOFANG KEJI DAXUE JISUANJI XUEYUAN, CN, vol. 42, no. 9, 15 June 2020 (2020-06-15), CN , XP055957887, ISSN: 1007-130X *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115421930A (zh) * 2022-11-07 2022-12-02 山东海量信息技术研究院 任务处理方法、系统、装置、设备及计算机可读存储介质
CN115421930B (zh) * 2022-11-07 2023-03-24 山东海量信息技术研究院 任务处理方法、系统、装置、设备及计算机可读存储介质
CN116662010A (zh) * 2023-06-14 2023-08-29 肇庆学院 基于分布式系统环境下的动态资源分配方法及系统
CN116662010B (zh) * 2023-06-14 2024-05-07 肇庆学院 基于分布式系统环境下的动态资源分配方法及系统
CN116545892A (zh) * 2023-06-29 2023-08-04 云账户技术(天津)有限公司 业务瓶颈节点的确定方法、装置、电子设备及存储介质
CN116545892B (zh) * 2023-06-29 2023-10-13 云账户技术(天津)有限公司 业务瓶颈节点的确定方法、装置、电子设备及存储介质
CN117272838A (zh) * 2023-11-17 2023-12-22 恒海云技术集团有限公司 一种政务大数据平台数据采集优化方法
CN117272838B (zh) * 2023-11-17 2024-02-02 恒海云技术集团有限公司 一种政务大数据平台数据采集优化方法

Also Published As

Publication number Publication date
EP4290824A1 (en) 2023-12-13
CN114915630A (zh) 2022-08-16
JP2024506073A (ja) 2024-02-08
CN114915630B (zh) 2023-10-31

Similar Documents

Publication Publication Date Title
WO2022171066A1 (zh) 基于物联网设备的任务分配方法、网络训练方法及装置
Zhou et al. Edge intelligence: Paving the last mile of artificial intelligence with edge computing
Lu et al. IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds
Movahedi et al. An efficient population-based multi-objective task scheduling approach in fog computing systems
EP4293965A1 (en) Information processing method, apparatus, system, electronic device and storage medium
JP2022511716A (ja) 非集中的な分散型深層学習
Memari et al. A latency-aware task scheduling algorithm for allocating virtual machines in a cost-effective and time-sensitive fog-cloud architecture
Shooshtarian et al. A clustering-based approach to efficient resource allocation in fog computing
Kaur et al. Load balancing optimization based on deep learning approach in cloud environment
Wu et al. Optimal deploying IoT services on the fog computing: A metaheuristic-based multi-objective approach
Liu et al. An optimized human resource management model for cloud-edge computing in the internet of things
Lone et al. A review on offloading in fog-based Internet of Things: Architecture, machine learning approaches, and open issues
Wang et al. Deep Reinforcement Learning-based scheduling for optimizing system load and response time in edge and fog computing environments
Zhang A computing allocation strategy for Internet of things’ resources based on edge computing
Salehnia et al. An optimal task scheduling method in IoT-Fog-Cloud network using multi-objective moth-flame algorithm
Ghobaei‐Arani et al. Deploying IoT services on the fog infrastructure: a graph partitioning‐based approach
Aqib et al. Machine learning for fog computing: review, opportunities and a fog application classifier and scheduler
Xu et al. A meta reinforcement learning-based virtual machine placement algorithm in mobile edge computing
WO2021115082A1 (zh) 作业调度方法以及作业调度装置
Asghari et al. Bi-objective cloud resource management for dependent tasks using Q-learning and NSGA-3
WO2023143570A1 (zh) 一种连接关系预测方法及相关设备
Qi Fuzzy logic hybridized artificial intelligence for computing and networking on internet of things platform
Afrasiabi et al. Reinforcement learning-based optimization framework for application component migration in nfv cloud-fog environments
WO2021227757A1 (en) Optimal placement of data structures in a hybrid memory based inference computing platform
Ramanathan et al. A survey on time-sensitive resource allocation in the cloud continuum

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22752227

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023548262

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2022752227

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022752227

Country of ref document: EP

Effective date: 20230908