GB2571342A

GB2571342A - Artificial Neural Networks

Info

Publication number: GB2571342A
Application number: GB1803083.3A
Authority: GB
Inventors: Aytekin Caglar; Cricri Francesco; Baris Aksu Emre
Original assignee: Nokia Technologies Oy
Current assignee: Nokia Technologies Oy
Priority date: 2018-02-26
Filing date: 2018-02-26
Publication date: 2019-08-28
Also published as: GB201803083D0; WO2019162568A1

Abstract

A system comprises a controller and a plurality of communication devices 23-27, wherein the controller transmits signalling data to each device over a communications network. The signalling data causes each device to initialise a subset of local layers of a multi-layer artificial neural network architecture, with each device providing a different combination of layers. Assigning a combination of layers to each device forms a distributed neural network. The signalling data may indicate the topology of the layers, and this may define the number of nodes in each layer, the type of each layer, and the weight associated with each layer. The signalling data may also define the processing operations to be performed by the nodes in each layer. The controller may receive the computing resources and capability of each device, and may re-assign layers if an assigned device is incapable of providing a layer. The neural network architecture may be implemented in Internet of Things (IoT) devices. The method, apparatus and computer program of both the controller and one of the devices are claimed.

Description

Example embodiments relate to artificial neural networks.

Background

An artificial neural network (“neural network”) is a computer system inspired by the biological neural networks in human brains. A neural network maybe considered a particular kind of algorithm or architecture used in machine learning. A neural network may comprise a plurality of discrete elements called “artificial neurons” which maybe connected to one another in various ways, in order that the strengths or weights of the connections may be adjusted with the aim of optimising the neural networks performance on a task in question. The artificial neurons may be organised into layers, typically an input layer, one or more hidden layers, and an output layer. The output from one layer becomes the input to the next layer and so on until the output is produced by the final layer.

Known applications of neural networks include pattern recognition, image processing and classification, merely given by way of example.

Summary

An example embodiment discloses n apparatus comprising: means for receiving signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and means for initialising on the apparatus one or more local layers, being a subset of the layers of the neural network architecture, based on the received signalling data.

The received signalling data may indicate a prior assignment of local layers for the apparatus, the initialising means initialising said assigned local layers. The received signalling data may further define a topology of the one or more local layers, and the initialising means may be configured to initialise the one or more local layers with the signalled topology. The received signalling data may defines the topology of each local layer by means of defining at least the number of nodes in each layer. The received signalling data may define the topology of each local layer by defining a type of layer. The received signalling data may define the topology of each of the local layers by defining weights associated with the one or more layers.

The received signalling data may further define one or more processing operations to be performed by one or more nodes of each local layer, and wherein the initialising means may be configured to initialise the nodes to perform said processing operations.

The received signalling data may indicate a type of neural network architecture or a neural network task, the initialising means being arranged to determine the one or more local layers to initialise based on prior knowledge of neural network architectures appropriate to the type of neural network architecture or neural network task.

The initialising means may be further configured to determine locally a topology of the one or more local layers based on prior information.

The apparatus may further comprise means for transmitting, to a controller device, data indicative of said apparatus’s capability to implement one or more layers of the neural network architecture, the received signalling data from the controller device being at least in part based on the transmitted capability data.

The transmitted capability data may be indicative of one or both of the device’s computational capability and memory capability.

The transmitted capability data may be indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

The apparatus may further comprise means to transmit output data produced by a final layer of the one or more local layers to one or more other apparatuses providing another layer of the neural network architecture.

The transmitting means may be configured to transmit, with the output data, data indicative of the processing order of the final local layer to the one or more other apparatuses for selective processing of the output data by another apparatus based on the processing order.

The processing order data may be received with the signalling data from the control apparatus.

The transmitting means may be configured to transmit the output data and the processing order data in a broadcast or multicast signal to the one or more other apparatuses.

The received signalling data may further identify one or more other apparatuses as implementing a subsequent layer to which data generated by the final layer is to be transmitted.

The apparatus may further comprise means to receive, from one or more other apparatuses, data indicative of output produced by, and an associated processing order of, one or more layers of the other apparatus, the receiving means being configured to selectively process the received output data as input data to a local layer based on the received processing order data.

The receiving means may be configured to selectively use the received output data as input data only if the apparatus has current capability to process the received output data as input data using said local layer.

The apparatus may further comprise means to self-train the one or more layers.

Another example embodiment may provide an apparatus comprising: means for transmitting signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more devices based on the signalling data.

The signalling means may be configured to transmit signalling data for assigning one or more layers of the neural network architecture to the different devices such that each device is assigned a different layer or a different combination of layers of the neural network architecture for implementation thereat. The signalling means may be configured to transmit signalling data which further defines a topology of each assigned layer for implementation at the respective devices.

The signalling means may be configured to transmit signalling data which defines the topology of each layer by means of defining the number of nodes in the layer.

The signalling means may be configured to transmit signalling data which defines the topology of each layer by defining a type of layer. The signalling means maybe configured to transmit signalling data which defines the type of layer as one of an input layer, an intermediate layer or an output layer.

The signalling data may define the topology of one or more layers by defining weights associated with the one or more layers. The signalling data may further define one or more processing operations to be performed by one or more nodes of each layer. The signalling data may further indicate the processing order of each layer.

The apparatus may further comprise means to receive, from one or more of the devices, data indicative of said device’s capability to implement one or more layers of the neural network architecture, and means for generating the signalling data at least in part based on the capability data from said one or more devices.

The received capability data maybe indicative of one or both of the device’s computational capability and memory capability. The received capability data may be indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

The apparatus may further comprising means for determining from the capability data that a device is incapable of providing one or more layers of the neural network architecture, and for re-assigning the one or more layers to another device having sufficient capability.

The signalling data may indicate an assignment of one or more of the same layers to different devices, and wherein if the capability data may be indicative that a device is incapable of providing said layer, the apparatus is configured to cause a re-direction of data destined for said layer to the same layer on another device.

The means in any above apparatus may comprise: at least one processor; and at least one memory including computer program code, the at least one memory and computer program code configured to, with the at least one processor, cause the performance of the apparatus.

Another example embodiment provides a method comprising: receiving signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and initialising one or more local layers, being a subset of the layers of the neural network architecture, based on the received signalling data.

The received signalling data may indicate a prior assignment of local layers for the apparatus. The received signalling data may further define a topology of the one or more local layers. The received signalling data may define the topology of each local layer by means of defining at least the number of nodes in each layer. The received signalling data may define the topology of each local layer by defining a type of layer. The received signalling data may define the topology of each of the local layers by defining weights associated with the one or more layers. The received signalling data may further define one or more processing operations to be performed by one or more nodes of each local layer.

The received signalling data may indicate a type of neural network architecture or a neural network task, and wherein initialising the one or more local layers comprises determining the one or more local layers to initialise based on prior knowledge of neural network architectures appropriate to the type of neural network architecture or neural network task.

The method may further comprise determining locally a topology of the one or more local layers based on prior information.

The method may further comprise transmitting, to a controller device, data indicative of said apparatus’s capability to implement one or more layers of the neural network architecture, the received signalling data from the controller device being at least in part based on the transmitted capability data.

The method may further comprise transmitting output data produced by a final layer of the one or more local layers to one or more other apparatuses providing another layer of the neural network architecture.

The method may further comprise transmitting, with the output data, data indicative of the processing order of the final local layer to the one or more other apparatuses for selective processing of the output data by another apparatus based on the processing order. The processing order data may be received with the signalling data from the control apparatus.

The method may further comprise transmitting the output data and the processing order data in a broadcast or multicast signal to the one or more other apparatuses.

The method may further comprise receiving, from one or more other apparatuses, data indicative of output produced by, and an associated processing order of, one or more layers of the other apparatus, and selectively processing the received output data as input data to a local layer based on the received processing order data.

The method may further comprise selectively using the received output data as input data only if the apparatus has current capability to process the received output data as input data using said local layer.

The method may further comprise self-training the one or more layers.

Another example embodiment provides a method comprising: transmitting signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more devices based on the signalling data.

The method may further comprise transmitting signalling data for assigning one or more layers of the neural network architecture to the different devices such that each device is assigned a different layer or a different combination of layers of the neural network architecture for implementation thereat.

The signalling data may further define a topology of each assigned layer for implementation at the respective devices. The signalling data may defines the topology of each layer by means of defining the number of nodes in the layer. The signalling data may defines the topology of each layer by defining a type of layer. The signalling data may define the type of layer as one of an input layer, an intermediate layer or an output layer. The signalling data may defines the topology of one or more layers by defining weights associated with the one or more layers. The signalling data may further define one or more processing operations to be performed by one or more nodes of each layer. The signalling data may further indicate the processing order of each layer.

The method may further comprise receiving, from one or more of the devices, data indicative of said device’s capability to implement one or more layers of the neural network architecture, and generating the signalling data at least in part based on the capability data from said one or more devices.

The received capability data may be indicative of one or both of the device’s computational capability and memory capability. The received capability data may be indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load. The method may further comprise determining from the capability data that a device is incapable of providing one or more layers of the neural network architecture, and re-assigning the one or more layers to another device having sufficient capability.

The signalling data may indicate an assignment of one or more of the same layers to different devices, and wherein if the capability data is indicative that a device is incapable of providing said layer, the apparatus is configured to cause a re-direction of data destined for said layer to the same layer on another device.

Another example embodiment may provide a computer program comprising instructions for causing an apparatus to perform at least the following: receiving signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and initialising one or more local layers, being a subset of the layers of the neural network architecture, based on the received signalling data.

Another example embodiment may provide a computer program comprising instructions for causing an apparatus to perform at least the following: transmitting signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more devices based on the signalling data.

Another aspect provides a non-transitory computer-readable medium having stored thereon computer-readable code, which, when executed by at least one processor, causes the at least one processor to perform a method according to any above method definition.

Another aspect provides an apparatus, the apparatus having at least one processor and at least one memory having computer-readable code stored thereon which when executed controls the at least one processor: to receive signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and to initialise one or more local layers, being a subset of the layers of the neural network architecture, based on the received signalling data.

Another aspect provides an apparatus, the apparatus having at least one processor and at least one memory having computer-readable code stored thereon which when executed controls the at least one processor: to transmit signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more deGees based on the signalling data.

Another example embodiment may provide a system comprising: controller means; and a plurality of communications devices, wherein the controller means is configured to transmit signalling data to the plurality of communications device over a communications network, the signalling data being indicative of at least part of a neural network architecture, for causing each of the communications devices to initialise one or more local layers, being a subset of layers of the neural network architecture, such that each communications device provides a different layer, or a different combination of, layers than the other communications devices.

Brief Description of the Drawings

Example embodiments will now be described with reference to the accompanying drawings, in which:

FIG. 1 is a graphical representation of an example neural network architecture;

FIG. 2 is a computer system comprising a plurality of networked devices and a control apparatus according to some example embodiments;

FIG. 3 is a schematic diagram of components of the FIG. 2 control apparatus according to some example embodiments;

FIG. 4 is a flow diagram showing example operations performed at the FIG.2 control apparatus according to some example embodiments;

FIG. 5 is a flow diagram showing other example operations performed at the FIG.2 control apparatus according to some example embodiments;

FIG. 6 is a flow diagram showing other example operations performed at the FIG.2 control apparatus according to some example embodiments;

FIG. 7 is a flow diagram showing example operations performed at one of the FIG. 2 devices according to some example embodiments;

FIG. 8 is a flow diagram showing example other operations performed at one of the FIG. 2 devices according to some example embodiments;

FIG. 9 is a flow diagram showing example other operations performed at one of the FIG. 2 devices according to some example embodiments;

FIG. io is a schematic diagram showing the FIG. 2 plurality of networked devices on which different combinations of layers of a neural network architecture are initialised;

FIG. 11 is a schematic diagram similar to FIG. io, showing the flow of data between the layers of the neural network architecture during a training or inference phase; and

FIG. 12 is a flow diagram showing example operations performed at one of the FIG. 2 devices for determining whether to process device broadcast from another such device, according to some example embodiments.

Detailed Description

In the description and drawings, like reference numerals refer to like elements throughout.

An artificial neural network (“neural network”) is a computer system inspired by the biological neural networks in human brains. A neural network maybe considered a particular kind of computational graph or architecture used in machine learning. A neural network may comprise a plurality of discrete processing elements called “artificial neurons” which may be connected to one another in various ways, in order that the strengths or weights of the connections may be adjusted with the aim of optimising the neural network’s performance on a task in question. The artificial neurons maybe organised into layers, typically an input layer, one or more intermediate or hidden layers, and an output layer. The output from one layer becomes the input to the next layer, and so on, until the output is produced by the final layer.

For example, in image processing, the input layer and one or more intermediate layers close to the input layer may extract semantically low-level features, such as edges and textures. Later intermediate layers may extract higher-level features. There maybe one or more intermediate layers, or a final layer, that performs a certain task on the extracted high-level features, such as classification, semantic segmentation, object detection, de-noising, style transferring, super-resolution processing and so on.

Neural networks may be termed “shallow” or “deep” which generally reflects the number of layers. A shallow neural network may in theory comprise only an input and an output layer. A deep neural network may comprise many hidden layers, possibly running into hundreds or thousands, depending on the complexity of the neural network.

It follows that the amount of processing and storage required for somewhat complex tasks may be very high.

Artificial neurons are sometimes referred to as “nodes”. Nodes perform processing operations, often non-linear. The strengths or weights between nodes are typically represented by numerical data and may be considered as weighted connections between nodes of different layers. There may be one or more other inputs called bias inputs.

There are a number of different architectures of neural network, some of which will be briefly mentioned here.

The term architecture (alternatively topology) refers to characteristics of the neural network, for example how many layers it comprises, the number of nodes in a layer, how the artificial neurons are connected within or between layers and may also refer to characteristics of weights and biases applied, such as how many weights or biases there are, whether they use integer precision, floating point precision etc. It defines at least part of the structure of the neural network. Learned characteristics such as the actual values of weights or biases may not form part of the architecture.

The architecture or topology may also refer to characteristics of a particular layer of the neural network, for example one or more of its type (e.g. input, intermediate or output layer, convolutional), the number of nodes in the layer,, the processing operations to be performed by each node etc.

For example, a feedforward neural network (FFNN) is one where connections between nodes do not form a cycle, unlike recurrent neural networks. The feedforward neural network is perhaps the simplest type of neural network in that data or information moves in one direction, forwards from the input node or nodes, through hidden layer nodes (if any) to the one or more output nodes. There are no cycles or loops. Feedforward neural networks may be used in applications such as computer vision and speech recognition, and generally to classification applications.

For example, a convolutional neural network (CNN) is an architecture different from the feedforward type in that the connections between some nodes form a directed cycle, and convolution operations may take place to help correlate features of the input data across space and time, making such networks useful for applications such as handwriting and speech recognition.

For example, a recurrent neural network (RNN) is an architecture that maintains some kind of state or memory from one input to the next, making it well-suited to sequential forms of data such as text. In other words, the output for a given input depends not just on the input but also on previous inputs.

Example embodiments to be described herein may be applied to any form of neural network, for any application or task, although examples are focussed on feedforward neural networks.

When the architecture of a neural network is initialised, the neural network may operate in two phases, namely a training phase and an inference phase.

Initialised, initialisation or implementing, refers to setting up of at least part of the neural network architecture on one or more devices, and may comprise providing initialisation data to the devices prior to commencement of the training and/or inference phases. This may comprise reserving memory and/or processing resources at the particular device for the one or more layers, and may for example allocate resources for individual nodes, store data representing weights, and storing data representing other characteristics, such as where the output data from one layer is to be provided after execution. Initialisation may be incorporated as part of the training phase in some embodiments. Some aspect of the initialisation may be performed autonomously at one or more devices in some embodiments.

In the training phase, the values of the weights in the network are determined. Initially, random weights may be selected or, alternatively, the weights may takes values from a previously-trained neural network as the initial values. Training may involve supervised or unsupervised learning. Supervised learning involves providing both input and desired output data, and the neural network then processes the inputs, compares the resulting outputs against the desired outputs, and propagates the resulting errors back through the neural network causing the weights to be adjusted with a view to minimising the errors iteratively. When an appropriate set of weights are determined, the neural network is considered trained. Unsupervised, or adaptive training, involves providing input data but not output data. It is for the neural network itself to adapt the weights according to one or more algorithms. However, described embodiments are not limited by the specific training approach or algorithm used.

Once trained, the inference phase uses the trained neural network, with the weights determined during the training stage, to perform a task and generate output. For example, a task maybe classification of an input image into one or more categories of images. Another task may consist of filling in missing part in an image.

In example embodiments, a neural network architecture is implemented by means of using multiple devices interconnected by means of a communications network. In other words, the neural network architecture may be distributed across multiple devices where one or more layers are implemented in one device and a different layer, or different combination of layers, are implemented on a different device. Any number of two or more devices may be used.

In this way, the resources required for potentially complex neural network architectures can be distributed across plural devices, utilising their respective processing and memory resources, rather than employing a single computer system which may require significant processing and memory resources.

A device as defined herein is any physical apparatus having its own processing and storage capability. The different devices maybe co-located or physically separate. One or more of the devices may be remote. The communication network may be any form of data communication network, for example a local area network (LAN), a wide area network (WAN) or a peer-to-peer (P2P) network. A combination of the above network forms may be used. The data communications network may be established using one or both of wireless or wired channels or media, and may use protocols such as Ethernet, Bluetooth or WiFi, which are given merely by way of example.

A local layer is defined as a layer that is determined at, or assigned or allocated to a particular device. Each device may therefore initialise and execute a subset of the layers of the neural network architecture.

For example, the neural network architecture maybe implemented on two or more so-called Internet of Things (IoT) devices. Such devices are usually small network devices for performing a specific task, such as weighing scales, a home surveillance module, an energy consumption monitor, a digital assistant etc. A given home or office may comprise any number of such Internet of Things devices, usually linked to a home or office network. Such Internet of Things devices tend to have processing and memory capabilities appropriate to their function. Such Internet of Things devices may be mains or battery powered, and may have other particular capabilities such as their battery status (charge capacity, consumption rate and/or charge status). Other capabilities may comprise one or more of expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

Some example embodiments also relate to a controller apparatus for transmitting signalling data to two or more interconnected devices in a communications network for establishing the neural network using two or more devices. The two or more interconnected devices may not be communicating with one another at the time of receiving the signalling data.

In some embodiments, the signalling data may indicate a type of neural network architecture or task that the devices are to implement, such that the devices can autonomously assign different neural network layers across the network. This may be based on known information, such as stored data at one or more of said devices, indicative of an appropriate neural network architecture for a given task.

In some embodiments, the signalling data maybe indicative of an assignment of one or more layers of an artificial neural network to the devices, such that each device is assigned a different layer or a different combination of layers based on the signalling data. The signalling data may be generated at the controller apparatus, and may be based on knowledge of each device’s capabilities.

The devices themselves may autonomously determine the topology of the layers.

The signalling data from the controller apparatus may define the topology of each assigned layer for implementation at the respective devices. For example, this maybe based on prior information and/or the capabilities of each device.

The term topology may refer to characteristics of each layer of the neural network, for example one or more of its type (e.g. input, intermediate or output layer), the number of nodes in the layer, the number of, or representational characteristics of, weights associated with nodes and the processing operation to be performed by each node. The values of the weights may not be referred to in the topology.

Some example embodiments also relate to one or more devices for receiving the signalling data from the controller apparatus over a communications network, the signalling data being for establishing a neural network comprising one or more layers. The devices maybe referred to as “training devices.”

In some embodiments, the received signalling data is indicative of an assignment of one or more layers, being a subset of layers of an artificial neural network, to the device.

The received signalling data may define a topology of the one or more assigned layers. The devices may implement the assigned one or more layers based on the received signalling data. In some embodiments, the signalling data may simply indicate the assignment of the layers, leaving it to the device and one or more other devices themselves to autonomously determine the topology of each layer. For example, this may be based on prior information and/or the capabilities of each device. In some embodiments, the signalling data my simply indicate a type of neural network or task that the neural network is required to perform.

FIG. 1 is an example neural network architecture to, comprising a plurality of nodes 11, each for performing a respective processing operation. The neural network architecture 10 comprises an input layer 12, one or more intermediate layers 14 and an output layer 16. The interconnections 17 between nodes 11 of different layers may have associated weights. The weights maybe set in an implementation or initialization phase, varied during the training phase of the neural network, and may remain set during the inference phase of the neural network.

FIG. 2 is an example computer system 20 on which the FIG. 1 neural network architecture 10 may be implemented. The computer system 20 comprises a network 21, a neural network control apparatus 22 and first to fifth devices 23 - 27. The network 21 maybe any form of data network, for example a local area network (LAN), a wide area network (WAN), the Internet, a peer-to-peer (P2P) network, and may use wired or wireless communications as mentioned previously.

The neural network control apparatus 22 and the first to fifth devices 23 - 27 maybe distinct, physically separate computer devices having their own processing and memory resources. In some example embodiments, the neural network control apparatus 22 may be provided as part of one of the first to fifth devices 23 - 27. In embodiments herein, it is assumed that the neural network control apparatus 22 is separate from the first to fifth devices 23 - 27.

The neural network control apparatus 22 and the first to fifth devices 23 - 27 maybe any form of computer device, for example one or more of a personal computer (PC), laptop, tablet computer, smartphone, router and an Internet-of-Things (IoT) device. The processing and memory capabilities of the first to fifth devices 23 - 27 maybe different, and may change during operation, for example during use for some other purpose. In some embodiments, data representing the capabilities of the first to fifth devices 23 - 27 may be fed back to the neural network control apparatus 22 which may adapt allocation of the layers based on capability, for example to assign a computationally intensive layer to a device having greater capacity and/or to release one or more layers from a device having restricted capacity, which may be due to a new or ongoing other processing task.

In some embodiments, a plurality of copies of the same layer may be allocated to different ones of the first to fifth devices 23 - 27 to provide a level of redundancy, for example to provide access to a different copy of the layer on a different device if the current device is in error or significant resources are being used by another processing task.

In some embodiments, one or more of the first to fifth devices 23 - 27 may be portable devices. In some embodiments, one or more of the first to fifth devices 23 - 27 may be battery powered or self-powered by one or more energy harvesting sources, such as for examples a solar panel or a kinetic energy converter. One or more of the first to fifth devices 23 - 27 may perform the operations to be described below in parallel with other processing tasks, for example when running other applications, handling telephone calls, retrieving and displaying browser data, or performing an Internet of Things (IoT) operation.

The neural network control apparatus 22 may be considered a centralised computer apparatus that causes assignment of neural network layers to one or more networked devices, for example the networked devices 23 - 27 shown in FIG. 2.

Neural Network Control Apparatus

FIG. 3 is a schematic diagram of components of the neural network control apparatus 22. However, it should be appreciated that the same or similar components may be provided in each of the first to fifth devices 23 - 27.

The neural network control apparatus 22 may have a controller 30, a memory 31 closely coupled to the controller and comprised of a RAM 33 and ROM 34 and a network interface 32. It may additionally, but not necessarily, comprise a display and hardware keys. The controller 30 may be connected to each of the other components to control operation thereof. The term memory may refer to a storage space.

The network interface 32 maybe configured for connection to the network 21, e.g. a modem which may be wired or wireless. An antenna (not shown) may be provided for wireless connection, which may use WiFi, 3GPP NB-IOT, and/or Bluetooth, for example.

The memory 31 may comprise a hard disk drive (HDD) or a solid state drive (SSD). The ROM 34 of the memory 31 stores, amongst other things, an operating system 35 and may store one or more software applications 36. The RAM 33 is used by the controller 30 for the temporary storage of data. The operating system 35 may contain code which, when executed by the controller 30 in conjunction with the RAM 33, controls operation of each of the hardware components.

The controller 30 may take any suitable form. For instance, it may be a microcontroller, plural microcontrollers, a processor, plural processors, or processor circuitry.

In some example embodiments, the neural network control apparatus 22 may also be associated with external software applications. These may be applications stored on a remote server device and may run partly or exclusively on the remote server device. These applications may be termed cloud-hosted applications or data. The neural network control apparatus 22 may be in communication with the remote server device in order to utilize the software application stored there.

Example embodiments herein will now be described in greater detail. The processing operations to be described below may be performed by the one or more software applications 36 provided on the memory 30, or on hardware or a combination thereof.

In an example embodiment, the neural network control apparatus 22 may simply transmit to one or more of the first to fifth devices 23 - 27 an indication of a type of neural network or a particular task that it is to perform. Two or more of the first to fifth devices 23 - 27 may then autonomously establish the neural network based on prior information. For example, the prior information may comprise knowledge of a neural network architecture, useful for the same or similar task that the current neural network is to implement. The task or task type may be indicated through signalling information from the neural network control apparatus 22. For example, if the signalling information indicates a task or task type for image classification, the first to fifth devices 23 - 27 (or a subset thereof) may store information indicating that a sequence of approximately n layers each comprising a convolutional layer, a batch normalization layer and a rectified linear unit (ReLu) activation layer is required. Another example may be a de-noising auto-encoder for learning to denoise images. Based on signalling data of this type, the first to fifth devices 23 - 27 (or a subset thereof) may store information indicating that a sequence of approximately m encoding convolutional layers using batch normalisation and rectified linear unit activations will provide the encoder, and a sequence of about ten decoding deconvolution layers using batch normalization and rectified linear units and activators will provide the decoder.

In another example embodiment, the neural network control apparatus 22 may transmit to the first to fifth devices 23 - 27 (or a subset thereof) signalling information indicative of an assignment of layers to the devices.

FIG. 4 is a flow diagram showing example operations that may be performed by the neural network control apparatus 22. The operations maybe performed in hardware, software or a combination thereof. One or more operations maybe omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 4.1 may comprise providing a neural network architecture, which architecture may comprise a plurality of layers. The architecture maybe provided in any suitable data form. Each layer may have a particular topology, for example the type of layer (e.g. input, intermediate, output, convolutional), the number of nodes, processing operations to be performed by each node, input and outputs, weights, biases etc.

The signalling data may also comprise an indication of processing order. For example, a layer may be assigned an order number “k” (e.g. an integer) which may indicate that the data generated by said layer is to be provided as input to one or more layers having the next sequential order number “k+i” or indeed a plurality of other layers in some cases, e.g. “k+i”, “k+2” and “k+3”. As will be described later on, the devices 23 - 27 may in some example embodiments transmit as a broadcast or multi-cast signal the order number together with the generated data from the last (highest-order) active layer, in order to advertise to the other devices. Based on the order number, one or more of the other devices may selectively process the advertised data, e.g. if the order number matches their expected order number.

Another operation 4.2 may comprise transmitting signalling data to two or more devices, for example the first to fifth devices 23 - 27, the signalling data being indicative of an assignment of layers to the devices.

Each of the two or more devices 23 - 27 may be assigned a different layer or a different combination of layers. Thus, for a neural network architecture comprising five layers, a first device 23 may be assigned first and second layers of the neural network architecture, a second device 24 may be assigned third and fourth layers of the neural network architecture and a third device 25 may be assigned fourth and fifth layers of the neural network architecture. As mentioned above, one or more devices may be assigned a common layer, in this case the fourth layer, which provides resilience should one device fail or become overloaded. It is also possible that a device maybe assigned a set of layers which are not adjacent to one other, for example where the output of one layer in the device is not the input to any of the other layers in the device, and where the input to any layer in the device is not the output of a layer in the device. This may be reasonable if a certain intermediate layer occupies more memory than what the device can provide, or requires more computational capabilities than what the device can provide.

FIG. 5 is a flow diagram showing example operations in another example embodiment that may performed by the neural network control apparatus 22. The operations may be performed in hardware, software or a combination thereof. One or more operations maybe omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 5.1 may comprise providing a neural network architecture, which architecture may comprise data representing a plurality of layers. The architecture may be provided in any suitable data form. Each layer may have a particular topology, for example the type of layer (e.g. input, intermediate, output, convolutional), the number of nodes, processing operations to be performed by each node, input and outputs, weights etc.

Another example operation 5.2 may comprise transmitting signalling data to two or more devices, for example the first to fifth devices 23 - 27, the signalling data being indicative of an assignment of layers to the devices and further defines a topology of each assigned layer for implementation at the respective devices. In this embodiment, the topology for each layer is therefore signalled to each of the first to fifth devices 23 - 27 rather than leaving it for them to autonomously determine the topologies to use. A combination of both methods may be employed in some embodiments.

FIG. 6 is a flow diagram showing example operations in another example embodiment that may performed by the neural network control apparatus 22. The operations may be performed in hardware, software or a combination thereof. One or more operations maybe omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 6.1 may comprise providing a neural network architecture, which architecture may comprise data representing a plurality of layers. Each layer may have a topology, for example the type of layer (e.g. input, intermediate, output, convolutional), the number of nodes, processing operations to be performed by each node, input and outputs, weights etc.

Another example operation 6.2 may comprise receiving capability data from the two or more devices 23 - 27. Capability data may be indicative of each device’s capability to implement the assigned layer or combination of layers. For example, the received capability data may be indicative of one or both of a device’s computational capability and memory capability. Alternatively, or additionally, the received capability data maybe indicative of one or more of the device’s battery status, current or expected computational workload, current or expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load. The capability data may comprise one or more of the above in combination. The capability data may represent the current capability or capability types, i.e. at substantially the time of receipt, or may be based in part on historical data.

Another example operation 6.3 may comprise assigning layers to the two or more devices 23 - 27 based on their respective capabilities. For example, a computationally-heavy layer may be assigned to a device having a faster processor and more available memory. For example, a computationally-light layer maybe assigned to a device having a slower processor and limited available memory. For example, a device having a limited battery supply remaining may have only one or a limited number of assigned layers. In this case, one or more of said device’s layers may be duplicated in another device, as backup should the battery expire.

Another operation 6.4 may comprise transmitting signalling data to two or more devices, for example the first to fifth devices 23 - 27, the signalling data being indicative of the assignment of layers determined in operation 5.3.

The operations 6.2 - 6.4 may be repeated at subsequent times when new capability data is received from the first to fifth devices 23 - 27. This is represented by operation 6.5. The new capability data may be received periodically, continuously or responsive to requests from the neural network control apparatus 22.

Operation 6.4 may comprise transmitting signalling data defining a topology of each assigned layer for implementation at the respective devices. In this embodiment, the topology for each layer is therefore signalled to each of the first to fifth devices 23 - 27 rather than leaving it for them to autonomously determine the topologies to use. A combination of both methods may be employed in some embodiments.

In some example embodiments, the neural network control apparatus 22 may transmit the signalling data only once, i.e. to establish or initialise the neural network, prior to execution of the neural network nodes. In other example embodiments, the neural network control apparatus 22 may transmit the signalling data more than once. This may occur if the neural network changes, whether partially (only some layers) or fully (an updated network or one for another task). As noted previously, another situation in which updated signalling maybe required is if the capabilities of one or more of the first to fifth devices 23 - 27 changes, e.g. degrades, or if there is a failure. This may require updated signalling data to change the assignment of layers based on the change in capability or failure of one or more of the first to fifth devices 23 - 27.

As already mentioned, the signalling data from the neural network control apparatus 22 may assign plural instances of a common layer to multiple ones of the first to fifth devices 23 27. This is useful in that if one device 23 - 27 fails, is otherwise in error, or its capabilities are degraded, then one or more layers implemented on said device may be activated on one or more other devices. This may involve a re-assignment of the topology in the signalling data so that output data from a preceding layer is diverted to the new device or devices having the required duplicate layer.

First to Fifth Devices 23 - 27

The following describes example embodiments relating to the first to fifth devices 23 - 27 shown in FIG. 2. The following will focus on the first device 23 for ease of explanation, but it will be appreciated that the description may be applicable to the second to fifth devices 24 27.

The first device 23 may be termed a “learning device” on account that one if its roles is to be trained for the inference phase. The first device 23 at the hardware level may comprise the same components as the neural network control apparatus 22 shown in FIG. 3. The operations performed by the first device 23 are somewhat different, however, in that it receives the signalling information from the neural network control apparatus 22 for establishing part of the neural network architecture. Other parts of the neural network architecture are established on one or more of the second to fifth devices 24 - 27, as will be appreciated.

In an example embodiment, the first device 23 may receive signalling data from the neural network control apparatus 22 indicative of a type of neural network or a particular task that collectively the first to fifth devices 24 - 27 are to perform. Two or more of the first to fifth devices 23 - 27 may then autonomously establish the neural network based on prior information stored at one or more of the first to fifth devices 24 - 27.

For example, the prior information may comprise knowledge of a neural network architecture useful for the same or similar task that the current neural network is to implement. For example, if the received signalling information indicates a task or task type for image classification, one or more of the first to fifth devices 23 - 27 (or a subset thereof) may store information indicating that a sequence of approximately twenty layers each comprising a convolutional layer, a batch normalization layer and a rectified linear unit (ReLu) activation layer is appropriate. Another example maybe a de-noising auto-encoder for learning to de-noise images. Based on signalling data of this type, one or more of the first to fifth devices 23 - 27 (or a subset thereof) may store information indicating that a sequence of approximately ten encoding convolutional layers using batch normalisation and rectified linear unit activations will provide the encoder, and a sequence of about ten decoding deconvolution layers using batch normalization and rectified linear units and activators will provide the decoder.

FIG. 7 is a flow diagram showing example operations that may performed by the first device 23. The operations maybe performed in hardware, software or a combination thereof. One or more operations maybe omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 7.1 may comprise receiving signalling data indicative of an assignment of one or more layers to the device.

Another example operation 7.2 may comprise implementing the one or more assigned layers of the neural network based on the signalling data.

In some embodiments, the first device 23 may autonomously decide the topology of its layers based on prior information, e.g. the type of neural network. In other embodiments, the topology maybe signalled with the signalling data.

FIG. 8 is a flow diagram showing example operations in another example embodiment that may performed by the first device 23. The operations may be performed in hardware, software or a combination thereof. One or more operations may be omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 8.1 may comprise receiving signalling data, the signalling data being indicative of an assignment of one or more layers to the device and further defining a topology of each assigned layer for implementation at the respective device. In this embodiment, the topology for each layer is therefore signalled to the first device 23 rather than leaving it to said device to autonomously determine the topologies to use. A combination of both methods may be employed in some embodiments. Each layer may have a particular topology, for example the type of layer (e.g. input, intermediate, output), the number of nodes, processing operations to be performed by each node, input and outputs, weights etc.

The received signalling data may also comprise an indication of processing order. For example, a layer assigned to the first device 23 may be assigned an order number “k” which may indicate that the data generated by said layer is to be provided as input to one or more layers having the next sequential order number “k+i” or indeed a plurality of other layers in some cases, e.g. “k+i”, “k+2” and “k+3”.

As will be described later on, the first to fifth devices 23 - 27 may in some example embodiments transmit as a broadcast or multi-cast signal the order number together with the generated data from the last (highest-order) active layer, in order to advertise to other devices which may selectively receive as input the generated data based on the order number.

Within the FIG. 3 example, each of the first to fifth devices 23 - 27 may be assigned a different layer or a different combination of layers.

FIG. 9 is a flow diagram showing example operations in another example embodiment that may performed by the first device 23. The operations may be performed in hardware, software or a combination thereof. One or more operations may be omitted. The number of operations is not necessarily indicative of the order of processing.

One example operation 9.1 may comprise providing capability data from the first device 23. Capability data maybe indicative of the first device’s capability to implement the assigned layer or combination of layers, as signalled, or it may be provided before initialisation of the neural network. For example, the provided capability data may be indicative of one or both of a device’s computational capability and memory capability. Alternatively, or additionally, the received capability data may be indicative of one or more of the device’s battery status, current or expected computational workload, current or expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load. The capability data may comprise one or more of the above in combination. The capability data may represent the current capability or capability types or may be based in part on historical data. In some embodiments, the capability data may be indicative of a failure, error condition and/or degradation in performance of capability.

Another example operation 9.2 may comprise transmitting the capability data to the neural network control apparatus 22. As will be appreciated from the above, particularly in relation to FIG. 6, the neural network control apparatus 22 may assign the layers and/or topology of layers to the first to fifth devices 23 - 27 based on their respective capabilities, both at the time of initialisation and possibly during the training and inference phases.

Another example operation 9.3 may comprise receiving signalling data indicative of an assignment of one or more layers to the first device 23 based on the capability data. For example, a computationally-heavy layer may be assigned to the first device 23 if it has a faster processor and more available memory. For example, a computationally-light layer may be assigned to a different device having a slower memory and limited available memory. For example, a device having a limited battery supply remaining may have only one or a limited number of assigned layers. In this case, one or more of said device’s layers may be duplicated in another device, as backup should the battery expire.

The operations 9.2 - 9.3 may be repeated at subsequent times when new capability data is available at the first device 23. This is represented by operation 9.4.

Operation 9.3 may comprise receiving signalling data defining a topology of each assigned layer for implementation at the first device 23, rather than leaving it for the device to autonomously determine the topology or topologies to use. A combination of both methods may be employed in some embodiments.

In some example embodiments, the first device 23 may receive the signalling data only once, i.e. to establish part of the neural network, prior to execution of the neural network nodes in the training and/or inference phases. In other example embodiments, the first device 23 may receive the signalling data more than once. This may occur if the neural network changes, whether partially (only some layers) or fully (an updated network or one for another task). As noted previously, another situation in which updated signalling may be required is if the capabilities of the first device 23 changes, e.g. degrades, or if there is a failure. This may require updated signalling data to change the assignment of layers based on the change in capability or failure of the first device 23.

As already mentioned, the signalling data received from the neural network control apparatus 22 may assign plural instances of a common layer to multiple ones of the first to fifth devices 23 - 27. This is useful in that if one device 23 - 27 fails, is otherwise in error, or its capabilities are degraded, then one or more layers implemented on said device may be activated on one or more other devices. This may involve a re-assignment of the topology in the signalling data so that output data from a preceding layer is diverted to the new device or devices having the required duplicate layer.

It will be appreciated that the above-described description of the first device 23 is equally applicable to the second to fifth devices 24 - 27 and their collective roles in implementing the signalled neural network using their distributed resources.

In terms of training the neural network, training data may be received from the neural network control apparatus 22. This training data may be provided as part of the signalling data, and may indicate the weights between nodes of different layers, for example.

In other embodiments, the training may be performed autonomously by the first to fifth devices 23 - 27 themselves.

FIG. 10 is a schematic diagram of an example assignment too of seven neural network layers

102 -108 distributed across the first to fifth devices 23 - 27. The assignment too may be responsive to received signalling data from the neural network control apparatus 22 using any of the above embodiments.

Further details will now be described, by way of example.

We assume that each device 23 - 27 has received from the neural network control apparatus 22 one, or different combinations of, layers 102 -108 of a feedforward neural network. The number and type of layers 102 - 108 which are received by a certain device 23 - 27 depend on the memory capabilities and the computational capabilities of that device, in addition to the expected workload of that device regarding any other tasks that the device was designed for. Each device 23 - 27 may inform the neural network control apparatus 22 about its capabilities, for example amount of total memory or memory currently available for neural network processing, battery status, expected workloads in terms of both memory consumption and computational load (e.g., maximum memory consumption, average memory consumption, maximum CPU and/or GPU load, average CPU and/or GPU load). The neural network control apparatus 22 may determine the distribution of the layers, based on the received information about each devices capabilities.

In embodiments where the training of the layers 102 -108 is performed by the devices 23 27 themselves, the neural network control apparatus 22 may transmit beforehand (i.e., before the execution of the layers commences) only the architecture or topology of the layers, e.g. the type of layer, how many units for each layer, etc. In response to receiving the topology information from the neural network control apparatus 22, a device 23 - 27 may reserve appropriate resources for training/execution of the required one or more layers of the neural network. Also, the devices 23 - 27 may start building the computational graphs representing the layers 102 -108 to be hosted, and also initialize the weights of those layers, for example by assigning an initial value to each weight in the one or more layers, which will represent the starting value for the training process. For example, this may be by using one of the commonly-used initialization methods, such as Xavier initialization. Alternatively, this may be by copying the weight values from a neural network previously trained by the same devices or by a third-party entity for the same task, similar task or different task.

In other embodiments, the devices 23 - 27 may autonomously decide the topology of the layers 102 - 108 to use, based on prior information and the capabilities of each device. The prior information may consist of knowledge of a neural network architecture known to be able to solve a similar task as the task at hand, as described previously. For example, in the case of models handling temporal data, the topology may consist of a number of feature extraction layers, followed by layers forming a recurrent neural network, followed by a classification layer or by one or more data reconstruction layers. In this case, there may be no need to have the neural network control apparatus 22 distribute the topology or weights of the layers 102 - 108. Instead, in one example embodiment, the neural network control apparatus 22 may send an initialization message, including the type of task to be performed, to the deGees 23 - 27. Each deGee 23 - 27 may for example be preconfigured with a table that associates a particular type of a task to a corresponding topology. Alternatively, this table may be communicated by the neural network control apparatus 22 to each deGee 23 27In some embodiments, if the training of the layers 102 - 108 on the deGees 23 - 27 is not performed by the deGees themselves, the neural network control apparatus 22 may send both the topology, including the weights of the layers, beforehand.

In receiGng multiple layers 102 - 108, a particular deGee 23 - 27 may receive adjacent layers, i.e. layers which are processed one after the other, with no missing intermediate layers. In this way, the communication bandwidth may be optimized. However, in some cases, a deGee 23 - 27 may receive layers 102 - 108 which are not adjacent, for example because the deGee’s limited memory does not allow for adjacent layers.

In an example embodiment, the layers 102 - 108 are distributed by the neural network control apparatus 22 to the deGees 23 - 27 only once, before computation of the layers is started.

In another example embodiment, the layers 102 - 108 may be distributed more than once. One example is when the neural network is changed partially (only some layers are changed) or fully (e.g. for an updated network, or a different network for a different task).

In another example embodiment, the setup may dynamically adapt to the current status of the deGees 23 - 27, for example relating to changes in workload expectations and/or a failure of one or more deGees, etc. Therefore, if one deGee 23 - 27 has a failure, the layers 102 - 108 originally assigned to that deGee may be re-distributed (by the neural network control apparatus 22, or independently among the deGees) to other deGees which can accommodate those layers.

This initialization phase may also comprise determining the order of computation of the layers 102 - 108, which may originally be known to the neural network control apparatus 22, or from the prior information about the topology. In an example embodiment, the distribution of layers 102 - 108 may comprise the assignment of layers and, for each layer, that layer’s order number. For example, a first layer 102 may have order number 1, a second layer 103 may have order number 2.

In an alternative embodiment, the distribution of layers 102 - 108 may comprise the layers, the order number and information about the device 23 - 27 which contains the next layer to be computed, after the device to which the information is communicated.

As will be seen in FIG. 10, a third layer 104 and a fourth layer 105 are duplicated and assigned to multiple devices, i.e. the second, third and fourth devices 24, 25, 26. As described above, this redundancy provides robustness against device failures or high workloads or communication failures.

Referring now to FIG. 11, which shows the FIG. 10 neural network assignment too during an execution or inference phase, the first device 23 assigned the first layer 102 requires input data. The input data may be captured in real-time, or may be stored or received from another device. For a real-time example, the input data may be captured by a module of the first device 23 itself, for example a microphone or a camera, or by a remote module which is not physically part of the first device. In one embodiment, the input data is received from the neural network control apparatus 22. In a non-real-time example, the data may be stored within the first device 23, or stored within a remote device. A combination of these different options is also possible.

The input data may carry (or be associated with) a data identifier. This data identifier may then be communicated together with all modifications or processed versions of that data. This may be useful for relating the output of the last layer 108 to the input data. We refer to data-identifier as the data-ID. The data-ID maybe therefore used for distinguishing between data belonging to different rounds of execution of the neural network. For example, devices 23 - 27 including initial layers may already process input data relating to next processing round while other devices are still executing the final layers of a previous execution round.

The first device 23 with the first layer 102 may input the input data to the first layer, execute the layer, and the output may be input to any other adjacent layer or layers which are provided on that device. In this example, there is only one layer 102 on the first device 23.

The output of the last layer in a device 23 - 27 needs to be sent out to be processed by the next layer in the computation order, which is provided in one or more of the other devices.

For example, the first device 23 may transmit its output data from the first layer 102 to all other devices 24 - 27, together with the order number of the last layer used to obtain that output data and possibly with the data-identifier. This transmission may be in a broadcast or multi-cast -like communication. The other devices 24 - 27 will all receive such data, and may check whether the received order number precedes that of one of their layers. If a suitable layer is found (e.g. a layer whose order number follows the received order number) the device will run that layer and any other adjacent layers, if current workload permits.

In the shown example, the second device 24 may identify the order number “1” with the data from the first device 23 and the process the received data sequentially using the second and third layers 103, 104 before broadcasting the output data from the third layer 103 with the associated order number, e.g. “3”.

Broadcasting the output in this way allows execution of the distributed neural network, without having to informing each device 23 - 27 about the device configured to execute the next layer.

In an alternative embodiment, the device 23 - 27 which has computed an output may know the device (“target device”) which contains the next layer. Identification of the target device may be received as part of signalling information from another device, for example neural network control apparatus 22. In this case, instead of broadcasting the output data and/or the order number, the device will simply send its output data to the target device, possibly together with the data-ID.

Another alternative embodiment may comprise a combination of the previous two embodiments. One device 23 - 27 which has computed an output may broadcast only the order number of the last computed layer, and will send the output data together with the data-ID only to the devices which can execute the next layer.

For a device 24 - 27 which does not contain the first layer 102, the sequence of operations is similar as for the device 23 with the first layer. A difference is in the input to its layers 103 108, because these devices 24 - 27 receive data output by another device 23. These other devices 24 - 27 need to understand whether they need to process it (or not) based on the order number (if the data was broadcast) or based on the fact that it was sent directly to this device.

The device or devices 27 providing the last layer 108 may be informed that they contain the last layer, for example by being notified during the distribution of layers 102 - 108. They may also be informed about what to do with the output of the last layer 108, for example to send the output to the neural network control apparatus 22 or to a third party entity, not shown.

As can be seen in FIG. 11, there are redundant layers 104,105 which have been distributed to more than one device 24, 25, 26. This redundancy can be exploited for example if one device 24, 25, 26 is temporarily busy e.g. performing some operation for which the device was designed. Instead of making the whole chain of distributed layers 102 - 108 wait, an affected device may signal that it is busy or in error, and processing will be done by another device containing the same layer. As an example, if the second device 24 at the current time can only execute the second layer 103 and not therefore the third layer 104, it may send out the output of the second layer with the order number “2” in a broadcast, and also to itself. If there are other devices 25 with the third layer 104, and which are available, i.e. have capacity, they may process the output from the second device 24. Otherwise, the second device 24 may wait until it can execute the third layer 104 itself.

In one example embodiment, devices 23 - 27 may be configured to send an acknowledgement of data that has been processed by a layer. The acknowledgement may include identification of the layer and the data. The acknowledgement may be sent to the device that provided the data. Alternatively, the acknowledgement may be broadcast or multicast to all devices. In response to receiving an acknowledgement, a device may be configured not to process the identified data at the identified layer. This may improve efficiency of the distributed neural network by preventing unnecessary processing of data that have been already processed by one of the devices 23 - 27.

In another example embodiment, if there are no devices 23 - 27 having a redundant layer, for example in the case of the second device 24 having the second layer 103, the layer may be sent to one or more devices 23, 25, 26, 27 which have free memory to accommodate this layer and which can execute it. The distribution of the second layer 103 may be performed by the second device 24, or by the neural network control apparatus 22. The communicated data may comprise the second layer topology, weights and order number, and may be broadcast to all devices 23, 25, 26, 27. The devices 23, 25, 26, 27 which can accommodate it and execute it will receive it and execute it on the data output by the second device 24.

If many devices 23 - 27 asks for inference at approximately the same time, and there are insufficient resources to conduct such a request, the neural network control apparatus 22 may hold the scheduling order based on the query time and may then orchestrate the inference at a later time accordingly.

FIG. 12 is a flow diagram illustrating example operations for performed by a single one of the devices 23 - 27 for an example embodiment of determining whether to execute a particular layer. The operations may be performed in hardware, software or a combination thereof. One or more operations may be omitted. The number of operations is not necessarily indicative of the order of processing.

A first operation 120 may comprise receiving broadcast data from another device, which broadcast data comprises at least the output data and the order number, and possibly the Data-ID.

A second operation 121 may comprise matching the order number with any present layer.

A third operation 122, responsive to the order number matching, may comprise evaluating the current workload of the respective device and determining how many present layers to execute.

A fourth operation 123, responsive to their being available capability based on current workload of the respective device, may be to execute the one or more layer.

A fifth operation 124 may comprise broadcasting the output data from the respective device, again comprising the order number of the last executed layer, and possibly the Data-ID.

The example embodiments above may apply to both inference and training phases. However, for the training phase, additional features maybe used.

For example, the final output of the implemented neural network, i.e. the output of the device 27 comprising the last layer 108 may need to be processed in order to derive training signals for all the distributed layers.

The training signals may be based on the gradient of a loss function with respect to the layers’ learnable parameters (i.e. the weights). The loss function may be one or more of the mean squared error, cross-entropy, adversarial loss (as in the one used in generative adversarial networks), a loss which is output by an auxiliary neural network, or any other suitable loss function.

The training signals may use the gradient of the loss to derive a weight update, based on any suitable method used by deep learning practitioners such as Stochastic Gradient Descent, Adam, Momentum, meta-learning based derivation of update rule, etc.

The training signals may be computed locally by the device 27 containing the last layer 108, or by the neural network control apparatus 22, or by a different device.

The computation of the gradient of the loss may require backpropagation, where the gradient is computed using the chain rule for derivatives backwards from the last layer 108 to the first layer 102. Thus, each device 23 - 27 may need to send data back to the one or more devices containing the preceding layer or layers. This can be done using either the broadcasting embodiment or directed communication to target devices, described previously. Alternatively, during the forward pass, each device “appends” to the output data an identifier for the device and layer which contributed in obtaining the final output of the distributed neural network, so that the chain of used layers is clear and backward propagation can be done by following the chain in reverse order.

Each device 23 - 27 may comprise its layers’ gradients, i.e. the gradients of the loss with respect to the weights of layers contained in the particular device. The gradient is then used to compute the weight update. Once the weight update has been computed by a particular device 23 - 27, it will be applied to its layers. So, each device 23 - 27 may only update its own layers. However, if one or more layers of a device are present also in other devices, the modified weights of these layers may be signalled also to those other devices, either by a broadcast/multicast method, or by signalling to the target devices directly. This way, all the copies of the same layers will be kept in synchronization with respect to their weights.

Once the devices 23 - 27 have updated their layers 102 -108, new input data can be input to the first layer 102 contained in one or more devices 23, and a new training iteration starts, which may consist of forward pass and backward pass. The training stops based on any suitable stopping criterion, which may be evaluated by the particular device computing the loss function, or by another device.

The example embodiments offer technical improvements, for example by using distributed processing resources of multiple devices, possibly Internet of Things devices, to implement a particular neural network architecture. The assignment of parts of said architecture to the different devices maybe based on the capabilities of the devices, and the assignments may dynamically change based on current conditions, such as performance issues and errors. It allows neural networks to be implemented, trained and executed using resources other than a dedicated, high performance computer system. A number of low memory devices maybe used, for example. Embodiments may also overcome issues with failures at such a dedicated computer system. By performing data analysis locally, for example within a home or office environment, there is improved data security as compared with sending input data for the neural network to one which is hosted remotely on the cloud.

Where a structural feature has been described, it maybe replaced by means for performing one or more of the functions of the structural feature whether that function or those functions are explicitly or implicitly described.

The term apparatus may be replaced with the term device.

In this brief description, reference has been made to various examples. The description of features or functions in relation to an example indicates that those features or functions are present in that example. The use of the term ‘example’ or ‘for example’ or ‘may’ in the text denotes, whether explicitly stated or not, that such features or functions are present in at least the described example, whether described as an example or not, and that they can be, but are not necessarily, present in some of or all other examples. Thus ‘example’, ‘for example’ or ‘may’ refers to a particular instance in a class of examples. A property of the instance can be a property of only that instance or a property of the class or a property of a sub-class of the class that includes some but not all of the instances in the class. It is therefore implicitly disclosed that a features described with reference to one example but not with reference to another example, can where possible be used in that other example but does not necessarily have to be used in that other example.

Although embodiments of the present invention have been described in the preceding paragraphs with reference to various examples, it should be appreciated that modifications to the examples given can be made without departing from the scope of the invention as claimed.

Features described in the preceding description maybe used in combinations other than the combinations explicitly described.

Although functions have been described with reference to certain features, those functions may be performable by other features whether described or not.

Although features have been described with reference to certain embodiments, those features may also be present in other embodiments whether described or not.

Whilst endeavoring in the foregoing specification to draw attention to those features of the invention believed to be of particular importance it should be understood that the Applicant claims protection in respect of any patentable feature or combination of features hereinbefore referred to and/or shown in the drawings whether or not particular emphasis has been placed thereon.

Claims

1. An apparatus comprising:

means for receiving signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and means for initialising on the apparatus one or more local layers, being a subset ofthe layers of the neural network architecture, based on the received signalling data.

2. The apparatus of claim 1, wherein the received signalling data indicates a prior assignment of local layers for the apparatus, the initialising means initialising said assigned local layers.

3. The apparatus of claim 2, wherein the received signalling data further defines a topology of the one or more local layers, the initialising means being configured to initialise the one or more local layers with the signalled topology.

4. The apparatus of claim 3, wherein the received signalling data defines the topology of each local layer by means of defining at least the number of nodes in each layer.

5. The apparatus of claim 3 or claim 4, wherein the received signalling data defines the topology of each local layer by defining a type of layer.

6. The apparatus of any of claims 3 to 5, wherein the received signalling data defines the topology of each of the local layers by defining weights associated with the one or more layers.

7. The apparatus of any of claims 1 to 6, wherein the received signalling data further defines one or more processing operations to be performed by one or more nodes of each local layer, and wherein the initialising means is configured to initialise the nodes to perform said processing operations.

8. The apparatus of claim 1, wherein the received signalling data indicates a type of neural network architecture or a neural network task, the initialising means being arranged to determine the one or more local layers to initialise based on prior knowledge of neural network architectures appropriate to the type of neural network architecture or neural network task.

9. The apparatus of claim 2 or claim 8, wherein the initialising means is further configured to determine locally a topology of the one or more local layers based on prior information.

10. The apparatus of any preceding claim, further comprising means for transmitting, to a controller device, data indicative of said apparatus’s capability to implement one or more layers of the neural network architecture, the received signalling data from the controller device being at least in part based on the transmitted capability data.

it. The apparatus of claim io, wherein the transmitted capability data is indicative of one or both of the device’s computational capability and memory capability.

12. The apparatus of claim io or claim n, wherein the transmitted capability data is indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

13. The apparatus of any of claims 1 to 12, further comprising means to transmit output data produced by a final layer of the one or more local layers to one or more other apparatuses providing another layer of the neural network architecture.

14. The apparatus of claim 13, wherein the transmitting means is configured to transmit, with the output data, data indicative of the processing order of the final local layer to the one or more other apparatuses for selective processing of the output data by another apparatus based on the processing order.

15. The apparatus of claim 14, wherein the processing order data is received with the signalling data from the control apparatus.

16. The apparatus of claim 15, wherein the transmitting means is configured to transmit the output data and the processing order data in a broadcast or multicast signal to the one or more other apparatuses.

17· The apparatus of any of claims 14 to 16, wherein the received signalling data further identifies one or more other apparatuses as implementing a subsequent layer to which data generated by the final layer is to be transmitted.

18. The apparatus of any preceding claim, further comprising means to receive, from one or more other apparatuses, data indicative of output produced by, and an associated processing order of, one or more layers of the other apparatus, the receiving means being configured to selectively process the received output data as input data to a local layer based on the received processing order data.

19. The apparatus of claim 18, wherein the receiving means is configured to selectively use the received output data as input data only if the apparatus has current capability to process the received output data as input data using said local layer.

20. The apparatus of any of claims 15 to 19, further comprising means to self-train the one or more layers.

21. An apparatus comprising:

means for transmitting signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more devices based on the signalling data.

22. The apparatus of claim 21, wherein the signalling means is configured to transmit signalling data for assigning one or more layers of the neural network architecture to the different devices such that each device is assigned a different layer or a different combination of layers of the neural network architecture for implementation thereat.

23. The apparatus of claim 22, wherein the signalling means is configured to transmit signalling data which further defines a topology of each assigned layer for implementation at the respective devices.

24. The apparatus of claim 23, wherein the signalling means is configured to transmit signalling data which defines the topology of each layer by means of defining the number of nodes in the layer.

25- The apparatus of claim 23 or claim 24, wherein the signalling means is configured to transmit signalling data which defines the topology of each layer by defining a type of layer.

26. The apparatus of claim 25, wherein the signalling means is configured to transmit signalling data which defines the type of layer as one of an input layer, an intermediate layer or an output layer.

27. The apparatus of any of claims 23 to 26, wherein the signalling data defines the topology of one or more layers by defining weights associated with the one or more layers.

28. The apparatus of any of claims 22 to 27, wherein the signalling data further defines one or more processing operations to be performed by one or more nodes of each layer.

29. The apparatus of any of claims 22 to 28, wherein the signalling data further indicates the processing order of each layer.

30. The apparatus of any of claims 21 to 29, further comprising means to receive, from one or more of the devices, data indicative of said device’s capability to implement one or more layers of the neural network architecture, and means for generating the signalling data at least in part based on the capability data from said one or more devices.

31. The apparatus of claim 30, wherein the received capability data is indicative of one or both of the device’s computational capability and memory capability.

32. The apparatus of claim 30 or claim 31, wherein the received capability data is indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

33. The apparatus of any of claims 30 to 32, further comprising means for determining from the capability data that a device is incapable of providing one or more layers of the neural network architecture, and for re-assigning the one or more layers to another device having sufficient capability.

34· The apparatus of claim 33, wherein the signalling data indicates an assignment of one or more of the same layers to different devices, and wherein if the capability data is indicative that a device is incapable of providing said layer, the apparatus is configured to cause a re-direction of data destined for said layer to the same layer on another device.

35. The apparatus of any preceding claim, wherein the means comprises:

at least one processor; and at least one memory including computer program code, the at least one memory and computer program code configured to, with the at least one processor, cause the performance of the apparatus.

36. A method comprising:

receiving signalling data from a controller device over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers; and initialising one or more local layers, being a subset of the layers of the neural network architecture, based on the received signalling data.

37. The method of claim 36, wherein the received signalling data indicates a prior assignment of local layers for the apparatus.

38. The method of claim 37, wherein the received signalling data further defines a topology of the one or more local layers.

39. The method of claim 38, wherein the received signalling data defines the topology of each local layer by means of defining at least the number of nodes in each layer.

40. The method of claim 38 or claim 39, wherein the received signalling data defines the topology of each local layer by defining a type of layer.

41. The method of any of claims 38 to 40, wherein the received signalling data defines the topology of each of the local layers by defining weights associated with the one or more layers.

42. The method of any of claims 36 to 41, wherein the received signalling data further defines one or more processing operations to be performed by one or more nodes of each local layer.

43. The method of claim 36, wherein the received signalling data indicates a type of neural network architecture or a neural network task, and wherein initialising the one or more local layers comprises determining the one or more local layers to initialise based on prior knowledge of neural network architectures appropriate to the type of neural network architecture or neural network task.

44. The method of claim 37 or claim 43, further comprising determining locally a topology of the one or more local layers based on prior information.

45. The method of any of claims 36 to 44, further comprising transmitting, to a controller device, data indicative of said apparatus’s capability to implement one or more layers of the neural network architecture, the received signalling data from the controller device being at least in part based on the transmitted capability data.

46. The method of claim 45, wherein the transmitted capability data is indicative of one or both of the device’s computational capability and memory capability.

47. The method of claim 45 or claim 46, wherein the transmitted capability data is indicative of one or more of the device’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

48. The method of any of claim 36 to 47, further comprising transmitting output data produced by a final layer of the one or more local layers to one or more other apparatuses providing another layer of the neural network architecture.

49. The method of claim 48, further comprising transmitting, with the output data, data indicative of the processing order of the final local layer to the one or more other apparatuses for selective processing of the output data by another apparatus based on the processing order.

50. The method of claim 49, wherein the processing order data is received with the signalling data from the control apparatus.

51. The method of claim 50, further comprising transmitting the output data and the processing order data in a broadcast or multicast signal to the one or more other apparatuses.

52. The method of any of claims 49 to 51, wherein the received signalling data further identifies one or more other apparatuses as implementing a subsequent layer to which data generated by the final layer is to be transmitted.

53. The method of any of claims 36 to 52, further comprising receiving, from one or more other apparatuses, data indicative of output produced by, and an associated processing order of, one or more layers of the other apparatus, and selectively processing the received output data as input data to a local layer based on the received processing order data.

54. The method of claim 53, further comprising selectively using the received output data as input data only if the apparatus has current capability to process the received output data as input data using said local layer.

55. The method of any of claims 50 to 54, further comprising self-training the one or more layers.

56. A method comprising:

transmitting signalling data to two or more different devices over a communications network, the signalling data being indicative of at least part of a neural network architecture comprising a plurality of layers for implementation in a distributed manner using said two or more devices based on the signalling data.

57. The method of claim 56, further comprising transmitting signalling data for assigning one or more layers of the neural network architecture to the different devices such that each device is assigned a different layer or a different combination of layers of the neural network architecture for implementation thereat.

58. The method of claim 57, wherein the signalling data further defines a topology of each assigned layer for implementation at the respective devices.

59· The method of claim 58, wherein the signalling data defines the topology of each layer by means of defining the number of nodes in the layer.

60. The method of claim 58 or claim 59, wherein the signalling data defines the topology of each layer by defining a type of layer.

61. The method of claim 60, wherein the signalling data defines the type of layer as one of an input layer, an intermediate layer or an output layer.

62. The method of any of claims 57 to 61, wherein the signalling data defines the topology of one or more layers by defining weights associated with the one or more layers.

63. The method of any of claims 57 to 62, wherein the signalling data further defines one or more processing operations to be performed by one or more nodes of each layer.

64. The method of any of claims 57 to 63, wherein the signalling data further indicates the processing order of each layer.

65. The method of any of claims 56 to 64, further comprising receiGng, from one or more of the deGees, data indicative of said deGee’s capability to implement one or more layers of the neural network architecture, and generating the signalling data at least in part based on the capability data from said one or more deGees.

66. The method of claim 65, wherein the received capability data is indicative of one or both of the deGee’s computational capability and memory capability.

67. The method of claim 65 or claim 66, wherein the received capability data is indicative of one or more of the deGee’s battery status, expected computational workload, expected memory workload, maximum processor speed, average memory consumption, maximum processor load and average processor load.

68. The method of any of claims 65 to 67, further comprising determining from the capability data that a deGee is incapable of proGding one or more layers of the neural network architecture, and re-assigning the one or more layers to another device having sufficient capability.

69. The method of claim 67, wherein the signalling data indicates an assignment of one or more of the same layers to different devices, and wherein if the capability data is indicative that a device is incapable of providing said layer, the apparatus is configured to cause a re-direction of data destined for said layer to the same layer on another device.

70. A computer program comprising instructions for causing an apparatus to perform at least the following:

71. A computer program comprising instructions for causing an apparatus to perform at least the following:

72. A system comprising:

controller means; and a plurality of communications devices, wherein the controller means is configured to transmit signalling data to the plurality of communications device over a communications network, the signalling data being indicative of at least part of a neural network architecture, for causing each of the communications devices to initialise one or more local layers, being a subset of layers of the neural network architecture, such that each communications device provides a different layer, or a different combination of, layers than the other communications devices.