EP3803712A1 - Appareil, procédé et programme informatique pour sélectionner un réseau neuronal - Google Patents

Appareil, procédé et programme informatique pour sélectionner un réseau neuronal

Info

Publication number
EP3803712A1
EP3803712A1 EP19814193.9A EP19814193A EP3803712A1 EP 3803712 A1 EP3803712 A1 EP 3803712A1 EP 19814193 A EP19814193 A EP 19814193A EP 3803712 A1 EP3803712 A1 EP 3803712A1
Authority
EP
European Patent Office
Prior art keywords
auxiliary
task
neural network
data
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP19814193.9A
Other languages
German (de)
English (en)
Other versions
EP3803712A4 (fr
Inventor
Francesco Cricri
Caglar AYTEKIN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of EP3803712A1 publication Critical patent/EP3803712A1/fr
Publication of EP3803712A4 publication Critical patent/EP3803712A4/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/24Negotiation of communication capabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks

Definitions

  • Various example embodiments relate to selecting a neural network from a plurality of neural networks.
  • Neural networks are being utilized in an ever increasing number of applications for many different types of device, such as mobile phones. Examples of applications comprise image and video analysis and processing, social media data analysis, device usage data analysis, etc.
  • Various different neural networks may be available for various different tasks. It may be difficult to choose between different neural networks which one is an optimal neural network for performing a specific task when ground-truth data or an entity providing guidance are not available.
  • an apparatus comprising means for receiving data to be processed by one of a plurality of main neural networks; providing the data and signalling information associated with the data to a plurality of devices each comprising a main neural network and an auxiliary neural network, the auxiliary neural network comprising a subset of layers of the main neural network, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural networks at the plurality of devices; receiving, from the plurality of devices, indications of performance of the auxiliary neural networks for performing the auxiliary task; and selecting, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data.
  • the apparatus further comprises means for requesting the selected main neural network to perform the main task on the data; and receiving an output of the main task.
  • the apparatus further comprises means for requesting one of the plurality of devices comprising the selected main neural network to provide the selected main neural network; receiving the selected main neural network; and performing the main task on the data using the selected main neural network.
  • the apparatus is a cell phone.
  • an apparatus comprising a main neural network and an auxiliary neural network comprising a subset of layers of the main neural network, further comprising means for: receiving data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network; training the auxiliary network for performing the auxiliary task; providing an indication of performance of the auxiliary neural network for performing the auxiliary task; and receiving, in response to providing the indication of performance, a request to perform a main task by a selected main neural network or to provide the selected main neural network to another device.
  • the auxiliary task is an unsupervised task or a self-supervised task.
  • the signalling information further comprises at least one of an identifier of the main task; one or more parameters for the auxiliary neural networks; and one or more training parameters for the auxiliary neural networks.
  • the indication of performance comprises a convergence speed of the auxiliary neural network.
  • the auxiliary networks are trained using as initial values for parameters of the subset of layers values of the subset of layers of the main neural network.
  • a learning rate of the subset of layers is lower than a learning rate of other layers of the auxiliary neural network.
  • the data is image data or video data and the auxiliary task is an image denoising task, an image inpainting task, an image compression task, a single-image super-resolution task, a next frame prediction task and/or sound generation task from image data or video data.
  • the auxiliary task is an image denoising task, an image inpainting task, an image compression task, a single-image super-resolution task, a next frame prediction task and/or sound generation task from image data or video data.
  • the data is image data or video data and the main task is an image classification task, an image segmentation task, an image object detection task, an image or a video captioning task, a salient object detection task and/or a video object tracking task.
  • the means comprises at least one processor; at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the performance of the apparatus.
  • a method comprising receiving data to be processed by one of a plurality of main neural networks; providing the data and signalling information associated with the data to a plurality of devices each comprising a main neural network and an auxiliary neural network, the auxiliary neural network comprising a subset of layers of the main neural network, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural networks at the plurality of devices; receiving, from the plurality of devices, indications of performance of the auxiliary neural networks for performing the auxiliary task; and selecting, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data.
  • a method comprising receiving data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network; training the auxiliary network for performing the auxiliary task; providing an indication of performance of the auxiliary neural network for performing the auxiliary task; and receiving, in response to providing the indication of performance, a request to perform a main task by a selected main neural network or to provide the selected main neural network to another device.
  • a computer program comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to: receive data to be processed by one of a plurality of main neural networks; provide the data and signalling information associated with the data to a plurality of devices each comprising a main neural network and an auxiliary neural network, the auxiliary neural network comprising a subset of layers of the main neural network, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural networks at the plurality of devices; receive, from the plurality of devices, indications of performance of the auxiliary neural networks for performing the auxiliary task; and select, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data
  • a computer program comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to: receive data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network; train the auxiliary network for performing the auxiliary task; provide an indication of performance of the auxiliary neural network for performing the auxiliary task; and receive, in response to providing the indication of performance, a request to perform a main task by a selected main neural network or to provide the selected main neural network to another device.
  • Fig . 1 a shows, by way of example, a system and devices for selecting a neural network
  • Fig. 1 b shows, by way of example, a block diagram of an apparatus
  • Fig. 2 shows, by way of example, a flowchart of a method for selecting a neural network
  • Fig. 3a, 3b and 3c show, by way of examples, communication and signalling between a user device and other devices;
  • Fig. 4 shows, by way of example, a process of transfer learning and training of an auxiliary network.
  • Fig. 5 shows, by way of example, a flowchart of a method for selecting a neural network
  • a neural network is a computation graph comprising several layers of computation. Each layer may comprise one or more units, where each unit performs an elementary computation. A unit is connected to one or more other units, and the connection may have associated a weight. The weight may be used for scaling a signal passing through the associated connection. Weights are usually learnable parameters, i.e. , values which may be learned from training data. There may be other learnable parameters, such as those of batch-normalization layers.
  • weights of neural networks may be referred to as learnable parameters or simply as parameters.
  • Feed-forward neural networks are such that there is no feedback loop: each layer takes input from one or more of the layers before and provides its output as the input for one or more of the subsequent layers. Also, units inside a certain layers take input from units in one or more of preceding layers, and provide output to one or more of following layers.
  • Initial layers i.e. layers close to the input data, may extract semantically low- level features.
  • these low-level features may be e.g. edges and textures in images.
  • the intermediate and final layers may extract more high-level features.
  • a feedback loop so that the network may become stateful, i.e., it may be able to memorize information or a state.
  • the neural networks and other machine learning tools are able to learn properties from input data.
  • Learning may be e.g. supervised or unsupervised or semi-supervised.
  • Such learning is a result of a training algorithm, or of a meta-level neural network providing the training signal.
  • the training algorithm may comprise changing some properties of the neural network so that its output is as close as possible to a desired output.
  • the output of the neural network may be used to derive a class or category index which indicates the class or category that the object in the input image belongs to.
  • Training may be carried out by minimizing or decreasing the output’s error, also referred to as the loss. Examples of losses are mean squared error, cross-entropy, etc.
  • Training may be an iterative process, where at each iteration the algorithm modifies the weights of the neural network to make a gradual improvement of the network’s output, i.e. to gradually decrease the loss.
  • Training a neural network is an optimization process.
  • the goal of the optimization or training process is to make the model learn the properties of the data distribution from a limited training dataset.
  • the goal is to learn to use a limited training dataset in order to learn to generalize to previously unseen data, i.e. data which was not used for training the model. This may be referred to as generalization.
  • the data may be split into at least two sets, the training set and the validation set.
  • the training set is used for training the network, i.e. to modify its learnable parameters in order to minimize the loss.
  • the validation set is used for checking the performance of the network on data which was not used to minimize the loss, as an indication of the final performance of the model.
  • the errors on the training set and on the validation set may be monitored during the training process.
  • the network is learning if the training set error decreases. Otherwise the model is considered to be in the regime of underfitting.
  • the network is learning to generalize if also the validation set error decreases and is not too much higher than the training set error. If the training set error is low, but the validation set error is much higher than the training set error, or it does not decrease, or it even increases, the model is considered to be in the regime of overfitting. This means that the model has just memorized the training set’s properties and performs well on that set, but may perform poorly on a set not used for tuning its parameters.
  • Various different neural networks may be available for various different tasks, such as classification, segmentation, future prediction etc. Different neural networks may be trained for the same task. However, each neural network may be trained on a specific and/or narrow data-domain, rather than on a wide domain.
  • the domain means the context and/or conditions in which the data was captured.
  • an image scene classification task involves classes such as“outdoor”,“cityscape”,“nature”,“kitchen”,“inside car”, etc.
  • Each network may be trained to perform the scene classification task on data captured in one of the following lighting and/or weather conditions:“rainy” domain,“dark” domain,“sunny” domain,“foggy” domain, etc.
  • the domain is not limited to lighting conditions, but it may be one of the scene classes described above, i.e. “outdoor”, “cityscape”, “nature”,“kitchen”,“inside-car”, etc.
  • one network may be trained on data from the“outdoor” domain, another network on the“indoor” domain, and still another network on the“kitchen” domain, etc.
  • neural nets may perform better if they are trained on a narrow domain, as it is a simpler problem to be solved.
  • the training device performing the training of a certain network may be able of capturing data only or mainly from a specific domain. For example, in The British Isles, the most common weather condition domain may be“cloudy”, whereas in California the most common weather condition domain may be“sunny”.
  • a neural network trained on a narrow domain may need less weights to perform well, as opposed to another network trained on a wide domain which may need much more weights to perform well on that wide domain.
  • a network with less weights may occupy less memory and storage and thus be more suitable for memory-limited devices such as Internet of Things (loT) devices and mobile devices.
  • LoT Internet of Things
  • a user device receives content and the content needs to be processed or analyzed by a neural net.
  • the user device may have limited memory and/or computational capabilities.
  • this neural network may have been trained for narrow domains and thus probably not suitable for performing a task of interest on data which may be from a different domain.
  • the user device may be connected to other devices having neural networks. These other devices may also have limited memory and/or computational capabilities. Thus, it may be difficult to choose between the neural networks which one is the optimal neural network for performing a specific task on a certain input data.
  • An apparatus performing a method disclosed herein is able to select a neural network to efficiently execute or obtain an output from the most optimal neural network among a plurality of neural networks without availability of any ground-truth data nor the availability of an oracle providing indications or approximations of the ground-truth labels.
  • oracle refers to an entity, such as a human or a neural network trained on a big dataset, which may provide ground-truth data or guidance about the performance of other neural networks.
  • the most optimal or the best neural network is the network having the best performance on the domain of the data provided by the user device.
  • the approach proposed in this invention may provide an approximation of the most optimal neural network, i.e. , it may select a network which is not the most optimal but which may be one of the most optimal neural networks.
  • Fig. 1 a shows, by way of example, a system and devices for selecting a neural network.
  • a user device 110 may be e.g. a mobile device such as a cell phone, e.g. smartphone 125, or the user device may be a personal computer or a laptop 120.
  • the user device may be able to capture content or receive content from another entity, e.g. a database.
  • the content i.e. data, needs to be processed and/or analyzed by a neural network.
  • a device may be a user device if it has availability of the content.
  • a server 1 15 may be considered as a user device, if it has availability of the data.
  • the different devices may be connected to each other via a communication connection 100, e.g.
  • the server 1 15 may be connected to and controlled by another device, e.g. another user device.
  • Devices 130, 131 , 132 are devices having at least one neural network (NN).
  • the NN devices 130, 131 , 132 may be connected among themselves.
  • the NN devices have and are able to run at least one neural network.
  • the at least one neural network may be trained on a narrow domain and each NN device may have at least one neural network which has been trained on a different domain than the network on another NN device.
  • the server 1 15 may be one of the NN devices.
  • Fig. 1 b shows, by way of example, a block diagram of an apparatus.
  • the apparatus may be the user device 1 10 and/or the NN device.
  • the apparatus may comprise a user interface 102.
  • the user interface may receive user input e.g. through a touch screen and/or a keypad. Alternatively, the user interface may receive user input from internet or a personal computer or a smartphone via a communication interface 108.
  • the apparatus may comprise means such as circuitry and electronics for handling, receiving and transmitting data.
  • the apparatus may comprise a memory 106 for storing data and computer program code which can be executed by a processor 104 to carry out various embodiment of the method as disclosed herein.
  • the elements of the method may be implemented as a software component residing in the apparatus or distributed across several apparatuses.
  • Processor 104 may include processor circuitry.
  • the computer program code may be embodied on a non- transitory computer readable medium.
  • Fig. 2 shows a flowchart of a method 200 for selecting a neural network.
  • the method 200 may be carried out e.g. by the user device 110.
  • the method may comprise receiving 210 data to be processed by one of a plurality of main neural networks.
  • the method may comprise providing 220 the data and signalling information associated with the data to a plurality of devices each comprising a main neural network and an auxiliary neural network.
  • the auxiliary neural network may comprise a subset of layers of the main neural network.
  • the signalling information may comprise an identifier of an auxiliary task to be performed on the data by the auxiliary neural networks at the plurality of devices.
  • the method may comprise receiving 230, from the plurality of devices, indications of performance of the auxiliary neural networks for performing the auxiliary task.
  • the method may comprise selecting 240, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data.
  • Fig. 3a, 3b and 3c show, by way of examples, communication and signalling between a user device and other devices.
  • the user device 1 10 may be connected by a bi-directional channel to the NN devices 130, 131 , 132.
  • the NN devices may also be connected among themselves.
  • the user device 1 10 has availability of data.
  • the data may be received e.g. from a memory of the user device, by capturing the data by the user device or received from another entity, such as a database.
  • the data may be any type of data which is, or is pre-processed to be in suitable format for inputting the data to a neural network.
  • the data may be pre-processed to tensor form i.e. to a multidimensional array.
  • the data may be e.g. image data captured by a camera or video data captured by a video camera.
  • Other examples of data may be e.g. text data or audio data, such as audio or speech signal.
  • a task i.e. a main task, needs to be performed on the data, for any reason.
  • the task may be e.g. an analysis task such as classification, e.g. classification of image data, or a processing task such as denoising of an image.
  • the task may be a speech recognition task.
  • the task is to be performed by a neural network.
  • the user device 1 10 may have one or more neural networks. However, these networks may be trained on a narrow domain or for a different task. If the narrow domains on which these neural networks are trained do not correspond to the domain of the data on which the main task needs to be performed, the networks are not optimal for performing the task.
  • the user device may determine that the data is from a different domain, i.e. the probability that the data is from the same domain as the training data of the neural network of the user device is less than 1. The determination of whether the domains are different may be carried out such that the neural network of the user device performs the task, for example a classification task.
  • the user device may initiate a process for identifying the best or optimal neural network which is able to perform the main task, i.e. a task of interest, on the domain to which the data belongs. However, the user device may initiate the identification of the best neural network even without first verifying that the domain of the input data is different from the domain of the neural network(s) in the user device.
  • the NN devices have one or more neural networks. Furthermore, the NN devices have sufficient memory and computational capabilities for running the neural networks. The NN devices also have capabilities for training the neural networks.
  • the neural networks on different devices may have been trained on different data domains and for the same task. Alternatively, the neural networks may have been trained for different tasks, but for each task there may be different networks trained on different data domains. In general, the assumption is that at least a sub-set of all NN devices has one or more neural networks trained for the task of interest for the user device. There is not availability of ground-truth labels or the availability of an oracle providing indications or approximations of the ground-truth labels.
  • Fig. 3a shows, by way of example, the signalling from the user device to the NN devices.
  • the user device 1 10 provides the data 310 to a plurality of the NN devices 130, 131 , 132.
  • the NN devices comprise a main neural network and an auxiliary neural network.
  • the auxiliary network comprises a subset of layers of the main neural network.
  • signalling information 320 associated with the data may be provided.
  • the signalling information comprises an auxiliary task ID i.e. an AuxTaskID.
  • the AuxTaskID may be an identifier of an auxiliary task to be performed on the data by the auxiliary neural network to be trained.
  • the signalling information may further comprise an identifier of the main task, i.e. a Task ID.
  • the main task is the task of interest for the user device 1 10.
  • the Task ID may be understood by the NN devices.
  • Examples of main tasks may be object detection, image classification, image segmentation, image enhancement, image captioning, etc.
  • the signalling information may further comprise e.g. hyper-parameters for the architecture of the auxiliary network, training hyper-parameters for the auxiliary network and/or the number K of initial layers to transfer from the main network to the auxiliary network.
  • the user device may send the data, whereas the additional information may be either sent to the NN devices by a third-party entity or is negotiated among the NN devices themselves.
  • Fig. 3c shows an example, wherein there is an intermediate device 350, e.g. a third-party entity, communicating between the user device 1 10 and the NN devices 130, 131 , 132.
  • the auxiliary task ID and/or the main task ID may comprise e.g. a script to be run by the NN device, the script executing a task on data.
  • the auxiliary task may be e.g. an image denoising task, an image inpainting task, an image compression task, a single-image super-resolution task, a next frame prediction task and/or sound generation task from image data or video data.
  • the main task may be e.g. an image classification task, an image segmentation task, an image object detection task, an image or a video captioning task, a salient object detection task and/or a video object tracking task.
  • the NN devices receive data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network.
  • the signalling information may further comprise other information as described above.
  • the NN devices comprise one or more neural networks for performing the main task, referred to as main neural networks. At least some of these main neural networks may have been trained on different data domains.
  • the NN devices may use the data to train an auxiliary network for performing the auxiliary task.
  • the training may be carried out in unsupervised way or in self- supervised way.
  • the auxiliary network may be derived from the main network.
  • the auxiliary network comprises the already-trained initial layers of the main network and the new layers which are not yet trained. Using the initial layers is an example, which may be very common in practice, but the method presented herein is not limited to such case.
  • the auxiliary network may re use any combination of layers, or a subset thereof, from the main network.
  • the training may comprise using as initial values for parameters of the re used subset of layers values of the subset of layers of the main neural network.
  • Features extracted by the initial layers of the main neural network are better features when transferred to another neural network if the domain of the input data and the domain of the data used to train the main neural network are similar. Therefore, the features extracted by the re-used or transferred layers perform well also for an auxiliary task, which may be an unsupervised or a self-supervised task.
  • the transfer learning is described in more detail in the context of Fig. 4.
  • the Fig. 3b shows, by way of example, the signalling from the NN devices 130, 131 , 132 to the user device 1 10.
  • Indications of performance 330 of the auxiliary neural networks are sent from the NN devices to the user device.
  • Indications of performance indicate how well the auxiliary neural network performed the auxiliary task. Indications of performance need to be comparable among different NN devices such that the most optimal neural network may be chosen.
  • a method for determining performance of the auxiliary network and/or a format of providing the indication of performance may be indicated in signalling information 320 sent from the user device 1 10 to the plurality of NN devices 130, 132, 132.
  • the indications of performance may comprise a convergence speed. The convergence speed may be measured e.g.
  • the performance of the auxiliary net may be described by loss values. Examples of losses are mean squared error, cross-entropy, etc.
  • a loss may be computed based on the input to the auxiliary neural network, i.e. the data received from the user device, and the output of the neural network.
  • the performance of the auxiliary neural network may be described by how much the auxiliary neural network modified the weights of the initial layers, i.e. re-used or transferred layers.
  • the received data may be divided into two parts. For example, if the data is an image, it may be divided into two halves. One part is used to train the auxiliary neural network and the other part is used as validation data. The loss computed on the validation data is the value which will be compared among different NN devices. This approach is more robust in cases where the auxiliary task comprises reconstructing the input data and where the input data is not corrupted by any noise or other modification.
  • a decision may be made about which NN device has the best main network for the main task.
  • the decision may be made based on the training session of the auxiliary networks. Thus, it is selected, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data.
  • the selection may be made by comparing the indication(s) of performance received from the plurality of NN devices to at least one predetermined criterion, e.g., which auxiliary neural network reached the lowest loss, which auxiliary neural net converged faster, or which auxiliary neural net’s training modified the least the weights of the re-used or transferred K layers in case the re-used or transferred K layers were also trained, i.e. their weights or parameters were tuned. See Fig. 4 for the K layers.
  • predetermined criterion e.g., which auxiliary neural network reached the lowest loss, which auxiliary neural net converged faster, or which auxiliary neural net’s training modified the least the weights of the re-used or transferred K layers in case the re-used or transferred K layers were also trained, i.e. their weights or parameters were tuned. See Fig. 4 for the K layers.
  • the user device may select the first NN device or network that fulfils the at least one predetermined criterion.
  • the selected neural network is not necessarily the best according to at least one predetermined criterion mentioned above.
  • user device may select the NN device 130 if it provides an indication of sufficient performance before receiving the requested indications of performance from the other NN devices 131 , 132.
  • Sufficient performance may be defined by a predetermined threshold, e.g. a specific convergence speed.
  • the user device may request the selected main neural network to perform the main task on the data and receive an output of the main task.
  • the user device may request one of the plurality of devices comprising the selected main neural network to provide the selected main neural network.
  • the user device may receive the selected main neural network and perform the main task on the data using the selected main neural network.
  • the NN device may receive a request to perform a main task by the selected main neural network or to provide the selected main neural network to another device, e.g. to the user device 1 10 or to an intermediate device 350.
  • the NN device may perform the main task and transmit the results to user device 1 10, or transmit its main neural network to user device 1 10 or another device identified in the request.
  • output of the main task performed by the main neural network may also be provided to the user device together with the indications of performance, i.e. AuxNet’s performance.
  • the user device may compare the training information of the auxiliary networks, make a selection of the best main neural network, and use the output of the desired neural network, for example the best performing main neural network.
  • the output of the main task performed by the main neural network may also be provided to the user device together with the main neural network.
  • the user device may compare the training information of the auxiliary networks, make a selection of the best main neural network, and use this best main neural network to obtain the desired output.
  • the intermediate device 350 may perform e.g. receiving the data and main task ID from the user device 1 10.
  • the intermediate device 350 may broadcast the data and main task ID to the NN devices 130, 131 , 132, with associated signalling information similar to signalling information 320.
  • the intermediate device 350 may receive the information on performance of the trained auxiliary networks from the NN devices and make the decision about the model.
  • the intermediate device 350 may request the NN device having the selected model to provide the main NN’s output to the user device.
  • this entity will just forward the selected main task output to the user device.
  • the NN devices already sent the main networks to the intermediate device 350 then this entity will either run the best main network on the given input and send back to the user device the obtained output, or forward the best main network to the user device.
  • the Fig. 4 shows, by way of example, a process of transfer learning and training of an auxiliary network.
  • the main network comprises the initial layers 430, which may extract low/mid-level features 435.
  • the layers 450 are layers which are more specific to the main task.
  • the layers 450 may be one or few layers which are needed for performing classification, such as one or more convolutional layer, one or more fully-connected layers, and a final softmax layer.
  • the layers 450 may comprise one fully-connected layer and a softmax layer.
  • the layers 450 include the first K layers, but it is to be noted that any subset of K layers may be used.
  • the NN device may transfer the initial K layers 430 from the main network MainNet 410 to the auxiliary network AuxNet 420.
  • the transfer may be carried out, for example, by copying the initial layers 430 to obtain re-used or transferred layers 440.
  • the auxiliary network may comprise a subset of layers of the main neural network.
  • the first K layers 440 may also be trained during training of the auxiliary network. Any modification to these layers done during auxiliary training will not affect the original copy of these layers in the main network.
  • These re-used or transferred K layers may function as feature extraction layers.
  • the device will add new layers to complete the auxiliary network.
  • the new layers 460 of the auxiliary network may be chosen based on the received additional information, i.e. the signalling information.
  • the new layers 460 may be completely untrained, for example initialized with one of the common initialization methods, or may be pre-trained on another dataset.
  • the architecture of the auxiliary net may be the same in different devices. Similarity of the architectures may be agreed e.g. by receiving the architecture information from the user device or from a third-party entity. Alternatively the similarity of the architectures may be negotiated and agreed by the NN devices with each other. The NN device may set the learning rate and other training hyper-parameters as instructed in the signalling information received from the user device or from the third-party entity or as agreed with other NN devices. The NN device may train the auxiliary neural network for the number of iterations or epochs specified in the additional information.
  • auxiliary network 420 may comprise the pre trained re-used or transferred layers 440, followed by new layers 460.
  • the new layers may be pre-trained or only initialized with one of the methods known for a skilled person.
  • the new network is then trained on a new task and/or a new data domain.
  • the new layers 460 are trained with a sufficiently high learning rate, e.g., 0.01 , whereas the pre-trained layers 440 may be left un-modified, or be fine-tuned for example using a smaller learning rate, e.g., 0.001 , than for the new layers.
  • Learning rate above 0.007 may be considered high.
  • Learning rate below 0.002 may be considered low.
  • Learning rate is an update step size. Smaller learning rate indicates that the weights are updated by a smaller amount.
  • a learning rate of the subset of layers 440, i.e. the re used or transferred layers from the main neural network may be lower than a learning rate of other layers 460 of the auxiliary neural network.
  • the new layers 460 may be trained more than the subset of layers 440 transferred, e.g. copied, from the main neural network 410. With lower learning rate of the subset of layers 440, the subset of layers may be preserved close to the initial K layers 430. This may ensure that the performance of the auxiliary network provides a good estimate of the performance of the main neural network.
  • the initial layers may be considered to be good if they transfer well to the new training, i.e. , if the new network, after being trained, performs well on the new task and/or domain. Transfer learning may be applied on the initial layers and/or to any subset of layers.
  • the NN devices may transfer the initial layers of the main network to a new network, referred to as the auxiliary network, which is then trained to perform an auxiliary task.
  • This auxiliary task may be an unsupervised task or a self- supervised task.
  • Unsupervised training or self- supervised training 470 are training approaches where the input data to the neural network is obtained by modifying the data 480, and using the original version of the data as the ground-truth desired data to which the neural network’s output will be compared for computing the loss and the weight updates.
  • Modification of data may be a degradation operation, such as addition of noise, removal of portions of the data, etc., or other modifications.
  • the auxiliary neural network’s unsupervised task is to recover the original version of the data, i.e., to denoise or to fill-in the missing portions.
  • Another type of modification may comprise splitting the data in different parts. Splitting may be done for example spatially, e.g. by splitting an image into different crops or blocks. Alternatively, splitting may be done e.g. temporally, e.g. by splitting a video into past frames and future frames. Such modifications may be used for the unsupervised task of predicting one split from the other. For example, a neural network may be trained to get as input one image’s crop/patch/block, and to output a prediction of the neighbouring block. Then, the loss may be computed by computing the mean squared error or other suitable loss measure between the network’s predicted block and the corresponding real block.
  • a network may be trained to get as input the past frames of a video and to output a prediction of the following frames. Then, the loss may be computed by computing the mean squared error or other suitable loss measure between the predicted future frames and the real future frames.
  • the data is image data or video data.
  • the auxiliary task may be an image denoising task.
  • the auxiliary task may be an image inpainting task.
  • the auxiliary task may be an image compression task.
  • the auxiliary task may be a single-image super-resolution task.
  • Super resolution task may be realized by a neural network which is trained to perform upsampling of the input image, or to improve the quality, e.g. mean squared error, of a previously upsampled image.
  • the auxiliary task may be a next frame prediction task.
  • the auxiliary task may be a sound generation task, if the received data comprises also an audio track.
  • the auxiliary task may be any combination of these tasks.
  • the data is image data or video data.
  • the main task may be image classification task.
  • the main task may be image segmentation task.
  • the main task may be image object detection task.
  • the main task may be image or video captioning task.
  • the main task may be salient object detection task.
  • the main task may be video object tracking task.
  • the main task may be any combination of these tasks.
  • Fig. 5 shows, by way of example, a flowchart of a method 500 for selecting a neural network.
  • the method 500 may be carried out e.g. by devices 130, 131 , 132 having at least one neural network.
  • the method may comprise receiving 510 data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network.
  • the method may comprise training 520 the auxiliary network for performing the auxiliary task.
  • the method may comprise providing 530 an indication of performance of the auxiliary neural network for performing the auxiliary task.
  • the method may comprise receiving 540, in response to providing the indication of performance, a request to perform a main task by a selected main neural network or to provide the selected main neural network to another device.
  • the another device may be e.g. the user device 1 10 or the intermediate device 350.
  • the user device may be carried out by the user device such that the user device builds an auxiliary neural network.
  • the user device needs to have capabilities for training neural networks.
  • the auxiliary neural network may be built as described above in the context of the NN device. Then, the user device may train the auxiliary neural network in self-supervised way. If the performance is not high enough, the domain of the neural network of the user device may be determined to be different from the domain of the input data.
  • circuitry may refer to one or more or all of the following: (a) hardware-only circuit implementations (such as implementations in only analog and/or digital circuitry) and (b) combinations of hardware circuits and software, such as (as applicable):
  • circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware.
  • circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in server, a cellular network device, or other computing or network device.
  • a system may comprise a first apparatus and a plurality of second apparatuses.
  • the first apparatus may be a user device 1 10.
  • the second apparatus may be the NN device 130, 131 , 132.
  • the plurality of second apparatuses each comprise a main neural network and an auxiliary neural network.
  • the system may comprise means for receiving, by the first apparatus, data to be processed by one of a plurality of main neural networks.
  • the system may comprise means for providing the data and signalling information associated with the data to a plurality of devices each comprising a main neural network and an auxiliary neural network, the auxiliary neural network comprising a subset of layers of the main neural network, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural networks at the plurality of devices.
  • the system may comprise means for receiving, by the first apparatus from the plurality of devices, indications of performance of the auxiliary neural networks for performing the auxiliary task.
  • the system may comprise means for selecting, based on the indications of performance of the auxiliary neural networks, one of the plurality of main neural networks for performing a main task on the data.
  • the system may comprise means for receiving, by the plurality of second apparatuses, data and signalling information associated with the data, wherein the signalling information comprises an identifier of an auxiliary task to be performed on the data by the auxiliary neural network.
  • the system may comprise means for training, by the plurality of second apparatuses, the auxiliary network for performing the auxiliary task.
  • the system may comprise means for providing, by the plurality of second apparatuses to the first apparatus, an indication of performance of the auxiliary neural network for performing the auxiliary task.
  • the system may comprise means for receiving by the second apparatus comprising a selected main neural network, in response to providing the indication of performance, a request to perform a main task by the selected main neural network or to provide the selected main neural network to another device.
  • the system may comprise means for requesting, by the first apparatus, the selected main neural network to perform the main task on the data.
  • the system may comprise means for receiving an output of the main task from the second apparatus comprising the selected main neural network.
  • the system may comprise means for requesting, by the first apparatus, the second apparatus comprising the selected main neural network to provide the selected main neural network.
  • the system may comprise means for receiving, by the first apparatus, the selected main neural network.
  • the system may comprise means for performing, by the first apparatus, the main task on the data using the selected main neural network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un appareil comprenant des moyens pour recevoir des données à traiter par un réseau neuronal principal parmi une pluralité de réseaux neuronaux principaux (210). L'appareil comprend des moyens pour fournir les données et des informations de signalisation associées aux données à une pluralité de dispositifs comprenant chacun un réseau neuronal principal et un réseau neuronal auxiliaire, le réseau neuronal auxiliaire comprenant un sous-ensemble de couches du réseau neuronal principal, les informations de signalisation comprenant un identifiant d'une tâche auxiliaire à effectuer sur les données par les réseaux neuronaux auxiliaires au niveau de la pluralité de dispositifs (220). L'appareil comprend des moyens pour recevoir, depuis la pluralité de dispositifs, des indications de performance des réseaux neuronaux auxiliaires pour effectuer la tâche auxiliaire (230). L'appareil comprend des moyens pour sélectionner, sur la base des indications de performance des réseaux neuronaux auxiliaires, un réseau neuronal principal parmi la pluralité de réseaux neuronaux principaux pour effectuer une tâche principale sur les données (240).
EP19814193.9A 2018-06-08 2019-05-21 Appareil, procédé et programme informatique pour sélectionner un réseau neuronal Pending EP3803712A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI20185527 2018-06-08
PCT/FI2019/050393 WO2019234291A1 (fr) 2018-06-08 2019-05-21 Appareil, procédé et programme informatique pour sélectionner un réseau neuronal

Publications (2)

Publication Number Publication Date
EP3803712A1 true EP3803712A1 (fr) 2021-04-14
EP3803712A4 EP3803712A4 (fr) 2022-04-20

Family

ID=68770064

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19814193.9A Pending EP3803712A4 (fr) 2018-06-08 2019-05-21 Appareil, procédé et programme informatique pour sélectionner un réseau neuronal

Country Status (2)

Country Link
EP (1) EP3803712A4 (fr)
WO (1) WO2019234291A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401474B (zh) * 2020-04-13 2023-09-08 Oppo广东移动通信有限公司 视频分类模型的训练方法、装置、设备及存储介质
CN114494800B (zh) * 2022-02-17 2024-05-10 平安科技(深圳)有限公司 预测模型训练方法、装置、电子设备及存储介质
CN115328661B (zh) * 2022-09-09 2023-07-18 中诚华隆计算机技术有限公司 一种基于语音和图像特征的算力均衡执行方法及芯片

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9633315B2 (en) * 2012-04-27 2017-04-25 Excalibur Ip, Llc Method and system for distributed machine learning
GB2539845B (en) * 2015-02-19 2017-07-12 Magic Pony Tech Ltd Offline training of hierarchical algorithms
US10878318B2 (en) * 2016-03-28 2020-12-29 Google Llc Adaptive artificial neural network selection techniques
US10733534B2 (en) * 2016-07-15 2020-08-04 Microsoft Technology Licensing, Llc Data evaluation as a service

Also Published As

Publication number Publication date
EP3803712A4 (fr) 2022-04-20
WO2019234291A1 (fr) 2019-12-12

Similar Documents

Publication Publication Date Title
Huang et al. Efficient uncertainty estimation for semantic segmentation in videos
Mathieu et al. Deep multi-scale video prediction beyond mean square error
Roy et al. Impulse noise removal using SVM classification based fuzzy filter from gray scale images
CN113159073B (zh) 知识蒸馏方法及装置、存储介质、终端
US11062210B2 (en) Method and apparatus for training a neural network used for denoising
WO2019234291A1 (fr) Appareil, procédé et programme informatique pour sélectionner un réseau neuronal
EP3818502A1 (fr) Procédé, appareil et produit-programme informatique de compression d'image
US11200648B2 (en) Method and apparatus for enhancing illumination intensity of image
Juefei-Xu et al. Rankgan: a maximum margin ranking gan for generating faces
EP3767549A1 (fr) Fourniture de réseaux neuronaux comprimés
CN114511576B (zh) 尺度自适应特征增强深度神经网络的图像分割方法与系统
WO2021042857A1 (fr) Procédé de traitement et appareil de traitement pour modèle de segmentation d'image
CN113283368B (zh) 一种模型训练方法、人脸属性分析方法、装置及介质
CN113469283A (zh) 一种图像分类方法、图像分类模型的训练方法及设备
WO2016142285A1 (fr) Procédé et appareil de recherche d'images à l'aide d'opérateurs d'analyse dispersants
CN115293348A (zh) 一种多模态特征提取网络的预训练方法及装置
US20220277430A1 (en) Spatially adaptive image filtering
WO2020165490A1 (fr) Procédé, appareil et produit-programme informatique pour codage et décodage vidéo
WO2022246986A1 (fr) Procédé, appareil et dispositif de traitement de données, et support de stockage lisible par ordinateur
CN112966754B (zh) 样本筛选方法、样本筛选装置及终端设备
Li et al. Cyclic annealing training convolutional neural networks for image classification with noisy labels
CN110490876B (zh) 一种基于轻量级神经网络的图像分割方法
Dahanayaka et al. Robust open-set classification for encrypted traffic fingerprinting
CN114118207B (zh) 基于网络扩张与记忆召回机制的增量学习的图像识别方法
Li et al. Automatic channel pruning with hyper-parameter search and dynamic masking

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20210111

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
A4 Supplementary search report drawn up and despatched

Effective date: 20220317

RIC1 Information provided on ipc code assigned before grant

Ipc: G06T 9/00 20060101ALN20220311BHEP

Ipc: H04L 69/24 20220101ALI20220311BHEP

Ipc: G06N 3/08 20060101ALI20220311BHEP

Ipc: G06N 3/04 20060101AFI20220311BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20240201